US Citizen
Green Card
EAD (OPT/CPT/GC/H4)
H1B Work Permit
Corp-Corp
W2-Permanent
W2-Contract
Contract to Hire
Consulting/Contract
UG :- - Not Required
PG :- - Not Required
No of position :- ( 1 )
Post :- 29th Sep 2023
As a Big Data Engineer at [LTI MINDTREE], you will play a critical role in managing and optimizing our data infrastructure, facilitating data-driven decision-making, and enabling advanced analytics. You will work closely with cross-functional teams to design, develop, and maintain scalable Big Data solutions.
Key Responsibilities:
ETL Development: Create, maintain, and optimize Extract, Transform, Load (ETL) processes to efficiently ingest and process large volumes of data from various sources into our data warehouse.
Data Warehousing: Design and maintain our data warehousing architecture, ensuring data is stored in a structured and accessible manner for analytics and reporting.
Data Analytics: Collaborate with data analysts and data scientists to understand their requirements and provide clean, reliable datasets for analysis. Implement data models and algorithms to support data analytics initiatives.
PySpark Development: Develop and maintain PySpark applications to process and analyze Big Data efficiently. Optimize PySpark jobs for performance and scalability.
Hive Query Optimization: Write and optimize HiveQL queries for data retrieval and transformation. Ensure Hive tables are organized and partitioned for optimal query performance.
Data Quality Assurance: Implement data quality checks, validation rules, and monitoring processes to ensure data accuracy and consistency.
Documentation and Collaboration: Document ETL processes, data models, and solutions. Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders.
Performance Tuning: Continuously monitor and tune the performance of Big Data jobs and processes to meet SLAs and performance expectations.
Qualifications:
Bachelor's or Master's degree in Computer Science, Data Science, or a related field.
Proven experience as a Big Data Engineer or Data Engineer in a similar role.
Strong proficiency in PySpark and Hive for Big Data processing.
Experience with ETL tools and methodologies in a Big Data environment.
Proficiency in SQL and relational databases.
Familiarity with data warehousing concepts and technologies.
Strong problem-solving skills and the ability to work in a collaborative team environment.
Excellent communication and documentation skills.
Knowledge of cloud-based Big Data platforms (e.g., AWS EMR, Google Dataprep) is a plus.