Position: Data Engineer
Location: Remote/Miami, Florida
Duration: 6+ Months
Mode of Interview: Phone and Skype
Skills: Tableau, SQL, Python, Spark, AWS, Linux
Client is looking for a Data Engineer with 5-8 years of experience with data management and data pipeline development. The engineer should be capable of designing, developing, and implementing data pipelines to integrate third-party data and packaged solutions to our data-lake and data warehouse. The ideal candidate should have worked in a Microsoft Azure or AWS Cloud stack and migrated from self-hosted data platforms such as SQL, Teradata or Hadoop and have experience with visualization tools such as Tableau or Power BI.
Key Responsibilities:
- On board data from variety of sources and formats such as Text, Parquet, Avro, Json, XML.
- Build data pipelines required to extract, transform, clean, and move data to data-lake, data warehouse, or operational data store. The overall life cycle for data development.
- Develop and test scripts and exception handling processes based upon design
- Responsible for developing policies/procedures and other functional documentation related to maintaining data environments
- Consult with analysts, engineers and other technical support personnel to support development activities
- Operational support to ensure timely data at a consistent quality.
- Responsible for working with technical counterparts in the Information Services Department, for physical environment management and development.
- Troubleshoot to resolve system resource issues.
- Troubleshoot to resolve data quality issues.
- Experience resolving performance issues in large scale data environments
- Document technical artifacts for developed solutions
- Create and manage published Tableau Datasets integrating to exisiting and future data warehouse.
Technical Skills: (Must have)
- 5+ Year - SQL
- 5+ Year - Python
- 3+ Year - Spark
- 1+ Year - Containers (Docker)
- 2+ Year – Tableau
- 3+ Year - Linux
- 2+ Relational Data Modeling
- Understanding of enterprise level source control procedures
- Continuous Integration and Deployment Tools (Azure dev-ops, Jenkins)
- Hadoop (Big Data, Any distribution)
- Relational Databases (MS SQL, Teradata, etc)
- Cloud and big data architecture
- AWS Data Stack (S3, Glue, Athena, Data Pipelines, Data Exchange, Kinesis and Lambda)
- 3+ Year – Must have worked on a development team environment requiring collaboration on shared code
- Experience with work tracking tools such as Jira
- Experience handling Data Security and Data Encryption
Technical Skills: (Nice to have)
- R, TensorFlow and Pytorch and other Machine Learning libraries (A plus)
- Azure Data Management solutions (A plus)
- NoSQL Databases (Mongo, etc)
Soft Skills:
- Ability to interact and communicate effectively using oral and written skills with individuals at all levels of the organization
- Must be able to work in a fast-paced environment with demonstrated ability to juggle multiple competing tasks and demands. Experience working in Agile Environments
- Should be self-motivated, goal orientated, quality driven, and capable of working independently
Education:
- Bachelor’s degree in Computer Science, Mathematics, or comparable job experience
- Certifications with data management tool stacks for development and administration on Azure or AWS