Required Skills

Distribution system Data architecture HBase

Work Authorization

  • Citizen

Preferred Employment

  • Full Time

Employment Type

  • Direct Hire

education qualification

  • UG :- - Not Required

  • PG :- - Not Required

Other Information

  • No of position :- ( 1 )

  • Post :- 20th Jul 2022


  • You will partner with teammates to create complex data processing pipelines in order to solve our clients most complex challengesYou will collaborate with Data Scientists in order to design scalable implementations of their models

  • You will pair to write clean and iterative code based on TDD

  • Leverage various continuous delivery practices to deploy, support and operate data pipelines

  • Advise and educate clients on how to use different distributed storage and computing technologies from the plethora of options available

  • Develop and operate modern data architecture approaches to meet key business objectives and provide end-to-end data solutions

  • Create data models and speak to the tradeoffs of different modeling approaches

  • Seamlessly incorporate data quality into your day-to-day work as well as into the delivery process

  • Assure effective collaboration between Thoughtworks and the clients teams, encouraging open communication and advocating for shared outcomes Youre resilient and flexible in ambiguous situations and enjoy solving problems from technical and business perspectivesAn interest in coaching, sharing your experience and knowledge with teammatesYou enjoy influencing others and always advocate for technical excellence while being open to change when neededPresence in the external tech community: you willingly share your expertise with others via speaking engagements, contributions to open source, blogs and more

  • You have a good understanding of data modelling and experience with data engineering tools and platforms such as Kafka, Spark, and HadoopYou have built large-scale data pipelines and data-centric applications using any of the distributed storage platforms such as HDFS, S3, NoSQL databases (Hbase, Cassandra, etc

  • ) and any of the distributed processing platforms like Hadoop, Spark, Hive, Oozie, and Airflow in a production settingHands on experience in MapR, Cloudera, Hortonworks and/or cloud (AWS EMR, Azure HDInsights, Qubole etc) based Hadoop distributionsYou are comfortable taking data-driven approaches and applying data security strategy to solve business problems
  • Working with data excites you: you can build and operate data pipelines, and maintain data storage, all within distributed systemsYoure genuinely excited about data infrastructure and operations with a familiarity working in cloud environments


Company Information