Bachelor’s degree in data science, computer science, or related technical field
5+ years of experience in statistical analysis and machine learning within the Python ecosystem including TensorFlow, PyTorch, pandas, NumPy, and sklearn
3+ years of experience creating ML pipelines incorporating both structured and unstructured data
3+ years of experience working with various data sources including diverse file types, web service APIs, relational databases, and NoSQL
3+ years of experience implementing data science solutions on AWS, Google, or Azure cloud
Experience with supervised and unsupervised learning techniques as well as classification, regression, and anomaly detection problems
Broad understanding of machine learning including model training, hyperparameter tuning, optimization, performance evaluation, inference, model interpretability, and GPU acceleration
Experience scaling and optimizing data workloads on distributed systems
Experience working with big data file formats such as Parquet, Avro, and Iceberg
Proven ability to perform development at all stages in the data science lifecycle
Foundational understanding of software engineering, algorithms, and data structures
Familiarity with model lifecycle management and ML Ops concepts
History of continuous learning and incorporation of open data and software to solve problems