Postgraduate qualification in Statistics / Computer Science / Data Science from a premier institute
Exposure to tools – Python, R and R Shiny
Experience with a variety of data stores such as MongoDB, Cassandra, HBase, MySQL/Postgres
Must have experience in Agile development methods.
Experience in natural language processing technologies and services with emphasis in one or more of the following:
Data acquisition and NL modeling: harvesting NL resources from the Web, rapid bootstrapping of domain-specific and multilingual NL models for named-entity, syntactic parsing and text classification
NL systems: large-scale development and deployment, performance monitoring, tuning and optimization of NL models
NL methodology: grammar-based, data-driven and machine learning-based, hybrid approaches
NL technologies: spoken language understanding, language translation, natural language search, syntax-semantics.
Interfaces using ontologies, discourse processing, language generation and summarization, interactive NL systems
In depth understanding of language processing technology, a strong software background and experience with processing large scale multilingual texts through effective use of computing resources.
Hands on experience with NLTK tool, tokenization, lemmatization, NER model architecture, creating an NLP pipeline with sentence resolver, relationship extraction, chunk merge approach
Annotation of texts using NLP annotators such as inception transforming a model from research to production and ability to fine-tune model performance in a production setting using adaptive/incremental machine learning techniques
Job Responsibilities:
Developing advanced algorithms that solve problems of large dimensionality in a computationally efficient and statistically effective manner
Implementing statistical and data mining techniques e.g. hypothesis testing, machine learning, and retrieval processes on a large amount of data to identify trends, patterns and other relevant information
Knowledge of databases, data modeling, data normalization etc. is a must
Collaborating with clients and other stakeholders to effectively integrate and communicate analysis finding.