NLP Data Engineer
NLP Data Engineer
You will be working as an NLP Data Engineer on an investment team that is committed to producing high
quality fundamental research using advanced data analytics to help inform investment decisions. You will
need to be strategic and innovative in the hunt for information and work closely with analysts when
obtaining new data sets and deriving unconventional data insights.
Responsibilities
As an NLP Data Engineer specializing in text data and natural language processing, you will develop and
scale innovative NLP/ML/DL algorithms to normalize data insights from unstructured textual data. We
will rely on you to test the data to ensure accuracy and quality. You will also own the process to identify
and rectify any issues with breaks as well as scale algorithms as needed.
Qualifications
• Working knowledge and experience in NLP core components (NER, Entity Disambiguation)
• Experience with at least one of the following: Keras, Tensorflow, Caffe, or PyTorch
• Experience with training LLM models
• 3+ years of Python development experience.
• 3+ years of experience of building and deploying machine learning or NLP intensive AI
algorithms from scratch.
• Experience writing maintainable, testable, production-grade code in one or more general
purpose languages (Java, C/C++, Python, etc)
• Familiarity with techniques and tools for crawling, extracting, and processing data (Scrapy,
pandas, MapReduce, SQL, BeautifulSoup, etc).
• Experience with version control, open-source practices, and code review.
Preferred Qualifications
• Experience with deep learning NLP toolkits such as Hugging Face transformers, Deep Graph
Library, DGL-KE, Spacy, ELMo, BERT.
• MS/PhD in NLP, ML, AI, Engineering or equivalent
• 3 years of experience as a Data Engineer handling large scale web scrapes, data pipelines
and platforms, and have strong understanding of the building cloud-based data platforms.
(AWS Spark, EMR, SageMaker, Comprehend).
• Experience with data quality and validation.