Anmol.pdf
Anmol.pdf
EXPERIENCE
CIP Data Engineering Intern, Eaton India Innovation Center, Pune June 2023-Aug 2023
● Analyzed data readiness of 9 adopters for 30-50 days, and created a production-ready dashboard using
Kibana reducing data-related issues by almost 20%.
● Developed the ARC (Accessibility, Representability, Contextuality) model, resulting in a nearly 30% reduction
in model downtime.
● Utilized Data Validation tool Great Expectations and data drift tool why labs reducing false positives by nearly
40% and enabling early detection of data anomalies.
● Skilled in creating data pipelines with Apache Airflow, saving nearly 30% of data analysis time and improving
decision-making.
ACADEMIC PROJECT
Character level nano-gpt model (from scratch)—| | | PyTorch, Transformer, Transfer learning, GPT, ANN
● Developed a character-level language model for seq2seq text generation, initially employing a bigram model
with a train loss of 2.4691 and a validation loss of 2.4889.
● Enhanced the model by building a Nano-GPT (small GPT-2) transformer from scratch.
● Implemented multi-head decoder attention blocks, achieving significant improvement over traditional RNN
and LSTM with a train loss of 0.2257 and a validation loss of 0.2442.
Uber Data Engineering Project— | | | Python, Cloud storage, BigQuery, Looker Studio, MageAI
● A data analytics project with an interactive dashboard analyzing valuable insights from 0.1 million Uber data.
● Built a DAG with an ETL tool to establish a data pipeline, deployed on Google Compute Instance to retrieve
raw data from Google Cloud Storage and store it in BigQuery.
● Designed and implemented a dimensional data model using SQL queries for CRUD operation.
TECHNICAL SKILLS
Programming Skills Industrial Skills Soft Skills Industrial Tools
C++, Python, SQL, OOP’s, Machine Learning, Presentation skills, Power BI, GCP, AWS, Apache
Flask, PyTorch, Deep Learning, Communication skills, Airflow, Elasticsearch, Linux, IBM
JavaScript, HTML, CSS Data Engineering, Statistics Agile Development SPSS, Git, Postman, Databricks