0% found this document useful (0 votes)
8 views

Machine Learning QnA Session @ DDU Jan 2025

The document outlines the educational and professional journeys of Vyom and Muktan, highlighting their experiences in computer science, machine learning, and research. Key learnings emphasize the importance of strong foundational knowledge, collaboration in research, and the value of practical experience through internships and competitions. It also includes resources and advice for beginners in machine learning and related fields.

Uploaded by

smitshah084
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Machine Learning QnA Session @ DDU Jan 2025

The document outlines the educational and professional journeys of Vyom and Muktan, highlighting their experiences in computer science, machine learning, and research. Key learnings emphasize the importance of strong foundational knowledge, collaboration in research, and the value of practical experience through internships and competitions. It also includes resources and advice for beginners in machine learning and related fields.

Uploaded by

smitshah084
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Intros

Vyom

● BTech CE Undergrad @ DDU - Circa 2017 - 2021


○ Circa 2017 - 2018 - Understanding basics of Math, Data Structures, and
Algorithms
■ Math, Data Structures, and Algorithms are foundation keys to all subfields
in CS!
■ Computer Networks, Hardware protocols are foundation keys to all
subfields in CS/CE as well!
■ ACM ICPC 2018 - University Rank 1
○ Circa 2018 - 2019 - Understanding Machine Learning Algorithms
■ GDSC events introduced ML as a sub-branch of CS
■ Andrew Ng Coursera Course to understand basics of ML
■ Kaggle Competitions for honing basic ML skills
■ GDSC ML Lead
■ Open Source Contribution to understand ML related libraries - Pandas,
Keras, Hugging Face, and Pytorch Ignite
● Collaboration in ML is high signal learning
○ Circa 2019 - 2021 - Understanding Speech Modality
■ Solving problems in one of the human modalities to understand the
impact of deep-learning [Speech, Text, and Vision]
■ Research work with BSB sir in Speech Recognition to understand DL in
depth
● Published in ACM and ACL Anthology
■ Research work @ ISRO on application of NN for coastal water quality
degradation
● Published in IEEE
■ Kaggle Competition in text classification problem to understand text
modality along with different training framework (Pytorch)
● Silver Medal
● MS in CS @ University of Florida (UFL) - Circa 2021 - 2023
○ Pivoting to Text Modality [Language Models]
■ Research @ DSR Lab
● Worked on text embeddings and textual graph based problems
■ ML research intern @ Apple Siri text-to-speech (TTS)
● Built large acoustic models
■ Applied Science Intern @ Amazon Alexa
● Built attention based sequence to sequence classification system
● Internal research conference helped reinforce research work in
collaboration thrives (reachout to paper authors they are very kind
in helping you and you will probably also get to work on cool stuff
with them!)
● Scaling Language Model Knowledge - Circa 2023-2024
○ Applied Scientist 2 @ Chronograph
■ Working on building systems at scale to solve NLP financial tasks
○ Researcher @ UFL
■ Working on improving faithfulness in reasoning for large foundation
models using Human/AI Corrections
● Focusing on research direction on large foundation models - Present
○ Applied Scientist @ Amazon
■ Working on scaling embedding model training
○ Independent Research
■ Working on robust reasoning for foundation models

Muktan
● Software intern @ Atliq
● Research work with BSB sir in Speech recognition
● Data science internship @ ISRO, worked on precipitation data analysis
● Kaggle Silver medal
● Open source contributions in Pandas, Pytorch Ignite and Hugging face
● MS in CS @ UTD
● ML internship @ Apple, worked on speech recognition
● Applied Scientist internship @ Amazon, worked on Reinforcement Learning based
recommender systems, published a paper at internal conference (AMLC '23)
● ML internship @ Finesse (fashion-tech startup), worked on scraping and ranking of
trending fashion outfit posts
● MLE @ Apple, working on Apple Speech Recognition team

Key Learning

● Make basic foundations strong


● Difficulty while learning often leads to finer understanding of concepts
● Learning in groups/pairs provides you accountability and rich external signals
● Collaboration in ML research/engineering is key
● Prioritization, and open Communication are key skills
● Not every paper has to be a great paper, especially for beginners
○ Definitely worth writing a paper just to learn
● Give yourself some break and have fun along the way too!

Reachout:
(with questions in the linkedin requests so we can connect faster)
Vyom - https://www.linkedin.com/in/01-vyom/
Muktan - https://www.linkedin.com/in/muktan-patel/

Open discussion

(Optional) leave a name

Mods can add questions and answers here so we can have these qna session results for
posterity.

● How to do math for ML?


○ Differentiation and Integration through DDU math courses is a good starter.
○ Algorithm and Data Structures are important as well
○ Andrew Ng Coursera ML Course
○ https://youtu.be/ZK3O402wf1c?si=IHUUTSCZQzSfztD4
○ https://www.youtube.com/@3blue1brown
● Basic ML -> Learning Complex ML concepts
○ Kaggle Competitions Basic ML
○ DL -
https://youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&si=
6ohVORx7gx5QiytH
○ [1hr Talk] Intro to Large Language Models
○ ML Engineering
■ How to read, process, and prepare the data
○ Try reading papers and implementing them
■ [1706.03762] Attention Is All You Need
○ Use AI to learn
■ https://chat.openai.com/
■ https://claude.ai/
■ https://notebooklm.google/
■ https://huggingface.co/
■ https://colab.research.google.com/
● Future of SaaS
○ Understand MLE
○ Use ML tools to get familiar with everything little by little
● AI replacing SaaS
○ Not in the near future because of the barrier of adaptation
● Working with lead researcher
○ How to write a paper (which is different from writing report)
○ How to develop experiments
○ How to work with rebuttal
○ Submitting and doing the whole paper publishing is a very good learning
○ How to read papers
○ How to pick a research direction
○ How to pivot when an idea fails
○ How to shape an idea into a research paper
● Focus on CP or ML
○ Data structure and Algorithms are foundation
○ CP level is not necessary for ML
○ Focus on newer DL techniques after understanding basic ML
○ CP itself is a good path too!
● AI/ML in India vs US
○ Learning and foundation can be done anywhere
○ However USA has better opportunities and the obstacles to get into AI/ML is
easier compared to India
○ Its harder to get a job if you are not in top 10 colleges
○ Reaching out to people for doing AI/ML internship/externship research
● Working on Gujarati ASR and challenges
○ Understanding huge Deep Learning models
○ Not enough resources (compute) to do training
○ Not enough data
○ Understanding NN and deep learning as a whole
○ Writing paper
● Open Source Contributions
○ Guidance
○ Getting your hands dirty!
○ How to find issues, github, etc…
● NLP or LLM
○ Basic DL
○ Basic NLP- CS224n: Natural Language Processing with Deep Learning
○ [1706.03762] Attention Is All You Need
○ [1hr Talk] Intro to Large Language Models
○ Andrej Karpathy - YouTube
○ https://twitter.com/home [not necessary]
● ML Ops. W5H
○ Docker - Kubernetes
○ GPUs - Distributed GPUs, CPUs, etc…
○ AWS / Other cloud
○ Parallel computing - Dask, Ray, Mlflow, Spark
○ Black Box System - Wrapper

Q) can you share some resources for fusion techniques?


Answer)
1. https://www.kaggle.com/competitions/commonlitreadabilityprize/discussion/237129
2. https://www.kaggle.com/code/cdeotte/forward-selection-oof-ensemble-0-942-private
Q) As a first year student how to start ml like I know that we should do dsa and python libraries
but how should we proceed after that. And an honest question that is college enough . it
completes dsa in 2nd year . How should I proceed through that I know python till now and I am
in 1st year

Answer:
1. start with ML course by Andrew NG and get basics of ML clear (take your time to complete it)
2. get hands-on experience by working on kaggle competitions, start with easier one
3. Select one modality (text, image, tabular data) and work on live kaggle competition to get
more experience
IMP: Try to work on you DSA basics in 1st and half of 2nd year of your university then
from 4th semester start with above steps

Q) How can I get started with computer vision after completing foundational courses in
Machine Learning and Deep Learning? Please share resources also if you know for it.
Answer)
● Exploring problems to solve in Computer Vision
○ https://www.kaggle.com/competitions/global-wheat-detection this can be a good
starting point to learn basics
○ Picking - Kaggle, Reading papers, talking to people working in CV research space
(reaching out to people)
○ https://www.coursera.org/specializations/deep-learning?utm_medium=sem&utm
_source=gg&utm_campaign=B2C_NAMER_deep-learning_deeplearning-ai_FTCOF_
specializations_pmax-nonNRL-within-14d&campaignid=20131140422&adgroupid
=6467332841&device=c&keyword=&matchtype=&network=x&devicemodel=&ad
position=&creativeid=6467332841&hide_mobile_promo&gad_source=1&gclid=Cj
0KCQiAst67BhCEARIsAKKdWOmLAnGsvlXuZd_qupqgBWGAhBXybxphibNxzOhP6
LY7iXqDOGuz1A0aAvZJEALw_wcB
○ Augmentation
○ Image types
○ Understanding the “Attention is all you need” but for Vision
■ Previous works from the below paper
■ [1512.03385] Deep Residual Learning for Image Recognition
○ Good ideas now -
■ Merging with Language Models - CLIP: Connecting text and images |
OpenAI
■ Image Generation
■ Image Understanding
■ Video Gen/Under - Veo 2 - Google DeepMind
● AI as a security breach for the data privacy issues
○ Opt Outs for all softwares
○ Opt Outs for your internet data (website, etc…) - robots.txt
○ https://www.lesswrong.com/posts/SGDjWC9NWxXWmkL86/keeping-content-out-
of-llm-training-datasets

You might also like