SlideShare a Scribd company logo
Automatic Machine Learning
By: Himadri Mishra, 13074014
Overview: What is Machine Learning?
● Subfield of computer science
● Evolved from the study of pattern recognition and
computational learning theory in artificial intelligence
● Gives computers the ability to learn without being
explicitly programmed
● Explores the study and construction of algorithms that
can learn from and make predictions on data
Basic Flow of Machine Learning
Overview: Why Machine Learning?
● Some tasks are difficult to define algorithmically.
Example: Learning to recognize objects.
● High-value predictions that can guide better decisions
and smart actions in real time without human intervention
● Machine learning as a technology that helps analyze these
large chunks of big data,
● Research area that targets progressive automation of
machine learning
● Also known as AutoML
● Focuses on end users without expert knowledge
● Offers new tools to Machine Learning experts.
○ Perform architecture search over deep representations
○ Analyse the importance of hyperparameters
○ Development of flexible software packages that can be instantiated
automatically in a data-driven way
● Follows the paradigm of Programming by Optimization (PbO)
What is Automatic Machine Learning?
Examples of AutoML
● AutoWEKA: Approach for the simultaneous selection of a machine learning
algorithm and its hyperparameters
● Deep Neural Networks: notoriously dependent on their hyperparameters, and
modern optimizers have achieved better results in setting them than humans
(Bergstra et al, Snoek et al).
● Making a science of model search: a complex computer vision architecture
could automatically be instantiated to yield state-of-the-art results on 3
different tasks: face matching, face identification, and object
recognition.
Methods of AutoML
● Bayesian optimization
● Regression models for structured data and big data
● Meta learning
● Transfer learning
● Combinatorial optimization.
An AutoML Framework
Automatic Machine Learning, AutoML
Modules of AutoML Framework, unraveled
● Data Pre-Processing
● Problem Identification and Data Splitting
● Feature Engineering
● Feature Stacking
● Application of various models to data
● Decomposition
● Feature Selection
● Model selection and HyperParameter tuning
● Evaluation of Model
Data Pre-Processing
● Tabular data is most common way of representing data in
machine learning or data mining
● Data must be converted to a tabular form
Problem Identification and Data Splitting
● Single column, binary values (Binary Classification)
● Single column, real values (Regression problem)
● Multiple column, binary values (Multi-Class
Classification)
● Multiple column, real values (Multiple target Regression
problem)
● Multilabel Classification
Types of Labels
● Stratified KFold splitting for Classification
● Normal KFold split for regression
Feature Engineering
● Numerical Variables
○ No Processing Required
● Categorical Variables
○ Label Encoders
○ One Hot Encoders
● Text Variables
○ Count Vectorize
○ TF-IDF vectorize
Types of Variables
Feature Stacking
● Two Kinds of Stacking
○ Model Stacking
■ An Ensemble Approach
■ Combines the power of diverse models into single
○ Feature Stacking
■ Different features after processing, gets combined
● Our Stacker Module is a feature stacker
Application of models and Decomposition
● We should go for Ensemble tree based models:
○ Random Forest Regressor/Classifier
○ Extra Trees Regressor/Classifier
○ Gradient Boosting Machine Regressor/Classifier
● Can’t apply linear models without Normalization
○ For dense features Standard Scaler Normalization
○ For Sparse Features Normalize without scaling about mean, only to
unit variance
● If the above steps give a “good” model, we can go for
optimization of hyperparameters module, else continue
● For High dimensional data, PCA is used to decompose
● For images start with 10-15 components and increase it as
long as results improve
● For other kind of data, start with 50-60 components
● For Text Data, we use Singular Value Decomposition after
converting text to sparse matrix
Feature Selection
● Greedy Forward Selection
○ Selecting best features iteratively
○ Selecting features based on coefficients of model
● Greedy backward elimination
● Use GBM for normal features and Random Forest for Sparse
features for feature evaluation
Model selection and HyperParameter tuning
● Most important and fundamental process of Machine
Learning
● Classification:
○ Random Forest
○ GBM
○ Logistic Regression
○ Naive Bayes
○ Support Vector Machines
○ k-Nearest Neighbors
● Regression
○ Random Forest
○ GBM
○ Linear Regression
○ Ridge
○ Lasso
○ SVR
Choice of Model and Hyperparameters
Automatic Machine Learning, AutoML
Evaluation of Model
Saving all Transformations on Train Data for reuse
Re-Use of saved transformations for Evaluation on validation set
Current Research
Automatic Architecture selection for Neural Network
Automatically Tuned Neural Network
● Auto-Net is a system that automatically configures neural networks
● Achieved the best performance on two datasets in the human expert track of
the recent ChaLearn AutoML Challenge
● Works by tuning:
○ layer-independent network hyperparameters
○ per-layer hyperparameters
● Auto-Net submission reached an AUC score of 90%, while the best human
competitor (Ideal Intel Analytics) only reached 80%
● first time an automatically-constructed neural network won a competition
dataset
Conclusion
● Machine learning (ML) has achieved considerable successes
in recent years and an ever-growing number of disciplines
rely on it.
● However, its success crucially relies on human machine
learning experts to perform various tasks manually
● The rapid growth of machine learning applications has
created a demand for off-the-shelf machine learning
methods that can be used easily and without expert
knowledge
● Auto-ML is an open research topic and will be very soon
challenging the state of the Art results in various
domains
Thank You
Ad

Recommended

Microsoft Introduction to Automated Machine Learning
Microsoft Introduction to Automated Machine Learning
Setu Chokshi
 
Automated Machine Learning
Automated Machine Learning
safa cimenli
 
The Evolution of AutoML
The Evolution of AutoML
Ning Jiang
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
Ivo Andreev
 
Automated Machine Learning
Automated Machine Learning
Yuriy Guts
 
AutoML - The Future of AI
AutoML - The Future of AI
Ning Jiang
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
Hayim Makabee
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
QuantUniversity
 
Machine learning
Machine learning
Rajib Kumar De
 
What is MLOps
What is MLOps
Henrik Skogström
 
Text Classification
Text Classification
RAX Automation Suite
 
Introduction to Auto ML
Introduction to Auto ML
Dmitry Petukhov
 
Large Language Models Bootcamp
Large Language Models Bootcamp
Data Science Dojo
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
The fundamentals of Machine Learning
The fundamentals of Machine Learning
Hichem Felouat
 
Machine Learning
Machine Learning
Shrey Malik
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGI
SynaptonIncorporated
 
Machine Learning
Machine Learning
Vivek Garg
 
Machine learning
Machine learning
Saurabh Agrawal
 
Tips and tricks to win kaggle data science competitions
Tips and tricks to win kaggle data science competitions
Darius Barušauskas
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
Databricks
 
Machine Learning with PyCarent + MLflow
Machine Learning with PyCarent + MLflow
Databricks
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
Andrew Ferlitsch
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
Hima Patel
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
Gabriel Moreira
 
Feature Engineering
Feature Engineering
HJ van Veen
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
Jordan Birdsell
 
Generative models
Generative models
Birger Moell
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
Manuel Martín
 
Introduction to Machine Learning
Introduction to Machine Learning
Lior Rokach
 

More Related Content

What's hot (20)

Machine learning
Machine learning
Rajib Kumar De
 
What is MLOps
What is MLOps
Henrik Skogström
 
Text Classification
Text Classification
RAX Automation Suite
 
Introduction to Auto ML
Introduction to Auto ML
Dmitry Petukhov
 
Large Language Models Bootcamp
Large Language Models Bootcamp
Data Science Dojo
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
The fundamentals of Machine Learning
The fundamentals of Machine Learning
Hichem Felouat
 
Machine Learning
Machine Learning
Shrey Malik
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGI
SynaptonIncorporated
 
Machine Learning
Machine Learning
Vivek Garg
 
Machine learning
Machine learning
Saurabh Agrawal
 
Tips and tricks to win kaggle data science competitions
Tips and tricks to win kaggle data science competitions
Darius Barušauskas
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
Databricks
 
Machine Learning with PyCarent + MLflow
Machine Learning with PyCarent + MLflow
Databricks
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
Andrew Ferlitsch
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
Hima Patel
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
Gabriel Moreira
 
Feature Engineering
Feature Engineering
HJ van Veen
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
Jordan Birdsell
 
Generative models
Generative models
Birger Moell
 
Large Language Models Bootcamp
Large Language Models Bootcamp
Data Science Dojo
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
The fundamentals of Machine Learning
The fundamentals of Machine Learning
Hichem Felouat
 
Machine Learning
Machine Learning
Shrey Malik
 
Transformers, LLMs, and the Possibility of AGI
Transformers, LLMs, and the Possibility of AGI
SynaptonIncorporated
 
Machine Learning
Machine Learning
Vivek Garg
 
Tips and tricks to win kaggle data science competitions
Tips and tricks to win kaggle data science competitions
Darius Barušauskas
 
Seamless MLOps with Seldon and MLflow
Seamless MLOps with Seldon and MLflow
Databricks
 
Machine Learning with PyCarent + MLflow
Machine Learning with PyCarent + MLflow
Databricks
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
Andrew Ferlitsch
 
Data Quality for Machine Learning Tasks
Data Quality for Machine Learning Tasks
Hima Patel
 
Feature Engineering - Getting most out of data for predictive models
Feature Engineering - Getting most out of data for predictive models
Gabriel Moreira
 
Feature Engineering
Feature Engineering
HJ van Veen
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
Jordan Birdsell
 

Viewers also liked (20)

Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
Manuel Martín
 
Introduction to Machine Learning
Introduction to Machine Learning
Lior Rokach
 
Towards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive Systems
Manuel Martín
 
Robsonalves fotografia Fine Art 2016-2
Robsonalves fotografia Fine Art 2016-2
Robson Alves
 
400 million Search Results -Predict Contextual Ad Clicks
400 million Search Results -Predict Contextual Ad Clicks
Sri Ambati
 
H2O Machine Learning AutoML Roadmap 2016.10
H2O Machine Learning AutoML Roadmap 2016.10
Raymond Peck
 
Alice Lindorfer
Alice Lindorfer
AOtaki
 
Nuxeo Iks 2009 11 13
Nuxeo Iks 2009 11 13
Olivier Grisel
 
Fighting Knowledge Acquisition Bottleneck with Argument Based ...
Fighting Knowledge Acquisition Bottleneck with Argument Based ...
butest
 
Introduction to Deducer
Introduction to Deducer
Kazuki Yoshida
 
Data mining with Rattle For R
Data mining with Rattle For R
Akhil Anil
 
Installing R and R-Studio
Installing R and R-Studio
Syracuse University
 
NYAI - Commodity Machine Learning & Beyond by Andreas Mueller
NYAI - Commodity Machine Learning & Beyond by Andreas Mueller
Rizwan Habib
 
KM technologies and strategy
KM technologies and strategy
Andre Saito
 
R and Rcmdr Statistical Software
R and Rcmdr Statistical Software
arttan2001
 
R-Studio Vs. Rcmdr
R-Studio Vs. Rcmdr
Syracuse University
 
Scikit-Learn: Machine Learning in Python
Scikit-Learn: Machine Learning in Python
Microsoft
 
HUD 232 Lean Financing: A Primer
HUD 232 Lean Financing: A Primer
Cambridge Realty Capital Company
 
Windshields of the future
Windshields of the future
Windshield Experts India
 
Augmented reality in future cars
Augmented reality in future cars
Prathamesh Barah
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
Manuel Martín
 
Introduction to Machine Learning
Introduction to Machine Learning
Lior Rokach
 
Towards Automatic Composition of Multicomponent Predictive Systems
Towards Automatic Composition of Multicomponent Predictive Systems
Manuel Martín
 
Robsonalves fotografia Fine Art 2016-2
Robsonalves fotografia Fine Art 2016-2
Robson Alves
 
400 million Search Results -Predict Contextual Ad Clicks
400 million Search Results -Predict Contextual Ad Clicks
Sri Ambati
 
H2O Machine Learning AutoML Roadmap 2016.10
H2O Machine Learning AutoML Roadmap 2016.10
Raymond Peck
 
Alice Lindorfer
Alice Lindorfer
AOtaki
 
Fighting Knowledge Acquisition Bottleneck with Argument Based ...
Fighting Knowledge Acquisition Bottleneck with Argument Based ...
butest
 
Introduction to Deducer
Introduction to Deducer
Kazuki Yoshida
 
Data mining with Rattle For R
Data mining with Rattle For R
Akhil Anil
 
NYAI - Commodity Machine Learning & Beyond by Andreas Mueller
NYAI - Commodity Machine Learning & Beyond by Andreas Mueller
Rizwan Habib
 
KM technologies and strategy
KM technologies and strategy
Andre Saito
 
R and Rcmdr Statistical Software
R and Rcmdr Statistical Software
arttan2001
 
Scikit-Learn: Machine Learning in Python
Scikit-Learn: Machine Learning in Python
Microsoft
 
Augmented reality in future cars
Augmented reality in future cars
Prathamesh Barah
 
Ad

Similar to Automatic Machine Learning, AutoML (20)

SKLearn Workshop.pptx
SKLearn Workshop.pptx
fsxflyer789Productio
 
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Aaron Saray
 
MLlib and Machine Learning on Spark
MLlib and Machine Learning on Spark
Petr Zapletal
 
Software Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale Automation
Hao Xu
 
AI hype or reality
AI hype or reality
Awantik Das
 
Survey Of AutoGL - First Dedicated framework for machine learning on Graphs
Survey Of AutoGL - First Dedicated framework for machine learning on Graphs
SurabhiGovil2
 
Future of ai on the jvm
Future of ai on the jvm
Adam Gibson
 
BSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 Sessions
BigML, Inc
 
Aws autopilot
Aws autopilot
Vivek Raja P S
 
Centernet
Centernet
Arithmer Inc.
 
Active Learning on Question Answering with Dialogues
Active Learning on Question Answering with Dialogues
Jinho Choi
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDC
gdgsurrey
 
Overview of Artificial Intelligence - Technology
Overview of Artificial Intelligence - Technology
NickDAgostino3
 
Productionalizing Spark ML
Productionalizing Spark ML
datamantra
 
Michelangelo - Machine Learning Platform - 2018
Michelangelo - Machine Learning Platform - 2018
Karthik Murugesan
 
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Costanoa Ventures
 
Willump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML Inference
Databricks
 
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Alexey Zinoviev
 
Data Preprocessing
Data Preprocessing
zekeLabs Technologies
 
PPT3: Main algorithms and techniques required for implementing Machine Learni...
PPT3: Main algorithms and techniques required for implementing Machine Learni...
akira-ai
 
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Enterprise PHP Architecture through Design Patterns and Modularization (Midwe...
Aaron Saray
 
MLlib and Machine Learning on Spark
MLlib and Machine Learning on Spark
Petr Zapletal
 
Software Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale Automation
Hao Xu
 
AI hype or reality
AI hype or reality
Awantik Das
 
Survey Of AutoGL - First Dedicated framework for machine learning on Graphs
Survey Of AutoGL - First Dedicated framework for machine learning on Graphs
SurabhiGovil2
 
Future of ai on the jvm
Future of ai on the jvm
Adam Gibson
 
BSSML16 L10. Summary Day 2 Sessions
BSSML16 L10. Summary Day 2 Sessions
BigML, Inc
 
Active Learning on Question Answering with Dialogues
Active Learning on Question Answering with Dialogues
Jinho Choi
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDC
gdgsurrey
 
Overview of Artificial Intelligence - Technology
Overview of Artificial Intelligence - Technology
NickDAgostino3
 
Productionalizing Spark ML
Productionalizing Spark ML
datamantra
 
Michelangelo - Machine Learning Platform - 2018
Michelangelo - Machine Learning Platform - 2018
Karthik Murugesan
 
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Costanoa Ventures
 
Willump: Optimizing Feature Computation in ML Inference
Willump: Optimizing Feature Computation in ML Inference
Databricks
 
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Python's slippy path and Tao of thick Pandas: give my data, Rrrrr...
Alexey Zinoviev
 
PPT3: Main algorithms and techniques required for implementing Machine Learni...
PPT3: Main algorithms and techniques required for implementing Machine Learni...
akira-ai
 
Ad

Recently uploaded (20)

Introduction to sensing and Week-1.pptx
Introduction to sensing and Week-1.pptx
KNaveenKumarECE
 
Microwatt: Open Tiny Core, Big Possibilities
Microwatt: Open Tiny Core, Big Possibilities
IBM
 
60 Years and Beyond eBook 1234567891.pdf
60 Years and Beyond eBook 1234567891.pdf
waseemalazzeh
 
Tesla-Stock-Analysis-and-Forecast.pptx (1).pptx
Tesla-Stock-Analysis-and-Forecast.pptx (1).pptx
moonsony54
 
MATERIAL SCIENCE LECTURE NOTES FOR DIPLOMA STUDENTS
MATERIAL SCIENCE LECTURE NOTES FOR DIPLOMA STUDENTS
SAMEER VISHWAKARMA
 
Introduction to Natural Language Processing - Stages in NLP Pipeline, Challen...
Introduction to Natural Language Processing - Stages in NLP Pipeline, Challen...
resming1
 
Validating a Citizen Observatories enabling Platform by completing a Citizen ...
Validating a Citizen Observatories enabling Platform by completing a Citizen ...
Diego López-de-Ipiña González-de-Artaza
 
Complete University of Calculus :: 2nd edition
Complete University of Calculus :: 2nd edition
Shabista Imam
 
machine learning is a advance technology
machine learning is a advance technology
ynancy893
 
Structured Programming with C++ :: Kjell Backman
Structured Programming with C++ :: Kjell Backman
Shabista Imam
 
20CE404-Soil Mechanics - Slide Share PPT
20CE404-Soil Mechanics - Slide Share PPT
saravananr808639
 
Abraham Silberschatz-Operating System Concepts (9th,2012.12).pdf
Abraham Silberschatz-Operating System Concepts (9th,2012.12).pdf
Shabista Imam
 
retina_biometrics ruet rajshahi bangdesh.pptx
retina_biometrics ruet rajshahi bangdesh.pptx
MdRakibulIslam697135
 
Fatality due to Falls at Working at Height
Fatality due to Falls at Working at Height
ssuserb8994f
 
Complete guidance book of Asp.Net Web API
Complete guidance book of Asp.Net Web API
Shabista Imam
 
Unit III_One Dimensional Consolidation theory
Unit III_One Dimensional Consolidation theory
saravananr808639
 
Rapid Prototyping for XR: Lecture 4 - High Level Prototyping.
Rapid Prototyping for XR: Lecture 4 - High Level Prototyping.
Mark Billinghurst
 
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
resming1
 
Rapid Prototyping for XR: Lecture 1 Introduction to Prototyping
Rapid Prototyping for XR: Lecture 1 Introduction to Prototyping
Mark Billinghurst
 
Modern multi-proposer consensus implementations
Modern multi-proposer consensus implementations
François Garillot
 
Introduction to sensing and Week-1.pptx
Introduction to sensing and Week-1.pptx
KNaveenKumarECE
 
Microwatt: Open Tiny Core, Big Possibilities
Microwatt: Open Tiny Core, Big Possibilities
IBM
 
60 Years and Beyond eBook 1234567891.pdf
60 Years and Beyond eBook 1234567891.pdf
waseemalazzeh
 
Tesla-Stock-Analysis-and-Forecast.pptx (1).pptx
Tesla-Stock-Analysis-and-Forecast.pptx (1).pptx
moonsony54
 
MATERIAL SCIENCE LECTURE NOTES FOR DIPLOMA STUDENTS
MATERIAL SCIENCE LECTURE NOTES FOR DIPLOMA STUDENTS
SAMEER VISHWAKARMA
 
Introduction to Natural Language Processing - Stages in NLP Pipeline, Challen...
Introduction to Natural Language Processing - Stages in NLP Pipeline, Challen...
resming1
 
Validating a Citizen Observatories enabling Platform by completing a Citizen ...
Validating a Citizen Observatories enabling Platform by completing a Citizen ...
Diego López-de-Ipiña González-de-Artaza
 
Complete University of Calculus :: 2nd edition
Complete University of Calculus :: 2nd edition
Shabista Imam
 
machine learning is a advance technology
machine learning is a advance technology
ynancy893
 
Structured Programming with C++ :: Kjell Backman
Structured Programming with C++ :: Kjell Backman
Shabista Imam
 
20CE404-Soil Mechanics - Slide Share PPT
20CE404-Soil Mechanics - Slide Share PPT
saravananr808639
 
Abraham Silberschatz-Operating System Concepts (9th,2012.12).pdf
Abraham Silberschatz-Operating System Concepts (9th,2012.12).pdf
Shabista Imam
 
retina_biometrics ruet rajshahi bangdesh.pptx
retina_biometrics ruet rajshahi bangdesh.pptx
MdRakibulIslam697135
 
Fatality due to Falls at Working at Height
Fatality due to Falls at Working at Height
ssuserb8994f
 
Complete guidance book of Asp.Net Web API
Complete guidance book of Asp.Net Web API
Shabista Imam
 
Unit III_One Dimensional Consolidation theory
Unit III_One Dimensional Consolidation theory
saravananr808639
 
Rapid Prototyping for XR: Lecture 4 - High Level Prototyping.
Rapid Prototyping for XR: Lecture 4 - High Level Prototyping.
Mark Billinghurst
 
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
resming1
 
Rapid Prototyping for XR: Lecture 1 Introduction to Prototyping
Rapid Prototyping for XR: Lecture 1 Introduction to Prototyping
Mark Billinghurst
 
Modern multi-proposer consensus implementations
Modern multi-proposer consensus implementations
François Garillot
 

Automatic Machine Learning, AutoML

  • 1. Automatic Machine Learning By: Himadri Mishra, 13074014
  • 2. Overview: What is Machine Learning? ● Subfield of computer science ● Evolved from the study of pattern recognition and computational learning theory in artificial intelligence ● Gives computers the ability to learn without being explicitly programmed ● Explores the study and construction of algorithms that can learn from and make predictions on data
  • 3. Basic Flow of Machine Learning
  • 4. Overview: Why Machine Learning? ● Some tasks are difficult to define algorithmically. Example: Learning to recognize objects. ● High-value predictions that can guide better decisions and smart actions in real time without human intervention ● Machine learning as a technology that helps analyze these large chunks of big data,
  • 5. ● Research area that targets progressive automation of machine learning ● Also known as AutoML ● Focuses on end users without expert knowledge ● Offers new tools to Machine Learning experts. ○ Perform architecture search over deep representations ○ Analyse the importance of hyperparameters ○ Development of flexible software packages that can be instantiated automatically in a data-driven way ● Follows the paradigm of Programming by Optimization (PbO) What is Automatic Machine Learning?
  • 6. Examples of AutoML ● AutoWEKA: Approach for the simultaneous selection of a machine learning algorithm and its hyperparameters ● Deep Neural Networks: notoriously dependent on their hyperparameters, and modern optimizers have achieved better results in setting them than humans (Bergstra et al, Snoek et al). ● Making a science of model search: a complex computer vision architecture could automatically be instantiated to yield state-of-the-art results on 3 different tasks: face matching, face identification, and object recognition.
  • 7. Methods of AutoML ● Bayesian optimization ● Regression models for structured data and big data ● Meta learning ● Transfer learning ● Combinatorial optimization.
  • 10. Modules of AutoML Framework, unraveled ● Data Pre-Processing ● Problem Identification and Data Splitting ● Feature Engineering ● Feature Stacking ● Application of various models to data ● Decomposition ● Feature Selection ● Model selection and HyperParameter tuning ● Evaluation of Model
  • 12. ● Tabular data is most common way of representing data in machine learning or data mining ● Data must be converted to a tabular form
  • 13. Problem Identification and Data Splitting
  • 14. ● Single column, binary values (Binary Classification) ● Single column, real values (Regression problem) ● Multiple column, binary values (Multi-Class Classification) ● Multiple column, real values (Multiple target Regression problem) ● Multilabel Classification Types of Labels
  • 15. ● Stratified KFold splitting for Classification ● Normal KFold split for regression
  • 17. ● Numerical Variables ○ No Processing Required ● Categorical Variables ○ Label Encoders ○ One Hot Encoders ● Text Variables ○ Count Vectorize ○ TF-IDF vectorize Types of Variables
  • 19. ● Two Kinds of Stacking ○ Model Stacking ■ An Ensemble Approach ■ Combines the power of diverse models into single ○ Feature Stacking ■ Different features after processing, gets combined ● Our Stacker Module is a feature stacker
  • 20. Application of models and Decomposition
  • 21. ● We should go for Ensemble tree based models: ○ Random Forest Regressor/Classifier ○ Extra Trees Regressor/Classifier ○ Gradient Boosting Machine Regressor/Classifier ● Can’t apply linear models without Normalization ○ For dense features Standard Scaler Normalization ○ For Sparse Features Normalize without scaling about mean, only to unit variance ● If the above steps give a “good” model, we can go for optimization of hyperparameters module, else continue
  • 22. ● For High dimensional data, PCA is used to decompose ● For images start with 10-15 components and increase it as long as results improve ● For other kind of data, start with 50-60 components ● For Text Data, we use Singular Value Decomposition after converting text to sparse matrix
  • 24. ● Greedy Forward Selection ○ Selecting best features iteratively ○ Selecting features based on coefficients of model ● Greedy backward elimination ● Use GBM for normal features and Random Forest for Sparse features for feature evaluation
  • 25. Model selection and HyperParameter tuning
  • 26. ● Most important and fundamental process of Machine Learning
  • 27. ● Classification: ○ Random Forest ○ GBM ○ Logistic Regression ○ Naive Bayes ○ Support Vector Machines ○ k-Nearest Neighbors ● Regression ○ Random Forest ○ GBM ○ Linear Regression ○ Ridge ○ Lasso ○ SVR Choice of Model and Hyperparameters
  • 30. Saving all Transformations on Train Data for reuse
  • 31. Re-Use of saved transformations for Evaluation on validation set
  • 33. Automatic Architecture selection for Neural Network
  • 34. Automatically Tuned Neural Network ● Auto-Net is a system that automatically configures neural networks ● Achieved the best performance on two datasets in the human expert track of the recent ChaLearn AutoML Challenge ● Works by tuning: ○ layer-independent network hyperparameters ○ per-layer hyperparameters ● Auto-Net submission reached an AUC score of 90%, while the best human competitor (Ideal Intel Analytics) only reached 80% ● first time an automatically-constructed neural network won a competition dataset
  • 36. ● Machine learning (ML) has achieved considerable successes in recent years and an ever-growing number of disciplines rely on it. ● However, its success crucially relies on human machine learning experts to perform various tasks manually ● The rapid growth of machine learning applications has created a demand for off-the-shelf machine learning methods that can be used easily and without expert knowledge ● Auto-ML is an open research topic and will be very soon challenging the state of the Art results in various domains