SlideShare a Scribd company logo
S TAT I S T I C A L
L E A R N I N G
V S .
M A C H I N E
L E A R N I N G
Venn diagram on the coverage of machine learning and statistical modeling in the universe
of data science
DEFINITION
• Machine Learning is …
an algorithm that can learn from data without relying on rules-
based programming.
• Statistical Modelling is …
formalization of relationships between variables in the form of
mathematical equations.
A BUSINESS CASE
Let us now see an interesting example published by McKinsey differentiating the two
algorithms :
– Case : Understand the risk level of customers churn over a period of time for a
Telecom company
– Data Available : Two Drivers – A & B
– Next is a graph to understand the difference between a statistical model and a
Machine Learning algorithm.
Statistical learning vs. Machine Learning
OBSERVATION
• Statistical model is all about getting a simple formulation of a frontier in a classification
model problem. Here we see a non linear boundary which to some extent separates
risky people from non-risky people. But when we see the contours generated by
Machine Learning algorithm, we witness that statistical modeling is no way comparable
for the problem in hand to the Machine Learning algorithm. The contours of machine
learning seems to capture all patterns beyond any boundaries of linearity or even
continuity of the boundaries. This is what Machine Learning can do for you.
DIFFERENCES BETWEEN MACHINE
LEARNING AND STATISTICAL MODELING
• Given the flavor of difference in output of these two approaches, let us understand the
difference in the two paradigms, even though both do almost similar job :
– Schools they come from
– When did they come into existence?
– Assumptions they work on
– Type of data they deal with
– Nomenclatures of operations and objects
– Techniques used
– Predictive power and human efforts involved to implement
• All the differences mentioned above do separate the two to some extent, but there is
no hard boundary between Machine Learning and statistical modeling.
THEY BELONG TO DIFFERENT SCHOOLS
• Machine Learning is …
– a subfield of computer science and artificial intelligence which deals with building
systems that can learn from data, instead of explicitly programmed instructions.
• Statistical Modelling is …
– a subfield of mathematics which deals with finding relationship between variables
to predict an outcome
THEY CAME UP IN DIFFERENT ERAS
Statistical modeling has been there for centuries now. However, Machine
learning is a very recent development. It came into existence in the 1990s
as steady advances in digitization and cheap computing power enabled
data scientists to stop building finished models and instead train
computers to do so. The unmanageable volume and complexity of the big
data that the world is now swimming in have increased the potential of
machine learning—and the need for it.
EXTENT OF ASSUMPTIONS INVOLVED
• Statistical modeling work on a number of assumption. For instance a linear
regression assumes :
– Linear relation between independent and dependent variable
– Homoscedasticity
– Mean of error at zero for every dependent value
– Independence of observations
– Error should be normally distributed for each value of dependent variable
• Similarly Logistic regressions comes with its own set of assumptions. Even a non
linear model has to comply to a continuous segregation boundary. Machine
Learning algorithms do assume a few of these things but in general are spared
from most of these assumptions. The biggest advantage of using a Machine
Learning algorithm is that there might not be any continuity of boundary as
shown in the case study above. Also, we need not specify the distribution of
dependent or independent variable in a machine learning algorithm.
TYPES OF DATA THEY DEAL WITH
Machine Learning algorithms are wide range tools. Online Learning tools predict data on
the fly. These tools are capable of learning from trillions of observations one by one.
They make prediction and learn simultaneously. Other algorithms like Random Forest
and Gradient Boosting are also exceptionally fast with big data. Machine learning does
really well with wide (high number of attributes) and deep (high number of
observations). However statistical modeling are generally applied for smaller data with
less attributes or they end up over fitting.
Formulation
Even when the end goal for both machine learning and statistical modeling is same, the formulation of two
are significantly different.
In a statistical model, we basically try to estimate the function “f” in
Dependent Variable ( Y ) = f(Independent Variable) + error function
Machine Learning takes away the deterministic function “f” out of the equation. It simply becomes
Output(Y) ----- > Input (X)
It will try to find pockets of X in n dimensions (where n is the number of attributes), where occurrence of Y
is significantly different.
PREDICTIVE POWER AND HUMAN
EFFORT
• Nature does not assume anything before forcing an event to occur.
• So the lesser assumptions in a predictive model, higher will be the
predictive power. Machine Learning as the name suggest needs minimal
human effort. Machine learning works on iterations where computer
tries to find out patterns hidden in data. Because machine does this
work on comprehensive data and is independent of all the assumption,
predictive power is generally very strong for these models. Statistical
model are mathematics intensive and based on coefficient estimation. It
requires the modeler to understand the relation between variable before
putting it in.
END NOTES
It may seem that machine learning and statistical modeling are two
different branches of predictive modeling, they are almost the same. The
difference between these two have gone down significantly over past
decade. Both the branches have learned from each other a lot and will
further come closer in future. I hope we motivated you enough to acquire
skills in each of these two domains and then compare how do they
compliment each other.
Ad

More Related Content

What's hot (20)

Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
Suraj Parmar
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
Derek Kane
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
Venkata Reddy Konasani
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
VARUN KUMAR
 
Lasso and ridge regression
Lasso and ridge regressionLasso and ridge regression
Lasso and ridge regression
SreerajVA
 
Analysis of Time Series
Analysis of Time SeriesAnalysis of Time Series
Analysis of Time Series
Manu Antony
 
Theory of estimation
Theory of estimationTheory of estimation
Theory of estimation
Tech_MX
 
multiple linear regression
multiple linear regressionmultiple linear regression
multiple linear regression
Akhilesh Joshi
 
Model evaluation - machine learning
Model evaluation - machine learningModel evaluation - machine learning
Model evaluation - machine learning
Son Phan
 
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Edureka!
 
Machine learning
Machine learningMachine learning
Machine learning
Tushar Nikam
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Simplilearn
 
Factor analysis (1)
Factor analysis (1)Factor analysis (1)
Factor analysis (1)
CVA170032STUDENT
 
Linear and Logistics Regression
Linear and Logistics RegressionLinear and Logistics Regression
Linear and Logistics Regression
Mukul Kumar Singh Chauhan
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Koundinya Desiraju
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
Knoldus Inc.
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
mahutte
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
Haris Jamil
 
Important Classification and Regression Metrics.pptx
Important Classification and Regression Metrics.pptxImportant Classification and Regression Metrics.pptx
Important Classification and Regression Metrics.pptx
Chode Amarnath
 
Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
Suraj Parmar
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
Derek Kane
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
VARUN KUMAR
 
Lasso and ridge regression
Lasso and ridge regressionLasso and ridge regression
Lasso and ridge regression
SreerajVA
 
Analysis of Time Series
Analysis of Time SeriesAnalysis of Time Series
Analysis of Time Series
Manu Antony
 
Theory of estimation
Theory of estimationTheory of estimation
Theory of estimation
Tech_MX
 
multiple linear regression
multiple linear regressionmultiple linear regression
multiple linear regression
Akhilesh Joshi
 
Model evaluation - machine learning
Model evaluation - machine learningModel evaluation - machine learning
Model evaluation - machine learning
Son Phan
 
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Scikit Learn Tutorial | Machine Learning with Python | Python for Data Scienc...
Edureka!
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Simplilearn
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Koundinya Desiraju
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
Machine Learning With Logistic Regression
Machine Learning  With Logistic RegressionMachine Learning  With Logistic Regression
Machine Learning With Logistic Regression
Knoldus Inc.
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
mahutte
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
Haris Jamil
 
Important Classification and Regression Metrics.pptx
Important Classification and Regression Metrics.pptxImportant Classification and Regression Metrics.pptx
Important Classification and Regression Metrics.pptx
Chode Amarnath
 

Similar to Statistical learning vs. Machine Learning (20)

A tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbiesA tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbies
Vimal Gupta
 
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdfTop Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Artificial Intelligence Board of America
 
Qt unit i
Qt unit   iQt unit   i
Qt unit i
bhuvana ganesan
 
Data analytics with python introductory
Data analytics with python introductoryData analytics with python introductory
Data analytics with python introductory
Abhimanyu Dwivedi
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
NeeleEilers
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
NitinSharma134320
 
MACHINE LEARNING AND ITS APPLICATIONS (2).pptx
MACHINE LEARNING AND ITS APPLICATIONS (2).pptxMACHINE LEARNING AND ITS APPLICATIONS (2).pptx
MACHINE LEARNING AND ITS APPLICATIONS (2).pptx
ssuser442651
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
Rebecca Bilbro
 
A Comparison of Traditional Simulation and MSAL (6-3-2015)
A Comparison of Traditional Simulation and MSAL (6-3-2015)A Comparison of Traditional Simulation and MSAL (6-3-2015)
A Comparison of Traditional Simulation and MSAL (6-3-2015)
Bob Garrett
 
Lecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.pptLecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.ppt
NiteshJha97
 
Materi 10 - Penelitian Pemodelan Komputer.pdf
Materi 10 - Penelitian Pemodelan Komputer.pdfMateri 10 - Penelitian Pemodelan Komputer.pdf
Materi 10 - Penelitian Pemodelan Komputer.pdf
MahesaRioAditya
 
Machine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.pptMachine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.ppt
ShivaShiva783981
 
The Price is Wrong - Quantative Finance
The Price is Wrong - Quantative Finance The Price is Wrong - Quantative Finance
The Price is Wrong - Quantative Finance
TerminusDB
 
Mathematical models and algorithms challenges
Mathematical models and algorithms challengesMathematical models and algorithms challenges
Mathematical models and algorithms challenges
ijctcm
 
Machine learning - session 1
Machine learning - session 1Machine learning - session 1
Machine learning - session 1
Luis Borbon
 
Machine learning
Machine learningMachine learning
Machine learning
eonx_32
 
ML.pdf
ML.pdfML.pdf
ML.pdf
SamuelAwuah1
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Jean-Luc Caut
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
butest
 
demo lecture for foundation class for btech
demo lecture for foundation class for btechdemo lecture for foundation class for btech
demo lecture for foundation class for btech
ROHIT738213
 
A tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbiesA tour of the top 10 algorithms for machine learning newbies
A tour of the top 10 algorithms for machine learning newbies
Vimal Gupta
 
Data analytics with python introductory
Data analytics with python introductoryData analytics with python introductory
Data analytics with python introductory
Abhimanyu Dwivedi
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
NeeleEilers
 
MACHINE LEARNING AND ITS APPLICATIONS (2).pptx
MACHINE LEARNING AND ITS APPLICATIONS (2).pptxMACHINE LEARNING AND ITS APPLICATIONS (2).pptx
MACHINE LEARNING AND ITS APPLICATIONS (2).pptx
ssuser442651
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
Rebecca Bilbro
 
A Comparison of Traditional Simulation and MSAL (6-3-2015)
A Comparison of Traditional Simulation and MSAL (6-3-2015)A Comparison of Traditional Simulation and MSAL (6-3-2015)
A Comparison of Traditional Simulation and MSAL (6-3-2015)
Bob Garrett
 
Lecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.pptLecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.ppt
NiteshJha97
 
Materi 10 - Penelitian Pemodelan Komputer.pdf
Materi 10 - Penelitian Pemodelan Komputer.pdfMateri 10 - Penelitian Pemodelan Komputer.pdf
Materi 10 - Penelitian Pemodelan Komputer.pdf
MahesaRioAditya
 
Machine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.pptMachine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.ppt
ShivaShiva783981
 
The Price is Wrong - Quantative Finance
The Price is Wrong - Quantative Finance The Price is Wrong - Quantative Finance
The Price is Wrong - Quantative Finance
TerminusDB
 
Mathematical models and algorithms challenges
Mathematical models and algorithms challengesMathematical models and algorithms challenges
Mathematical models and algorithms challenges
ijctcm
 
Machine learning - session 1
Machine learning - session 1Machine learning - session 1
Machine learning - session 1
Luis Borbon
 
Machine learning
Machine learningMachine learning
Machine learning
eonx_32
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
butest
 
demo lecture for foundation class for btech
demo lecture for foundation class for btechdemo lecture for foundation class for btech
demo lecture for foundation class for btech
ROHIT738213
 
Ad

Recently uploaded (20)

Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjksPpt. Nikhil.pptxnshwuudgcudisisshvehsjks
Ppt. Nikhil.pptxnshwuudgcudisisshvehsjks
panchariyasahil
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
1. Briefing Session_SEED with Hon. Governor Assam - 27.10.pdf
Simran112433
 
Ad

Statistical learning vs. Machine Learning

  • 1. S TAT I S T I C A L L E A R N I N G V S . M A C H I N E L E A R N I N G
  • 2. Venn diagram on the coverage of machine learning and statistical modeling in the universe of data science
  • 3. DEFINITION • Machine Learning is … an algorithm that can learn from data without relying on rules- based programming. • Statistical Modelling is … formalization of relationships between variables in the form of mathematical equations.
  • 4. A BUSINESS CASE Let us now see an interesting example published by McKinsey differentiating the two algorithms : – Case : Understand the risk level of customers churn over a period of time for a Telecom company – Data Available : Two Drivers – A & B – Next is a graph to understand the difference between a statistical model and a Machine Learning algorithm.
  • 6. OBSERVATION • Statistical model is all about getting a simple formulation of a frontier in a classification model problem. Here we see a non linear boundary which to some extent separates risky people from non-risky people. But when we see the contours generated by Machine Learning algorithm, we witness that statistical modeling is no way comparable for the problem in hand to the Machine Learning algorithm. The contours of machine learning seems to capture all patterns beyond any boundaries of linearity or even continuity of the boundaries. This is what Machine Learning can do for you.
  • 7. DIFFERENCES BETWEEN MACHINE LEARNING AND STATISTICAL MODELING • Given the flavor of difference in output of these two approaches, let us understand the difference in the two paradigms, even though both do almost similar job : – Schools they come from – When did they come into existence? – Assumptions they work on – Type of data they deal with – Nomenclatures of operations and objects – Techniques used – Predictive power and human efforts involved to implement • All the differences mentioned above do separate the two to some extent, but there is no hard boundary between Machine Learning and statistical modeling.
  • 8. THEY BELONG TO DIFFERENT SCHOOLS • Machine Learning is … – a subfield of computer science and artificial intelligence which deals with building systems that can learn from data, instead of explicitly programmed instructions. • Statistical Modelling is … – a subfield of mathematics which deals with finding relationship between variables to predict an outcome
  • 9. THEY CAME UP IN DIFFERENT ERAS Statistical modeling has been there for centuries now. However, Machine learning is a very recent development. It came into existence in the 1990s as steady advances in digitization and cheap computing power enabled data scientists to stop building finished models and instead train computers to do so. The unmanageable volume and complexity of the big data that the world is now swimming in have increased the potential of machine learning—and the need for it.
  • 10. EXTENT OF ASSUMPTIONS INVOLVED • Statistical modeling work on a number of assumption. For instance a linear regression assumes : – Linear relation between independent and dependent variable – Homoscedasticity – Mean of error at zero for every dependent value – Independence of observations – Error should be normally distributed for each value of dependent variable • Similarly Logistic regressions comes with its own set of assumptions. Even a non linear model has to comply to a continuous segregation boundary. Machine Learning algorithms do assume a few of these things but in general are spared from most of these assumptions. The biggest advantage of using a Machine Learning algorithm is that there might not be any continuity of boundary as shown in the case study above. Also, we need not specify the distribution of dependent or independent variable in a machine learning algorithm.
  • 11. TYPES OF DATA THEY DEAL WITH Machine Learning algorithms are wide range tools. Online Learning tools predict data on the fly. These tools are capable of learning from trillions of observations one by one. They make prediction and learn simultaneously. Other algorithms like Random Forest and Gradient Boosting are also exceptionally fast with big data. Machine learning does really well with wide (high number of attributes) and deep (high number of observations). However statistical modeling are generally applied for smaller data with less attributes or they end up over fitting.
  • 12. Formulation Even when the end goal for both machine learning and statistical modeling is same, the formulation of two are significantly different. In a statistical model, we basically try to estimate the function “f” in Dependent Variable ( Y ) = f(Independent Variable) + error function Machine Learning takes away the deterministic function “f” out of the equation. It simply becomes Output(Y) ----- > Input (X) It will try to find pockets of X in n dimensions (where n is the number of attributes), where occurrence of Y is significantly different.
  • 13. PREDICTIVE POWER AND HUMAN EFFORT • Nature does not assume anything before forcing an event to occur. • So the lesser assumptions in a predictive model, higher will be the predictive power. Machine Learning as the name suggest needs minimal human effort. Machine learning works on iterations where computer tries to find out patterns hidden in data. Because machine does this work on comprehensive data and is independent of all the assumption, predictive power is generally very strong for these models. Statistical model are mathematics intensive and based on coefficient estimation. It requires the modeler to understand the relation between variable before putting it in.
  • 14. END NOTES It may seem that machine learning and statistical modeling are two different branches of predictive modeling, they are almost the same. The difference between these two have gone down significantly over past decade. Both the branches have learned from each other a lot and will further come closer in future. I hope we motivated you enough to acquire skills in each of these two domains and then compare how do they compliment each other.