SlideShare a Scribd company logo
Introduction to Machine Learning concepts
1958
1998
Introduction to Machine Learning concepts
Introduction to Machine Learning concepts
Introduction to Machine Learning concepts
Introduction to Machine Learning concepts
Introduction to Machine Learning concepts
&
Introduction to Machine Learning concepts
Supervised
orange
banana
banana
apple
apple
apple
apple
Trained model
New «unseen» data
orange
Given labeled examples learn
to predict new examples
Spam…
or not spam?
Is him a genuine or an impostor?
Which brand the car belongs to?
At which price should I sell my
beautiful house?
500’000 € ?
410’000 € ?
279’000 € ?
Unsupervised
Given data, but not labels, learn
to cluster/group similar data
$$$
$$$
$$$ $ $ $
Sport
Health
Business
Reinforcement Given a sequence of examples/states and a
reward after completing that sequence,
learn to predict the action to take in for an
individual example/state
… Win
… Lose
Why is Machine Learning possible?
Why is Machine Learning possible?
MORE DATA AVAILABLE – LARGER MEMORY IN HANDLING THE DATA – GREATER COMPUTATIONAL
POWER FOR CALCULATING – ONLINE CONTINUOUS LEARNING
How do we do Machine Learning?
Data gathering
MINING SOFTWARE REPOSITORIES – INTERVIEWS – SURVEYS – ONLINE OPEN DATASETS –
STREAM SOURCES
Data cleanliness and quality
BIGGER IS NOT BETTER!
How do we do Machine Learning?
Knowledge representation
Data formatting
VECTORS – MATRICES
Data visualization
IT IS OFTEN A LOT OF DATA
NEED FOR SPECIALIZED SOFTWARE!
How do we do Machine Learning?
Strategy selection …
Strategy selection: The machine learning framework
f( )=apple
f( )=orange
f( )=410’000 €
Apply a prediction function to a feature representation of the
data to get the desired output
y = f(x)
OUTPUT PREDICTION FUNCTION
FEATURE
Training
Given a training set of labeled examples 𝑥1, 𝑦1 , … , 𝑥𝑛, 𝑦𝑛 estimate the
prediction function f by minimizing the prediction error on the training set
Testing
Apply f to a never before seen test example x and output the predicted value
y=f(x)
Strategy selection: The machine learning design phase
Training data
Features Training Learned
model
Training
labels
Features
Learned
model Prediction
Test data
How do we choose training and test set?
Two most common techniques are percentile sampling
ad k-folds cross-validation
Percentile sampling
Divide the dataset between X% to
be used for training and Y% to be
used for prediction where X>>Y
(e.g., 70/30)
Training observation
Testing observation
K-folds cross-validation
Randomly divide dataset into k “folds” then
randomly select one to be used as testing data.
Circularly select another one, selecting all of them
at least once, calculate the average error rate of
estimations
3 folds
Train on
Test on
Train on
Test on
Train on
Test on
Select
strategy
Apply
strategy
Train
model
Test
model
Improve
strategy
Satisfied?
yes
no
In summary
ML means learning from the past to predict the future
MACHINES NEED TO MAKE LESS ERROR IN PREDICTION (ERROR IS EVALUATED WITH
STATITSICAL INDEXES AS F-MEASURE)
At least 3 classes of ML exist
SUPERVISED LEARNING
UNSUPERVISED LEARNING
REINFORCEMENT LEARNING
Ad

More Related Content

Similar to Introduction to Machine Learning concepts (20)

5_Model for Predictions_Machine_Learning.ppt
5_Model for Predictions_Machine_Learning.ppt5_Model for Predictions_Machine_Learning.ppt
5_Model for Predictions_Machine_Learning.ppt
VGaneshKarthikeyan
 
Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion
antimo musone
 
Machine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdfMachine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdf
NsitTech
 
Lecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.pptLecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.ppt
NiteshJha97
 
Machine Learning Techniques all units .ppt
Machine Learning Techniques all units .pptMachine Learning Techniques all units .ppt
Machine Learning Techniques all units .ppt
vidhyav58
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.ppt
ARVIND SARDAR
 
Predire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big DataPredire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big Data
Data Driven Innovation
 
AI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialAI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World Tutorial
Tariq King
 
Machine learning
Machine learningMachine learning
Machine learning
Dr Geetha Mohan
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
Jeet Das
 
Machine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxMachine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptx
NsitTech
 
Machine Learning_PPT.pptx
Machine Learning_PPT.pptxMachine Learning_PPT.pptx
Machine Learning_PPT.pptx
RajeshBabu833061
 
Machine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.pptMachine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.ppt
ShivaShiva783981
 
Module 6: Ensemble Algorithms
Module 6:  Ensemble AlgorithmsModule 6:  Ensemble Algorithms
Module 6: Ensemble Algorithms
Sara Hooker
 
Week 1.pdf
Week 1.pdfWeek 1.pdf
Week 1.pdf
AnjaliJain608033
 
Introduction of Machine Learning
Introduction of Machine LearningIntroduction of Machine Learning
Introduction of Machine Learning
Mohammad Hossain
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
Shree M.L.Kakadiya MCA mahila college, Amreli
 
Primer to Machine Learning
Primer to Machine LearningPrimer to Machine Learning
Primer to Machine Learning
Jeff Tanner
 
Statistical foundations of ml
Statistical foundations of mlStatistical foundations of ml
Statistical foundations of ml
Vipul Kalamkar
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptx
Yousef Aburawi
 
5_Model for Predictions_Machine_Learning.ppt
5_Model for Predictions_Machine_Learning.ppt5_Model for Predictions_Machine_Learning.ppt
5_Model for Predictions_Machine_Learning.ppt
VGaneshKarthikeyan
 
Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion Tech meetup Data Driven - Codemotion
Tech meetup Data Driven - Codemotion
antimo musone
 
Machine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdfMachine Learning - Lecture1.pptx.pdf
Machine Learning - Lecture1.pptx.pdf
NsitTech
 
Lecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.pptLecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.ppt
NiteshJha97
 
Machine Learning Techniques all units .ppt
Machine Learning Techniques all units .pptMachine Learning Techniques all units .ppt
Machine Learning Techniques all units .ppt
vidhyav58
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.ppt
ARVIND SARDAR
 
Predire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big DataPredire il futuro con Machine Learning & Big Data
Predire il futuro con Machine Learning & Big Data
Data Driven Innovation
 
AI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World TutorialAI and ML Skills for the Testing World Tutorial
AI and ML Skills for the Testing World Tutorial
Tariq King
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
Jeet Das
 
Machine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptxMachine Learning - Lecture2.pptx
Machine Learning - Lecture2.pptx
NsitTech
 
Machine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.pptMachine learning introduction to unit 1.ppt
Machine learning introduction to unit 1.ppt
ShivaShiva783981
 
Module 6: Ensemble Algorithms
Module 6:  Ensemble AlgorithmsModule 6:  Ensemble Algorithms
Module 6: Ensemble Algorithms
Sara Hooker
 
Introduction of Machine Learning
Introduction of Machine LearningIntroduction of Machine Learning
Introduction of Machine Learning
Mohammad Hossain
 
Primer to Machine Learning
Primer to Machine LearningPrimer to Machine Learning
Primer to Machine Learning
Jeff Tanner
 
Statistical foundations of ml
Statistical foundations of mlStatistical foundations of ml
Statistical foundations of ml
Vipul Kalamkar
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptx
Yousef Aburawi
 

More from Stefano Dalla Palma (11)

Design for Testability
Design for TestabilityDesign for Testability
Design for Testability
Stefano Dalla Palma
 
Introduction to Mutation Testing
Introduction to Mutation TestingIntroduction to Mutation Testing
Introduction to Mutation Testing
Stefano Dalla Palma
 
Artificial Neural Networks
Artificial Neural NetworksArtificial Neural Networks
Artificial Neural Networks
Stefano Dalla Palma
 
Decision Tree learning
Decision Tree learningDecision Tree learning
Decision Tree learning
Stefano Dalla Palma
 
Introduction to Machine Learning with examples in R
Introduction to Machine Learning with examples in RIntroduction to Machine Learning with examples in R
Introduction to Machine Learning with examples in R
Stefano Dalla Palma
 
Apache Mahout Architecture Overview
Apache Mahout Architecture OverviewApache Mahout Architecture Overview
Apache Mahout Architecture Overview
Stefano Dalla Palma
 
An Empirical Study on Bounded Model Checking
An Empirical Study on Bounded Model CheckingAn Empirical Study on Bounded Model Checking
An Empirical Study on Bounded Model Checking
Stefano Dalla Palma
 
UML, ER and Dimensional Modelling
UML, ER and Dimensional ModellingUML, ER and Dimensional Modelling
UML, ER and Dimensional Modelling
Stefano Dalla Palma
 
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
Stefano Dalla Palma
 
Detecting controversy in microposts: an approach based on word similarity wit...
Detecting controversy in microposts: an approach based on word similarity wit...Detecting controversy in microposts: an approach based on word similarity wit...
Detecting controversy in microposts: an approach based on word similarity wit...
Stefano Dalla Palma
 
Prolog in a nutshell
Prolog in a nutshellProlog in a nutshell
Prolog in a nutshell
Stefano Dalla Palma
 
Introduction to Mutation Testing
Introduction to Mutation TestingIntroduction to Mutation Testing
Introduction to Mutation Testing
Stefano Dalla Palma
 
Introduction to Machine Learning with examples in R
Introduction to Machine Learning with examples in RIntroduction to Machine Learning with examples in R
Introduction to Machine Learning with examples in R
Stefano Dalla Palma
 
Apache Mahout Architecture Overview
Apache Mahout Architecture OverviewApache Mahout Architecture Overview
Apache Mahout Architecture Overview
Stefano Dalla Palma
 
An Empirical Study on Bounded Model Checking
An Empirical Study on Bounded Model CheckingAn Empirical Study on Bounded Model Checking
An Empirical Study on Bounded Model Checking
Stefano Dalla Palma
 
UML, ER and Dimensional Modelling
UML, ER and Dimensional ModellingUML, ER and Dimensional Modelling
UML, ER and Dimensional Modelling
Stefano Dalla Palma
 
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
VCCFinder: Finding Potential Vulnerabilities in Open-Source Projects to Assis...
Stefano Dalla Palma
 
Detecting controversy in microposts: an approach based on word similarity wit...
Detecting controversy in microposts: an approach based on word similarity wit...Detecting controversy in microposts: an approach based on word similarity wit...
Detecting controversy in microposts: an approach based on word similarity wit...
Stefano Dalla Palma
 
Ad

Recently uploaded (20)

Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Customer Segmentation using K-Means clustering
Customer Segmentation using K-Means clusteringCustomer Segmentation using K-Means clustering
Customer Segmentation using K-Means clustering
Ingrid Nyakerario
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
Decision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdfDecision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdf
Saikat Basu
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Deloitte - A Framework for Process Mining Projects
Deloitte - A Framework for Process Mining ProjectsDeloitte - A Framework for Process Mining Projects
Deloitte - A Framework for Process Mining Projects
Process mining Evangelist
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
CTS EXCEPTIONSPrediction of Aluminium wire rod physical properties through AI...
ThanushsaranS
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
VKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptxVKS-Python-FIe Handling text CSV Binary.pptx
VKS-Python-FIe Handling text CSV Binary.pptx
Vinod Srivastava
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Customer Segmentation using K-Means clustering
Customer Segmentation using K-Means clusteringCustomer Segmentation using K-Means clustering
Customer Segmentation using K-Means clustering
Ingrid Nyakerario
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
chapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptxchapter 4 Variability statistical research .pptx
chapter 4 Variability statistical research .pptx
justinebandajbn
 
Decision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdfDecision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdf
Saikat Basu
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Deloitte - A Framework for Process Mining Projects
Deloitte - A Framework for Process Mining ProjectsDeloitte - A Framework for Process Mining Projects
Deloitte - A Framework for Process Mining Projects
Process mining Evangelist
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Ad

Introduction to Machine Learning concepts

  • 9. &
  • 11. Supervised orange banana banana apple apple apple apple Trained model New «unseen» data orange Given labeled examples learn to predict new examples
  • 13. Is him a genuine or an impostor?
  • 14. Which brand the car belongs to?
  • 15. At which price should I sell my beautiful house? 500’000 € ? 410’000 € ? 279’000 € ?
  • 16. Unsupervised Given data, but not labels, learn to cluster/group similar data
  • 19. Reinforcement Given a sequence of examples/states and a reward after completing that sequence, learn to predict the action to take in for an individual example/state … Win … Lose
  • 20. Why is Machine Learning possible?
  • 21. Why is Machine Learning possible? MORE DATA AVAILABLE – LARGER MEMORY IN HANDLING THE DATA – GREATER COMPUTATIONAL POWER FOR CALCULATING – ONLINE CONTINUOUS LEARNING
  • 22. How do we do Machine Learning? Data gathering MINING SOFTWARE REPOSITORIES – INTERVIEWS – SURVEYS – ONLINE OPEN DATASETS – STREAM SOURCES Data cleanliness and quality BIGGER IS NOT BETTER!
  • 23. How do we do Machine Learning? Knowledge representation Data formatting VECTORS – MATRICES Data visualization IT IS OFTEN A LOT OF DATA NEED FOR SPECIALIZED SOFTWARE!
  • 24. How do we do Machine Learning? Strategy selection …
  • 25. Strategy selection: The machine learning framework f( )=apple f( )=orange f( )=410’000 € Apply a prediction function to a feature representation of the data to get the desired output
  • 26. y = f(x) OUTPUT PREDICTION FUNCTION FEATURE Training Given a training set of labeled examples 𝑥1, 𝑦1 , … , 𝑥𝑛, 𝑦𝑛 estimate the prediction function f by minimizing the prediction error on the training set Testing Apply f to a never before seen test example x and output the predicted value y=f(x)
  • 27. Strategy selection: The machine learning design phase Training data Features Training Learned model Training labels Features Learned model Prediction Test data
  • 28. How do we choose training and test set? Two most common techniques are percentile sampling ad k-folds cross-validation
  • 29. Percentile sampling Divide the dataset between X% to be used for training and Y% to be used for prediction where X>>Y (e.g., 70/30) Training observation Testing observation
  • 30. K-folds cross-validation Randomly divide dataset into k “folds” then randomly select one to be used as testing data. Circularly select another one, selecting all of them at least once, calculate the average error rate of estimations 3 folds Train on Test on Train on Test on Train on Test on
  • 32. In summary ML means learning from the past to predict the future MACHINES NEED TO MAKE LESS ERROR IN PREDICTION (ERROR IS EVALUATED WITH STATITSICAL INDEXES AS F-MEASURE) At least 3 classes of ML exist SUPERVISED LEARNING UNSUPERVISED LEARNING REINFORCEMENT LEARNING