SlideShare a Scribd company logo
Lecture 01
Introduction to Machine Learning
What is Machine Learning?
 All useful Programs “Learn” something.
 Early definition of Machine Learning
“ Field of study that gives computers the ability to learn without
being explicitly programmed” Arthur Samuel (1959).
 Computer pioneer who wrote first self-learning program,
which played checkers- learned from experience.
 Tom Mitchell(1997)- “Said to learn from experience with
respect to some class of tasks, and a performance measure P,
if[ the learner’s] performance at tasks in the class as measured
by P, improves with experience”
Contd…
Traditional Programming:
Machine Learning:
Machine Learning is the central part of some other
courses like Natural Language Processing, Computational
Biology, Computer Vision, robotics etc.
Computer
Data
Output
Program
Computer
Data
Program
Output
How are things learned?
 Memorization:
 Accumulation of individual facts
Limited by-
Time to observe facts
Memory to store facts
 Generalization:
 Deduce new facts from old facts
Limited by accuracy of deduction process
 Essentially a predictive activity
 Assumes that past predicts the future
 Interested in extending to programs that can infer useful
information from implicit patterns in data.
Declarative Knowledge
Imperative Knowledge
Basic Paradigm:
 Observe set of examples: Training data
e.g., Football Players, labeled by position with height and
weight data.
 Infer something about process that generated that data
 Find canonical model of position by statistics
 Use interface to make predictions about previously
unseen data: Test data
e.g., predict position of new players.
Variations of Paradigm:
 Supervised Learning
 Learn an input to output map
• Classification: categorical output
• Regression: continuous output
Unsupervised Learning:
 Discover patterns in the data
• Clustering: cohesive grouping
• Association: Frequent cooccurance
 Reinforcement Learning
 Learning control
Machine Learning Tasks:
Task Measure
 Classification Classification error
 Regression Prediction error
 Clustering scatter/purity
 Associations Support/confidence
Challenges:
 How good is a model?
 How do I choose a model?
 Is the data of sufficient quality?
 Errors in data. E.g., Age=225; noise in low resolution images
 Missing vales
Contd…
 How confident can I be of the results?
 Am I describing the data correctly?
 Are age and income enough? Should I look at Gender also?
 How should I represent age? As a number or as young,
middle age, old?
Supervised Learning:
Labeled Training Data
Classification Possible
Classifiers
Supervised Learning:
Inductive Bias:
 Need to generalize
Assumption about lines
Encoding/Normalization of data
In general Inductive Bias: X1=
<0.15,0.25>,Y1= -1
 Language Bias and Search Bias X2=
<0.4,0.45>,Y2= +1
Process of supervised learning:
Training data X1=<30000,25>,Y1= doesnotbuycomputer
X2=<80000,45>,Y2= buycomputer
X1, Y1
X2, Y2
X3, Y3
-------
-------
Training Algorithm
X1, Y1
X2, Y2
X3, Y3
Testing
data
Classifier
Validiation
Supervised Learning:
Applications:
1. Credit Card Fraud detection (valid transaction or
not)
2. Sentiment Analysis (Opinion mining, buzz
analysis etc)
3. Churn Prediction (Potential Churner or not?)
4. Medical Diagnoses (Risk Analysis)
Prediction or Regression:
Regression:
Applications:
1.Time series Predictions
Rain Fall in a certain region, spend on
voice call
2.Classifications
3. Data Reduction
4. Trend Analysis
5. Risk Factor Analysis
Examples of Classifying and Clustering:
Unlabeled Data:
Clustering examples into group:
Similarity based on weight:
Similarity based on Height:
Cluster into two groups using both attributes:
Labeled Data:
Finding Classifier surface:
 Given Labeled groups into feature space, want to
find a surface
in that space that separates the groups.
 Subject to constraints on the complexity of
subsurface
 In this example, have 2 D space, so find line(or
connected set of line segments) that best
separates the two groups.
 When the examples get separated, it is straight
forward.
 When examples in labeled groups overlap, may
have to trade of false positives and false negative.
Labeled Data:
Adding some new data:
 Suppose we have learned to separate receivers
versus linemen
 Now we are given some running backs, and want to
use model to decide if they are more like receiver or
linemen.
Blount = [‘blount’, 75, 250]
white = [ ‘white’, 70, 205]
Clustering using Unlabeled data:
Classified using Labeled data:
Machine Learning Methods:
Requirements for Methods:
 Choosing training data and evaluation method.
 Representation of the features
 Distance matrix for feature vectors
 Objective function and constraints
 Optimization method for learning the model
Feature Representation:
An Example:
Contd…
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
Need to measure distance between Features
Feature Engineering:
 Deciding which features to include and which
are merely adding noise to classifier.
 Defining how to measure distances between
training examples (and ultimately between
classifiers and new instances)
 Deciding how to weight relative importance of
different dimensions of features vector, which
impacts defination of distance.
Measuring distance between animals:
 We can think of our animal examples as consisting
of four binary features and one integer features.
 One way to learn to separate Reptiles from non-
reptiles is to measure the distance between pairs of
examples and use that:
 To cluster nearby examples into a common class
(Unlabeled) data or
 To find a classifier surface in space of examples
that optimally separates different (Labeled)
collections of examples from ther collections.
Minkowski Matric:
Euclidean Distance between animals:
Euclidean Distance between animals:
Using Euclidean Distance rattlesnake and boa
constrictor are much closer to each other , than they are
to the dart frog.
Add an Alligator
Alligator is closer to dart frog than snake-why?
 Alligator differs from frog in 3 features, from boa in
only 2 features.
 But scale on “legs” is from 0 to 4, on other features
is 0 to 1
 “legs” dimension is disproportionately large
Using Binary Features:
Now alligator is closer to snakes than it is to
frog
- Make sense
Clustering Approach:
 Suppose we know that there are k different groups
in our training data, but we don’t know labels.
 Pick K samples (at random?) as exemplars
 Cluster remaining samples by minimizing distance
between samples in same cluster(objective function)
put samples in group with closest exemplars.
 Find median example in each cluster as new
exemplars
 Repeat until no change
Issues:
 How do we decide on the best number of clusters?
 How do we select the best features, the best
distance metric?
Classification Approach:
 Want to find boundaries in feature space that
separate different classes of labeled examples.
 Look for simple surface (e.g., best line or plane)
that separates classes.
 Look for more complex surface(subject to
constraints) that separate classes.
 Use voting schemes.
use k nearest training examples, use majority vote
to select label.
Issues:
 How do we avoid over-fitting to data?
 How do we measure performance?
 How do we select best features?
Classification:
 Attempt to minimize error on training data
 Similar to fitting a curve to data
Evaluate on training data.
Randomly Divide Data into Training and Test set:
Two possible models for a Training Set
Confusion Matrices (Training Error)
Training Accuracy of Model:
 0.7 for both models?
Which is better?
 Can we find a model with less training error? Yes.
Applying Model to test Data:
Other statistical Measures:
Summery:
 Machine learning methods provide a way of
building models of processes from data sets
 Supervised Learning uses labeled and create
classifiers that optimally separate data into known
classes.
 Unsupervised Learning tries to infer latent
variables by clustering training examples into
nearby groups.
 Choice of features influences results.
 Choice of distance measurement between
examples influences results.
 We will see some clustering methods like k-
means.
Ad

More Related Content

What's hot (20)

Chapter 09 class advanced
Chapter 09 class advancedChapter 09 class advanced
Chapter 09 class advanced
Houw Liong The
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
Pier Luca Lanzi
 
Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
DataminingTools Inc
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
Arshad Farhad
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
Christopher Marker
 
Multiple Classifier Systems
Multiple Classifier SystemsMultiple Classifier Systems
Multiple Classifier Systems
Farzad Vasheghani Farahani
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
butest
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
Jewel Refran
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...
butest
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
Trilok Sharma
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
KmPooja4
 
Bag the model with bagging
Bag the model with baggingBag the model with bagging
Bag the model with bagging
Chode Amarnath
 
L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
Machine Learning Valencia
 
ensemble learning
ensemble learningensemble learning
ensemble learning
butest
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
guest0edcaf
 
Lect4
Lect4Lect4
Lect4
sumit621
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
midi
 
Learning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learningLearning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learning
Simon John
 
Spss tutorial-cluster-analysis
Spss tutorial-cluster-analysisSpss tutorial-cluster-analysis
Spss tutorial-cluster-analysis
Animesh Kumar
 
Chapter 09 class advanced
Chapter 09 class advancedChapter 09 class advanced
Chapter 09 class advanced
Houw Liong The
 
Machine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers EnsemblesMachine Learning and Data Mining: 16 Classifiers Ensembles
Machine Learning and Data Mining: 16 Classifiers Ensembles
Pier Luca Lanzi
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
Arshad Farhad
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
butest
 
Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...Ensemble Learning Featuring the Netflix Prize Competition and ...
Ensemble Learning Featuring the Netflix Prize Competition and ...
butest
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
Trilok Sharma
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
KmPooja4
 
Bag the model with bagging
Bag the model with baggingBag the model with bagging
Bag the model with bagging
Chode Amarnath
 
ensemble learning
ensemble learningensemble learning
ensemble learning
butest
 
Textmining Retrieval And Clustering
Textmining Retrieval And ClusteringTextmining Retrieval And Clustering
Textmining Retrieval And Clustering
guest0edcaf
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
butest
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
midi
 
Learning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learningLearning to compare: relation network for few shot learning
Learning to compare: relation network for few shot learning
Simon John
 
Spss tutorial-cluster-analysis
Spss tutorial-cluster-analysisSpss tutorial-cluster-analysis
Spss tutorial-cluster-analysis
Animesh Kumar
 

Similar to Lecture 09(introduction to machine learning) (20)

slides
slidesslides
slides
butest
 
slides
slidesslides
slides
butest
 
chapter Three artificial intelligence 1.pptx
chapter Three artificial intelligence   1.pptxchapter Three artificial intelligence   1.pptx
chapter Three artificial intelligence 1.pptx
gadisaadamu101
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
butest
 
Introduction
IntroductionIntroduction
Introduction
butest
 
Introduction
IntroductionIntroduction
Introduction
butest
 
Introduction
IntroductionIntroduction
Introduction
butest
 
module 6 (1).ppt
module 6 (1).pptmodule 6 (1).ppt
module 6 (1).ppt
AKSHAYAROHITHKB1
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
Oluwasegun Matthew
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
Akshay Kanchan
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
butest
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
Tonmoy Bhagawati
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Oluwasegun Matthew
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptx
Yousef Aburawi
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
Vajira Thambawita
 
Introduction to Machine Learning Techniques
Introduction to Machine Learning TechniquesIntroduction to Machine Learning Techniques
Introduction to Machine Learning Techniques
rahuljain582793
 
introducatio to ml introducatio to ml introducatio to ml
introducatio to ml introducatio to ml introducatio to mlintroducatio to ml introducatio to ml introducatio to ml
introducatio to ml introducatio to ml introducatio to ml
DecentMusicians
 
Lecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.pptLecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.ppt
NiteshJha97
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
shubhamatak136
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
VenkateswaraBabuRavi
 
slides
slidesslides
slides
butest
 
slides
slidesslides
slides
butest
 
chapter Three artificial intelligence 1.pptx
chapter Three artificial intelligence   1.pptxchapter Three artificial intelligence   1.pptx
chapter Three artificial intelligence 1.pptx
gadisaadamu101
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
butest
 
Introduction
IntroductionIntroduction
Introduction
butest
 
Introduction
IntroductionIntroduction
Introduction
butest
 
Introduction
IntroductionIntroduction
Introduction
butest
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
Oluwasegun Matthew
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
Akshay Kanchan
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
butest
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
Tonmoy Bhagawati
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
Oluwasegun Matthew
 
AI_06_Machine Learning.pptx
AI_06_Machine Learning.pptxAI_06_Machine Learning.pptx
AI_06_Machine Learning.pptx
Yousef Aburawi
 
Lecture 5 machine learning updated
Lecture 5   machine learning updatedLecture 5   machine learning updated
Lecture 5 machine learning updated
Vajira Thambawita
 
Introduction to Machine Learning Techniques
Introduction to Machine Learning TechniquesIntroduction to Machine Learning Techniques
Introduction to Machine Learning Techniques
rahuljain582793
 
introducatio to ml introducatio to ml introducatio to ml
introducatio to ml introducatio to ml introducatio to mlintroducatio to ml introducatio to ml introducatio to ml
introducatio to ml introducatio to ml introducatio to ml
DecentMusicians
 
Lecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.pptLecture: introduction to Machine Learning.ppt
Lecture: introduction to Machine Learning.ppt
NiteshJha97
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
VenkateswaraBabuRavi
 
Ad

More from Jeet Das (13)

Lecture 13
Lecture 13Lecture 13
Lecture 13
Jeet Das
 
Lecture 12
Lecture 12Lecture 12
Lecture 12
Jeet Das
 
Lecture 10
Lecture 10Lecture 10
Lecture 10
Jeet Das
 
Information Retrieval 08
Information Retrieval 08 Information Retrieval 08
Information Retrieval 08
Jeet Das
 
Information Retrieval 02
Information Retrieval 02Information Retrieval 02
Information Retrieval 02
Jeet Das
 
Information Retrieval 07
Information Retrieval 07Information Retrieval 07
Information Retrieval 07
Jeet Das
 
Information Retrieval-06
Information Retrieval-06Information Retrieval-06
Information Retrieval-06
Jeet Das
 
Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)
Jeet Das
 
Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)
Jeet Das
 
Information Retrieval-1
Information Retrieval-1Information Retrieval-1
Information Retrieval-1
Jeet Das
 
NLP
NLPNLP
NLP
Jeet Das
 
Token classification using Bengali Tokenizer
Token classification using Bengali TokenizerToken classification using Bengali Tokenizer
Token classification using Bengali Tokenizer
Jeet Das
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technology
Jeet Das
 
Lecture 13
Lecture 13Lecture 13
Lecture 13
Jeet Das
 
Lecture 12
Lecture 12Lecture 12
Lecture 12
Jeet Das
 
Lecture 10
Lecture 10Lecture 10
Lecture 10
Jeet Das
 
Information Retrieval 08
Information Retrieval 08 Information Retrieval 08
Information Retrieval 08
Jeet Das
 
Information Retrieval 02
Information Retrieval 02Information Retrieval 02
Information Retrieval 02
Jeet Das
 
Information Retrieval 07
Information Retrieval 07Information Retrieval 07
Information Retrieval 07
Jeet Das
 
Information Retrieval-06
Information Retrieval-06Information Retrieval-06
Information Retrieval-06
Jeet Das
 
Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)Information Retrieval-05(wild card query_positional index_spell correction)
Information Retrieval-05(wild card query_positional index_spell correction)
Jeet Das
 
Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)Information Retrieval-4(inverted index_&amp;_query handling)
Information Retrieval-4(inverted index_&amp;_query handling)
Jeet Das
 
Information Retrieval-1
Information Retrieval-1Information Retrieval-1
Information Retrieval-1
Jeet Das
 
Token classification using Bengali Tokenizer
Token classification using Bengali TokenizerToken classification using Bengali Tokenizer
Token classification using Bengali Tokenizer
Jeet Das
 
Silent sound technology
Silent sound technologySilent sound technology
Silent sound technology
Jeet Das
 
Ad

Recently uploaded (20)

Dynamics of Structures with Uncertain Properties.pptx
Dynamics of Structures with Uncertain Properties.pptxDynamics of Structures with Uncertain Properties.pptx
Dynamics of Structures with Uncertain Properties.pptx
University of Glasgow
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
Resistance measurement and cfd test on darpa subboff model
Resistance measurement and cfd test on darpa subboff modelResistance measurement and cfd test on darpa subboff model
Resistance measurement and cfd test on darpa subboff model
INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
 
How to use nRF24L01 module with Arduino
How to use nRF24L01 module with ArduinoHow to use nRF24L01 module with Arduino
How to use nRF24L01 module with Arduino
CircuitDigest
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
ISO 9001 quality management systemPPT.pptx
ISO 9001 quality management systemPPT.pptxISO 9001 quality management systemPPT.pptx
ISO 9001 quality management systemPPT.pptx
mesfin608
 
2025 Apply BTech CEC .docx
2025 Apply BTech CEC                 .docx2025 Apply BTech CEC                 .docx
2025 Apply BTech CEC .docx
tusharmanagementquot
 
Surveying through global positioning system
Surveying through global positioning systemSurveying through global positioning system
Surveying through global positioning system
opneptune5
 
How to Buy Snapchat Account A Step-by-Step Guide.pdf
How to Buy Snapchat Account A Step-by-Step Guide.pdfHow to Buy Snapchat Account A Step-by-Step Guide.pdf
How to Buy Snapchat Account A Step-by-Step Guide.pdf
jamedlimmk
 
Introduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptxIntroduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptx
AS1920
 
Comprehensive-Event-Management-System.pptx
Comprehensive-Event-Management-System.pptxComprehensive-Event-Management-System.pptx
Comprehensive-Event-Management-System.pptx
dd7devdilip
 
COMPUTER GRAPHICS AND VISUALIZATION :MODULE-02 notes [BCG402-CG&V].pdf
COMPUTER GRAPHICS AND VISUALIZATION :MODULE-02 notes [BCG402-CG&V].pdfCOMPUTER GRAPHICS AND VISUALIZATION :MODULE-02 notes [BCG402-CG&V].pdf
COMPUTER GRAPHICS AND VISUALIZATION :MODULE-02 notes [BCG402-CG&V].pdf
Alvas Institute of Engineering and technology, Moodabidri
 
ZJIT: Building a Next Generation Ruby JIT
ZJIT: Building a Next Generation Ruby JITZJIT: Building a Next Generation Ruby JIT
ZJIT: Building a Next Generation Ruby JIT
maximechevalierboisv1
 
Routing Riverdale - A New Bus Connection
Routing Riverdale - A New Bus ConnectionRouting Riverdale - A New Bus Connection
Routing Riverdale - A New Bus Connection
jzb7232
 
Artificial Intelligence introduction.pptx
Artificial Intelligence introduction.pptxArtificial Intelligence introduction.pptx
Artificial Intelligence introduction.pptx
DrMarwaElsherif
 
SICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introductionSICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introduction
fabienklr
 
Compiler Design_Syntax Directed Translation.pptx
Compiler Design_Syntax Directed Translation.pptxCompiler Design_Syntax Directed Translation.pptx
Compiler Design_Syntax Directed Translation.pptx
RushaliDeshmukh2
 
Value Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous SecurityValue Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous Security
Marc Hornbeek
 
Main cotrol jdbjbdcnxbjbjzjjjcjicbjxbcjcxbjcxb
Main cotrol jdbjbdcnxbjbjzjjjcjicbjxbcjcxbjcxbMain cotrol jdbjbdcnxbjbjzjjjcjicbjxbcjcxbjcxb
Main cotrol jdbjbdcnxbjbjzjjjcjicbjxbcjcxbjcxb
SunilSingh610661
 
Dynamics of Structures with Uncertain Properties.pptx
Dynamics of Structures with Uncertain Properties.pptxDynamics of Structures with Uncertain Properties.pptx
Dynamics of Structures with Uncertain Properties.pptx
University of Glasgow
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
How to use nRF24L01 module with Arduino
How to use nRF24L01 module with ArduinoHow to use nRF24L01 module with Arduino
How to use nRF24L01 module with Arduino
CircuitDigest
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
ISO 9001 quality management systemPPT.pptx
ISO 9001 quality management systemPPT.pptxISO 9001 quality management systemPPT.pptx
ISO 9001 quality management systemPPT.pptx
mesfin608
 
Surveying through global positioning system
Surveying through global positioning systemSurveying through global positioning system
Surveying through global positioning system
opneptune5
 
How to Buy Snapchat Account A Step-by-Step Guide.pdf
How to Buy Snapchat Account A Step-by-Step Guide.pdfHow to Buy Snapchat Account A Step-by-Step Guide.pdf
How to Buy Snapchat Account A Step-by-Step Guide.pdf
jamedlimmk
 
Introduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptxIntroduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptx
AS1920
 
Comprehensive-Event-Management-System.pptx
Comprehensive-Event-Management-System.pptxComprehensive-Event-Management-System.pptx
Comprehensive-Event-Management-System.pptx
dd7devdilip
 
ZJIT: Building a Next Generation Ruby JIT
ZJIT: Building a Next Generation Ruby JITZJIT: Building a Next Generation Ruby JIT
ZJIT: Building a Next Generation Ruby JIT
maximechevalierboisv1
 
Routing Riverdale - A New Bus Connection
Routing Riverdale - A New Bus ConnectionRouting Riverdale - A New Bus Connection
Routing Riverdale - A New Bus Connection
jzb7232
 
Artificial Intelligence introduction.pptx
Artificial Intelligence introduction.pptxArtificial Intelligence introduction.pptx
Artificial Intelligence introduction.pptx
DrMarwaElsherif
 
SICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introductionSICPA: Fabien Keller - background introduction
SICPA: Fabien Keller - background introduction
fabienklr
 
Compiler Design_Syntax Directed Translation.pptx
Compiler Design_Syntax Directed Translation.pptxCompiler Design_Syntax Directed Translation.pptx
Compiler Design_Syntax Directed Translation.pptx
RushaliDeshmukh2
 
Value Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous SecurityValue Stream Mapping Worskshops for Intelligent Continuous Security
Value Stream Mapping Worskshops for Intelligent Continuous Security
Marc Hornbeek
 
Main cotrol jdbjbdcnxbjbjzjjjcjicbjxbcjcxbjcxb
Main cotrol jdbjbdcnxbjbjzjjjcjicbjxbcjcxbjcxbMain cotrol jdbjbdcnxbjbjzjjjcjicbjxbcjcxbjcxb
Main cotrol jdbjbdcnxbjbjzjjjcjicbjxbcjcxbjcxb
SunilSingh610661
 

Lecture 09(introduction to machine learning)

  • 1. Lecture 01 Introduction to Machine Learning
  • 2. What is Machine Learning?  All useful Programs “Learn” something.  Early definition of Machine Learning “ Field of study that gives computers the ability to learn without being explicitly programmed” Arthur Samuel (1959).  Computer pioneer who wrote first self-learning program, which played checkers- learned from experience.  Tom Mitchell(1997)- “Said to learn from experience with respect to some class of tasks, and a performance measure P, if[ the learner’s] performance at tasks in the class as measured by P, improves with experience”
  • 3. Contd… Traditional Programming: Machine Learning: Machine Learning is the central part of some other courses like Natural Language Processing, Computational Biology, Computer Vision, robotics etc. Computer Data Output Program Computer Data Program Output
  • 4. How are things learned?  Memorization:  Accumulation of individual facts Limited by- Time to observe facts Memory to store facts  Generalization:  Deduce new facts from old facts Limited by accuracy of deduction process  Essentially a predictive activity  Assumes that past predicts the future  Interested in extending to programs that can infer useful information from implicit patterns in data. Declarative Knowledge Imperative Knowledge
  • 5. Basic Paradigm:  Observe set of examples: Training data e.g., Football Players, labeled by position with height and weight data.  Infer something about process that generated that data  Find canonical model of position by statistics  Use interface to make predictions about previously unseen data: Test data e.g., predict position of new players.
  • 6. Variations of Paradigm:  Supervised Learning  Learn an input to output map • Classification: categorical output • Regression: continuous output Unsupervised Learning:  Discover patterns in the data • Clustering: cohesive grouping • Association: Frequent cooccurance  Reinforcement Learning  Learning control
  • 7. Machine Learning Tasks: Task Measure  Classification Classification error  Regression Prediction error  Clustering scatter/purity  Associations Support/confidence Challenges:  How good is a model?  How do I choose a model?  Is the data of sufficient quality?  Errors in data. E.g., Age=225; noise in low resolution images  Missing vales
  • 8. Contd…  How confident can I be of the results?  Am I describing the data correctly?  Are age and income enough? Should I look at Gender also?  How should I represent age? As a number or as young, middle age, old?
  • 9. Supervised Learning: Labeled Training Data Classification Possible Classifiers
  • 11. Inductive Bias:  Need to generalize Assumption about lines Encoding/Normalization of data In general Inductive Bias: X1= <0.15,0.25>,Y1= -1  Language Bias and Search Bias X2= <0.4,0.45>,Y2= +1 Process of supervised learning: Training data X1=<30000,25>,Y1= doesnotbuycomputer X2=<80000,45>,Y2= buycomputer X1, Y1 X2, Y2 X3, Y3 ------- ------- Training Algorithm X1, Y1 X2, Y2 X3, Y3 Testing data Classifier Validiation
  • 12. Supervised Learning: Applications: 1. Credit Card Fraud detection (valid transaction or not) 2. Sentiment Analysis (Opinion mining, buzz analysis etc) 3. Churn Prediction (Potential Churner or not?) 4. Medical Diagnoses (Risk Analysis)
  • 14. Regression: Applications: 1.Time series Predictions Rain Fall in a certain region, spend on voice call 2.Classifications 3. Data Reduction 4. Trend Analysis 5. Risk Factor Analysis
  • 15. Examples of Classifying and Clustering:
  • 20. Cluster into two groups using both attributes:
  • 22. Finding Classifier surface:  Given Labeled groups into feature space, want to find a surface in that space that separates the groups.  Subject to constraints on the complexity of subsurface  In this example, have 2 D space, so find line(or connected set of line segments) that best separates the two groups.  When the examples get separated, it is straight forward.  When examples in labeled groups overlap, may have to trade of false positives and false negative.
  • 24. Adding some new data:  Suppose we have learned to separate receivers versus linemen  Now we are given some running backs, and want to use model to decide if they are more like receiver or linemen. Blount = [‘blount’, 75, 250] white = [ ‘white’, 70, 205]
  • 28. Requirements for Methods:  Choosing training data and evaluation method.  Representation of the features  Distance matrix for feature vectors  Objective function and constraints  Optimization method for learning the model
  • 34. Need to measure distance between Features Feature Engineering:  Deciding which features to include and which are merely adding noise to classifier.  Defining how to measure distances between training examples (and ultimately between classifiers and new instances)  Deciding how to weight relative importance of different dimensions of features vector, which impacts defination of distance.
  • 35. Measuring distance between animals:  We can think of our animal examples as consisting of four binary features and one integer features.  One way to learn to separate Reptiles from non- reptiles is to measure the distance between pairs of examples and use that:  To cluster nearby examples into a common class (Unlabeled) data or  To find a classifier surface in space of examples that optimally separates different (Labeled) collections of examples from ther collections.
  • 38. Euclidean Distance between animals: Using Euclidean Distance rattlesnake and boa constrictor are much closer to each other , than they are to the dart frog.
  • 39. Add an Alligator Alligator is closer to dart frog than snake-why?  Alligator differs from frog in 3 features, from boa in only 2 features.  But scale on “legs” is from 0 to 4, on other features is 0 to 1  “legs” dimension is disproportionately large
  • 40. Using Binary Features: Now alligator is closer to snakes than it is to frog - Make sense
  • 41. Clustering Approach:  Suppose we know that there are k different groups in our training data, but we don’t know labels.  Pick K samples (at random?) as exemplars  Cluster remaining samples by minimizing distance between samples in same cluster(objective function) put samples in group with closest exemplars.  Find median example in each cluster as new exemplars  Repeat until no change Issues:  How do we decide on the best number of clusters?  How do we select the best features, the best distance metric?
  • 42. Classification Approach:  Want to find boundaries in feature space that separate different classes of labeled examples.  Look for simple surface (e.g., best line or plane) that separates classes.  Look for more complex surface(subject to constraints) that separate classes.  Use voting schemes. use k nearest training examples, use majority vote to select label. Issues:  How do we avoid over-fitting to data?  How do we measure performance?  How do we select best features?
  • 43. Classification:  Attempt to minimize error on training data  Similar to fitting a curve to data Evaluate on training data.
  • 44. Randomly Divide Data into Training and Test set:
  • 45. Two possible models for a Training Set
  • 47. Training Accuracy of Model:  0.7 for both models? Which is better?  Can we find a model with less training error? Yes.
  • 48. Applying Model to test Data:
  • 50. Summery:  Machine learning methods provide a way of building models of processes from data sets  Supervised Learning uses labeled and create classifiers that optimally separate data into known classes.  Unsupervised Learning tries to infer latent variables by clustering training examples into nearby groups.  Choice of features influences results.  Choice of distance measurement between examples influences results.  We will see some clustering methods like k- means.