module 6 (1).ppt

What is Machine Learning?
 Learning from Data
 Learns the relationship between the variables of a
system (input, output and hidden) from direct samples
of the system (data)
 Plenty of data and computational power can replace rule-
based models to probabilistic data-driven modes
 Learning from Big Data (scalable Learning)
 Smart / Autonomous / Intelligent systems

Why Machine Learning?
 There is no need to “learn” to calculate payroll etc….
 No Machine Learning needed when the relationships
between all system variables (input, output, and
hidden) is completely understood! – known function
 This is NOT the case for almost any real world system!

A Generic Real World System
System
…
…
1
x
2
x
N
x
1
y
2
y
M
y
1 2
, ,..., K
h h h
 
1 2
, ,..., N
x x x

x
 
1 2
, ,..., K
h h h

h
 
1 2
, ,..., K
y y y

y
Input Variables:
Hidden Variables:
Output Variables:

Machine Learning – Why?
 When human expertise does not exist
 Humans are unable to explain their expertise
(speech recognition)
 Some tasks cannot be defined well, except by
examples (e.g., recognizing people).
 Correlations and patterns can be hidden within
large amounts of data.
6

Machine Learning – Why?
 Amount of knowledge available about certain tasks
might be too large for explicit encoding by humans
(e.g., medical diagnostic).
 Environment / solution change over time.
 Solution needs to be adapted to particular cases
 Set of all possible behaviors given all possible inputs
is too large
 Process must generalize from the finite set of
examples to produce an output / decision in new
cases.
7

Learning from Data
 Data may belong to any domain such as text,
image, video, speech, Bio informatics etc.
 Few Examples:
 Learning to recognize spoken words
 Learning to drive an autonomous vehicle
 Learning to classify new structures
 Learning to play games
8

Driverless Vehicle
• Learning to drive an
autonomous vehicle
– Associate steering commands
with image sequences
Google
Prototype
Task T: driving on public, 4-lane
highway using vision sensors
Perform measure P: average
distance traveled
Training E: sequence of
images and steering commands
recorded while observing a human
driver

Handwritten character recognition
• It is very hard to say
what makes a “2”
• Wide variability of same
numeral
• Handcrafted rules will
result in large no of
rules and exceptions
• Better to have a
machine that learns
from a large training set

Face Recognition
Training examples of a person
Test images

Machine Learning Tasks
 Recognition / Classification
 Clustering
 Prediction / Regression
 Anomaly detection
 Retrieval
12

Other Applications
 Recognizing patterns:
 Speech Recognition
 Facial identities or facial expressions
 Handwritten or spoken words
 Medical images
 Recognizing anomalies:
 Unusual sequences of credit card transactions
 Unusual patterns of sensor readings in a nuclear power plant
 Video surveillance
 Prediction:
 Future stock prices
 Sales prediction
 Information Retrieval

Machine Learning – How?
 Supervised learning
 Training data includes desired outputs
 classification, regression, outlier detection
 Unsupervised learning
 Training data does not include desired outputs
 Grouping similar instances - clusters;
Big Data don’t have labels.
 Semi-supervised learning
 Training data includes a few desired outputs
 Reinforcement learning
 Rewards or penalty from sequence of actions
 After a set of trial-and error runs, learns best policy -
sequence of actions that maximize the total reward.

Machine Learning - Where?
 Handwriting Recognition
 x: Data from pen motion.
 f(x): Letter of the alphabet.
 Disease diagnosis
 x: Properties of patient (symptoms, lab tests)
 f(x): Disease (or recommended therapy)
 Face recognition
 x: Bitmap picture of person's face
 f(x): Name of the person.
 Spam Detection
 x: Email message
 f(x): Spam or not spam.
 So many………
17

 Data Representation
 Choice of Similarity or Distance measure
 Choice of Learning Algorithm
18

 Raw data preprocessed to obtain a feature vector,
X, that adequately describes all the relevant features
for classifying examples
 Each x is a list of (attribute, value) pairs. Example:
X = [Person:Susan, EyeColor:Brown, Age:40, Sex:Female]
 The number of attributes is fixed
 Each attribute has discrete or continuous values
 An example can be interpreted as a point in an n-
dimensional feature space, where n is the number
of attributes
19

20
Machine Learning – Techniques
• Decision Tree induction
 Tree is constructed in a top-down recursive divide-and-conquer manner
 Simple rules
 Overfitting
• Bayesian classifier
 Statistical classifier performs probabilistic prediction - predicts class
membership probabilities based on Bayes Theorem
 Prior knowledge can be incorporated
 Easy implementation, Good results
 Dependence among attributes cannot be modeled – Bayesian Belief
networks
• Neural Networks
 Given enough hidden units and enough training samples, they can closely
approximate any function
 Long training time
 Require a number of parameters typically best determined empirically
 Ability to classify untrained patterns
 Well-suited for continuous-valued inputs and outputs

21
Machine Learning Techniques
• Lazy Learning : K- Nearest Neighbor
• Gaussian mixture model GMM)
• Adapted GMM
• Hidden Markov model (HMM)
• Support vector Machine (SVM)
• Posterior probability SVM

Supervised learning
 Supervised learning, as the name indicates, has the
presence of a supervisor as a teacher.
 Basically supervised learning is when we teach or
train the machine using data that is well labeled.
Which means some data is already tagged with the
correct answer.
 After that, the machine is provided with a new set of
examples(data) so that the supervised learning
algorithm analyses the training data(set of training
examples) and produces a correct outcome from
labeled data.
22

Prediction
Steps
Training
Labels
Training
Images
Trainin
g
Training
Image
Features
Image
Features
Testing
Test Image
Learned
model
Learned
model

Types of Supervised Learning Algorithms
 Supervised learning is classified into two categories
of algorithms:
 Classification: A classification problem is when
the output variable is a category, such as “Red” or
“blue” or “disease” and “no disease”.
 Regression: A regression problem is when the
output variable is a real value, such as “dollars” or
“weight”.
26

Other types
 Supervised learning deals with or learns with
“labeled” data. This implies that some data is already
tagged with the correct answer.
 Other types:-
 Logistic Regression
 Naive Bayes Classifiers
 K-NN (k nearest neighbors)
 Decision Trees
 Support Vector Machine
27

 Advantages:-
 Supervised learning allows collecting data and produces
data output from previous experiences.
 Helps to optimize performance criteria with the help of
experience.
 Supervised machine learning helps to solve various types of
real-world computation problems.
 Disadvantages:-
 Classifying big data can be challenging.
 Training for supervised learning needs a lot of computation
time. So, it requires a lot of time.
28

Unsupervised learning
 Unsupervised learning is the training of a machine using
information that is neither classified nor labeled and
allowing the algorithm to act on that information without
guidance.
 Here the task of the machine is to group unsorted
information according to similarities, patterns, and
differences without any prior training of data.
 Unlike supervised learning, no teacher is provided that
means no training will be given to the machine. Therefore
the machine is restricted to find the hidden structure in
unlabeled data by itself.
29

 For instance, suppose it is given an image having
both dogs and cats which it has never seen.
30

 The machine has no idea about the features of dogs and
cats so we can’t categorize it as ‘dogs and cats ‘.
 But it can categorize them according to their similarities,
patterns, and differences, i.e., we can easily categorize the
above picture into two parts.
 The first may contain all pics having dogs in them and the second part
may contain all pics having cats in them.
 Here you didn’t learn anything before, which means no training data or
examples.
 It allows the model to work on its own to discover patterns
and information that was previously undetected. It mainly
deals with unlabelled data.
31

Types of Unsupervised learning
 Unsupervised learning is classified into two
categories of algorithms:
 Clustering: A clustering problem is where you want
to discover the inherent groupings in the data, such
as grouping customers by purchasing behavior.
 Association: An association rule learning problem
is where you want to discover rules that describe
large portions of your data, such as people that buy X
also tend to buy Y
32

Other Types
 Types of Unsupervised Learning:-
 Clustering
 Exclusive (partitioning)
 Agglomerative
 Overlapping
 Probabilistic
 Clustering Types:-
 Hierarchical clustering
 K-means clustering
 Principal Component Analysis
 Singular Value Decomposition
 Independent Component Analysis
33

Parameters Supervised machine learning
Unsupervised machine
learning
Input Data
Algorithms are trained using labeled
data.
Algorithms are used against data that
is not labeled
Computational Complexity Simpler method Computationally complex
Accuracy Highly accurate Less accurate
34

Reinforcement Learning
 Refer the below website:
https://www.geeksforgeeks.org/what-is-
reinforcement-learning/
35

module 6 (1).ppt

More Related Content

Similar to module 6 (1).ppt (20)

Recently uploaded (20)

module 6 (1).ppt