0% found this document useful (0 votes)

2 views

Svm

Uploaded by

Shivam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Svm

Uploaded by

Shivam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

SUPPORT VECTOR

MACHINE
Prof. Subodh Kumar Mohanty
The idea of support vectors and its
importance
Introduction
• The support vector machine is currently considered to be the best
off-the-shelf learning algorithm and has been applied successfully in various
domains.
• Support vector machines were originally designed for binary classification.
• Then, it is extended to solve multi-class and regression problems.
• But, it is widely used in classification objectives.
• The objective of the support vector machine algorithm is to find a hyperplane
in an N-dimensional space (N — the number of features) that distinctly
classifies the data points.
• To separate the two classes of data points, there are many possible
hyperplanes that could be chosen. Our objective is to find a plane that
has the maximum margin, i.e the maximum distance between data
points of both classes.

Possible
hyperplanes
• Maximizing the margin distance provides some reinforcement so that
future data points can be classified with more confidence.

Possible
hyperplanes
• Maximizing the margin distance provides some reinforcement so that
future data points can be classified with more confidence.
• The goal is to choose a hyperplane with the greatest possible margin
between the hyperplane and any point within the training set, giving a
greater chance of new data being classified correctly.
• Hyperplanes are decision boundaries that help classify the data points.
• Data points falling on either side of the hyperplane can be attributed to
different classes.
• Also, the dimension of the hyperplane depends upon the number of
features.
• It becomes difficult to imagine when the number of features exceeds 3.
• Support vectors are data points that are closer to the hyperplane and
influence the position and orientation of the hyperplane.
• Using these support vectors, we maximize the margin of the classifier.
• Deleting the support vectors will change the position of the hyperplane.
• These are the points that help us build our SVM (that works for a
new/test data).

test data point test data point

(will be predicted as (will be predicted as
square) circle)
• But what happens when there is no clear hyperplane?
• A dataset will often look more like the jumbled balls below which
represent a linearly non separable dataset.

• In order to classify a dataset like the one above it’s necessary to

move away from a 2d view of the data to a 3d view.
• Explaining this is easiest with another simplified example.
• Imagine that our two sets of colored balls above are sitting on a sheet
and this sheet is lifted suddenly, launching the balls into the air.
• While the balls are up in the air, you use the sheet to separate them.
• This ‘lifting’ of the balls represents the mapping of data into a higher
dimension.

• This is known as kernelling.

Pros & Cons of Support Vector Machines
Pros
• Accuracy
• Works well on smaller cleaner datasets
• It can be more efficient because it uses a subset of training points

Cons
• Isn’t suited to larger datasets as the training time with SVMs can be high
• Less effective on noisier datasets with overlapping classes
Applications
• SVM is used for text classification tasks such as category assignment,
detecting spam and sentiment analysis.
• It is also commonly used for image recognition challenges,
performing particularly well in aspect-based recognition and
color-based classification.
• SVM also plays a vital role in many areas of handwritten digit
recognition, such as postal automation services.
Derivation of Support Vector Equation
Comparison with logistic regression
Training set:
m examples
Sigmoid function

Threshold classifier output at 0.5:

If , predict “y = 1”
How to choose parameters ? If , predict “y = 0”
Max Likelihood Estimation (already discussed)
Comparison with logistic regression
• In SVM, we take the output of the
linear function and if that output is
greater than 1, we identify it with one
class and if the output is less than -1,
we identify it with another class.
Sigmoid function
• Since the threshold values are
changed to 1 and -1 in SVM, we
obtain this reinforcement range of
values ([-1,1]) which acts as margin.
Threshold classifier output at 0.5:
If , predict “y = 1”
If , predict “y = 0”
Comparison with logistic regression
• In SVM, we take the output of the
linear function and if that output is
greater than 1, we identify it with one
class and if the output is less than -1,
we identify it with another class.
• Since the threshold values are
changed to 1 and -1 in SVM, we
obtain this reinforcement range of
values ([-1,1]) which acts as margin.

• g(x) is a linear discriminant function that divides (categorizes) into two

decision regions.
• The generalization of the linear discriminant function for an
n-dimensional feature space in is straight forward:

• The discriminant function is now a linear n-dimensional surface,

called a hyperplane; symbolized as
• A two-category classifier implements the following decision rule:
Decide Class 1 if g(x) > 0 and Class 2 if g(x) < 0
• Thus, x is assigned to Class 1 if the inner product wTx exceeds the
threshold (bias) –w0, and to Class 2 otherwise.
• Figure shows the architecture of a typical implementation of the
linear classifier.
• It consists of two computational units: an aggregation unit and an
output unit.

A simple linear classifier

•

Linear decision boundary between two classes

•

Algebraic measure of the

distance from x to the
•
Geometry for 3-dimensions (n=3)
Linear Maximal Margin Classifier for Linearly
Separable Data
• For linearly separable, many hyperplanes exist to perfrom separation.
• SVM framework tells which hyperplane is best.
• Hyperplane with the largest margin which minimizes training error.
• Select the decision boundary that is far away from both the classes.
• Large margin separation is expected to yield good generalization.
• in wTx + w0 = 0, w defines a direction perpendicular to the
hyperplane.
• w is called the normal vector (or simply normal) of the hyperplane.
• Without changing the normal vector w, varying w0 moves the
hyperplane parallel to itself.
test data point test data point
(will be predicted as (will be predicted as
square) circle)

Large margin and small margin seperation

Geometric interpretation of algebraic distances of points to a hyperplane for
two-dimensional case
•
KKT Condition
Learning problem in SVM
•
Hard margin svm Vs Soft margin svm
•
•
•
Linear Soft Margin Classifier for Overlapping
Classes
• To generalize SVM, allow noise in the training data.
• Hard margin linear SVM algorithm will not work.

Soft decision boundary

•
•
•
•
Kernel Function: Dealing with non linearity
Non-linear classifiers
• for several real-life datasets, the decision boundaries are nonlinear.
• To deal with nonlinear case, the formulation and solution methods
employed for the linear case are still applicable.
• Only input data is transformed from its original space into another space
(higher dimensional space) so that a linear decision boundary can separate
Class 1 examples from Class 2.
• The transformed space is called the feature space.
• The original data space is known as the input space.
Non-linear classifiers
• For training examples which cannot be linearly separated.
• In the feature space, they can be separated linearly with some
transformations.

Transformation from input space to feature space

•
•
•
Mercer’s theorem
•
Polynomial and Radial Basis Kernel
•
Polynomial Kernel
Polynomial Kernel
• The polynomial kernel represents the similarity of vectors (training
samples) in a feature space over polynomials of the original variables,
allowing learning of non-linear models.
• It looks not only at the given features of input samples to determine their
similarity, but also combinations of these (interaction features).
• Quite popular in natural language processing (NLP).
• The most common degree is d = 2 (quadratic), since larger degrees tend to
overfit on NLP problems.
• One problem with the polynomial kernel is that it may suffer from
numerical instability: (result ranges from 0 to infinity)
Radial Basis Kernel
•
Radial Basis Kernel
• The maximum value that the RBF kernel can get is 1 and occurs when d₁₂
is 0 which is when the points are the same, i.e. X₁ = X₂.
• When the points are the same, there is no distance between them and
therefore they are extremely similar.
• When the points are separated by a large distance, then the kernel value
is less than 1 and close to 0 which would mean that the points are
dissimilar.
• There are no golden rules for determining which admissible kernel will
result in the most accurate SVM.
• In practice, the kernel chosen does not generally make a large difference in
resulting accuracy.
• SVM training always finds a global solution, unlike neural networks (to be
discussed in the next chapter) where many local minima usually exist.

Ultimate Guide For Game Dev Tycoon
No ratings yet
Ultimate Guide For Game Dev Tycoon
5 pages
RoboticAutomatic Tool Changer PDF
No ratings yet
RoboticAutomatic Tool Changer PDF
36 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
UNIT - 2-1
No ratings yet
UNIT - 2-1
7 pages
SVM.pptx
No ratings yet
SVM.pptx
67 pages
UNIT - 2
No ratings yet
UNIT - 2
15 pages
Ann Unit III
No ratings yet
Ann Unit III
20 pages
IVPML Unit III
No ratings yet
IVPML Unit III
139 pages
SVM
No ratings yet
SVM
11 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
No ratings yet
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
31 pages
Support Vector Machines (SVMs) - Introduction and Key Concepts
No ratings yet
Support Vector Machines (SVMs) - Introduction and Key Concepts
52 pages
1694600937-Unit2.5 Support Vector Machine CU 2.0
No ratings yet
1694600937-Unit2.5 Support Vector Machine CU 2.0
26 pages
Unit-4 AI - SVM
No ratings yet
Unit-4 AI - SVM
21 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
Unit2 notes What is a Support Vector Machine
No ratings yet
Unit2 notes What is a Support Vector Machine
11 pages
This Is
No ratings yet
This Is
7 pages
Unit 2
No ratings yet
Unit 2
47 pages
SVM (Repaired)
No ratings yet
SVM (Repaired)
39 pages
Support Vector Machines
No ratings yet
Support Vector Machines
43 pages
Support Vector Machine
No ratings yet
Support Vector Machine
40 pages
Chapter 07
No ratings yet
Chapter 07
18 pages
ML-Lec9-SVM
No ratings yet
ML-Lec9-SVM
32 pages
DMML Unit4 - SVM
No ratings yet
DMML Unit4 - SVM
50 pages
Svm
No ratings yet
Svm
40 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
Lec5 Support vector machine
No ratings yet
Lec5 Support vector machine
28 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
28 pages
Unit II 2.2 ML Kernel Machines SVM
No ratings yet
Unit II 2.2 ML Kernel Machines SVM
50 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
Atc Lecture Tyliu
No ratings yet
Atc Lecture Tyliu
48 pages
SVMs
No ratings yet
SVMs
30 pages
SVM notes unit 4.docx
No ratings yet
SVM notes unit 4.docx
8 pages
Support_Vector_Machine(SVM)[1]
No ratings yet
Support_Vector_Machine(SVM)[1]
103 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
SVMs[1]
No ratings yet
SVMs[1]
30 pages
Lecture 18 - SVM
No ratings yet
Lecture 18 - SVM
54 pages
L5-Support Vector Machine
No ratings yet
L5-Support Vector Machine
61 pages
SVM Theory
No ratings yet
SVM Theory
7 pages
SUpport Vector Machine
No ratings yet
SUpport Vector Machine
28 pages
SVM
No ratings yet
SVM
12 pages
SVM Notes
No ratings yet
SVM Notes
8 pages
SVM_Presentation
No ratings yet
SVM_Presentation
13 pages
Session Svmclassification
No ratings yet
Session Svmclassification
28 pages
SVM
No ratings yet
SVM
6 pages
Support Vactor Machine Final
No ratings yet
Support Vactor Machine Final
11 pages
Support Vector Machine
No ratings yet
Support Vector Machine
21 pages
Detailed SVM Presentation
No ratings yet
Detailed SVM Presentation
15 pages
Unit 2 SVM
No ratings yet
Unit 2 SVM
16 pages
Support Vector Machines: Jeff Wu
No ratings yet
Support Vector Machines: Jeff Wu
35 pages
Machine Learning Unit-3.3
No ratings yet
Machine Learning Unit-3.3
38 pages
MergedPDF Iml
No ratings yet
MergedPDF Iml
114 pages
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Bayes Classifier
No ratings yet
Bayes Classifier
20 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
12 pages
Spring Mid Sem ML Evalution Scheme
No ratings yet
Spring Mid Sem ML Evalution Scheme
8 pages
CNN
No ratings yet
CNN
20 pages
K-Means Clustering Numerical
No ratings yet
K-Means Clustering Numerical
12 pages
decision tree
No ratings yet
decision tree
66 pages
mean shift clustering
No ratings yet
mean shift clustering
23 pages
Evans Technology in Action 8th Edition PDF
No ratings yet
Evans Technology in Action 8th Edition PDF
2 pages
Godfather Love Theme Ringtone
No ratings yet
Godfather Love Theme Ringtone
3 pages
bt1R0721 21 11 09rahul
No ratings yet
bt1R0721 21 11 09rahul
23 pages
btp_sprint_project_use_cases_-_fpt-btp-b1_-_p1
No ratings yet
btp_sprint_project_use_cases_-_fpt-btp-b1_-_p1
40 pages
Network Layer Assignment
No ratings yet
Network Layer Assignment
3 pages
English test NIS
No ratings yet
English test NIS
6 pages
M-2-2-Geom Mod Syn Curv General Characteristics
No ratings yet
M-2-2-Geom Mod Syn Curv General Characteristics
11 pages
FILL OUT THE Application Form: Device Operating System Application
No ratings yet
FILL OUT THE Application Form: Device Operating System Application
8 pages
COVID-19 Future Forecasting Using Supervised Machine Learning Models
No ratings yet
COVID-19 Future Forecasting Using Supervised Machine Learning Models
13 pages
Furnace Micro Controller 2438 Operational - Manual PDF
No ratings yet
Furnace Micro Controller 2438 Operational - Manual PDF
21 pages
Operation and Maintenance Protocols Manual
No ratings yet
Operation and Maintenance Protocols Manual
56 pages
10 BI SAP BI Modeling-Solution02
No ratings yet
10 BI SAP BI Modeling-Solution02
11 pages
The Sammi Application Development Kit
No ratings yet
The Sammi Application Development Kit
2 pages
Logcat - Sun 02-25-2024 - 23.59.57
No ratings yet
Logcat - Sun 02-25-2024 - 23.59.57
4,639 pages
Paharang Integrated School: Department of Education
No ratings yet
Paharang Integrated School: Department of Education
6 pages
LoRaWAN EU Supplementary Device Info Questionnaire V1 0 Cyble5 Showcase
No ratings yet
LoRaWAN EU Supplementary Device Info Questionnaire V1 0 Cyble5 Showcase
11 pages
Final ToRs Pastoralism TEV - Ethiopia - 2 PDF
No ratings yet
Final ToRs Pastoralism TEV - Ethiopia - 2 PDF
5 pages
ERP Marketplace and Marketplace Dynamics: ERP Demystified (Second Edition) by Alexis Leon (2008)
No ratings yet
ERP Marketplace and Marketplace Dynamics: ERP Demystified (Second Edition) by Alexis Leon (2008)
24 pages
Panasonic NV-GS30B Manual
No ratings yet
Panasonic NV-GS30B Manual
52 pages
Scotch Yoke Mechanism: Technical Teaching Equipment
No ratings yet
Scotch Yoke Mechanism: Technical Teaching Equipment
4 pages
Unit 2 - Session 11
No ratings yet
Unit 2 - Session 11
16 pages
Lecture Notes PPM
No ratings yet
Lecture Notes PPM
124 pages
Pacesetters Edited
No ratings yet
Pacesetters Edited
5 pages
Esewa
No ratings yet
Esewa
30 pages
DWDM Alarm
No ratings yet
DWDM Alarm
186 pages
How To Remote SSO For Citrix Nfuse
No ratings yet
How To Remote SSO For Citrix Nfuse
6 pages
POWER BI Mastery in 15 Days
No ratings yet
POWER BI Mastery in 15 Days
46 pages
Tecon t420hvn04.0 Auo
No ratings yet
Tecon t420hvn04.0 Auo
24 pages

Svm

Uploaded by

Svm

Uploaded by

SUPPORT VECTOR

test data point test data point

• In order to classify a dataset like the one above it’s necessary to

• This is known as kernelling.

Threshold classifier output at 0.5:

• g(x) is a linear discriminant function that divides (categorizes) into two

• The discriminant function is now a linear n-dimensional surface,

A simple linear classifier

Linear decision boundary between two classes

Algebraic measure of the

Large margin and small margin seperation

Soft decision boundary

Transformation from input space to feature space

You might also like