0% found this document useful (0 votes)
17 views30 pages

SVMs[1]

Support Vector Machines (SVM) are supervised machine learning algorithms used for classification tasks, effective for both linearly and non-linearly separable datasets. SVM utilizes hyperplanes to separate classes, with support vectors being the nearest data points that maximize the margin between classes. The algorithm can employ various kernels, such as linear, Gaussian, and polynomial, to transform data into higher dimensions for better classification, though it has limitations with larger datasets and high training times.

Uploaded by

Fareeha Butt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views30 pages

SVMs[1]

Support Vector Machines (SVM) are supervised machine learning algorithms used for classification tasks, effective for both linearly and non-linearly separable datasets. SVM utilizes hyperplanes to separate classes, with support vectors being the nearest data points that maximize the margin between classes. The algorithm can employ various kernels, such as linear, Gaussian, and polynomial, to transform data into higher dimensions for better classification, though it has limitations with larger datasets and high training times.

Uploaded by

Fareeha Butt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Support Vector

Machines
By Dr. Adven
What is Support Vector
Machine?
• Support Vector Machine or SVM algorithm is a simple yet powerful
Supervised Machine Learning algorithm that is generally used for
classification purposes.
• SVM algorithm can perform really well with both linearly separable
and non-linearly separable datasets.
• Even with a limited amount of data, the support vector machine
algorithm does not fail to show its magic.
Basic terms used in SVM
A hyperplane in SVM (Support Vector Machine) is like an invisible wall
or boundary that separates data into different groups or categories.
1. Hyper-Plane
A hyperplane is a decision boundary that differentiates the two classes
in SVM. A data point falling on either side of the hyperplane can be
attributed to different classes. The dimension of the hyperplane
depends on the number of input features in the dataset. If we have 2
input features the hyper-plane will be a line. likewise, if the number of
features is 3, it will become a two-dimensional plane.
Basic terms used in SVM(cont.)
2. Support Vectors
• The nearest points from the optimal
decision boundary that maximize the
distance are called support vectors
• Support Vectors are simply the
coordinates of individual
observation.
• We have to select a hyperplane, for
which the margin, i.e. the distance
between support vectors and hyper-
plane is maximum.
Types of Support Vector
Machines
There are two types of Support Vector Machines:
• Linear SVM or Simple SVM: Linear SVM is used for linearly separable
data. If a dataset can be classified into two classes with a single straight
line, then that data is considered to be linearly separable data, and the
classifier is referred to as the linear SVM classifier. It is typically used for
linear regression and classification problems.
• Nonlinear SVM or Kernel SVM: Nonlinear SVM is used for nonlinearly
separated data, i.e., a dataset that cannot be classified by using a straight
line. The classifier used in this case is referred to as a nonlinear SVM
classifier. It has more flexibility for nonlinear data because more features
can be added to fit a hyperplane instead of a two-dimensional space.
How SVM works? (cont.)
Best Separation (Scenario-2): Here, we
have three hyper-planes (A, B, and C)
and all are segregating the classes well.
Now, How can we identify the right
hyper-plane?
How SVM works?
Now the burning question is “How can we identify the
right hyper-plane?
Let’s understand:
Identifying the right hyper-plane (Scenario-1-
Maximum classification):
Here, we have three hyper-planes (A, B, and
C).Identify the best one.
The thumb rule to identify the right hyper-plane:
“Select the hyper-plane which segregates the two
classes better”.
In this scenario, hyper-plane “B” has excellently
performed this job.
How SVM works? (cont.)
• Here, maximizing the distances between
nearest data point (either class) and hyper-
plane will help to decide the right hyper-plane.
This distance is called as Margin.
• you can see that the margin for hyper-plane C
is high as compared to both A and B
• Hence, we name the right hyper-plane as C.
Another lightning reason for selecting the
hyper-plane with higher margin is robustness.
If we select a hyper-plane having low margin
then there is high chance of miss-classification.
How SVM works? (cont.)
Identify the right hyper-plane (Scenario-3)
Hint: Use the rules as discussed in previous
section to identify the right hyper-plane.
Some of you may have selected the hyper-
plane B as it has higher margin compared
to A. But, here is the catch, SVM selects the
hyper-plane which classifies the classes
accurately prior to maximizing margin.
Here, hyper-plane B has a classification error
and A has classified all correctly.
Therefore, the right hyper-plane is A.
How SVM works? (cont.)
Can we classify two classes (Scenario-4): We are
unable to segregate the two classes using a
straight line, as one of the stars lies in the
territory of other(circle) class as an outlier.
As I have already mentioned, one star at other
end is like an outlier for star class. The SVM
algorithm has a feature to ignore outliers and
find the hyper-plane that has the maximum
margin. Hence, we can say, SVM classification is
robust to outliers.
How SVM works? (cont.)
• one star at other end is like an outlier
for star class. The SVM algorithm has
a feature to ignore outliers and find
the hyper-plane that has the
maximum margin.
• Hence, we can say, SVM classification
is robust to outliers.
How SVM works? (cont.)
Find the hyper-plane to segregate two
classes (Scenario-5): In this scenario, we
can’t have linear hyper-plane between
the two classes, so how does SVM
classify these two classes?
Till now, we have only looked at the
linear hyper-plane.
Can SVM solve this problem?
How SVM works? (cont.)
• Well, the answer is Yes. SVM does this
by projecting the data in a higher
dimension. As shown in the image. In
the first case, data is not linearly
separable, hence, we project into a
higher dimension.
• If we have more complex data then
SVM will continue to project the data
in a higher dimension till it becomes
linearly separable. Once the data
become linearly separable, we can use
SVM to classify just like the previous
problems.
How SVM works? (cont.)
Projection into Higher Dimension:
• All values for z would be positive always because z is
the squared sum of both x and y
• In the original plot, red circles appear close to the
origin of x and y axes, leading to lower value of z and
star relatively away from the origin result to higher
value of z.
• In the SVM classifier, it is easy to have a linear hyper-
plane between these two classes.
• But, another burning question which arises is, should
we need to add this feature manually to have a hyper-
plane. No, the SVM algorithm has a technique called
the kernel trick.
SVM Kernel

The SVM kernel is a function that takes low dimensional input space and transforms it
to a higher dimensional space i.e. it converts not separable problem to separable
problem. It is mostly useful in non-linear separation problem.
Simply, it does some extremely complex data transformations, then finds out the
process to separate the data based on the labels or outputs you’ve defined.
When we look at the hyper-plane in original input space it looks like a circle
Types of kernels available
1. Linear Kernel
• In the linear kernel, the decision
boundary is a straight line.
Unfortunately, most of the real-
world data is not linearly
separable, this is the reason the
linear kernel is not widely used
in SVM.
Types of kernels available
2. Gaussian / RBF kernel
• It is the most commonly used kernel.
• It projects the data into a Gaussian
distribution, where the red points
become the peak of the Gaussian
surface and the green data points
become the base of the surface,
making the data linearly separable.
• But this kernel is prone to overfitting
and it captures the noise.
Types of kernels available
3. Polynomial kernel
• Lastly, we have a polynomial kernel,
which is non-uniform in nature due
to the polynomial combination of the
base features. It often gives good
results.
• But the problem with the polynomial
kernel is, the number of higher
dimension features increases
exponentially. As a result, this is
computationally more expensive
than RBF or linear kernel.
Advantages of SVM Algorithm
• It has a high level of accuracy
• It works very well with limited datasets
• Kernel SVM contains a non-linear transformation function to convert the
complicated non-linearly separable data into linearly separable data
• It is effective on datasets that have multiple features
• It is effective when the number of features are greater than the number of
data points
• It employs a subset of training points in the decision function or support
vectors, making SVM memory efficient
• Apart from common kernels, it is also possible to specify custom kernels for
the decision function.
Disadvantages of SVM
Algorithm
• Does not work well with larger datasets
• Sometimes, training time with SVMs can be high
• If the number of features is significantly greater than the number of
data points, it is crucial to avoid overfitting when choosing kernel
functions and regularization terms
• Probability estimates are not directly provided by SVMs; rather, they
are calculated by using an expensive fivefold cross-validation
• It works best on small sample sets due to its high training time.
Application of SVM
SVM is mainly used to classify the unseen data and have various application in different fields:
• Face Detection
Classifies the images of people’s faces in an environment from non-face by creating a square
box around it.
• Bioinformatics
The Support vector machines are used for gene classification that allows researchers to
differentiate between various proteins and identify biological problems and cancer cells.
• Text Categorization
Used in training models that are used to classify the documents into different categories based
on the score, types, and other threshold values.
• Generalized Predictive Control(GPC)
Provides you control over different industrial processes with multivariable version and
interactor matrix. GPC is used in various industries like cement mills, robotics, spraying, etc.
Practical Example (Linear SVM)
X Y Class
1 0 _ve
0 1 _ve
3 1 +ve
0 -1 _ve
-1 0 _ve
3 -1 +ve
6 1 +ve
6 -1 +ve
Example (cont.)
Example (cont.)
Example (cont.)
Example (cont.)
Example (cont.)
Example (Nonlinear SVMs)
X Y Class
2 2 +ve
2 -2 +ve
-2 -2 +ve
-2 2 +ve
1 1 -ve
1 -1 -ve
-1 -1 -ve
-1 1 -ve
Example (cont.)
Example (cont.)

You might also like