0% found this document useful (0 votes)
0 views

Support Vector Machine (SVM) Algorithm - GeeksforGeeks

The Support Vector Machine (SVM) algorithm is a supervised machine learning technique used for classification and regression tasks, focusing on finding the optimal hyperplane to separate data points into different classes. It employs concepts like support vectors, margins, and kernel functions to handle both linearly and non-linearly separable data. SVM is advantageous for high-dimensional data and is robust to outliers, but it can be slow for large datasets and requires careful parameter tuning.

Uploaded by

Harikrishnan S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Support Vector Machine (SVM) Algorithm - GeeksforGeeks

The Support Vector Machine (SVM) algorithm is a supervised machine learning technique used for classification and regression tasks, focusing on finding the optimal hyperplane to separate data points into different classes. It employs concepts like support vectors, margins, and kernel functions to handle both linearly and non-linearly separable data. SVM is advantageous for high-dimensional data and is robust to outliers, but it can be slow for large datasets and requires careful parameter tuning.

Uploaded by

Harikrishnan S
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Support Vector Machine (SVM) Algorithm

Last Updated : 27 Jan, 2025

Support Vector Machine (SVM) is a supervised machine learning


algorithm used for classification and regression tasks. While it can
handle regression problems, SVM is particularly well-suited for
classification tasks.

SVM aims to find the optimal hyperplane in an N-dimensional


space to separate data points into different classes. The algorithm
maximizes the margin between the closest points of different
classes.

Support Vector Machine (SVM) Terminology


Hyperplane: A decision boundary separating different classes
in feature space, represented by the equation wx + b = 0 in
linear classification.
Support Vectors: The closest data points to the hyperplane,
crucial for determining the hyperplane and margin in SVM.
Margin: The distance between the hyperplane and the support
vectors. SVM aims to maximize this margin for better
classification performance.
Kernel: A function that maps data to a higher-dimensional
space, enabling SVM to handle non-linearly separable data.
Hard Margin: A maximum-margin hyperplane that perfectly
separates the data without misclassifications.
Soft Margin: Allows some misclassifications by introducing
slack variables, balancing margin maximization and
misclassification penalties when data is not perfectly separable.
C: A regularization term balancing margin maximization and
misclassification penalties. A higher C value enforces a stricter
penalty for misclassifications.
Hinge Loss: A loss function penalizing misclassified points or
margin violations, combined with regularization in SVM.
Dual Problem: Involves solving for Lagrange multipliers
associated with support vectors, facilitating the kernel trick and
efficient computation.

How does Support Vector Machine Algorithm


Work?
The key idea behind the SVM algorithm is to find the hyperplane
that best separates two classes by maximizing the margin
between them. This margin is the distance from the hyperplane to
the nearest data points (support vectors) on each side.
Multiple hyperplanes separate the data from two classes

The best hyperplane, also known as the “hard margin,” is the one
that maximizes the distance between the hyperplane and the
nearest data points from both classes. This ensures a clear
separation between the classes. So, from the above figure, we
choose L2 as hard margin.

Let’s consider a scenario like shown below:


Selecting hyperplane for data with outlier

Here, we have one blue ball in the boundary of the red ball.

How does SVM classify the data?

It’s simple! The blue ball in the boundary of red ones is an outlier
of blue balls. The SVM algorithm has the characteristics to ignore
the outlier and finds the best hyperplane that maximizes the
margin. SVM is robust to outliers.

Hyperplane which is the most optimized one

A soft margin allows for some misclassifications or violations of


the margin to improve generalization. The SVM optimizes the
following equation to balance margin maximization and penalty
minimization:

Objective Function = ( margin


1
) + λ ∑ penalty

The penalty used for violations is often hinge loss, which has the
following behavior:
If a data point is correctly classified and within the margin, there
is no penalty (loss = 0).
If a point is incorrectly classified or violates the margin, the
hinge loss increases proportionally to the distance of the
violation.

Till now, we were talking about linearly separable data(the group


of blue balls and red balls are separable by a straight line/linear
line).

What to do if data are not linearly separable?

When data is not linearly separable (i.e., it can’t be divided by a


straight line), SVM uses a technique called kernels to map the
data into a higher-dimensional space where it becomes separable.
This transformation helps SVM find a decision boundary even for
non-linear data.

Original 1D dataset for classification

A kernel is a function that maps data points into a higher-


dimensional space without explicitly computing the coordinates in
that space. This allows SVM to work efficiently with non-linear
data by implicitly performing the mapping.
For example, consider data points that are not linearly separable.
By applying a kernel function, SVM transforms the data points into
a higher-dimensional space where they become linearly separable.

Linear Kernel: For linear separability.


Polynomial Kernel: Maps data into a polynomial space.
Radial Basis Function (RBF) Kernel: Transforms data into a
space based on distances between data points.

Data Science IBM Certification Data Science Data Science Projects Data Analysis Data Visualiza

Mapping 1D data to 2D to become able to separate the two classes

In this case, the new variable y is created as a function of distance


from the origin.

Mathematical Computation: SVM


Consider a binary classification problem with two classes, labeled
as +1 and -1. We have a training dataset consisting of input
feature vectors X and their corresponding class labels Y.

The equation for the linear hyperplane can be written as:

wT x + b = 0

Where:
w is the normal vector to the hyperplane (the direction
perpendicular to it).
b is the offset or bias term, representing the distance of the
hyperplane from the origin along the normal vector w.

Distance from a Data Point to the Hyperplane

The distance between a data point x_i and the decision boundary
can be calculated as:
w T xi +b
di = ​

∣∣w∣∣

where ||w|| represents the Euclidean norm of the weight vector w.


Euclidean norm of the normal vector W

Linear SVM Classifier

Distance from a Data Point to the Hyperplane:


1 : wT x + b ≥ 0
y^ = {
0 : wT x + b < 0
​ ​ ​

Where y^ is the predicted label of a data point.


Optimization Problem for SVM

For a linearly separable dataset, the goal is to find the hyperplane


that maximizes the margin between the two classes while
ensuring that all data points are correctly classified. This leads to
the following optimization problem:

minimize 12 ∥w∥2 ​ ​

w,b

Subject to the constraint:

yi (w T xi + b) ≥ 1 f or i = 1, 2, 3, ⋯ , m
​ ​
Where:

yi ​is the class label (+1 or -1) for each training instance.

xi ​is the feature vector for the i-th training instance.


m is the total number of training instances.

The condition yi (wT xi + b) ≥ 1 ensures that each data point is


​ ​

correctly classified and lies outside the margin.

Soft Margin Linear SVM Classifier

In the presence of outliers or non-separable data, the SVM allows


some misclassification by introducing slack variables ζi ​. The ​

optimization problem is modified as:

minimize 12 ∥w∥2 + C ∑m
​ ​

i=1 ζi ​ ​

w,b

Subject to the constraints:

yi (w T xi + b) ≥ 1–ζi
​ ​ ​
and ζi ≥ 0 for i = 1, 2, … , m

Where:

C is a regularization parameter that controls the trade-off


between margin maximization and penalty for
misclassifications.
ζi ​are slack variables that represent the degree of violation of

the margin by each data point.

Dual Problem for SVM

The dual problem involves maximizing the Lagrange multipliers


associated with the support vectors. This transformation allows
solving the SVM optimization using kernel functions for non-linear
classification.
The dual objective function is given by:

maximize 12 ∑m m
​ ​ ​
m
i=1 ∑j=1 αi αj ti tj K (xi , xj )– ∑i=1 αi
​ ​ ​ ​ ​ ​ ​ ​ ​

Where:

αi ​are the Lagrange multipliers associated with the i-th training


sample.
ti ​is the class label for the iii-th training sample (+1+1+1 or −1-

1−1).
K(xi , xj ) is the kernel function that computes the similarity
​ ​

between data points xi ​and xj ​. The kernel allows SVM to handle


​ ​

non-linear classification problems by mapping data into a


higher-dimensional space.

The dual formulation optimizes the Lagrange multipliers αi ​, and ​

the support vectors are those training samples where αi > 0. ​

SVM Decision Boundary


Once the dual problem is solved, the decision boundary is given
by:
m
w = ∑i=1 αi ti K(xi , x) + b
​ ​ ​ ​

Where w is the weight vector, x is the test data point, and b is the
bias term.

Finally, the bias term b is determined by the support vectors, which


satisfy:

ti (w T xi –b) = 1
​ ​ ⇒ b = w T xi –ti ​ ​

Where xi ​is any support vector.


This completes the mathematical framework of the Support Vector


Machine algorithm, which allows for both linear and non-linear
classification using the dual problem and kernel trick.
Types of Support Vector Machine
Based on the nature of the decision boundary, Support Vector
Machines (SVM) can be divided into two main parts:

Linear SVM: Linear SVMs use a linear decision boundary to


separate the data points of different classes. When the data can
be precisely linearly separated, linear SVMs are very suitable.
This means that a single straight line (in 2D) or a hyperplane (in
higher dimensions) can entirely divide the data points into their
respective classes. A hyperplane that maximizes the margin
between the classes is the decision boundary.

Non-Linear SVM: Non-Linear SVM can be used to classify data


when it cannot be separated into two classes by a straight line
(in the case of 2D). By using kernel functions, nonlinear SVMs
can handle nonlinearly separable data. The original input data
is transformed by these kernel functions into a higher-
dimensional feature space, where the data points can be
linearly separated. A linear SVM is used to locate a nonlinear
decision boundary in this modified space.

Implementing SVM Algorithm in Python


Predict if cancer is Benign or malignant. Using historical data
about patients diagnosed with cancer enables doctors to
differentiate malignant cases and benign ones are given
independent attributes.

Load the breast cancer dataset from sklearn.datasets


Separate input features and target variables.
Build and train the SVM classifiers using RBF kernel.
Plot the scatter plot of the input features.
# Load the important packages
from sklearn.datasets import load_breast_cancer
import matplotlib.pyplot as plt
from sklearn.inspection import DecisionBoundaryDisplay
from sklearn.svm import SVC

# Load the datasets


cancer = load_breast_cancer()
X = cancer.data[:, :2]
y = cancer.target

#Build the model


svm = SVC(kernel="rbf", gamma=0.5, C=1.0)
# Trained the model
svm.fit(X, y)

# Plot Decision Boundary


DecisionBoundaryDisplay.from_estimator(
svm,
X,
response_method="predict",
cmap=plt.cm.Spectral,
alpha=0.8,
xlabel=cancer.feature_names[0],
ylabel=cancer.feature_names[1],
)

# Scatter plot
plt.scatter(X[:, 0], X[:, 1],
c=y,
s=20, edgecolors="k")
plt.show()

Output:
Breast Cancer Classifications with SVM RBF kernel

Advantages of Support Vector Machine (SVM)


1. High-Dimensional Performance: SVM excels in high-
dimensional spaces, making it suitable for image classification
and gene expression analysis.
2. Nonlinear Capability: Utilizing kernel functions like RBF and
polynomial, SVM effectively handles nonlinear relationships.
3. Outlier Resilience: The soft margin feature allows SVM to
ignore outliers, enhancing robustness in spam detection and
anomaly detection.
4. Binary and Multiclass Support: SVM is effective for both
binary classification and multiclass classification, suitable for
applications in text classification.
5. Memory Efficiency: SVM focuses on support vectors, making it
memory efficient compared to other algorithms.

Disadvantages of Support Vector Machine (SVM)


1. Slow Training: SVM can be slow for large datasets, affecting
performance in SVM in data mining tasks.
2. Parameter Tuning Difficulty: Selecting the right kernel and
adjusting parameters like C requires careful tuning, impacting
SVM algorithms.
3. Noise Sensitivity: SVM struggles with noisy datasets and
overlapping classes, limiting effectiveness in real-world
scenarios.
4. Limited Interpretability: The complexity of the hyperplane in
higher dimensions makes SVM less interpretable than other
models.
5. Feature Scaling Sensitivity: Proper feature scaling is essential;
otherwise, SVM models may perform poorly.

Support Vector Regression Intuition Visit Course

Support Vector Machine (SVM) Algorithm- FAQs

How does SVM work in machine learning?

SVM works by finding the maximum-margin hyperplane that


best separates the data points of different classes. It uses
support vectors, which are the closest data points to the
hyperplane, to define this boundary.

What are the key advantages of using SVM in machine


learning?

SVMs are effective for high-dimensional data, robust to


outliers, and versatile due to kernel functions, allowing them
to handle both linear and nonlinear relationships.

What is the difference between hard margin and soft


margin SVM?

A hard margin SVM perfectly separates classes without


misclassification, while a soft margin SVM allows some
misclassifications to better accommodate outliers, balancing
the margin and penalties.

What types of kernel functions are used in SVM?

Common kernel functions in SVM include linear, polynomial,


radial basis function (RBF), and sigmoid, each mapping input
data into higher-dimensional spaces for better separation.

When should I use SVM in data mining?

Use SVM in data mining when dealing with complex


datasets, especially when you need to classify data with high
dimensions, non-linear boundaries, or when robustness to
outliers is important.

Get IBM Certification and a 90% fee refund on completing


90% course in 90 days! Take the Three 90 Challenge today.

Master Machine Learning, Data Science & AI with this complete


program and also get a 90% refund. What more motivation do
you need? Start the challenge right away!

Comment More info Next Article


Classifying data using Support

Advertise with us Vector Machines(SVMs) in


Python

Similar Reads
Major Kernel Functions in Support Vector Machine (SVM)
In previous article we have discussed about SVM(Support Vector
Machine) in Machine Learning. Now we are going to learn in detail…

4 min read

Image classification using Support Vector Machine (SVM) in…


Support Vector Machines (SVMs) are a type of supervised machine
learning algorithm that can be used for classification and regressio…

9 min read
Support Vector Machine (SVM) for Anomaly Detection
Support Vector Machines (SVMs) are powerful supervised learning
models that can also be used for anomaly detection. They can be…

8 min read

Important Support Vector Machine (SVM) -Interview Questions…


SVM is a type of supervised learning algorithm used in machine
learning to solve both classification and regression tasks particularl…

10 min read

Introduction to Support Vector Machines (SVM)


INTRODUCTION:Support Vector Machines (SVMs) are a type of
supervised learning algorithm that can be used for classification or…

6 min read

Visualizing Support Vector Machines (SVM) using Python


Support Vector Machines (SVMs) are powerful supervised learning
models used for classification and regression tasks. A key factor…

6 min read

Multi-class classification using Support Vector Machines (SVM)


Support Vector Machines (SVM) are widely recognized for their
effectiveness in binary classification tasks. However, real-world…

6 min read

Implementing SVM and Kernel SVM with Python's Scikit-Learn


In this article we will implement a classification model using Scikit
learn implementation for SVM model in Python. Then we will try to…
6 min read

Optimizing SVM Classifiers: The Role of Support Vectors in…


Support Vector Machines (SVMs) are a powerful tool in the machine
learning arsenal, particularly for classification tasks. They work by…

7 min read

Does the SVM in sklearn support incremental (online) learning?


Support Vector Machines (SVM) are popular for classification and
regression tasks in machine learning. When it comes to incrementa…

5 min read

Corporate & Communications Address:


A-143, 7th Floor, Sovereign Corporate
Tower, Sector- 136, Noida, Uttar Pradesh
(201305)

Registered Address:
K 061, Tower K, Gulshan Vivante
Apartment, Sector 137, Noida, Gautam
Buddh Nagar, Uttar Pradesh, 201305

Advertise with us
Company Explore
About Us Job-A-Thon Hiring Challenge
Legal Hack-A-Thon
Privacy Policy GfG Weekly Contest
Careers Offline Classes (Delhi/NCR)
In Media DSA in JAVA/C++
Contact Us Master System Design
GfG Corporate Solution Master CP
Placement Training Program GeeksforGeeks Videos
Geeks Community

Languages DSA
Python Data Structures
Java Algorithms
C++ DSA for Beginners
PHP Basic DSA Problems
GoLang DSA Roadmap
SQL DSA Interview Questions
R Language Competitive Programming
Android Tutorial

Data Science & ML Web Technologies


Data Science With Python HTML
Data Science For Beginner CSS
Machine Learning JavaScript
ML Maths TypeScript
Data Visualisation ReactJS
Pandas NextJS
NumPy NodeJs
NLP Bootstrap
Deep Learning Tailwind CSS

Python Tutorial Computer Science


Python Programming Examples GATE CS Notes
Django Tutorial Operating Systems
Python Projects Computer Network
Python Tkinter Database Management System
Web Scraping Software Engineering
OpenCV Tutorial Digital Logic Design
Python Interview Question Engineering Maths

DevOps System Design


Git High Level Design
AWS Low Level Design
Docker UML Diagrams
Kubernetes Interview Guide
Azure Design Patterns
GCP OOAD
DevOps Roadmap System Design Bootcamp
Interview Questions

School Subjects Software and Tools


Mathematics AI Tools Directory
Physics Marketing Tools Directory
Chemistry Accounting Software Directory
Biology HR Management Tools
Social Science Editing Software Directory
English Grammar Microsoft Products and Apps
Figma Tutorial

Databases Preparation Corner


SQL Company-Wise Recruitment Process
MYSQL Resume Templates
PostgreSQL Aptitude Preparation
PL/SQL Puzzles
MongoDB Company-Wise Preparation
Companies
Colleges

Competitive Exams More Tutorials


JEE Advanced Software Development
UGC NET Software Testing
UPSC Product Management
SSC CGL Project Management
SBI PO Linux
SBI Clerk Excel
IBPS PO All Cheat Sheets
IBPS Clerk Recent Articles

Free Online Tools Write & Earn


Typing Test Write an Article
Image Editor Improve an Article
Code Formatters Pick Topics to Write
Code Converters Share your Experiences
Currency Converter Internships
Random Number Generator
Random Password Generator

DSA/Placements Development/Testing
DSA - Self Paced Course JavaScript Full Course
DSA in JavaScript - Self Paced Course React JS Course
DSA in Python - Self Paced React Native Course
C Programming Course Online - Learn C with Data Django Web Development Course
Structures Complete Bootstrap Course
Complete Interview Preparation Full Stack Development - [LIVE]
Master Competitive Programming JAVA Backend Development - [LIVE]
Core CS Subject for Interview Preparation Complete Software Testing Course [LIVE]
Mastering System Design: LLD to HLD Android Mastery with Kotlin [LIVE]
Tech Interview 101 - From DSA to System Design [LIVE]
DSA to Development [HYBRID]
Placement Preparation Crash Course [LIVE]

Machine Learning/Data Science Programming Languages


Complete Machine Learning & Data Science Program - C Programming with Data Structures
[LIVE] C++ Programming Course
Data Analytics Training using Excel, SQL, Python & Java Programming Course
PowerBI - [LIVE] Python Full Course
Data Science Training Program - [LIVE]
Mastering Generative AI and ChatGPT
Data Science Course with IBM Certification

Clouds/Devops GATE 2026


DevOps Engineering GATE CS Rank Booster
AWS Solutions Architect Certification GATE DA Rank Booster
Salesforce Certified Administrator Course GATE CS & IT Course - 2026
GATE DA Course 2026
GATE Rank Predictor

@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved

You might also like