SVM SLIDES
SVM SLIDES
Dr.G.JOHN BABU
November 3, 2024
Optimal separation aims to find the best hyperplane that maximizes the
margin between two classes, improving the generalization of the classifier.
A larger margin reduces the likelihood of misclassification for new
data points.
Support vectors are crucial in defining the optimal hyperplane.
minimize 12 ||w||2
subject to yi (w · xi + b) ≥ 1, ∀i
where:
w and b define the hyperplane,
yi denotes class labels ensuring points of one class yield positive
results and the other yields negative results.
L(x, y , λ) = xy + λ(10 − x − y )
ti (w T xi + b) ≥ 1
n
∂L X
=− λi ti .
∂b
i=1
Substituting these values into the Lagrangian function yields the dual
problem, where we aim to maximize the following with respect to λi :
n n n
∗ ∗
X 1 XX
L(w , b , λ) = λi − λi λj ti tj xiT xj ,
2
i=1 i=1 j=1
Pn
subject to λi ≥ 0 and i=1 λi ti = 0.
The kernel trick in SVM is used to handle data that is not linearly
separable by transforming it into a higher-dimensional space.
Instead of computing this transformation directly, the kernel trick
allows us to compute the inner product of transformed vectors in the
original space.
This reduces computational complexity, making algorithms efficient
even with complex mappings.
The polynomial kernel function can map input data into polynomial
feature space:
K (x, y ) = (x · y + c)d
where:
d is the degree of the polynomial.
c is a constant, controlling the influence of higher-dimensional features.
By applying a polynomial kernel, we can capture interactions of
features up to the d-th degree, helping classify data that has
non-linear relationships.
Example: A polynomial kernel of degree 2 can separate data that
requires a quadratic boundary.
K (x, y ) = tanh(α (x · y ) + c)
The RBF kernel (or Gaussian kernel) is widely used for non-linear
classification:
∥x − y ∥2
K (x, y ) = exp −
2σ 2
where σ determines the spread of the kernel.
Measures the ”distance” between points, with closer points having
higher similarity.
Example: For data forming concentric circles, the RBF kernel allows
SVM to classify these clusters by mapping them into separable
regions in the transformed space.