support_vector_machines
support_vector_machines
machines
Support vector machines
Support Vector Machines (SVM) are a type of supervised learning algorithm primarily used for
classification tasks, although they can also be applied to regression. The core idea behind SVM is
to find the optimal hyperplane that separates data points of different classes in a high-
dimensional space.
Terminolgy
Hyperplane: A hyperplane is a decision boundary that separates different classes in the feature
space. In two dimensions, it is a line; in three dimensions, it is a plane; and in N dimensions, it
has (N-1) dimensions.
Support Vectors: These are the data points that lie closest to the hyperplane and directly
influence its position and orientation. They are crucial for defining the optimal hyperplane.
Margin: The margin is the distance between the hyperplane and the nearest data points from
either class. SVM aims to maximize this margin to improve generalization and minimize
classification errors.
Support vector classifier
Support Vector Classifier (SVC) is a machine learning algorithm that belongs to the Support
Vector Machines (SVM) family. It is primarily used for classification tasks, aiming to find a
hyperplane that best separates data points of different classes in a high-dimensional space.
SVC is capable of handling both linear and non-linear classification through the use of various
kernel functions, allowing it to project data into higher dimensions to find optimal decision
boundaries.
Training Phase
Finding the Hyperplane: The core task is to find the optimal hyperplane that separates the different
classes. The SVM algorithm:
Maximizes the Margin: It identifies the hyperplane that maximizes the distance (margin) between the
closest points (support vectors) of each class. The support vectors are the data points that lie closest
to the decision boundary.
Optimization Problem: This process involves solving a constrained optimization problem, which can be
formulated mathematically. The SVM tries to minimize a loss function while ensuring that the data
points are classified correctly.
Predicting Phase
Applying the Hyperplane: The model uses the hyperplane derived during the training phase to make
predictions:
Each new data point is evaluated in relation to the hyperplane. Depending on which side of the
hyperplane the point falls on, it is assigned a class label (for binary classification).
For multi-class classification, the SVM will utilize the decision function for the relevant hyperplanes (e.g.,
one-vs-one or one-vs-rest) to determine the final class.
Summary
In the training phase, SVM learns the optimal hyperplane and parameters by maximizing the
margin between classes based on the training data.
In the predicting phase, SVM classifies new data points by evaluating their position relative to
the learned hyperplane.
Equation of the Hyperplane: The SVM finds a hyperplane defined by the equation:
Here, w represents the weights (or coefficients) learned during training, x is the input feature
vector, and b is the bias term.
How my model know where my output should fall
Kernel Trick: The overall technique of using kernel functions to enable efficient computation in
high-dimensional spaces without explicitly transforming the data.
Support vector Regressor
Support Vector Regression (SVR) is a type of machine learning algorithm that uses the principles of
Support Vector Machines (SVMs) for regression tasks. Unlike traditional regression methods that
aim to minimize the error for every data point, SVR focuses on fitting a function that predicts the
target values within a specified margin of tolerance (ϵ\epsilonϵ).
Training Phase
Fit a regression line (or curve for nonlinear SVR) that predicts the output values while maintaining most
of the data points within a margin of tolerance (ϵ)(\epsilon)(ϵ).
A tolerance band is created around the line where data points are considered "close enough" to the
prediction. Points outside this margin are penalized.
Only the data points that lie on or outside the margin influence the final regression line. These are
called support vectors.
The model learns the best line (or curve) that minimizes errors while being robust to small deviations
within the margin.
Predicting Phase
The trained model uses the regression line (or curve) to predict the target values for new input data.
The trained model uses the function f(x)=w⋅x+b to predict the output for new data points.
How Does ϵ\epsilon Affect the Model?
Small ϵ :
Tight margin around the regression line.
Only very small deviations from the predicted line are acceptable.
More support vectors are used to define the model.
Risk of overfitting if ϵ is too small.
Large ϵ :
Wide margin around the regression line.
Larger deviations are acceptable, resulting in fewer support vectors.
Risk of underfitting if ϵ is too large.
Optimal ϵ:
The choice of ϵ\epsilonϵ should balance accuracy (fit the data well) and generalization (perform
well on unseen data).
How Does ϵ\epsilon Affect the Model?