0% found this document useful (0 votes)
36 views

SVM Regressor

The document discusses support vector regression (SVR), which is a regression analysis technique that uses support vector machines. SVR finds a function that approximates real-valued targets based on training data, aiming to have the lowest possible error for points within a margin of tolerance ("epsilon") while maximizing the distance to the nearest training data points. Key aspects of SVR covered include kernels, hyperparameters, decision boundaries, advantages like robustness and disadvantages like poor performance on large or noisy datasets. Metrics for evaluating SVR models like R-squared and mean squared error are also defined.

Uploaded by

srir93616
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

SVM Regressor

The document discusses support vector regression (SVR), which is a regression analysis technique that uses support vector machines. SVR finds a function that approximates real-valued targets based on training data, aiming to have the lowest possible error for points within a margin of tolerance ("epsilon") while maximizing the distance to the nearest training data points. Key aspects of SVR covered include kernels, hyperparameters, decision boundaries, advantages like robustness and disadvantages like poor performance on large or noisy datasets. Metrics for evaluating SVR models like R-squared and mean squared error are also defined.

Uploaded by

srir93616
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Support Vector Machine – Regressor

Unit 4
Debangshu Chatterjee

1
Support Vector Machine (SVM)

2
Support Vector Machine (SVM)

3
Hyperparameters of the Support Vector Machine

• Kernel: A kernel helps us find a


hyperplane in the higher
dimensional space without
increasing the computational cost.
Usually, the computational cost will
increase if the dimension of the data
increases. This increase in
dimension is required when we are
unable to find a separating
hyperplane in a given dimension and
are required to move in a higher
dimension.
4
Hyperparameters of the Support Vector Machine

• Hyperplane: This is basically a


separating line between two
data classes in SVM. But in
Support Vector Regression,
this is the line that will be used
to predict the continuous
output

5
Hyperparameters of the Support Vector Machine

• Decision Boundary: A decision


boundary can be thought of as a
demarcation line (for
simplification) on one side of
which lie positive examples and
on the other side lie the negative
examples. On this very line, the
examples may be classified as
either positive or negative. This
same concept of SVM will be
applied in Support Vector
Regression as well
6
Support Vector Regression
• The problem of regression is to find a
function that approximates mapping from an
input domain to real numbers on the basis
of a training sample.

• Consider these two red lines as the decision


boundary and the green line as the
hyperplane. Our objective, when we are
moving on with SVR, is to basically consider
the points that are within the decision
boundary line. Our best fit line is the
hyperplane that has a maximum number of
points.
7
Support Vector Regression
• The first thing that we’ll understand is what is the
decision boundary. Consider these lines as being at any
distance, say ‘a’, from the hyperplane. So, these are the
lines that we draw at distance ‘+a’ and ‘-a’ from the
hyperplane. This ‘a’ in the text is basically referred to
as epsilon.

• Y = wx+b (equation of hyperplane)

• Then the equations of decision boundary become:


• wx+b= +a
• wx+b= -a Hence, we are going to take only those points
• Thus, any hyperplane that satisfies our SVR should that are within the decision boundary and
satisfy: have the least error rate or are within the
Margin of Tolerance. This gives us a better
• -a < Y- wx+b < +a
fitting model.
8
Advantages of Support Vector Regression

Although Support Vector Regression is used rarely it carries


certain advantages that are as mentioned below:
• It is robust to outliers.
• Decision model can be easily updated.
• It has excellent generalization capability, with high prediction
accuracy.
• Its implementation is easy.

9
Disadvantages of Support Vector Regression

• They are not suitable for large datasets.


• In cases where the number of features for each data point
exceeds the number of training data samples, the SVM will
underperform.
• The Decision model does not perform very well when the data
set has more noise i.e. target classes are overlapping

10
R-squared score
• R-squared, also known as the coefficient of determination, is a statistical measure that represents the proportion
of the variance in the dependent variable that is explained by the independent variable(s). It is commonly used to
evaluate the goodness of fit of a linear regression model.

• The R-squared score ranges from 0 to 1, with 1 indicating a perfect fit of the model to the data and 0 indicating no
fit. A negative R-squared score can indicate that the model is worse than predicting the mean of the dependent
variable.

• The R-squared score is calculated as the ratio of the explained variance to the total variance. It is calculated as:

• R-squared = 1 - (SS_res / SS_tot)

• where SS_res is the sum of squared residuals (the difference between the actual and predicted values) and
• SS_tot is the total sum of squares (the difference between the actual values and the mean of the dependent
variable).

• A higher R-squared score indicates a better fit of the model to the data, but it is important to note that a high R-
squared score does not necessarily mean that the model is the best possible model for the data, and it should be
used in conjunction with other metrics and validation techniques.
11
Mean squared error
• Mean squared error (MSE) is a statistical metric used to measure the average squared difference between the
predicted and actual values of a regression model. It is commonly used to evaluate the accuracy of a regression
model.

• MSE is calculated by taking the average of the squared differences between the predicted and actual values:

• MSE = 1/n * Σ(yi - ŷi)²

• where yi is the actual value, ŷi is the predicted value, n is the number of samples.

• MSE is always non-negative, with values closer to zero indicating better model performance. A perfect prediction
would result in an MSE of zero.

• However, MSE is sensitive to outliers, and it is not always easy to interpret the value because it depends on the
scale of the data. Therefore, it is often used in conjunction with other evaluation metrics to get a more complete
picture of the model's performance.

12
13

You might also like