SVM 1
SVM 1
NITHISHKUMAR. S
RA2332241040023
DEPARTMENT OF MCA
1ST YEAR
TITLE : SUPPORT VECTOR MACHINE (SVM)
CONTENTS :
INTRODUCTION TO SVM
CLASSIFICATION OF SVM
USAGE OF SVM
IMPLIMENTATION
ADVANTAGES
DISADVANTAGES
CONCLUSION
SUPPORT VECTOR MACHINE (SVM) :
Linear SVM
Non-linear SVM
Linear SVM :
• When the data is perfectly linearly separable only then we can use
Linear SVM. Perfectly linearly separable means that the data points
can be classified into 2 classes by using a single straight line(if 2D).
Non-Linear SVM :
• Non-Linear SVM: When the data is not linearly separable then we can
use Non-Linear SVM, which means when the data points cannot be
separated into 2 classes by using a straight line (if 2D) then we use
some advanced techniques like kernel tricks to classify them. In most
real-world applications we do not find linearly separable datapoints
hence we use kernel trick to solve them.
• Handwriting detection
• Handwriting recognition has beenone of the most fascinating and
challengingresearch areas of image processingand pattern recognition
in the recentyears. It contributes enormously to theimprovement of
automation process andupgrades the interface between man
andmachine in numerous applications whichinclude, reading aid for
blind, librarycataloguing, ledgering, processing of forms,cheques and
faxes and conversion of anyhandwritten document into editable
textetc. As a result, the off-line handwritingrecognition continues to be
an active area ofresearch towards exploring the innovativetechniques
to produce adequate accuracy.Even though, sufficient studies
haveperformed in foreign scripts like Chinese,Japanese and Arabic
characters, onlyfew works can be traced for handwrittencharacter
recognition of Indian scripts.
• Face detection
• These SMS are created in such a way that it is hard to detect whether
these are real or spam messages for a normal person. That’s why most
people get caught in these fraud schemes and lose their hard-earned
money. It’s not just about money, there are other types of fraud too
that takes place just through SMS.
IMPLEMENTATION OF SVM :
Explore the data to figure out what they look likePre-process the
data
Implementation code
# Train-test split
train_indices <- sample(1:nrow(X), 0.7 * nrow(X))
# 70% for training
X_train <- X[train_indices, ]
y_train <- y[train_indices]
X_test <- X[-train_indices, ]
y_test <- y[-train_indices]
# Calculate accuracy
accuracy <- mean(predictions == y_test)
cat("Accuracy:", accuracy, "\n")
Output :
The output of the code will display the accuracy of the SVM model on
the test data. Here's what the output might look like
Accuracy: 0.85
This indicates that the SVM model achieved an accuracy of 85% on the
test data, meaning it correctly classified 85% of the test samples.
ADVANTAGES OF SVM :
Support Vector Machines (SVMs) offer several advantages, making them
a popular choice for classification and regression tasks in machine
learning:
• Effective in High-Dimensional Spaces:
SVMs perform well even in high-dimensional spaces, making them
suitable for problems with many features, such as text classification,
image recognition, and genomics.
• Memory Efficient:
SVMs use a subset of training points (support vectors) in the decision
function, which makes them memory efficient, particularly when dealing
with large datasets.
• Versatile Kernels:
SVMs can use different kernel functions, such as linear, polynomial,
radial basis function (RBF), and sigmoid, allowing them to handle
complex decision boundaries and non-linear relationships between
features.
• Robust to Overfitting:
SVMs tend to generalize well even in cases where the number of
features exceeds the number of samples. This is because SVMs maximize
the margin between classes, which helps prevent overfitting.
• Effective in Data with Few Samples:
SVMs are effective even when the number of samples is less than the
number of features. This is particularly useful in scenarios where
collecting large amounts of labeled data is difficult or expensive.
• Global Optimum:
The objective function of SVMs leads to a convex optimization
problem, ensuring that the solution is the global optimum rather than a
local one.
• Tolerant to Noise:
SVMs are less sensitive to noisy data due to the use of margin
maximization, which helps in better generalization.
• Regularization Parameter:
SVMs have a regularization parameter (C) that helps control the trade-
off between maximizing the margin and minimizing the classification
error, providing flexibility in model tuning.
DISADVANTAGES OF SVM :
While Support Vector Machines (SVMs) have many advantages, they also
come with certain limitations and disadvantages:
• Difficulty in Choosing Appropriate Kernel:
The performance of SVMs heavily depends on the choice of kernel
function and its parameters. Selecting the right kernel and tuning its
parameters can be challenging and often requires domain expertise or
extensive experimentation.
• Computational Complexity:
Training an SVM can be computationally intensive, especially for large
datasets. The time complexity of SVM training is generally between
O(n^2) and O(n^3), where n is the number of samples. This can make
SVMs less practical for very large datasets or real-time applications.
• Memory Intensive:
SVMs require storing all support vectors in memory, which can
become memory-intensive, especially when dealing with large datasets
with a high number of support vectors.
• Sensitivity to Noise:
SVMs are sensitive to noise in the training data, especially when using
complex kernel functions. Noisy data can lead to overfitting, reducing the
generalization performance of the model.
• Difficulty with Large-Scale and Streaming Data:
While SVMs perform well with small to medium-sized datasets, they
may not scale efficiently to very large datasets or streaming data due to
their computational complexity and memory requirements.
• Binary Classification:
SVMs inherently perform binary classification and need to be
extended for multi-class classification tasks using techniques like one-vs-
one or one-vs-all, which can increase complexity and computational
overhead.
• Interpretability:
SVMs typically provide a black-box model, meaning they offer limited
interpretability compared to simpler models like decision trees or logistic
regression. Understanding the decision boundaries learned by SVMs can
be challenging, especially with complex kernel functions.
CONCLUSION :