Understanding One-Class Support Vector Machines
Last Updated :
24 Apr, 2025
Support Vector Machine is a popular supervised machine learning algorithm. it is used for both classifications and regression. In this article, we will discuss One-Class Support Vector Machines model.
One-Class Support Vector Machines
One-Class Support Vector Machine is a special variant of Support Vector Machine that is primarily designed for outlier, anomaly, or novelty detection. The objective behind using one-class SVM is to identify instances that deviate significantly from the norm. Unlike other traditional Machine Learning models, one-class SVM is not used to perform binary or multiclass classification tasks but to detect outliers or novelties within the dataset. Some of the key working principles of one-class SVM is discussed below.
- Outlier Boundary: One-Class SVM operates by defining a boundary around the majority class (normal instances) in the feature space. This boundary is constructed to encapsulate the normal data points, effectively creating a region of normalcy.
- Margin Maximization: This algorithm strives to maximize the margin around the normal instances, allowing for a more robust separation between normal and anomalous data points. This margin is crucial for accurately identifying outliers during testing.
- High sensitivity: One-Class SVM has an in-build hyperparameter called "nu," representing an upper bound on the fraction of margin errors and support vectors. Fine-tuning this parameter influences the model's sensitivity to outliers.
How does One-Class SVM differ from SVM?
SVM and one-class SVMs are like twins but not identical twins, as their usage and principals are different. The three most common differences are discussed below:
Aspects
| Support Vector Machine
| One-Class Support Vector Machine
|
---|
Single-Class Training
| Requires labeled data from both classes for training
| Operates with only the majority class during training
|
---|
Imbalance Handling
| Can't handle imbalance nature of datasets.
| Inherently addresses class imbalance, prevalent in outlier detection tasks. By concentrating on the majority class during training.
|
---|
Outlier Detection Focus
| Only aims to find a hyperplane that best separates multiple classes.
| Excels in scenarios where the goal is to uncover instances that deviate from the norm like in fraud detection or fault monitoring.
|
---|
How One-Class SVM Works?
One-Class Support Vector Machines (OCSVM) operate on a fascinating principle inspired by the idea of isolating the norm from the abnormal in a dataset. Unlike traditional Support Vector Machines (SVM), which are adept at handling binary and multiclass classification problems, OCSVM specializes in the nuanced task of anomaly detection. The workflow of OCSVM is discussed below:
- Conceptual Foundation: OCSVM establishes itself on the premise that the majority of real-world data is inherently normal. In most scenarios, outliers or anomalies are rare occurrences that deviate significantly from the usual patterns. OCSVM's goal is to define a boundary encapsulating the normal instances in the feature space, thereby creating a region of familiarity.
- Outlier Boundary Definition: The algorithm crafts a boundary around the normal instances, often referred to as the "normalcy region." This boundary is strategically positioned to maximize the margin around the normal data points, allowing for a clear delineation between what is considered ordinary and what may be deemed unusual. Think of it as drawing a protective circle around the typical instances to shield them from the outliers or anomalies.
- Margin Maximization: The heart of OCSVM lies in its commitment to maximizing the margin between the normal instances and the boundary. A larger margin provides a robust separation, enhancing the model's ability to discern anomalies during testing. This emphasis on margin maximization is akin to creating a safety buffer around the normal instances, fortifying the model against the influence of potential outliers or anomalies.
- Training Process: During the training phase, OCSVM exclusively leverages the majority class or normal instances. This unimodal focus distinguishes it from traditional SVMs, which necessitate examples from both classes for effective training. By concentrating solely on the norm, OCSVM tailors itself to scenarios where anomalies are sparse, and labeled instances of anomalies are hard to come by. It comes with a fantastic hyperparameter called 'nu'. This parameter acts as an upper bound on the fraction of margin errors and support vectors allowed by the model. Tuning the nu parameter enables practitioners to strike a balance between the desire for a stringent model that minimizes false positives (normal instances misclassified as anomalies) and a more lenient model that embraces a higher fraction of anomalies.
- Testing and Anomaly Identification: Armed with the learned normalcy region, OCSVM can swiftly identify anomalies during testing. Instances falling outside the defined boundary are flagged as potential outliers. The model essentially acts as a vigilant guardian, scrutinizing new data points and signaling if they exhibit behavior significantly different from the norm.
One-Class SVM in Anomaly Detection
In the domain of anomaly detection, One-Class Support Vector Machines (OCSVM) serve as a robust and versatile tool designed to discern normal patterns from irregular occurrences. Notably, OCSVM takes a distinctive approach by training exclusively on the majority class, which represents normal instances, eliminating the need for labeled anomalous data during training. This is particularly advantageous in real-world scenarios where anomalies are rare, making it challenging to obtain sufficient labeled samples. The core principle of OCSVM involves defining a boundary around the normal instances in the feature space, achieved through a specified kernel function and a nuanced parameter termed "nu." This parameter acts as an upper limit on the fraction of margin errors and support vectors, enabling users to fine-tune the model's sensitivity to outliers. During the testing phase, instances falling outside this learned boundary are flagged as potential outliers, facilitating efficient anomaly identification. OCSVM's adaptability extends to various applications, including fraud detection in financial transactions, fault monitoring in industrial systems, and network intrusion detection. Its innate ability to capture complex, non-linear relationships and its focus on the majority class make it a valuable asset in safeguarding systems against unexpected events and ensuring robust anomaly detection across diverse domains. In this article, we will implement credit card anomaly detect using OCSVM further.
Use Cases of One-Class SVM
There are several real-world use-cases of One-Class SVM which are listed below-->
- Detecting fraud in financial transactions: OCSVM excels in rare cases that are not uncommon associated with fraudulent activity in financial transactions. With specialized training in general practices, it becomes adept at distinguishing specific patterns. During testing, deviations from these scholarly models are immediately flagged, indicating that the fraud can be treated as abnormalities.
- Fault detection in commercial systems: Companies that rely on complex devices can benefit from OCSVM’s real-time monitoring of defects or anomalies. When applied to sensor data, OCSVM identifies abnormal behavior, and identifies potential errors. Early detection through OCSVM prevents maintenance, reduces downtime and increases operational efficiency.
- Network Intrusion Detection: OCSVM can play an important role in continuously monitoring computer networks to protect against malicious activity. It helps identify unusual network behaviors that may indicate a possible attack. OCSVM works well in situations where most network traffic is normal and anomalies are very rare.
- Quality Control in Manufacturing: Strict quality control is needed in manufacturing to ensure fault-free products. OCSVM is applied on sensor data or product characteristics to detect deviations from the perfect product. It helps to detect defects early during production.
One-Class SVM Kernel Trick
One-Class SVM supports various kernel options like SVM for optimized performance which are discussed below:
- Linear Kernel: The linear kernel is the simplest form of a kernel and is equivalent to performing a linear transformation. It is suitable when the relationship between the features is approximately linear. The decision boundary in the higher-dimensional space is a hyperplane.
- Polynomial Kernel: The polynomial kernel introduces non-linearity by considering not just the dot product but also higher-order interactions between features. It is characterized by a user-defined degree parameter (
degree
). A higher degree allows the model to capture more complex relationships but in the same time it may increase the risk of overfitting. - Sigmoid Kernel: The sigmoid kernel is particularly suitable for scenarios where the data distribution is not well defined or exhibits sigmoidal patterns. It is often used in neural network-inspired SVMs. The
gamma
and coef0
parameters govern the shape and position of the decision boundary. - Radial Basis Function (RBF) or Gaussian Kernel: The RBF kernel is versatile for handling complex, non-linear relationships. It transforms data into a space where intricate decision boundaries can be drawn. Well-suited when the exact form of relationships is unknown or intricate.
- Precomputed Kernel: This kernel allows users to provide a precomputed kernel matrix instead of the actual data. Useful when the kernel matrix is computed using a custom kernel function or when using pairwise similarities between instances.
From these above kernels the RBF kernel is the default kernel for One-Class SVM. In our implementation we will use default kernel for credit card anomaly detection.
Step-by-Step implementation of One-Class Support Vector Machines in Python
Importing required modules
At first, we will import all required Python libraries like Pandas, NumPy, Matplotlib and SKlearn etc.
Python3
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import OneClassSVM
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
Dataset loading and preprocessing
Now we will load the famous credit card dataset. For faster implementation we will use first 50k rows of the dataset. Then we will use Standard Scaler to scale the target column. Then we will separate the features and target variable for further usages.
Python3
credit_data = pd.read_csv('creditcard.csv', nrows=50000) # https://www.kaggle.com/mlg-ulb/creditcardfraud
standardized_data_without_class = StandardScaler().fit_transform(credit_data.loc[:,credit_data.columns!='Class'])
data_50k_new = standardized_data_without_class[0:50000]
data_50k_df = pd.DataFrame(data=data_50k_new)
# Separate features and target variable
X = credit_data.drop(columns=['Class'])
y = credit_data['Class']
Model training
Now we will train the One-class SVM on various hyperparameters which are discussed below:
kernel
: The choice of the kernel determines the transformation applied to the input data in a higher-dimensional space. Here we have set to default "rbf" which stands for Radial Basis Function, commonly known as the Gaussian kernel. This kernel is suitable for capturing complex, non-linear relationships in the data.degree
: We have set it to default value 3. It defines the degree of the polynomial function and is particularly applicable when the kernel is set to "poly." However, if the patterns of dataset act as polynomial then this parameter automatically handles the kernel as required.gamma
: is a crucial parameter that influences the shape of the decision boundary. A smaller gamma value results in a broader decision boundary which makes the model less sensitive to individual data points. Conversely, a larger gamma value leads to a more complex decision boundary, potentially capturing intricate patterns in the data. Fine-tuning gamma is essential for achieving optimal model performance.nu
: It represents an upper bound on the fraction of margin errors and support vectors. It allows users to control the balance between precision and recall in the model. A smaller nu value makes the algorithm more lenient, permitting a higher fraction of margin errors and support vectors, which can be useful in scenarios with a considerable number of anomalies.
Python3
clf_svm = OneClassSVM(kernel="rbf", degree=3, gamma=0.1, nu=0.01)
y_predict = clf_svm.fit_predict(data_50k_df)
Model evaluation
Here model evaluation is little different from other traditional ML models. This model can be evaluated by accuracy by not for performance of classification rather for outlier or anomalies detection or logically separation.
Python3
svm_predict = pd.Series(y_predict).replace([-1,1],[1,0])
svm_anomalies = data_50k_df[svm_predict==1]
# Calculate accuracy
accuracy = accuracy_score(y, svm_predict)
print("Accuracy in separating Outlier:", accuracy)
Output:
Accuracy in separating Outlier: 0.9641
Visualizing detected outliers(anomalies)
Now we will plot the inlier and outlier plots between any two features. To do this we will define a small function(plot_OCSVM) which can plot any features with outliers any per our choice. By just changing the integer value during function calling we can visualize them.
Python3
def plot_OCSVM(i):
plt.scatter(data_50k_df.iloc[:,i],data_50k_df.iloc[:,i+1],c='red',s=40, edgecolor="k")
plt.scatter(svm_anomalies.iloc[:,i],svm_anomalies.iloc[:,i+1],c='green', s=40, edgecolor="k")
plt.title("OC-SVM Outlier detection between Feature Pair: V{} and V{}".format(i,i+1))
plt.xlabel("V{}".format(i))
plt.ylabel("V{}".format(i+1))
#plot_OCSVM(1) # chnage the integer value to visualize different pairs of features
plot_OCSVM(2)
#plot_OCSVM(3)
Output:
Detected Outliers or anomalies
So, from the above plot we can clearly see the One-class SVM has sharply separated the normal occurrences with anomalies(potentially outliers) for both the features. We can also visualize other features by calling the function with different values.
Similar Reads
Classifying data using Support Vector Machines(SVMs) in R
Support Vector Machines (SVM) are supervised learning models mainly used for classification and but can also be used for regression tasks. In this approach, each data point is represented as a point in an n-dimensional space, where n is the number of features. The goal is to find a hyperplane that b
5 min read
Classifying data using Support Vector Machines(SVMs) in Python
Introduction to SVMs: In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. A Support Vector Machine (SVM) is a discriminative classifier
4 min read
Heart Disease Prediction using Support Vector Machine
Heart disease is a significant health concern worldwide, and early detection plays a crucial role in effective treatment and prevention. Machine learning algorithms, such as Support Vector Machines (SVM), have shown promising results in predicting heart disease based on patient data. SVM stands for
4 min read
Multi-class classification using Support Vector Machines (SVM)
Support Vector Machines (SVM) are widely recognized for their effectiveness in binary classification tasks. However, real-world problems often require distinguishing between more than two classes. This is where multi-class classification comes into play. While SVMs are inherently binary classifiers,
6 min read
Image classification using Support Vector Machine (SVM) in Python
Support Vector Machines (SVMs) are a type of supervised machine learning algorithm that can be used for classification and regression tasks. In this article, we will focus on using SVMs for image classification. When a computer processes an image, it perceives it as a two-dimensional array of pixels
9 min read
Visualizing Support Vector Machines (SVM) using Python
Support Vector Machines (SVM) are powerful machine learning algorithms used for classification tasks. They work by finding the best hyperplane that separates different classes in the feature space. SVM is particularly useful in both linear and non-linear classification problems. Weâll demonstrate ho
5 min read
Understanding num_classes for xgboost in R
One of the most well-liked and effective machine learning libraries for a range of applications, including regression and classification, is called XGBoost (Extreme Gradient Boosting). Data scientists and machine learning practitioners use it because of its excellent accuracy and capacity to handle
4 min read
Dual Support Vector Machine
Pre-requisite: Separating Hyperplanes in SVM The Lagrange multiplier equation for the support vector machine. The equation of that can be given by: \underset{\vec{w},b}{min} \underset{\vec{a}\geq 0}{max} \frac{1}{2}\left \| w \right \|^{2} - \sum_{j}a_j\left [ \left ( \vec{w} \cdot \vec{x}_{j} \righ
6 min read
Support Vector Machine (SVM) Algorithm
Support Vector Machine (SVM) is a supervised machine learning algorithm used for classification and regression tasks. While it can handle regression problems, SVM is particularly well-suited for classification tasks. SVM aims to find the optimal hyperplane in an N-dimensional space to separate data
10 min read
Kernel Trick in Support Vector Classification
Support Vector Machines (SVMs) have proven to be a powerful and versatile tool for classification tasks. A key component that significantly enhances the capabilities of SVMs, particularly in dealing with non-linear data, is the Kernel Trick. This article delves into the intricacies of the Kernel Tri
3 min read