Ex 6,EX 7 AIML
Ex 6,EX 7 AIML
ALGORITHM
Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM)
Step-3: Divide the S into subsets that contains possible values for the best attributes.
Step-4: Generate the decision tree node, which contains the best attribute.
Step-5: Recursively make new decision trees using the subsets of the dataset created in step
-3. Continue this process until a stage is reached where you cannot further classify the nodes and
called the final node as a leaf node.
We will be using the IRIS dataset to build a decision tree classifier. The dataset contains
information for three classes of the IRIS plant, namely IRIS Setosa, IRIS Versi colour, and IRIS
Virginica, with the following attributes: sepal length, sepal width, petal length, and petal widt
PROGRAM:
# Importing the required packages
import numpy as np
import pandas as pd
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
import matplotlib.pyplot as plt
return balance_data
Data Splitting:
# Performing training
clf_gini.fit(X_train, y_train)
return clf_gini
Training with Entropy:
# Performing training
clf_entropy.fit(X_train, y_train)
return clf_entropy
if __name__ == "__main__":
data = importdata()
X, Y, X_train, X_test, y_train, y_test = splitdataset(data)
OUTPUT:
DATA INFO
Dataset Length: 625
Dataset Shape: (625, 5)
Dataset: 0 1 2 3 4
0 B 1 1 1 1
1 R 1 1 1 2
2 R 1 1 1 3
3 R 1 1 1 4
4 R 1 1 1 5
RESULT:
Thus the decision tree for Iris dataset is executed and verified.
Ex.No.7 BUILD A SVM MODEL
Date:
AIM
To build a SVM model using MATLAB plot.
Support Vector Machine (SVM) is a powerful machine learning algorithm used for linear or
nonlinear classification, regression, and even outlier detection tasks.
1. Hyperplane: Hyper plane is the decision boundary that is used to separate the data points
of different classes in a feature space. In the case of linear classifications, it will be a linear
equation i.e. wx+b = 0.
2. Support Vectors: Support vectors are the closest data points to the hyper plane, which
makes a critical role in deciding the hyper plane and margin.
3. Margin: Margin is the distance between the support vector and hyper plane. The main
objective of the support vector machine algorithm is to maximize the margin. The wider
margin indicates better classification performance.
4. Kernel: Kernel is the mathematical function, which is used in SVM to map the original
input data points into high-dimensional feature spaces, so, that the hyper plane can be easily
found out even if the data points are not linearly separable in the original input space. Some
of the common kernel functions are linear, polynomial, radial basis function (RBF), and
sigmoid.
import numpy as np
import pandas as pd
import sklearn
import sklearn.datasets as ds
import sklearn.model_selection as ms
import sklearn.svm as svm
import matplotlib.pyplot as plt
%matplotlib inline
We generate 2D points and assign a binary label according to a linear operation on the
coordinates:
X = np.random.randn(200, 2)
y = X[:, 0] + X[:, 1] > 1
We now fit a linear Support Vector Classifier (SVC). This classifier tries to separate the two
groups of points with a linear boundary (a line here, but more generally a hyperplane):
Let's take a look at the classification results with the linear SVC:
ax = plot_decision_function(
est, "Linearly separable, linear SVC")
We now modify the labels with a XOR function. A point's label is 1 if the coordinates have
different signs. This classification is not linearly separable. Therefore, a linear SVC fails
completely:
The SVC classifier in scikit-learn uses the Radial Basis Function (RBF) kernel:
est = ms.GridSearchCV(
svm.SVC(), {'C': np.logspace(-3., 3., 10),
'gamma': np.logspace(-3., 3., 10)})
est.fit(X, y)
print("Score: {0:.3f}".format(
ms.cross_val_score(est, X, y).mean()))
plot_decision_function(
est.best_estimator_, "XOR, non-linear SVC")
RESULT:
Thus SVM is built using MATLAB plot.