Machine learning record
Machine learning record
COLLEGE
OWNED AND MANAGED BY TAMILNADU EDUCATIONAL AND MEDICAL FOUNDATION A
JAIN MINORITY INSTITUTION
APPROVED BY AICTE &PROGRAMMES ACCREDITED BY NBA, NEW DELHI, (UG PROGRAMMES – MECH, AI&DS, ECE, CSE,IT) ALL
PROGRAMMES RECOGNIZED BY THE GOVERNMENT OF TAMIL NADU AND AFFILIATED TO ANNA UNIVERSITY, CHENNAI GURU
MARUDHARKESARI BUILDING, JYOTHI NAGAR, RAJIV GANDHI SALAI, OMR THORAIPAKKAM, CHENNAI - 600 097.
REGULATION-2021
NAME :
REGISTER NUMBER :
YEAR : II
SEMESTER : IV
MISRIMAL NAVAJEE MUNOTH JAIN ENGINEERING
COLLEGE
OWNED AND MANAGED BY TAMILNADU EDUCATIONAL AND MEDICAL FOUNDATION
A JAIN MINORITY INSTITUTION
APPROVED BY AICTE &PROGRAMMES ACCREDITED BY NBA, NEW DELHI, (UG PROGRAMMES – MECH, AI&DS, ECE, CSE,IT)
ALL PROGRAMMES RECOGNIZED BY THE GOVERNMENT OF TAMIL NADU AND AFFILIATED TO ANNA UNIVERSITY, CHENNAI
GURU MARUDHARKESARI BUILDING, JYOTHI NAGAR, RAJIV GANDHI SALAI, OMR THORAIPAKKAM, CHENNAI - 600 097.
.
VISION
To produce high quality, creative and ethical engineers, and technologists
contributing effectively to the ever-advancing Artificial Intelligence and Data
Science field.
MISSION
To educate future software engineers with strong fundamentals by
continuously improving the teaching-learning methodologies using
contemporary aids.
To produce ethical engineers/researchers by instilling the values of
humility, humaneness, honesty and courage to serve the society.
To create a knowledge hub of Artificial Intelligence and Data Science
with everlasting urge to learn by developing, maintaining and continuously
improving the resources/Data Science.
MISRIMAL NAVAJEE MUNOTH JAIN ENGINEERING
COLLEGE
OWNED AND MANAGED BY TAMILNADU EDUCATIONAL AND MEDICAL FOUNDATION A
JAIN MINORITY INSTITUTION
APPROVED BY AICTE &PROGRAMMES ACCREDITED BY NBA, NEW DELHI, (UG PROGRAMMES – MECH, AI&DS, ECE, CSE,IT) ALL
PROGRAMMES RECOGNIZED BY THE GOVERNMENT OF TAMIL NADU AND AFFILIATED TO ANNA UNIVERSITY, CHENNAI GURU
MARUDHARKESARI BUILDING, JYOTHI NAGAR, RAJIV GANDHI SALAI, OMR THORAIPAKKAM, CHENNAI - 600 097.
Register No:
BONAFIDE CERTIFICATE
DATE:
MISRIMAL NAVAJEE MUNOTH JAIN ENGINEERING
COLLEGE
OWNED AND MANAGED BY TAMILNADU EDUCATIONAL AND MEDICAL FOUNDATION A
JAIN MINORITY INSTITUTION
APPROVED BY AICTE &PROGRAMMES ACCREDITED BY NBA, NEW DELHI, (UG PROGRAMMES – MECH, EEE, ECE, CSE & IT)
ALL PROGRAMMES RECOGNIZED BY THE GOVERNMENT OF TAMIL NADU AND AFFILIATED TO ANNA UNIVERSITY, CHENNAI
GURU MARUDHARKESARI BUILDING, JYOTHI NAGAR, RAJIV GANDHI SALAI, OMR THORAIPAKKAM, CHENNAI - 600 097.
COURSE OUTCOMES
PAGE
S.NO. TOPIC DATE SIGNATURE
NO
Write a program to
8.
implement Decision Tree
classification model
V
SYLLABUS
AD3461 MACHINE LEARNING LABORATORY
COURSE OBJECTIVES
To get practical knowledge on implementing machine learning algorithms in real time problem
for getting solutions
To implement supervised learning and their applications
To understand unsupervised learning like clustering and EM algorithms
To understand the theoretical and practical aspects of probabilistic graphical models.
Tools: Python, Numpy, Scipy, Matplotlib, Pandas, statmodels, seaborn, plotly, bokeh
Suggested Exercises:
1. For a given set of training data examples stored in a .CSV file, implement and demonstrate the
Candidate-Elimination algorithm to output a description of the set of all hypotheses
consistent with the training examples.
2. Build an Artificial Neural Network by implementing the Backpropagation algorithm and test
the same using appropriate data sets.
3. Write a program to implement the naïve Bayesian classifier for a sample training data set
stored as a .CSV file and compute the accuracy with a few test data sets.
4. Implement naïve Bayesian Classifier model to classify a set of documents and measure
the accuracy, precision, and recall.
5. Write a program to construct a Bayesian network considering medical data. Use this model
to d demonstrate the diagnosis of heart patients using standard Heart Disease Data Set.
6. Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set
for clustering using the k-Means algorithm. Compare the results of these two
algorithms.
7. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set.
Print both correct and wrong predictions.
8. Write a program to implement Decision Tree classification model.
9. Implement Logistic regression Algorithm with a dataset . And measure the accuracy score
and Confusion matrix.
10. Implement Linear regression Algorithm with a dataset . And measure the accuracy score.
Ex.No: 1 For a given set of training data examples stored in a
.CSV file implement and demonstrate the Candidate –
Elimination algorithm to output a description of the set
Date : of all hypotheses consistent with the training examples
AIM:
To implement and demonstrate the Candidate-Elimination algorithm to output a
description of the set of all hypotheses consistent with the training examples.
DATASET: trainingdata1.xlsx
ALGORITHM:
10
PROGRAM/SOURCE CODE:
import numpy as
np import pandas
as pd
data = pd.DataFrame(data=pd.read_excel('trainingdata1.xlsx'))
print(data)
Origin Manufacturer color Decade Type Example Type
0 Japan Honda blue 1980 economy positive
1 Japan Toyota green 1970 sports positive
2 Japan Toyota blue 1990 economy negative
3 USA Chrysler red 1980 economy positive
4 Japan Honda white 1980 economy positive
concepts = np.array(data.iloc[:,0:-
1]) target = np.array(data.iloc[:,-1])
print("concept:",concepts)
print("target:",target)
12
print("\nSteps of Candidate Elimination Algorithm: ",i+1)
print("Specific_h: ",i+1)
print(specific_h,"\n")
print("general_h :",
i+1) print(general_h)
indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]
print("\nIndices",indices)
for i in indices:
general_h.remove(['?', '?', '?', '?', '?', '?'])
return specific_h, general_h
s_final,g_final = learn(concepts, target) print("\
nFinal Specific_h:", s_final, sep="\n")
OUTPUT :
general_h : 5
[['?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?'], ['?', '?', '?',
'?', '?'], ['?', '?', '?', '?', '?'], ['?', '?', '?', '?', '?']]
Indices []
Final Specific_h:
['Japan' 'Honda' 'blue' 1980 'economy']
13
RESULT:
Thus, the program to Implement the concept of decision trees with suitable data set from real world
problem and classify the data set to produce new sample using Python has been executed successfully.
14
Ex.No: 2 Build an Artificial Neural Network by implementing the
Backpropagation algorithm and test the same using
Date : appropriate data sets.
AIM:
Build an Artificial Neural Network by implementing the Backpropagation algorithm and
test the same using appropriate data sets.
ALGORITHM:
PROGRAM/SOURCE CODE:
import numpy as np
X = np.array(([2, 9], [1, 5], [3, 6]), dtype=float)
y = np.array(([92], [86], [89]), dtype=float)
X = X/np.amax(X,axis=0) # maximum of X array longitudinally y = y/100
#Sigmoid Function
def sigmoid (x):
return (1/(1 + np.exp(-x)))
#Derivative of Sigmoid
Function def
derivatives_sigmoid(x):
return x * (1 - x)
#Variable
initialization
epoch=7000 #Setting training
iterations lr=0.1 #Setting learning rate
inputlayer_neurons = 2 #number of features in data set
hiddenlayer_neurons = 3 #number of hidden layers
neurons output_neurons = 1 #number of neurons at output
layer #weight and bias initialization
wh=np.random.uniform(size=(inputlayer_neurons,hiddenlayer_neurons))
bh=np.random.uniform(size=(1,hiddenlayer_neurons))
wout=np.random.uniform(size=(hiddenlayer_neurons,output_neurons))
bout=np.random.uniform(size=(1,output_neurons))
#Backpropagation
EO = y-output
outgrad =
derivatives_sigmoid(output) d _output
= EO* outgrad
EH = d_output.dot(wout.T)
hiddengrad = derivatives_sigmoid(hlayer_act)
OUTPUT:
Input:
[[0.66666667 1. ]
[0.33333333 0.55555556]
[1. 0.66666667]]
Actual Output:
[[92.]
[86.]
[89.]]
Predicted
Output:
[[0.99999908]
[0.99999712]
[0.99999904]]
RESULT:
Thus, the program to Build an Artificial Neural Network by implementing the Backpropagation
algorithm using python has been executed successfully.
16
Ex.No: 3 Write a program to implement the naïve Bayesian classifier for a
sample training dataset stored as a .CSV file. Compute the
accuracyof the classifier, considering few test data sets
Date :
AIM:
To implement the naïve Bayesian classifier for a sample training dataset stored as a .CSV
file and compute the accuracy with a few test data sets.
DATASET: pim_indian.csv
LINK: https://drive.google.com/file/d/18PcjOtDELvR8wY4-iCiAXm1wNuox67a7/view?usp=share_link
ALGORITHM:
Step 2: . Load the dataset and split it into training and testing datasets using `train_test_split`.
Step 3: . Train the Naive Bayes classifier on the training data using
`GaussianNB().fit(xtrain,ytrain.ravel())`.
Step 4: Predict the class labels for the testing data using `clf.predict(xtest)`.
PROGRAM/SOURCE CODE:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn import metrics
17
# Load the data
df = pd.read_csv("pim_indian.csv")
feature_col_names = ['Pregnancies', 'Glucose', 'BloodPressure', 'BMI',
'DiabetesPedigreeFunction', 'Age']
predicted_class_name = ['Diabetes']
X=df[feature_col_names].values
y=df[predicted_class_name].values
xtrain,xtest,ytrain,ytest=train_test_split(X,y,test_size=0.33)
print('\n the total number of Training Data: ',ytrain.shape)
print('\n the total number of Test Data: ',ytest.shape)
clf = GaussianNB().fit(xtrain, ytrain.ravel())
predicted = clf.predict(xtest)
predicttestdata = clf.predict([[6, 148, 72, 35, 0,
50]]) print('\n confusion matrix')
print(metrics.confusion_matrix(ytest, predicted))
print('\n accuracy of the clssifier is',metrics.accuracy_score(ytest,predicted))
print('\n the value of precision', metrics.precision_score(ytest,predicted))
print('\n the value of recall', metrics.recall_score(ytest,predicted))
print("predict value for individual test dataa:",predicttestdata)
OUTPUT:
the total number of Training Data: (514,
18
confusion
matrix [[144 28]
[ 35 47]]
OUTPUT:
Thus, the program to implement the naïve Bayesian classifier for a sample training dataset stored as a
.CSV file using Python has been executed successfully.
19
Ex.No: 4 Implement naïve Bayesian Classifier model to classify a
set of documents and measure the accuracy, precision
Date : ,and recall.
AIM:
Implement naïve Bayesian Classifier model to classify a set of documents and
measure the accuracy, precision, and recall
DATASET: naivetext.csv
LINK: https://drive.google.com/file/d/1sEpbtiB9qP6DdpqlvbB_6OL8aBsJqe8s/view?usp=share_link
ALGORITHM:
Step 6: Use the trained classifier to make predictions on the test data.
Step 7: Calculate and print the accuracy, confusion matrix, precision, and recall.
PROGRAM/SOURCE CODE:
import pandas as pd
msg=pd.read_csv('NaiveText.csv',names=['message','label'])
print('The dimensions of the dataset',msg.shape)
msg['labelnum']=msg.label.map({'pos':'1','neg':0})
20
21
X=msg.messag
e y=msg.label
print(X)
print(y)
from sklearn.model_selection import train_test_split
xtrain,xtest,ytrain,ytest=train_test_split(X,y)
print ('\n The total number of Training Data :',ytrain.shape)
print ('\n The total number of Test Data :',ytest.shape)
22
OUTPUT:
The dimensions of the dataset (19, 2)
0 message
1 I love this sandwich
2 This is an amazing place
3 I feel very good about these beers
4 This is my best work
5 What an awesome view
6 I do not like this restaurant
7 I am tired of this stuff
8 I can't deal with this
9 He is my sworn enemy
10 My boss is horrible
11 This is an awesome place
12 I do not like the taste of this juice
13 I love to dance
14 I am sick and tired of this place
15 What a great holiday
16 That is a bad locality to stay
17 We will have good fun tomorrow
18 I went to my enemy's house
today Name: message, dtype: object
0 label
1 1
2 1
3 1
4 1
5 1
6 0
7 0
8 0
9 0
10 0
11 1
12 0
13 1
14 0
15 1
23
16 0
17 1
18 0
Name: label, dtype: object
Confusion
matrix [[2 0]
[1 2]]
0.8333333333333333
RESULT:
24
Ex.No: 5 Write a program to construct a Bayesian network
considering medical data. Use this model to demonstrate the
Date : diagnosis of heart patients using standard Heart Disease
Data Set.
AIM:
To construct a Bayesian network considering medical data. Use this model to
demonstrate the diagnosis of heart patients using standard Heart Disease Data Set.
DATASET: heart.csv
LINK: https://drive.google.com/file/d/10C80zeowRWEGazpPZw_n0wK4f_rRlRbL/view?usp=share_link
ALGORITHM:
Step 6: Learn CPDs of the model from the dataset using MLE.
Step 7: Perform inference with the Bayesian network using 'VariableElimination' class.
Step 8: Compute and print the probabilities of heart disease given evidence of 'restecg=1'
and 'cp=2' using the 'query' method of 'VariableElimination' object.
25
PROGRAM/SOURCE CODE:
!pip install pgmpy
import numpy as
np import pandas
as pd import csv
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianNetwork
from pgmpy.inference import VariableElimination
#read Cleveland Heart Disease data
heartDisease = pd.read_csv('heart.csv')
heartDisease = heartDisease.replace('?',np.nan)
#display the data
print('Sample instances from the dataset are given below')
print(heartDisease.head())
#display the Attributes names and datatyes
print('\n Attributes and datatypes')
print(heartDisease.dtypes)
#Creat Model- Bayesian Network
model =BayesianNetwork([('age','heartdisease'),('sex','heartdisease'),(
'exang','heartdisease'),('cp','heartdisease'),('heartdisease', 'restecg'),
('heartdisease','chol')])
[5 rows x 14 columns]
Attributes and
datatypes age int64
sex int64
cp int64
trestbps int64
chol int64
fbs int64
restecg int64
thalach int64
exang int64
oldpeak float64
slope
int64
ca object
thal object
heartdisease
int64
dtype: object
27
Inferencing with Bayesian Network:
AIM:
To Apply EM algorithm to cluster a set of data stored in a .CSV file. Use the same data set
for clustering using the k-Means algorithm. Compare the results of these two algorithms.
DATASET: iris.csv
LINK: https://drive.google.com/file/d/1-lseekjQ6h1xHKETLYlm7a_IfZ-3sAOY/view?usp=share_link
ALGORITHM:
Step 2: Read the dataset from the given CSV file into a pandas dataframe.
Step 3: Extract the input features from the dataset anDstore it in a new dataframe
X.
Step 4: Create a KMeans model with three clusters and fit it to the input data X.
Step 5: Create a Gaussian Mixture model with three components and fit it to the input
data X.
Step 6: Print the accuracy score and confusion matrix of both models.
29
PROGRAM/SOURCE CODE:
from sklearn.cluster import
KMeans from sklearn import
preprocessing
from sklearn.mixture import GaussianMixture
from sklearn.datasets import load_iris
import sklearn.metrics as sm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
dataset=load_iris()
# print(dataset)
X=pd.DataFrame(dataset.data)
X.columns=['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y=pd.DataFrame(dataset.target)
y.columns=['Targets']
# print(X)
plt.figure(figsize=(14,7))
colormap=np.array(['red','lime','black'])
# REAL PLOT
plt.subplot(1,3,1)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y.Targets],s=40)
plt.title('Real')
# K-PLOT
plt.subplot(1,3,2)
model=KMeans(n_clusters=3)
model.fit(X)
predY=np.choose(model.labels_,[0,1,2]).astype(np.int64)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[predY],s=40)
plt.title('KMeans')
# GMM PLOT
30
scaler=preprocessing.StandardScaler()
31
scaler.fit(X)
xsa=scaler.transform(X)
xs=pd.DataFrame(xsa,columns=X.columns)
gmm=GaussianMixture(n_components=3)
gmm.fit(xs)
y_cluster_gmm=gmm.predict(xs)
plt.subplot(1,3,3)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y_cluster_gmm],s=40)
plt.title('GMM Classification')
OUTPUT:
32
33
RESULT:
Thus, the program to implement EM algorithm to cluster a set of data stored in a
.CSV file. Use the same data set for clustering using the k-Means algorithm has been
executed successfully.
34
Ex.No:7 Write a program to implement k-Nearest Neighbour
algorithm to classify the iris dataset. Print both
Date : correct and wrong predictions
AIM:
To implement k-Nearest Neighbour algorithm to classify the iris data set. Print both
correct and wrong predictions
DATASET: iris.csv
LINK: https://drive.google.com/file/d/1vVwpEuIyb-r3uVtrvbZxBMDsX5wpq-ml/view?usp=share_link
ALGORITHM:
Step 1: Load the Iris dataset from a CSV file into a pandas dataframe.
Step 2: Split the dataset into the input features X and output class y.
Step 4: Initialize a KNN classifier with n_neighbors set to 5 and fit it to the training data.
Step 5: Predict the class labels for the testing set using the KNN classifier.
Step 6: Calculate and print the confusion matrix, classification report, and accuracy score
of the KNN classifier.
35
PROGRAM/SOURCE CODE:
from sklearn.datasets import load_iris
from sklearn.neighbors import
KNeighborsClassifier from sklearn.model_selection
import train_test_split import numpy as np
dataset=load_iris()
#print(dataset)
X_train,X_test,y_train,y_test=train_test_split(dataset["data"],dataset["target"],random_state=0)
kn=KNeighborsClassifier(n_neighbors=1)
kn.fit(X_train,y_train)
for i in range(len(X_test)):
x=X_test[i]
x_new=np.array([x])
prediction=kn.predict(x_new)
print("TARGET=",y_test[i],dataset["target_names"][y_test[i]],"PREDICTED=",prediction,dataset
["target_names"]
[prediction])
print(kn.score(X_test,y_test))
OUTPUT:
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
RESULT:
Thus, the program to implement k-Nearest Neighbour algorithm to classify the iris
data set using Python has been executed successfully.
36
Ex.No :8
AIM:
To implement Decision Tree classification model using a .CSV file to measure the
accuracy.
DATASET: data_cleaned.csv
LINK: https://drive.google.com/file/d/1VbrVnGcblK7e2PXVQlMkUTtxTkaVVYBW/view?usp=share_link
ALGORITHM:
Step 1: Load cleaned data and split into training and validation sets.
Step 7: Iterate through different values of max_depth to generate train and validation
Accuracy scores.
Step 8: Visualize train and validation accuracy scores using a line graph.
Step 9: Create a decision tree classifier with max_depth of 8 and max_leaf_nodes of 25.
Step 10: Fit the classifier to the training set and evaluate accuracy on training and validation
Sets.
37
38
Step 11: Use graphviz to create a visualization of the decision tree.
PROGRAM/SOURCE CODE:
import pandas as
pd import numpy
as np
import matplotlib.pyplot as plt
%matplotlib inline
data=pd.read_csv('data_cleaned.csv')
print(data.shape)
data.isnull().sum()
y=
data['Survived']
X = data.drop(['Survived'], axis=1)
from sklearn.model_selection import train_test_split
X_train, X_valid, y_train, y_valid = train_test_split(X, y, random_state = 101, stratify=y,
test_size=0.25)
y_train.value_counts (normalize=True)
y_valid.value_counts(normalize=True)
X_train.shape, y_train.shape
X_valid.shape, y_valid.shape
from sklearn.tree import DecisionTreeClassifier
dt_model =
DecisionTreeClassifier(random_state=10)
dt_model.fit(X_train, y_train)
dt_model.score(X_train, y_train)
dt_model.score(X_valid, y_valid)
dt_model.predict(X_valid)
dt_model.predict_proba(X_valid)
y_pred = dt_model.predict_proba(X_valid)
[:,1] y_new = []
for i in range(len(y_pred)):
if y_pred[i]<=0.7:
y_new.append(0)
else:
y_new.append(1)
from sklearn.metrics import accuracy_score
accuracy_score(y_valid, y_new)
train_accuracy = []
validation_accuracy = []
for depth in
39
range(1,30):
dt_model = DecisionTreeClassifier(max_depth=depth, random_state=10)
dt_model.fit(X_train, y_train)
train_accuracy.append(dt_model.score(X_train, y_train))
40
validation_accuracy.append(dt_model.score(X_valid, y_valid))
frame = pd.DataFrame({'max_depth':range(1,30), 'train_acc':train_accuracy,
'valid_acc':validation_accuracy})
frame.head(15)
plt.figure(figsize=(12,6))
plt.plot(frame['max_depth'], frame['train_acc'], marker='o')
plt.plot(frame['max_depth'], frame['valid_acc'],
marker='o') plt.xlabel('Depth of tree')
plt.ylabel('performance')
image =
plt.imread('tree.png')
plt.figure(figsize=(15,15))
plt.imshow(image)
OUTPUT:
(891, 25)
41
RESULT:
Thus the program to implement Decision Tree classification model using a .CSV file to
measure the accuracy using python has been executed successfully.
42
Ex.No: 9
Implement Logistic regression Algorithm with a dataset
Date : .And measure the accuracy score and confusion matrix
AIM:
To implement Logistic regression Algorithm with a dataset . And measure the accuracy
score and Confusion matrix.
DATASET: iris.csv
LINK:https://drive.google.com/file/d/1w8C2PmuZkDOuVEhIwTdBb3LJMW7HIJ1R/view?usp=share_link
ALGORITHM:
Step 1: Import necessary libraries and load the dataset using pandas.read_csv().
Step 2: Extract the 'temp' and 'label' columns and reshape them.
Step 3: Create a scatter plot with a logistic regression line using seaborn.regplot().
Step 4: Split the data into training and testing sets using train_test_split().
Step 5: Initialize a LogisticRegression model object and fit the training data.
Step 6: Predict the y values for the testing data and calculate the accuracy score using
Accuracy score.
Step 7: Generate a confusion matrix to evaluate the model's performance on the entire
Dataset.
43
PROGRAM/SOURCE CODE:
import numpy as
np import pandas
as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix
import seaborn as sns
import matplotlib.pyplot as plt
44
# Calculate the accuracy score
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
45
# Generate the confusion matrix
conf_matrix = confusion_matrix(y_test,
y_pred) print('Confusion Matrix:')
print(conf_matrix)
46
OUTPUT:
Accuracy: 1.0
Confusion
Matrix:
[[17 0]
[ 0 13]]
RESULT:
Thus, the program to implement Logistic regression Algorithm with a dataset . And
measure the accuracy score and Confusion matrix using Python has been executed
successfully.
47
Ex.No: 10
Implement Linear regression Algorithm with a
Date :
dataset And measure the accuracy score.
AIM:
Implement Linear regression Algorithm with a dataset and measure the accuracy score.
LINK: https://drive.google.com/file/d/1zSPRRkSxqLPYWsQTCQHjAqUfNv14zrqU/view?usp=share_link
ALGORITHM:
Step 1: Import necessary libraries and mount Google Drive to access the dataset using
Drive.mount().
Step 2: Load the 'linear_data.csv' dataset using pandas.read_csv() function and store it in a
DataFrame variable called df.
Step 3: Extract the 'x' and 'y' columns and reshape them.
Step 4: Split the data into training and testing sets using train_test_split() with a test size of
0.25.
Step 5: Initialize a LinearRegression model object and fit the training data.
Step 6: Predict the y values for the testing data using lr.predict().
Step 7: Create a scatter plot of the 'x' and 'y' data with a linear regression line using
matplotlib.pyplot.scatter() and matplotlib.pyplot.plot() functions Dataset.
Step 8: Calculate the R-squared score of the model using r2_score() function by comparing
the predicted y values with the actual y values in the testing set.
48
PROGRAM/SOURCE CODE:
import numpy as
np import pandas
as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
import matplotlib.pyplot as plt
49
# Calculate the R^2 score
r2 = r2_score(y_test,
y_pred) print(f'R^2 score:
{r2}')
50
# Plotting the scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(y_test, y_pred, alpha=0.5)
plt.xlabel('Actual Values')
plt.ylabel('Predicted Values')
plt.title('Actual vs Predicted
Values')
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red')
# Diagonal line for reference
plt.show()
51
OUTPUT:
R^2 score: 0.595770232606166
RESULT:
Thus, the program to Implement Linear regression Algorithm with a
dataset and measure the accuracy score using python has been executed successfully.
52