0% found this document useful (0 votes)

25 views

Practical - 5 - 52

This document discusses applying KNN classification with and without feature scaling on a dataset. It includes the following key steps: 1. The dataset is split into training and test sets. 2. Standard scaling is applied to the training and test sets. 3. KNN models are trained and tested on the original and scaled data, showing that scaling improves performance. 4. The 'elbow method' is used to select the optimal k value, showing best results for k=28 on scaled data.

Uploaded by

Royal Empire

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

Practical - 5 - 52

Uploaded by

Royal Empire

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

20BECE30058

from google.colab import drive

drive.mount('/content/drive')

Mounted at /content/drive

import pandas as pd
import numpy as np

df=pd.read_csv('/content/drive/MyDrive/ML Lab/HCP/Classified Data',index_col=0)

print(df.head()) from sklearn.model_selection import train_test_split

X=df.drop('TARGET
CLASS',axis=1) 2

0 0.643798 0.879422 1.231409 2

2 1.154483 0.957877 1.285597 0 3 1.380003 1.522692 1.153093 2
4 0.646691 1.463812 1.419167 2

y=df['TARGET CLASS']

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=100)

WTT PTI EQW SBI LQE QWG FDJ

\ 0 0.913917 1.162073 0.567946 0.755464 0.780862 0.352608 0.759697
1 0.635632 1.003722 0.535342 0.825645 0.924109 0.648450 0.675334
2 0.721360 1.201493 0.921990 0.855595 1.526629 0.720781 1.626351
3 1.234204 1.386726 0.653046 0.825624 1.142504 0.875128 1.409708
4 1.279491 0.949750 0.627280 0.668976 1.232537 0.703727 1.115596

PJF HQE NXJ TARGET

CLASS 1 1.013546 0.621552 1.492702
from sklearn.neighbors 0
import KNeighborsClassifier
knn=KNeighborsClassifier(n_neighbors=1)
knn.fit(X_train,y_train) pred=knn.predict(X_test)

----Arguments----

KNeighborsClassi er(

n_neighbors=5,

weights='uniform'(---'uniform' or 'callable'),

algorithm='auto'({'auto', 'ball_tree', 'kd_tree', 'brute'},Algorithm used to compute the nearest neighbors),

leaf_size=30,

p=2(---Power parameter. When p = 1, this is

equivalent to using manhattan_distance (l1), and euclidean_distance (l2) for p = 2),

metric='minkowski',

metric_params=None(---the distance metric to use for the tree),

n_jobs=None,

from sklearn.metrics import classification_report,confusion_matrix

print(confusion_matrix(y_test,pred))
print(classification_report(y_test,pred))

[[98 12]
[ 8 82]]
precision recall f1-score support

0 0.92 0.89 0.91 110

1 0.87 0.91 0.89 90
accuracy 0.90 200
macro avg 0.90 0.90 0.90 200
weighted avg 0.90 0.90 0.90 200

1
20BECE30058
KNN using Standard Scaler

1) Split the Dataset

#------Here we are not knowing that what are the features so how to group the data points?
#-----If the values of some features are higher than it is required to do the feature scaling otherwise such features will show a very majo
#-----it will have much effect on the distance between the features

import pandas as pd
import numpy as np

df=pd.read_csv('/content/drive/MyDrive/ML Lab/HCP/Classified Data',index_col=0)

print(df.head()) from sklearn.model_selection import train_test_split

X=df.drop('TARGET CLASS',axis=1)

y=df['TARGET CLASS']

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=100)

WTT PTI EQW SBI LQE QWG FDJ

PJF HQE NXJ TARGET

CLASS 0 0.643798 0.879422 1.231409 1
1 1.013546 0.621552 1.492702 0
2 1.154483 0.957877 1.285597 0
3 1.380003 1.522692 1.153093 1
4 0.646691 1.463812 1.419167 1

2) Scale the Splitted dataset

1st Fit the data

2nd Transform the data

from sklearn.preprocessing import StandardScaler

scaler=StandardScaler()
scaler.fit(X_train) #--- It will drop the target class as we dont want to scale the labels

StandardScaler()

scaled_features_X_train=scaler.transform(X_train)
scaled_features_X_test=scaler.transform(X_test)

3) Apply KNN Model on the scaled dataset

from sklearn.neighbors import KNeighborsClassifier

knn=KNeighborsClassifier(n_neighbors=1) #---means k=1

knn.fit(scaled_features_X_train,y_train)
pred_1=knn.predict(scaled_features_X_test) pred

array([0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0,
1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1,
1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0,
0, 0, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0,
0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1,
1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0,
0, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0,
1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1,
0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0,
0, 1])

2
20BECE30058
4) Find the Classi cation Report for KNN =1 using scaled Data

from sklearn.metrics import classification_report,confusion_matrix

print(confusion_matrix(y_test,pred_1))
print(classification_report(y_test,pred_1))

#---Here you can see that the number of Misclassifications(17) in scaled dataset is more as compared to unscaled dataset(20).

[[98 12] [ 5 85]]

precisionrecall f1-scoresupport

0 0.95 0.89 0.92 110

1 0.88 0.94 0.91 90

accuracy 0.92 200

macro avg 0.91 0.92 0.91 200
weighted avg 0.92 0.92 0.92 200

'Elbow' method to nd correct value of 'k'

#------Use elbow method to choose correct value of k

#------Use the model with different values of 'k' and plot the error rate
#-----and observe which one has minimum error rate

error_rate=[] #-----empty list

for i in range(1,40):
knn=KNeighborsClassifier(n_neighbors=i)
knn.fit(scaled_features_X_train,y_train)
pred_i=knn.predict(scaled_features_X_test)
error_rate.append(np.mean(pred_i != y_test))
#-- - -taking the mean of all prediction and actual labels which are not equal

print(error_rat)

[0.085, 0.09, 0.09, 0.08, 0.09, 0.075, 0.09, 0.075, 0.095, 0.075, 0.095, 0.075, 0.08, 0.08, 0.075, 0.085, 0.085, 0.085, 0.08, 0.085

import matplotlib.pyplot as plt

plt.figure(figsize=(10,6))
plt.plot(range(1,40),error_rate,color='blue',linestyle='--',marker='o')
plt.title('Error Rate vs K value (1 to 40)') plt.xlabel('K value')
plt.ylabel('Error rate') plt.grid()

knn=KNeighborsClassifier(n_neighbors=32)

3
20BECE30058
knn.fit(scaled_features_X_train,y_train) pred_28=knn.predict(scaled_features_X_test)

print(confusion_matrix(y_test,pred_28)) print('\n') print(classification_report(y_test,pred_28))

#---Compare the confusion matrix for k=1 and for k=28 , it has better classsification

#---Misclassifications without Scaled dataset : 20 #---Misclassifications with Scaled dataset :17

#---Misclassification with better 'k' value and scaled dataset :16

[[99 11]
[ 3 87]]

precision recall f1-score support

0 0.97 0.90 0.93 110

1 0.89 0.97 0.93 90

accuracy 0.93 200

macro avg 0.93 0.93 0.93 200
weighted avg 0.93 0.93 0.93 200

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Database Assignment
No ratings yet
Database Assignment
13 pages
Poster - Template - PPTX (1) (2) A Fe
No ratings yet
Poster - Template - PPTX (1) (2) A Fe
1 page
Machine Learning Assignment 3
No ratings yet
Machine Learning Assignment 3
7 pages
Implementing KNN Algorithm on the Iris Dataset
No ratings yet
Implementing KNN Algorithm on the Iris Dataset
7 pages
LAB-4 Report
No ratings yet
LAB-4 Report
21 pages
K-Nearest Neighbor On Python Ken Ocuma
100% (2)
K-Nearest Neighbor On Python Ken Ocuma
9 pages
Lab Session 9
No ratings yet
Lab Session 9
2 pages
Assignment 02
No ratings yet
Assignment 02
5 pages
data preprocessing
No ratings yet
data preprocessing
9 pages
DL Exp-1.4 19BCS1431
No ratings yet
DL Exp-1.4 19BCS1431
5 pages
Slip
No ratings yet
Slip
5 pages
Scikit-Learn: Scikit-Learn Is An Open Source Python Library That
100% (1)
Scikit-Learn: Scikit-Learn Is An Open Source Python Library That
1 page
KNN_colab_illustration
No ratings yet
KNN_colab_illustration
5 pages
Mlda - Lab
No ratings yet
Mlda - Lab
35 pages
DM ML Practical
No ratings yet
DM ML Practical
13 pages
02 - Email - Spam - Ipynb - Colab
No ratings yet
02 - Email - Spam - Ipynb - Colab
11 pages
Week10 KNN Practical
No ratings yet
Week10 KNN Practical
4 pages
Unit2 ML Programs
No ratings yet
Unit2 ML Programs
7 pages
KnnClassifier - Jupyter Notebook
No ratings yet
KnnClassifier - Jupyter Notebook
2 pages
Case Study - Classifier
No ratings yet
Case Study - Classifier
5 pages
Machine Learning With Python - Machine Learning Algorithms - KNN
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - KNN
15 pages
Codes for Project
No ratings yet
Codes for Project
8 pages
Lab 8
No ratings yet
Lab 8
7 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
5 pages
Is Lab 7
No ratings yet
Is Lab 7
7 pages
Worksheet - 2.3 20BCS7490
No ratings yet
Worksheet - 2.3 20BCS7490
6 pages
Assignment #1: K Nearest Neighbor Classifier: Name: Srikanth Mujjiga (Roll No: 2015-50-831
No ratings yet
Assignment #1: K Nearest Neighbor Classifier: Name: Srikanth Mujjiga (Roll No: 2015-50-831
8 pages
Worksheet - 2.3 20BCS7611
No ratings yet
Worksheet - 2.3 20BCS7611
6 pages
AIML_ECE304_Assign-2_kartikeya_Kandpal_Ajitesh_S.ipynb - Colab
No ratings yet
AIML_ECE304_Assign-2_kartikeya_Kandpal_Ajitesh_S.ipynb - Colab
4 pages
SL
No ratings yet
SL
30 pages
Practical 1 52
No ratings yet
Practical 1 52
4 pages
MLLABDA2
No ratings yet
MLLABDA2
5 pages
SVM K NN MLP With Sklearn Jupyter NoteBo
No ratings yet
SVM K NN MLP With Sklearn Jupyter NoteBo
22 pages
B-56 Sanket Jambhulkar MLA-7
No ratings yet
B-56 Sanket Jambhulkar MLA-7
9 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
SPPUML5
No ratings yet
SPPUML5
4 pages
AI_ML22203009 - Assignment-10
No ratings yet
AI_ML22203009 - Assignment-10
3 pages
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
100% (1)
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
125 pages
IML Assingment Report
No ratings yet
IML Assingment Report
6 pages
ML Lab2 pgm
No ratings yet
ML Lab2 pgm
3 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
33 pages
mnbnmnbnnmbbhhuyrgh
No ratings yet
mnbnmnbnnmbbhhuyrgh
3 pages
I Avaliação Parcial - 25.0 PTS - Gabarito
No ratings yet
I Avaliação Parcial - 25.0 PTS - Gabarito
9 pages
Lab 10- Manual and assignment on KNN
No ratings yet
Lab 10- Manual and assignment on KNN
3 pages
Machine Learning Practical PDF
No ratings yet
Machine Learning Practical PDF
12 pages
20-SE-66 ML Assign 2
No ratings yet
20-SE-66 ML Assign 2
4 pages
Tous Les Algo de ML
No ratings yet
Tous Les Algo de ML
7 pages
Knn
No ratings yet
Knn
3 pages
ML Notes
100% (2)
ML Notes
125 pages
ML Lab
No ratings yet
ML Lab
7 pages
Experiment 4
No ratings yet
Experiment 4
8 pages
KNN Clearly Explained 1696688332
No ratings yet
KNN Clearly Explained 1696688332
7 pages
Python For Data Science Cheat Sheet: Scikit-Learn Create Your Model Evaluate Your Model's Performance
100% (1)
Python For Data Science Cheat Sheet: Scikit-Learn Create Your Model Evaluate Your Model's Performance
1 page
ML 4 (1)
No ratings yet
ML 4 (1)
33 pages
210596_ML_Labtask5.ipynb_k - Colab
No ratings yet
210596_ML_Labtask5.ipynb_k - Colab
8 pages
Experiment 11 Code
No ratings yet
Experiment 11 Code
4 pages
KNN Model
No ratings yet
KNN Model
5 pages
Machine Learning Assignment (1)
No ratings yet
Machine Learning Assignment (1)
8 pages
Here's An Visualization of The K-Nearest Neighbors Algorithm
No ratings yet
Here's An Visualization of The K-Nearest Neighbors Algorithm
5 pages
MLLabManual
No ratings yet
MLLabManual
24 pages
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Behavioral UML Diagrams: 1) UML Use Case Diagram Purpose
No ratings yet
Behavioral UML Diagrams: 1) UML Use Case Diagram Purpose
8 pages
Automatic Control Systems, 9th Edition: Chapter 6
No ratings yet
Automatic Control Systems, 9th Edition: Chapter 6
48 pages
ECSS E HB 10 02A Verification Guidelines
No ratings yet
ECSS E HB 10 02A Verification Guidelines
96 pages
EET302 M2-Ktunotes - in
No ratings yet
EET302 M2-Ktunotes - in
33 pages
Tugas Kelompok SIM - Kelas VD
No ratings yet
Tugas Kelompok SIM - Kelas VD
4 pages
Lec1 - Introduction To Control System
No ratings yet
Lec1 - Introduction To Control System
25 pages
Modeling, Simulation and Position Control of 3DOF Articulated Manipulator
No ratings yet
Modeling, Simulation and Position Control of 3DOF Articulated Manipulator
10 pages
MIS For Decision Making
No ratings yet
MIS For Decision Making
4 pages
Se Int - I Q.P Cse - A PC 501 CS
No ratings yet
Se Int - I Q.P Cse - A PC 501 CS
2 pages
Neeraj Goel: Professional Summary Languages
No ratings yet
Neeraj Goel: Professional Summary Languages
2 pages
Scilab Companion
No ratings yet
Scilab Companion
74 pages
Advanced Technology For Informed Insurance Policy Recommendations
No ratings yet
Advanced Technology For Informed Insurance Policy Recommendations
6 pages
Ai Image Generation: Presented by Mrunal Kotian:035 Nikhil Walunj: 032 Nikita Domale:034 Prathamesh Wagh 040
No ratings yet
Ai Image Generation: Presented by Mrunal Kotian:035 Nikhil Walunj: 032 Nikita Domale:034 Prathamesh Wagh 040
8 pages
1.introduction To Machine Learning (ML), OBE
No ratings yet
1.introduction To Machine Learning (ML), OBE
7 pages
Ford KNC Suspension en
No ratings yet
Ford KNC Suspension en
1 page
Position and Orientation Control of A Two-Wheeled
No ratings yet
Position and Orientation Control of A Two-Wheeled
7 pages
PID Speed Control
100% (2)
PID Speed Control
9 pages
3.3. Digital Protoyping
No ratings yet
3.3. Digital Protoyping
3 pages
Compare ISO 9001 AS9100c PDF
No ratings yet
Compare ISO 9001 AS9100c PDF
1 page
Printout Final IATF 16949 2016 - Key Changes
100% (3)
Printout Final IATF 16949 2016 - Key Changes
89 pages
Mother Tongue and Language Acquisition
No ratings yet
Mother Tongue and Language Acquisition
3 pages
Inventory Management Chapter 5 Warehouse and Supply Chain Management
No ratings yet
Inventory Management Chapter 5 Warehouse and Supply Chain Management
78 pages
065, G. Contissa Et Al., Automation and Liability in ATM As Fundamental Issues in Socio-Technical Systems
No ratings yet
065, G. Contissa Et Al., Automation and Liability in ATM As Fundamental Issues in Socio-Technical Systems
3 pages
Answers To Practice Questions For Module 2 - PDF
No ratings yet
Answers To Practice Questions For Module 2 - PDF
7 pages
PPAP Awareness Training
No ratings yet
PPAP Awareness Training
28 pages
Autosar Brochure
No ratings yet
Autosar Brochure
4 pages
CV and DIP Coures Outline
No ratings yet
CV and DIP Coures Outline
3 pages
Business Architecture
100% (2)
Business Architecture
4 pages

Practical - 5 - 52

Uploaded by

Practical - 5 - 52

Uploaded by

20BECE30058

from google.colab import drive

df=pd.read_csv('/content/drive/MyDrive/ML Lab/HCP/Classified Data',index_col=0)

print(df.head()) from sklearn.model_selection import train_test_split

0 0.643798 0.879422 1.231409 2

WTT PTI EQW SBI LQE QWG FDJ

PJF HQE NXJ TARGET

algorithm='auto'({'auto', 'ball_tree', 'kd_tree', 'brute'},Algorithm used to compute the nearest neighbors),

p=2(---Power parameter. When p = 1, this is

metric_params=None(---the distance metric to use for the tree),

from sklearn.metrics import classification_report,confusion_matrix

0 0.92 0.89 0.91 110

1) Split the Dataset

df=pd.read_csv('/content/drive/MyDrive/ML Lab/HCP/Classified Data',index_col=0)

print(df.head()) from sklearn.model_selection import train_test_split

WTT PTI EQW SBI LQE QWG FDJ

PJF HQE NXJ TARGET

2) Scale the Splitted dataset

2nd Transform the data

from sklearn.preprocessing import StandardScaler

3) Apply KNN Model on the scaled dataset

from sklearn.neighbors import KNeighborsClassifier

knn=KNeighborsClassifier(n_neighbors=1) #---means k=1

from sklearn.metrics import classification_report,confusion_matrix

[[98 12] [ 5 85]]

0 0.95 0.89 0.92 110

accuracy 0.92 200

'Elbow' method to nd correct value of 'k'

#------Use elbow method to choose correct value of k

error_rate=[] #-----empty list

import matplotlib.pyplot as plt

print(confusion_matrix(y_test,pred_28)) print('\n') print(classification_report(y_test,pred_28))

#---Misclassifications without Scaled dataset : 20 #---Misclassifications with Scaled dataset :17

precision recall f1-score support

0 0.97 0.90 0.93 110

accuracy 0.93 200

You might also like