0% found this document useful (0 votes)
39 views

Python Code Demonstration

The document provides instructions for using Google Colab to run machine learning algorithms under the supervision of Prof. Manoj Kumar Tiwari. It outlines how to create and connect a notebook, load datasets, and run classification, regression, and clustering algorithms like logistic regression, decision trees, k-nearest neighbors on various datasets including wine reviews, car sales, mobile phone prices, and customer segmentation data. Evaluation metrics like ROC curves and confusion matrices are used to analyze model performance.

Uploaded by

shreyash kanhed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Python Code Demonstration

The document provides instructions for using Google Colab to run machine learning algorithms under the supervision of Prof. Manoj Kumar Tiwari. It outlines how to create and connect a notebook, load datasets, and run classification, regression, and clustering algorithms like logistic regression, decision trees, k-nearest neighbors on various datasets including wine reviews, car sales, mobile phone prices, and customer segmentation data. Evaluation metrics like ROC curves and confusion matrices are used to analyze model performance.

Uploaded by

shreyash kanhed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

ML

TECHNIQUES
UNDER SUPERVISION OF
PROF.MANOJ KUMAR TIWARI,
DIRECTOR NITIE
INSTRUCTIONS TO USE GOOGLE COLAB

1. CREATING A NOTEBOOK

Enter the google colab website


or click here. To start coding, go
to the file section on the top left
corner and create a new
notebook

4. RUN CODE
2. AUTHENTICATION
Finally, type in your code in the
If colab isn't linked with your cell and press the play button
Google account, you will to run your code
receive a pop up to sign in
with your Google account

3. CONNECTING TO
NOTEBOOK
Connect the notebook in order
to run your code
INSTRUCTIONS TO USE GOOGLE COLAB TO RUN THE
ALGORITHMS

1. CONNECT YOUR GOOGLE


ACCOUNT
After opening our notebook by
clicking here, click on the
connect button present on the
top right corner and sign in to
your google account. After the
authentication, the notebook will
connect to your acount

2. LOAD DATASET
3. RUN ALGORITHMS
Progressing ahead, press the
Finally, press the play button
play button on the Dataset
on the last cell
Cell
DATA SET INFORMATION

CLASSIFICATION
CLASSIFICATION REGRESSION CLUSTERING

Wine-analytics Classification Wine-analytics Public Utilities (Existing)


(Existing) Advertisement(Existing) Customer Segmentation (New)
Wine-analytics Classification CarSales (New)
(Logistic Regression) (Existing) Metal Sales (New)
SUV_purchase (New)
Mobile Price
Classification(New)
Car Ownership
Classification(Existing)
CLASSIFICATION
CLASSIFICATION REGRESSION CLUSTERING

MODELS USED FOR CLASSIFICATION


Logistic Decision Tree K-NN
Regression

DATASET USED FOR CLASSIFICATION


New Data New Data

SUV Purchase Mobile Price


Forecasting Purchasing ability of SUVs. Predict price range indicating how high the price
The goal is fit a Classifier to the data and is
provide predictions for future customers. Feature: battery_power, mobile_wt, px_height,
Feature: Gender, Age, EstimatedSalary px_width, ram
Target: Purchased Target: price_range [0(low cost), 1(medium
cost), 2(high cost) and 3(very high cost)]
CLASSIFICATION REGRESSION CLUSTERING

PERFORMING
CLASSIFICATION
Simply enter the option number to
perform classification. Furthermore,
two more options are displayed to
choose the dataset and model of the
user's choice
CLASSIFICATION REGRESSION CLUSTERING

1. LOGISTIC REGRESSION
Dataset Description (wine_alytics(logistic))
CLASSIFICATION REGRESSION CLUSTERING

2. LOGISTIC REGRESSION
Data Visualization

wine_alytics(logistics) SUV_Purchase mobile_price


CLASSIFICATION REGRESSION CLUSTERING

2. LOGISTIC REGRESSION
For wine_alytics(logistic) Dataset

ROC-AUC CURVE Confusion Matrix


(entropy)
Train set

Test set
CLASSIFICATION REGRESSION CLUSTERING

2. LOGISTIC REGRESSION
For SUV_Purchase Dataset

ROC-AUC CURVE Confusion Matrix


(entropy)
Train set

Test set
CLASSIFICATION REGRESSION CLUSTERING

2. LOGISTIC REGRESSION
For mobile_price dataset

ROC-AUC CURVE
Confusion Matrix
(entropy) Train set

Test set
CLASSIFICATION REGRESSION CLUSTERING

2. DECISION TREE
CLASSIFIER
Dataset Description (wine_alytics)
CLASSIFICATION REGRESSION CLUSTERING

2. DECISION TREE CLASSIFIER

Dataset - (wine_alytics)
CLASSIFICATION REGRESSION CLUSTERING

2. DECISION TREE CLASSIFIER


CLASSIFICATION REGRESSION CLUSTERING

2. DECISION TREE CLASSIFIER


For SUV_Purchase Dataset
Confusion Matrix (entropy)
Train set Test set
ROC-AUC CURVE
(entropy)

Confusion Matrix (gini)


Train set Test set
ROC-AUC CURVE
(gini)

ACCURACY (gini): ACCURACY (entropy):


Training set: 0.9187 Training set: 0.9187
Test set: 0.9125 Test set: 0.9125
CLASSIFICATION REGRESSION CLUSTERING

2. DECISION TREE CLASSIFIER


For mobile_data Dataset
Confusion Matrix (entropy)
Train set Test set
ROC-AUC CURVE
(entropy)

Confusion Matrix (gini)


Train set Test set
ROC-AUC CURVE
(gini)

ACCURACY (gini): ACCURACY (entropy):


Training set: 0.7806 Training set: 0.7681
Test set: 0.7400 Test set: 0.7550
CLASSIFICATION REGRESSION CLUSTERING

2. DECISION TREE CLASSIFIER


Data Visualization

SUV_Purchase mobile_price
CLASSIFICATION REGRESSION CLUSTERING

2. DECISION TREE CLASSIFIER


For wine_alytics Dataset
CLASSIFICATION REGRESSION CLUSTERING

2. DECISION TREE CLASSIFIER


For SUV_Purchase Dataset
CLASSIFICATION REGRESSION CLUSTERING

2. DECISION TREE CLASSIFIER


For mobile_data Dataset

Depth of tree = 2
Classifier = entropy
CLASSIFICATION REGRESSION CLUSTERING

3. K-NEAREST NEIGHBORS
CLASSIFICATION
CLASSIFICATION REGRESSION CLUSTERING

3. K-NEAREST NEIGHBORS
CLASSIFICATION
Each feature are Normalised so the features become range
independent.

SUV_PURCHASE

MOBILE PRICE CLASSIFICATION


CLASSIFICATION REGRESSION CLUSTERING

3. K-NEAREST NEIGHBORS
CLASSIFICATION
Visualization of target class

SUV_Purchase mobile_price
CLASSIFICATION REGRESSION CLUSTERING

3. K-NEAREST NEIGHBORS
CLASSIFICATION

PERFORMING
KNN

Enter the first option to choose a


specific number of neighbors to Enter the max k and the number of cross-validation sets to search
look in Knn-Algorithm or the best k within a range
subsequent one to choose the
best k

CLASSIFICATION REGRESSION CLUSTERING

3. K-NEAREST NEIGHBORS
CLASSIFICATION

Best K

Best K

Max k =50, cross-validation set = 10


CLASSIFICATION REGRESSION CLUSTERING

3. K-NEAREST NEIGHBORS
CLASSIFICATION

SUV_Purchase Dataset
CLASSIFICATION REGRESSION CLUSTERING

3. K-NEAREST NEIGHBORS
CLASSIFICATION

Mobile_data Dataset
CLASSIFICATION REGRESSION CLUSTERING

3. K-NEAREST NEIGHBORS
CLASSIFICATION

CAR OWNERSHIP CLASSIFICATION AFTER LABELING AND SCALING

CAR OWNERSHIP CLASSIFICATION


CLASSIFICATION REGRESSION CLUSTERING

3. K-NEAREST NEIGHBORS
CLASSIFICATION

CHOOSING MAX K AND NUMBER OF CROSS VALIDATIONS SETS TO BE CREATED TO FIND BEST K

Best K
CLASSIFICATION REGRESSION CLUSTERING

MODELS USED FOR REGRESSION


Linear
Decision Tree
Regression

DATASET USED FOR REGRESSION

Old Data New Data New Data

Advertising Car Sales Monthly Steel


Sales Consumption
Predicting first year sales from Predicting sales of cars Forecasting monthly steel
advertisement consumption
Multi feature regression
Single Feature regression or Time-series forecasting
Feature: Supplier name, Car
multivariate Feature: Month number, steel
Feature: Cost of Advertisements, model, Car model etc.
ids
Promotion expenditure and Target: Sales Target: Monthly consumption
Competitors' sales
Target: Sales
CLASSIFICATION REGRESSION CLUSTERING

PERFORMING
REGRESSION
Simply enter the option number to perform
regression. Furthermore, two more options
are displayed to choose the dataset and
model of the user's choice
CLASSIFICATION REGRESSION CLUSTERING

20,000
3

15,000
2
10,000
COMPARISON OF RMSE
VALUES
1
5,000

0 0 Advertisement Dataset: Since the features are already highly correlated


Linear Regressor Decision Tree Linear Regressor Decision Tree
with the target value, a simple linear regression model can easily fit a line
Advertisement Dataset Car Sales Dataset
on the single and multivariate featured dataset.
Car Sales Dataset: The Dataset contains many features, including discrete

400 categorical variables. Due to this Decision Tree filters out the best
predictions through its tree-like structure
300
Steel Consumption: This dataset is slightly complicated and the target
values cannot be plotted on a single hyperplane. Hence Decision tree
200

Steel Consumption outperforms Linear Regressor


100
Dataset

0 16
Linear Regressor Decision Tree
CLASSIFICATION REGRESSION CLUSTERING

ANALYSIS OF
CAR SALES DATA

Data Description

Displays Data

Encoding non numerical values respective to


mean of target values
CLASSIFICATION REGRESSION CLUSTERING

Correlation Matrix

Plot of correlation Matrix


CLASSIFICATION REGRESSION CLUSTERING

RESULTS FOR CAR SALES


USING LINEAR REGRESSION

16
CLASSIFICATION REGRESSION CLUSTERING

K-MEANS CLUSTERING

DATASET USED FOR CLUSTERING

Old Data New Data

Customer
Public Utilities
Segmentation

Dataset consisting 5 features


Data of 22 firms with 8 variables
Divide the customers up based on
are given
common characteristics such as
We have to find clusters of similar
demographics or behaviors,
public utilities
5 16
CLASSIFICATION REGRESSION CLUSTERING

Public Utilities
PCA has been used for dimensionality reduction.
For k=2 and k=3, clustering visualization could be seen below.

k=2 k=3
CLASSIFICATION REGRESSION CLUSTERING

Changes in clustering of
Public Utilities Dataset
Since the coordinates of final clusters
in K-means depend on their initial
positions, we have found a different
result than the one shown in the ppt

Our Result
CLASSIFICATION REGRESSION CLUSTERING

Mall customer Segmentation


PCA has been used for dimensionality reduction.
For k=3 and k=4, clustering visualization could be seen below.

k=3 k=4
THANK YOU

THE DATASETS CAN BE VIEWED HERE THE NOTEBOOK CAN BE VIEWED HERE

You might also like