0% found this document useful (0 votes)
6 views

Hasnain Saeed Lab Task # 11

Uploaded by

hasnainsaeed478
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Hasnain Saeed Lab Task # 11

Uploaded by

hasnainsaeed478
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING

COLLEGE OF E&ME, NUST, RAWALPINDI

CS-117 Applications of ICT

LAB – 11
Machine Learning - Classification

Course Instructor: Asst. Prof. Jahan Zeb

Lab Instructor: Engr. Ayesha Khanam

Student Name HASNAIN SAEED

Degree/ Syndicate MECHANICAL ENGINEERING (B)

CMS ID 00000512244

1
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

LAB # 11
Machine Learning - Classification
LAB TASK NO # 01:
You are working for an HR department tasked with predicting whether employees will stay with the company or
leave within the next year. Build a classification model using a dataset that includes features such as Age, Years of
service, Job satisfaction level, and Performance rating. Your target label is Employee Retention (1 for staying, 0
for leaving). Download a relevant dataset or create one that suits this problem. Train a Decision Tree Classifier to
make predictions and evaluate the model's accuracy.

INPUT:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from google.colab import files
uploaded = files.upload()
data = =pd.read_csv(‘employees.csv’)

OUTPUT:

2
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

INPUT:
X = df.drop('Employee_Retention', axis=1)
y = df['Employee_Retention']
# Split data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize the Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)
# Train the classifier on the training data
clf.fit(X_train, y_train)
# Predict the target variable for the test set
y_pred = clf.predict(X_test)
# Evaluate the model's accuracy
accuracy = accuracy_score(y_test, y_pred)
# Print detailed classification metrics
report = classification_report(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")
print("\nClassification Report:\n", report)
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt
# Plot the decision tree
plt.figure(figsize=(12, 8))
plot_tree(clf, filled=True, feature_names=X.columns, class_names=["Leave", "Stay"], rounded=True,
fontsize=12)

3
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

plt.show()

OUTPUT:

OUTPUT:

4
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

5
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

LAB TASK NO # 02:

A university admissions office wants to predict whether a student will be admitted based on their grades in
previous degrees, entrance test score and recommendation score. The target label is Admitted (1 for
yes, 0 for no). Download or create a dataset with relevant features, preprocess it appropriately, and train a Decision
Tree Classifier. Test the model’s performance.

INPUT:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from google.colab import files
uploaded = files.upload()
data = =pd.read_csv(‘test_models.csv’)
# Split dataset into features and target
X = df.drop('Admitted', axis=1)
y = df['Admitted']

# Split data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize the Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)

# Train the classifier on the training data


clf.fit(X_train, y_train)
# Predict the target variable for the test set
y_pred = clf.predict(X_test)

# Evaluate the model's accuracy


accuracy = accuracy_score(y_test, y_pred)

# Print detailed classification metrics


report = classification_report(y_test, y_pred)

print(f"Accuracy: {accuracy:.4f}")
print("\nClassification Report:\n", report)

6
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

from sklearn.tree import plot_tree


import matplotlib.pyplot as plt

# Plot the decision tree


plt.figure(figsize=(12, 8))
plot_tree(clf, filled=True, feature_names=X.columns, class_names=["Not Admitted", "Admitted"], rounded=True,
fontsize=12)
plt.show()
OUTPUT:

7
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

LAB TASK NO # 03:


An e-commerce platform wants to predict whether a customer will purchase a product during a sale. Use a dataset
that includes features such as browsing time, pages viewed, number of items in cart, and discount offered. The
target label is Purchased (1 for purchased, 0 for not purchased). Download a relevant dataset for this scenario.
Preprocess the data if required, train a Decision Tree Classifier, and evaluate its performance. Additionally,
determine which feature most significantly influences the purchasing decision.

INPUT:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from google.colab import files
uploaded = files.upload()
data = =pd.read_csv(‘test_models.csv’)
# Split dataset into features and target
X = df.drop('Purchased', axis=1)
y = df['Purchased']

# Split data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize the Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)

# Train the classifier on the training data


clf.fit(X_train, y_train)
# Predict the target variable for the test set
y_pred = clf.predict(X_test)

# Evaluate the model's accuracy


accuracy = accuracy_score(y_test, y_pred)

# Print detailed classification metrics


report = classification_report(y_test, y_pred)

8
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

print(f"Accuracy: {accuracy:.4f}")
print("\nClassification Report:\n", report)
# Get feature importances
feature_importances = clf.feature_importances_

# Create a DataFrame to display feature importances


importance_df = pd.DataFrame({
'Feature': X.columns,
'Importance': feature_importances
})

# Sort by importance
importance_df = importance_df.sort_values(by='Importance', ascending=False)

print(importance_df)
OUTPUT:

9
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

10
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI

11

You might also like