Hasnain Saeed Lab Task # 11
Hasnain Saeed Lab Task # 11
LAB – 11
Machine Learning - Classification
CMS ID 00000512244
1
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
LAB # 11
Machine Learning - Classification
LAB TASK NO # 01:
You are working for an HR department tasked with predicting whether employees will stay with the company or
leave within the next year. Build a classification model using a dataset that includes features such as Age, Years of
service, Job satisfaction level, and Performance rating. Your target label is Employee Retention (1 for staying, 0
for leaving). Download a relevant dataset or create one that suits this problem. Train a Decision Tree Classifier to
make predictions and evaluate the model's accuracy.
INPUT:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from google.colab import files
uploaded = files.upload()
data = =pd.read_csv(‘employees.csv’)
OUTPUT:
2
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
INPUT:
X = df.drop('Employee_Retention', axis=1)
y = df['Employee_Retention']
# Split data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize the Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)
# Train the classifier on the training data
clf.fit(X_train, y_train)
# Predict the target variable for the test set
y_pred = clf.predict(X_test)
# Evaluate the model's accuracy
accuracy = accuracy_score(y_test, y_pred)
# Print detailed classification metrics
report = classification_report(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")
print("\nClassification Report:\n", report)
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt
# Plot the decision tree
plt.figure(figsize=(12, 8))
plot_tree(clf, filled=True, feature_names=X.columns, class_names=["Leave", "Stay"], rounded=True,
fontsize=12)
3
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
plt.show()
OUTPUT:
OUTPUT:
4
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
5
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
A university admissions office wants to predict whether a student will be admitted based on their grades in
previous degrees, entrance test score and recommendation score. The target label is Admitted (1 for
yes, 0 for no). Download or create a dataset with relevant features, preprocess it appropriately, and train a Decision
Tree Classifier. Test the model’s performance.
INPUT:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from google.colab import files
uploaded = files.upload()
data = =pd.read_csv(‘test_models.csv’)
# Split dataset into features and target
X = df.drop('Admitted', axis=1)
y = df['Admitted']
# Split data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize the Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)
print(f"Accuracy: {accuracy:.4f}")
print("\nClassification Report:\n", report)
6
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
7
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
INPUT:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from google.colab import files
uploaded = files.upload()
data = =pd.read_csv(‘test_models.csv’)
# Split dataset into features and target
X = df.drop('Purchased', axis=1)
y = df['Purchased']
# Split data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize the Decision Tree Classifier
clf = DecisionTreeClassifier(random_state=42)
8
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
print(f"Accuracy: {accuracy:.4f}")
print("\nClassification Report:\n", report)
# Get feature importances
feature_importances = clf.feature_importances_
# Sort by importance
importance_df = importance_df.sort_values(by='Importance', ascending=False)
print(importance_df)
OUTPUT:
9
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
10
DEPARTMENT OF COMPUTER & SOFTWARE ENGINEERING
COLLEGE OF E&ME, NUST, RAWALPINDI
11