Assignment 2
Assignment 2
Assignment Overview
In this question, you will work with a dataset that contains two features and a binary class
label. Your task is to utilize Python to develop a classification model that predicts the class
of an instance based on the provided features.
Dataset Description
A dataset, named Assignment Data – Spiral.csv, provided and consists of 500 rows and
three columns, where each row represents an instance and each column represents an
attribute of the item. The target variable is whether the item is true or false.
.
• Target Variable: class - Indicates whether the item is true (1) or false (0).
• Features: X1 and X2.
Tasks
1. Data Loading and Exploration
o Load the dataset into a pandas DataFrame.
2. Data Preprocessing
o Split the dataset into training and test sets (e.g., 80% train, 20% test).
3. Model Building
o Experiment with at least two diTerent classification algorithms (e.g., Logistic
Regression, Decision Trees, k-Nearest Neighbors, etc.).
o Train the models using the training dataset.
o Evaluate the models using the test dataset. Use metrics like accuracy,
specificity, sensitivity, precision, recall, and F1-score.
4. Model Evaluation
o Compare the performance of the diTerent models.
o Plot the ROC curve and calculate the AUC for each model.
5. Model Selection
o Select the best-performing model based on your evaluation metrics.
o Provide a justification for your choice.
o Report accuracy, specifity, precision, recall, and sensitivity.
6. Documentation and Reporting
o Create your final jupyter notebook file (.ipynb)
o Write a report summarizing your approach, the name of the models you built,
and highlight the best model performance, accuracy, specifity, precision,
recall, and sensitivity.
Submission Requirements
• Code: .ipynb file.
• Report: brief report as outlined above.
Deadline
By the end of Week 10.
This assignment will help you practice essential classification techniques and get familiar
with handling real-world datasets. If you have any questions, feel free to ask during class or
reach out via email.
Rubrik:
This assignment is marked relative to others:
1- Completing a model: 2 marks.
2- Relative to other students: 6 marks.
Assignment Overview
In this question, you will work with a dataset that contains some employee features. Your
task is to encrypt the Email column.
Dataset Description
A dataset, named OTice Data, is provided. Encrypt only the Email column.