CA One
CA One
1. Analyze a dataset from a problem domain in depth, and select appropriate statistical models,
tools, and techniques to derive insights regarding the dataset and domain.
3. Construct, refine, interpret, and critically evaluate predictive analytical and machine learning
models.
4. Critically evaluate and utilize hyperparameter search strategies for optimizing machine
learning models.
1
Supervised Machine Learning – Classification (100 Marks)
Dataset
Target Variable:
Attrition_Flag – Two labels - ‘Existing Customer’, ‘Attrited Customer’ (Customer who has
churned)
Task
The bank wants to use a classification model that can predict customer churn. Construct a
suitable classification model for the bank by implementing both random forest and support
vector classification algorithms in Python.
In addition to providing the python code file, you are required to provide critical analysis of
your approach and results in a pdf report.
2
Your code and analysis should cover the following points:
1. Data Preparation (What steps would you take to prepare your data? Discuss your approach)
[20]
2. Model Hyperparameter Tuning (Which hyperparameters would you tune and why? How
would you tune them?) [20]
3. Choice of Evaluation Metric (Which metric would be suitable for model evaluation and
why?) [20]
5. Results analysis
a). Which of the two models (random forest or support vector classifier) would you
recommend for deployment in the real-world?
b). Is any model underfitting? If yes, what could be the possible reasons?
[20]
Naming convention:
Report should be named as –
Report_Firstname_Surname.pdf
Code should be named as –
Code_Firstname_Surname.py
Zipped folder should be named as –
Firstname_Surname.zip
There is no prescribed word-count for the report. It will be assessed on quality, and not
quantity of content.
3
Assessment Criteria
Each part will be graded according to the following criteria:
1. Quality of code (correctness and completeness) [Weightage – 40%]
2. Quality of analysis in report (critical analysis of approach, presentation and interpretation
of results, conclusion) [Weightage – 60%]