ITNPBD6 Assignment 2018-2 PDF

A bank is having issues with loan defaults and wants to predict who will repay loans. A student is given loan data for 2000 customers to build predictive models using data mining techniques. The student must create a report covering the data, data preparation, two models built (decision tree and neural network), hyperparameters, results, and errors to analyze how well each model can predict loan repayment. The goal is to help the bank assess new customers' likelihood of repaying loans.

Uploaded by

Vaibhav Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

516 views2 pages

ITNPBD6 Assignment 2018-2 PDF

Uploaded by

Vaibhav Jain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

ITNPBD6 Data Mining Assignment

2018
Computing and Maths
University of Stirling

The banks are having a bit of trouble with debt at the moment. They have lent lots of
money to people who promised to pay it back, and then didn’t. In the future, they
would like to avoid lending to the kind of person who won’t pay back the loan, and
that is where you come in. We have got some data from a bank describing 2000 of its
loan customers. The data also tells us whether or not each customer repaid the loan.

The question is simple – Can we predict who will repay the loans and who won’t?
Your assignment is to answer that question using data mining techniques and produce
a system that would be able to tell the bank how likely it is that a new customer would
pay back a loan.

You can use any software of your choice (for example, Weka or scikit learn in
Python) and you will not be required to submit any code, just a report. You should
employ best practice for both the project management and the machine learning
aspects of the project. The data you need for the project is also available on the course
Canvas page.

You must hand in a report covering the following:

Introduction 10 Marks
Describe the task you were given, the data you received and the requirements of the
finished system. Define any terminology that you will use in the report (for example,
model, variable, task, etc.). Describe the project methodology you will use.

Data Summary 10 Marks

List the variables that you found in the file provided by the company. For each one,
say whether it is nominal or numeric, continuous or discrete and whether or not it
should be considered for building the solution. Explain your decisions.

Data Preparation 10 Marks

Describe what you did with the data prior to the modelling process. Show histograms
of the one example variable before and after any pre-processing that you carried out.
If you corrected any mis-typed entries in the data, report what you changed.

Modelling 50 Marks
You must use two different techniques and build models with both: pick a suitable
tree building algorithm and also use a multi-layer perceptron. Describe the different
methods you used and the results that you got. Give a detailed technical description of
the techniques and the way the models are represented. Include one diagram showing
the structure of each type of model that you build. In this section, it is particularly
important that the description and the diagrams are your own work. Do not copy (or
even paraphrase) from other sources. You must avoid plagiarism.
Describe what hyperparameters may be changed and what effect this has. If you
varied the hyperparameters of a model, show how this impacted on the results.
Describe how you split the data for training, validation and testing purposes. Be
methodical and record each result. This stage is a little like scientific research – you
are carrying out experiments in your search for the best solution. Once you have a
solution, show how you verified its robustness. For the two different techniques report
on their comparative ability to predict a defaulted loan, and also on how easy it would
be for the insurance company to understand the model and the reasons behind each
prediction it makes.

Results and Errors 20 Marks

Analyse and describe the level of accuracy the model achieves and the errors your
model makes. Show a confusion matrix for each model. Are there any areas of the
data where it performs worse than in others? Show an ROC curve for the decision as
to whether or not a loan will be repaid and describe what the curve shows.

Submission
Check the course web site on Canvas for the submission deadline. Upload your report
via canvas by the deadline. There is an 8000 word limit on the report and marks will
be deducted at a rate of 10 for every 1000 words over you go.

You do not need to submit the models that you built, just the report.

You can assume that the client has a good technical understanding of data mining and
statistics, so do not shy away from technical terms in your report. Where you use
them, however, explain what they mean in plain language too. To maximise your
mark, make sure you follow the instructions above and include everything that is
asked for in the report.

Plagiarism
Work which is submitted for assessment must be your own work. All students should
note that the University has a formal policy on plagiarism which can be found at
http://www.quality.stir.ac.uk/ac-policy/assessment.php.

This assignment is worth 50% of the overall grade for the course, and is subject to the
usual grade penalties for late submission. This assignment is set by Kevin Swingler.
You can email questions about it to [email protected].

Low-Code/No-Code: Citizen Developers and the Surprising Future of Business Applications
From Everand
Low-Code/No-Code: Citizen Developers and the Surprising Future of Business Applications
Phil Simon
2.5/5 (2)
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
CompTIA Network+ Practice Questions
From Everand
CompTIA Network+ Practice Questions
IP Specialist
No ratings yet
Project On Data Mining: Prepared by Ashish Pavan Kumar K PGP-DSBA at Great Learning
No ratings yet
Project On Data Mining: Prepared by Ashish Pavan Kumar K PGP-DSBA at Great Learning
50 pages
Assignment 1 DA_E Oct 2023 V1-1 (3)
No ratings yet
Assignment 1 DA_E Oct 2023 V1-1 (3)
3 pages
Week 3 v1.1 (hidden) Supervised Learning (Regression)
No ratings yet
Week 3 v1.1 (hidden) Supervised Learning (Regression)
52 pages
CA One 2024
No ratings yet
CA One 2024
4 pages
Assignment 3-PDS Python-24S3
No ratings yet
Assignment 3-PDS Python-24S3
5 pages
Digital Skills for Agile Business Analysis
From Everand
Digital Skills for Agile Business Analysis
Tj. Blake Williams
No ratings yet
CETM 24 Part 2
No ratings yet
CETM 24 Part 2
3 pages
AI-900: Microsoft Azure AI Fundamentals Preparation
From Everand
AI-900: Microsoft Azure AI Fundamentals Preparation
Georgio Daccache
No ratings yet
Machine learning Assignment
No ratings yet
Machine learning Assignment
3 pages
Data Mining & Machine Learning Courseoutline
No ratings yet
Data Mining & Machine Learning Courseoutline
7 pages
Crack the Data Analyst Interview: Real-Time Questions & Expert Answers
From Everand
Crack the Data Analyst Interview: Real-Time Questions & Expert Answers
Yash d.
No ratings yet
Computer network operations A Complete Guide
From Everand
Computer network operations A Complete Guide
Gerardus Blokdyk
No ratings yet
ADA Assignment - Final - 2022
No ratings yet
ADA Assignment - Final - 2022
6 pages
Wireless Data Communication The Ultimate Step-By-Step Guide
From Everand
Wireless Data Communication The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
The MSP’s Guide to the Ultimate Client Experience: Optimizing service efficiency, account management productivity, and client engagement with a modern digital-first approach.
From Everand
The MSP’s Guide to the Ultimate Client Experience: Optimizing service efficiency, account management productivity, and client engagement with a modern digital-first approach.
Jeff Farris
No ratings yet
Data Science Career Guide Interview Preparation
From Everand
Data Science Career Guide Interview Preparation
Gradient Publication
No ratings yet
CSC 603 - Final Project
No ratings yet
CSC 603 - Final Project
3 pages
final project documentation
No ratings yet
final project documentation
53 pages
Grid Computing Second Edition
From Everand
Grid Computing Second Edition
Gerardus Blokdyk
No ratings yet
Data Encryption Technologies A Complete Guide
From Everand
Data Encryption Technologies A Complete Guide
Gerardus Blokdyk
No ratings yet
Optimum Sigma is NOT 6
From Everand
Optimum Sigma is NOT 6
Kermit Taylor
No ratings yet
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
From Everand
Applied Predictive Modeling: An Overview of Applied Predictive Modeling
Steven Taylor
No ratings yet
Mastering AI Prompts: Unlocking the Potential of Intelligent Interaction
From Everand
Mastering AI Prompts: Unlocking the Potential of Intelligent Interaction
salah allam
No ratings yet
Microprediction: Building an Open AI Network
From Everand
Microprediction: Building an Open AI Network
Peter Cotton
No ratings yet
Optical networking A Clear and Concise Reference
From Everand
Optical networking A Clear and Concise Reference
Gerardus Blokdyk
No ratings yet
Mobile data terminal A Clear and Concise Reference
From Everand
Mobile data terminal A Clear and Concise Reference
Gerardus Blokdyk
No ratings yet
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
network appliance Third Edition
From Everand
network appliance Third Edition
Gerardus Blokdyk
No ratings yet
Predicting Personal Loan Approval Using Machine Learning Handbook
No ratings yet
Predicting Personal Loan Approval Using Machine Learning Handbook
31 pages
Tp-Andoc2023 24
No ratings yet
Tp-Andoc2023 24
2 pages
MCS-034: Software Engineering
From Everand
MCS-034: Software Engineering
Dr. DK Sukhani
No ratings yet
digital network A Complete Guide
From Everand
digital network A Complete Guide
Gerardus Blokdyk
No ratings yet
Telecom data intelligence Standard Requirements
From Everand
Telecom data intelligence Standard Requirements
Gerardus Blokdyk
No ratings yet
Telecommunication transaction processing systems A Complete Guide
From Everand
Telecommunication transaction processing systems A Complete Guide
Gerardus Blokdyk
No ratings yet
Mobile technology Second Edition
From Everand
Mobile technology Second Edition
Gerardus Blokdyk
No ratings yet
Data model Second Edition
From Everand
Data model Second Edition
Gerardus Blokdyk
No ratings yet
Data Science Project Ideas for Thesis, Term Paper, and Portfolio
From Everand
Data Science Project Ideas for Thesis, Term Paper, and Portfolio
Zemelak Goraga
No ratings yet
Network Rail A Complete Guide
From Everand
Network Rail A Complete Guide
Gerardus Blokdyk
No ratings yet
Information technology audit The Ultimate Step-By-Step Guide
From Everand
Information technology audit The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
Computational
No ratings yet
Computational
7 pages
Project - Data Mining: Bank - Marketing - Part1 - Data - CSV
No ratings yet
Project - Data Mining: Bank - Marketing - Part1 - Data - CSV
4 pages
Carrier cloud A Complete Guide
From Everand
Carrier cloud A Complete Guide
Gerardus Blokdyk
No ratings yet
Be Data Curious!: Be Data Curious!, #1
From Everand
Be Data Curious!: Be Data Curious!, #1
Nick Jewell
No ratings yet
Interactive Computing and Data Visualization Complete Self-Assessment Guide
From Everand
Interactive Computing and Data Visualization Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
Effective Analytics for Marketing
From Everand
Effective Analytics for Marketing
Sucheta Kakkar
No ratings yet
Data grid The Ultimate Step-By-Step Guide
From Everand
Data grid The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
Informatics engineering Second Edition
From Everand
Informatics engineering Second Edition
Gerardus Blokdyk
No ratings yet
ICT infrastructure Third Edition
From Everand
ICT infrastructure Third Edition
Gerardus Blokdyk
No ratings yet
Network model A Complete Guide
From Everand
Network model A Complete Guide
Gerardus Blokdyk
No ratings yet
Description: Bank - Marketing - Part1 - Data - CSV
No ratings yet
Description: Bank - Marketing - Part1 - Data - CSV
4 pages
Digital Network Intelligence A Complete Guide
From Everand
Digital Network Intelligence A Complete Guide
Gerardus Blokdyk
No ratings yet
Data cube A Clear and Concise Reference
From Everand
Data cube A Clear and Concise Reference
Gerardus Blokdyk
No ratings yet
Enterprise Architect’s Handbook: A Blueprint to Design and Outperform Enterprise-level IT Strategy (English Edition)
From Everand
Enterprise Architect’s Handbook: A Blueprint to Design and Outperform Enterprise-level IT Strategy (English Edition)
Dr. Vishwakarma J S
No ratings yet
Engineering informatics The Ultimate Step-By-Step Guide
From Everand
Engineering informatics The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
PMP Question Bank
From Everand
PMP Question Bank
Mohammad Usmani
4/5 (34)
Cellular network Second Edition
From Everand
Cellular network Second Edition
Gerardus Blokdyk
No ratings yet
IT infrastructure deployment Standard Requirements
From Everand
IT infrastructure deployment Standard Requirements
Gerardus Blokdyk
No ratings yet
DP CVM-C5 en
No ratings yet
DP CVM-C5 en
2 pages
03.2 - Baym (2006) Interpersonal Life Online - Handbook of New Media
No ratings yet
03.2 - Baym (2006) Interpersonal Life Online - Handbook of New Media
15 pages
(Synthesis Lectures On Control and Mechatronics) Mathukumalli Vidyasagar-Control System Synthesis - A Factorization Approach, Part II (Synthesis Lectures On Control and Mechatronics) - Morgan & Claypo
No ratings yet
(Synthesis Lectures On Control and Mechatronics) Mathukumalli Vidyasagar-Control System Synthesis - A Factorization Approach, Part II (Synthesis Lectures On Control and Mechatronics) - Morgan & Claypo
227 pages
Chapter 2
No ratings yet
Chapter 2
13 pages
Presentation 1
No ratings yet
Presentation 1
5 pages
assignment -1 with answer
No ratings yet
assignment -1 with answer
17 pages
Oi - Dhh805a en 10 2012 PDF
No ratings yet
Oi - Dhh805a en 10 2012 PDF
56 pages
Atoll
No ratings yet
Atoll
39 pages
Storytelling in Information Visualization Jeremy Boy
No ratings yet
Storytelling in Information Visualization Jeremy Boy
11 pages
DX100 Instruction Manual
No ratings yet
DX100 Instruction Manual
270 pages
Autodesk Inventor Tutorial Guide
No ratings yet
Autodesk Inventor Tutorial Guide
22 pages
Fraction and Division
No ratings yet
Fraction and Division
17 pages
How To Restore System Image Backup in Windows 10 or 11
No ratings yet
How To Restore System Image Backup in Windows 10 or 11
11 pages
La Norma Isa 95
No ratings yet
La Norma Isa 95
42 pages
DG Tech Specs
No ratings yet
DG Tech Specs
2 pages
MiX 2310i With IP Housing and 2G or 3G - PIG V1
No ratings yet
MiX 2310i With IP Housing and 2G or 3G - PIG V1
28 pages
Dapus Uro
No ratings yet
Dapus Uro
3 pages
Carlos Morel CV 1589191488
No ratings yet
Carlos Morel CV 1589191488
1 page
17EL73 Assignment 1 (PSA)
No ratings yet
17EL73 Assignment 1 (PSA)
2 pages
ISSUE DATE: NOC. 25, 2004: Read The Manual Before Operating This Machine. Keep This Manual For Your Reference
No ratings yet
ISSUE DATE: NOC. 25, 2004: Read The Manual Before Operating This Machine. Keep This Manual For Your Reference
33 pages
PDC CrewConnex App
No ratings yet
PDC CrewConnex App
21 pages
SWE Architecture and Design Set 2 Marking Guide
No ratings yet
SWE Architecture and Design Set 2 Marking Guide
5 pages
EIUL-BBP-PS-STRUCTURES - Erection-V4 - Water
No ratings yet
EIUL-BBP-PS-STRUCTURES - Erection-V4 - Water
12 pages
TN-23 - Typical Ceilometer Installations PDF
No ratings yet
TN-23 - Typical Ceilometer Installations PDF
2 pages
Weco E32 Instructions For Cleaning and Sterilization 2
No ratings yet
Weco E32 Instructions For Cleaning and Sterilization 2
2 pages
Regression Analysis Using Excel: X Abp
No ratings yet
Regression Analysis Using Excel: X Abp
7 pages
SVF 2 N 65
No ratings yet
SVF 2 N 65
7 pages
CCE Detailed Syllabus
No ratings yet
CCE Detailed Syllabus
106 pages
Electronic Parts Catalog - Option Detail
No ratings yet
Electronic Parts Catalog - Option Detail
4 pages
Colores Delphi
No ratings yet
Colores Delphi
9 pages

ITNPBD6 Assignment 2018-2 PDF

Uploaded by

ITNPBD6 Assignment 2018-2 PDF

Uploaded by

ITNPBD6 Data Mining Assignment

You must hand in a report covering the following:

Data Summary 10 Marks

Data Preparation 10 Marks

Results and Errors 20 Marks

You might also like