0% found this document useful (0 votes)

6 views

Make Up Assignment - Data Science

Uploaded by

lorrainencube175

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Make Up Assignment - Data Science

Uploaded by

lorrainencube175

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Assignment Rules

Please ensure that you strictly adhere to the following rules while completing this assignment. Any violation of
these guidelines will result in significant penalties or disqualification.

1. Plagiarism and Cheating

• Plagiarism Check: This assignment will be submitted through Turnitin, and any submission that
shows a similarity index of more than 50% will automatically receive a zero mark. Ensure that you
submit your own work.

• Originality: You are required to produce original work. Copying or paraphrasing substantial portions
of code or content from online sources or other students without appropriate acknowledgment will be
classified as plagiarism.

• Consequences: Plagiarism or cheating in any form, including the sharing or collaboration on this
individual assignment, will result in an automatic zero for the assignment. Further disciplinary action
may follow in accordance with the academic integrity policies of the institution.

2. Submission Guidelines

• No Code Submissions: You must not submit any code as part of your submission. This assignment is
based on your analysis and understanding of the concepts, so I will only be marking your content and
explanations. Focus on demonstrating a clear and deep understanding of the material, with well-
articulated answers.

• Clarity of Submission: Ensure that your work is well-organized, with answers clearly labelled
according to the question number.

3. Referencing

• Harvard Referencing Style: You must use the Harvard referencing style for all citations, including
inline citations. Failure to do so will result in up to 30% off your final mark. All external sources,
including datasets, academic papers, or online resources, must be cited properly.

• Incorrect Referencing: Improper or missing citations will result in up to a 30% deduction from your
overall grade, depending on the severity of the issues.

4. Assessment Criteria

• Content Evaluation: I will be marking you based on the content and depth of your answers. Make
sure your responses demonstrate critical thinking, the ability to apply theoretical concepts to practical
problems, and a comprehensive understanding of the dataset and the questions posed.

5. Code of Conduct

• Deadlines: Late submissions will not be accepted unless prior approval for an extension is granted
based on valid circumstances.

• Honesty and Integrity: Academic integrity is of paramount importance. Ensure that your work reflects
your individual effort and understanding.

By submitting this assignment, you confirm that you have read and understood these rules and agree to comply
with them. Non-compliance will result in academic penalties as specified.

Introduction to Data Science Assignment (100 Marks)

Dataset: Titanic: Machine Learning from Disaster
Time: 36 Hours Due at 12pm Friday the 18th. No late submissions will be accepted.
Instructions: Use the Titanic dataset from Kaggle to complete the following tasks. Submit a
zipped folder containing your code (in Python), dataset, and a report with answers,
visualizations, and interpretations.
Dataset Link: Titanic - Machine Learning from
Disaster(https://www.kaggle.com/c/titanic/overview)

Question 1: k-Nearest Neighbours (k-NN) (12 Marks)

You are tasked with using the k-NN algorithm to predict whether a passenger survived the
Titanic disaster based on features like age, fare, and class.
1.1 Explain the k-NN algorithm and how it can be used to classify passengers. (2)
1.2 How does class imbalance between the number of passengers who survived and those
who did not affect the performance of the k-NN model? (3)
1.3 Explain how varying the value of k could impact the effect of class imbalance. (3)
1.4 Propose a method to address the class imbalance in k-NN. Explain your choice. (4)

Question 2: Decision Trees (12 Marks)

A decision tree will be used to predict passenger survival using the Titanic dataset.
2.1 Explain how a decision tree works and how it can be used to predict survival. (2)
2.2 Give an example where the depth of the decision tree is less important in decision-
making. (2)
2.3 Provide a scenario where the error rate is less important than the simplicity or
interpretability of the model. (2)
2.4 Discuss the characteristics of the Titanic dataset that make decision trees a suitable model.
(4)
2.5 What is one major challenge in decision tree modelling, and how can it be addressed? (2)

Question 3: Ensemble Learning (12 Marks)

You will explore how ensemble learning techniques, like Random Forests, can improve
predictions on the Titanic dataset.
3.1 Can bagging and feature selection be applied to k-NN classifiers? Discuss any challenges.
(4)
3.2 Discuss the interpretability of Random Forest models. Can decision rules be extracted
from them? (3)
3.3 How is majority voting typically used in Random Forests, and how can it be adapted for
regression tasks? (3)
3.4 If some trees in a random forest are less accurate than others, how can you ensure that
majority voting remains fair? (2)
Question 4: Neural Networks and Perceptron’s (14 Marks)
Explore how neural networks can be applied to predict passenger survival.
4.1 What is a perceptron, and how could it be used to classify passengers? (3)
4.2 Compare step functions with smooth activation functions like sigmoid. What are the
advantages of smooth activation functions? (3)
4.3 What is the purpose of hidden layers in a neural network? (2)
4.4 Neural networks are often referred to as “black boxes.” Why is that? Does this apply to
perceptron’s? (3)
4.5 Compare the perceptron and Support Vector Machine (SVM) in terms of classification
tasks. (3)

Question 5: Regression Analysis (20 Marks)

Perform regression analysis using the Titanic dataset to predict the fare a passenger paid.
5.1 What is regression analysis, and how does it differ from classification? (3)
5.2 Perform a linear regression to predict the fare using features like age, class, and
embarkation point. Interpret the results. (6)
5.3 Explain the concept of a correlation matrix. What relationships can you observe between
variables in the dataset? (3)
5.4 Interpret the regression output (coefficients, R²). What does it tell you about the model's
effectiveness? (4)
5.5 Discuss the problem of overfitting in regression models and how it can be avoided. (4)

Question 6: Clustering (20 Marks)

Clustering will be used to group passengers based on similar characteristics.
6.1 Explain k-means clustering and how you could group passengers based on features like
age, class, and fare. (3)
6.2 Perform k-means clustering on the Titanic dataset. Determine the number of clusters and
describe them. (5)
6.3 Visualize the centroids of the clusters and explain what patterns you observe. (4)
6.4 How would you evaluate the quality of your clusters? Discuss methods for cluster
validation. (4)
6.5 Provide a real-world application of clustering in the context of the Titanic dataset, and
explain its potential use. (4)
Question 7: Model Comparison and Overfitting (10 Marks)
Compare the models you've used in this assignment in terms of performance and overfitting.
7.1 Compare the strengths and weaknesses of k-NN, Decision Trees, and Neural Networks
for the Titanic dataset. (5)
7.2 Discuss how each model can be prone to overfitting. What techniques would you use to
address overfitting for each model? (5)

Good luck

ISTQB Advanced Level Technical Test Analyst- Exam Insights: Q&A with Explanations
From Everand
ISTQB Advanced Level Technical Test Analyst- Exam Insights: Q&A with Explanations
SUJAN
No ratings yet
Individual Asignment Ucs551
70% (10)
Individual Asignment Ucs551
15 pages
Titanic: Machine Learning From Disaster: Source
No ratings yet
Titanic: Machine Learning From Disaster: Source
1 page
Task 1
0% (1)
Task 1
3 pages
ML Mini Project 2
No ratings yet
ML Mini Project 2
26 pages
Data Strategy Seminar Paper Round1
No ratings yet
Data Strategy Seminar Paper Round1
3 pages
ML Report-1
No ratings yet
ML Report-1
13 pages
Titanic (4)
No ratings yet
Titanic (4)
3 pages
Titanic (5)
No ratings yet
Titanic (5)
3 pages
1
No ratings yet
1
9 pages
Assignment 3-PDS Python-24S3
No ratings yet
Assignment 3-PDS Python-24S3
5 pages
HW 12
No ratings yet
HW 12
20 pages
LP3 - ML Mini-Project Report Format Shreeyas
No ratings yet
LP3 - ML Mini-Project Report Format Shreeyas
13 pages
Coding Titanicmain
No ratings yet
Coding Titanicmain
58 pages
Titanic Survival Prediction Using Machine Learning
No ratings yet
Titanic Survival Prediction Using Machine Learning
34 pages
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
Blockchain Foundation Courseware - English
From Everand
Blockchain Foundation Courseware - English
Eppo Luppes
No ratings yet
TITANIC SURVIVAL PREDICTION USING ML MINIPROJECT
No ratings yet
TITANIC SURVIVAL PREDICTION USING ML MINIPROJECT
21 pages
Titanic Survival Prediction Using Machine Learning
No ratings yet
Titanic Survival Prediction Using Machine Learning
7 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Titanic Report ml report
No ratings yet
Titanic Report ml report
14 pages
ML Mini Project - Docx New (A)
No ratings yet
ML Mini Project - Docx New (A)
17 pages
Neural Network Project
No ratings yet
Neural Network Project
4 pages
Core Concepts in Statistical Learning
From Everand
Core Concepts in Statistical Learning
Tushar Gulati
No ratings yet
ML Mini Project
No ratings yet
ML Mini Project
17 pages
Machine Learnig - Mini Project
No ratings yet
Machine Learnig - Mini Project
5 pages
Exploratory Data Analysis of Titanic Survival Prediction Using Machine Learning Techniques
No ratings yet
Exploratory Data Analysis of Titanic Survival Prediction Using Machine Learning Techniques
5 pages
(ISC)2 CCSP Certified Cloud Security Professional Official Study Guide
From Everand
(ISC)2 CCSP Certified Cloud Security Professional Official Study Guide
Mike Chapple
5/5 (1)
Ai Fall-23 Assignment
No ratings yet
Ai Fall-23 Assignment
5 pages
MBAN Assignment
No ratings yet
MBAN Assignment
2 pages
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
From Everand
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
Idea Link
No ratings yet
Python for Machine Learning: From Fundamentals to Real-World Applications
From Everand
Python for Machine Learning: From Fundamentals to Real-World Applications
Kameron Hussain
No ratings yet
Thesis Slide
No ratings yet
Thesis Slide
24 pages
Midterm Text
No ratings yet
Midterm Text
13 pages
MIE1624 - Assignment 3
No ratings yet
MIE1624 - Assignment 3
6 pages
Aim: Predicting The Survival of Titanic Passengers
No ratings yet
Aim: Predicting The Survival of Titanic Passengers
20 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Research Assignment 2
No ratings yet
Research Assignment 2
2 pages
Report TSP
No ratings yet
Report TSP
13 pages
Data Mining Assignment No 2
No ratings yet
Data Mining Assignment No 2
4 pages
Reinforcement Learning: A Practical Guide to Algorithms
From Everand
Reinforcement Learning: A Practical Guide to Algorithms
Trilokesh Khatri
No ratings yet
Titanic Disaster Using Machine Learning
No ratings yet
Titanic Disaster Using Machine Learning
7 pages
Computational
No ratings yet
Computational
7 pages
Titanic Disaster Prediction
No ratings yet
Titanic Disaster Prediction
20 pages
Mini Project ml111
No ratings yet
Mini Project ml111
2 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
ML Report
No ratings yet
ML Report
3 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
CS 2 SEM SYLLABUS
No ratings yet
CS 2 SEM SYLLABUS
3 pages
Data Science-1
No ratings yet
Data Science-1
6 pages
Machine Learning
100% (1)
Machine Learning
62 pages
PMI-ACP Exam Insights: Q&A with Explanations
From Everand
PMI-ACP Exam Insights: Q&A with Explanations
SUJAN
No ratings yet
Using Titanic Dataset for Comprehensive Machine Learning Model Training
No ratings yet
Using Titanic Dataset for Comprehensive Machine Learning Model Training
3 pages
Notebook
No ratings yet
Notebook
10 pages
Datascience
No ratings yet
Datascience
8 pages
Acknowledgement
No ratings yet
Acknowledgement
24 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Assignment 2
No ratings yet
Assignment 2
3 pages
ML Assignment
No ratings yet
ML Assignment
34 pages
Titanic Survival
No ratings yet
Titanic Survival
13 pages
What Makes Woman WOMAN
No ratings yet
What Makes Woman WOMAN
18 pages
Silenced by Fear
No ratings yet
Silenced by Fear
31 pages
Assignment 1 Brief
No ratings yet
Assignment 1 Brief
2 pages
Team Conflict
No ratings yet
Team Conflict
3 pages
Week 6
No ratings yet
Week 6
4 pages
CAE Essay Case Study
No ratings yet
CAE Essay Case Study
4 pages
Curriculum Implementation
No ratings yet
Curriculum Implementation
66 pages
Motivating Employees PDF
No ratings yet
Motivating Employees PDF
38 pages
DLL PerDev Week 8
100% (1)
DLL PerDev Week 8
4 pages
What Is Human Dignity?: by Mette Lebech, Faculty of Philosophy, National University of Ireland, Maynooth
100% (2)
What Is Human Dignity?: by Mette Lebech, Faculty of Philosophy, National University of Ireland, Maynooth
13 pages
TRENDS V FAD
No ratings yet
TRENDS V FAD
2 pages
Bringing Literature To Life Through Drama The 337
No ratings yet
Bringing Literature To Life Through Drama The 337
5 pages
Am Tras Într-O Zi o Bleandă, Pentru Că Nu-Mi Da Pace Să Prind Muşte..
No ratings yet
Am Tras Într-O Zi o Bleandă, Pentru Că Nu-Mi Da Pace Să Prind Muşte..
2 pages
Metaphoric Extension
No ratings yet
Metaphoric Extension
17 pages
Identification of Multiple Intelligences With The Multiple Intelligence Profiling Questionnaire III
No ratings yet
Identification of Multiple Intelligences With The Multiple Intelligence Profiling Questionnaire III
16 pages
CDT413: Advanced Software Engineering Software Engineering Research
No ratings yet
CDT413: Advanced Software Engineering Software Engineering Research
6 pages
Lesson 5 Pba
No ratings yet
Lesson 5 Pba
4 pages
Global Perspective Lesson Plan
No ratings yet
Global Perspective Lesson Plan
5 pages
Social Work Essay Examples
100% (2)
Social Work Essay Examples
3 pages
Module 1 Activity 5
No ratings yet
Module 1 Activity 5
5 pages
I. Ii. (My Step by Step Guide To Writing A Research Paper)
No ratings yet
I. Ii. (My Step by Step Guide To Writing A Research Paper)
3 pages
Cbet
No ratings yet
Cbet
21 pages
Emotional Intelligence and Job Satisfaction: An Empirical Investigation
No ratings yet
Emotional Intelligence and Job Satisfaction: An Empirical Investigation
9 pages
050819051019modul English For Communication PDF
No ratings yet
050819051019modul English For Communication PDF
46 pages
Instant download (Ebook) Deep Learning with R by François Chollet, J.J. Allaire ISBN 9782541471266, 2541471262 pdf all chapter
100% (12)
Instant download (Ebook) Deep Learning with R by François Chollet, J.J. Allaire ISBN 9782541471266, 2541471262 pdf all chapter
67 pages
Instruction Planning Models For Mother Tongue Instruction
No ratings yet
Instruction Planning Models For Mother Tongue Instruction
12 pages
Personal Integrity Essay
100% (3)
Personal Integrity Essay
2 pages
HSM 541 Week 7 Course Project
No ratings yet
HSM 541 Week 7 Course Project
5 pages
LEadrship Change
100% (1)
LEadrship Change
19 pages
Shuai Weis Resume
No ratings yet
Shuai Weis Resume
3 pages

Make Up Assignment - Data Science

Uploaded by

Make Up Assignment - Data Science

Uploaded by

Assignment Rules

1. Plagiarism and Cheating

Introduction to Data Science Assignment (100 Marks)

Question 1: k-Nearest Neighbours (k-NN) (12 Marks)

Question 2: Decision Trees (12 Marks)

Question 3: Ensemble Learning (12 Marks)

Question 5: Regression Analysis (20 Marks)

Question 6: Clustering (20 Marks)

You might also like