Applied Modelling and Visualisation
Applied Modelling and Visualisation
Please use this document as the cover sheet of for the 1st page of your assessment.
I hereby declare that I have read and understood BPP’s regulations on plagiarism and that this is my
original work, researched, undertaken, completed and submitted in accordance with the requirements
of BPP School of Business and Technology.
The word count, excluding contents table, bibliography and appendices, is ______ words.
By submitting this coursework you agree to all rules and regulations of BPP regarding assessments
and awards for programmes.
Please note that by submitting this assessment you are declaring that you are fit to sit this
assessment.
BPP University reserves the right to use all submitted work for educational purposes and may
request that work be published for a wider audience.
• Your summative assessment for this module is made up of this 2,500 words submission which
accounts for 100% of the marks
• Please note late submissions will not be marked.
• You are required to submit all elements of your assessment via Turnitin online access. Only
submissions made via the specified mode will be accepted and hard copies or any other digital
form of submissions (like via email or pen drive etc.) will not be accepted.
• For coursework, the submission word limit is 2,500 words. You must comply with the word
count guidelines. You may submit LESS than 2,500 words but not more. Word Count guidelines
can be found on your programme home page and the coursework submission page.
• Do not put your name or contact details anywhere on your submission. You should only put
your student registration number (SRN) which will ensure your submission is recognised in the
marking process.
• A total of 100 marks are available for this module assessment, and you are required to achieve
minimum 50% to pass this module.
• You are required to use only Harvard Referencing System in your submission. Any content which
is already published by other author(s) and is not referenced will be considered as a case of
plagiarism.
You can find further information on Harvard Referencing in the online library on the VLE. You can
use the following link to access this information: http://bpp.libguides.com/Home/StudySupport
• BPP University has a strict policy regarding authenticity of assessments. In proven instances of
plagiarism or collusion, severe punishment will be imposed on offenders. You are advised to
read the rules and regulations regarding plagiarism and collusion in the GARs and MOPP which
are available on VLE in the Academic registry section.
• You should include a completed copy of the Assignment Cover sheet. Any submission without
this completed Assignment Cover sheet may be considered invalid and not marked.
Source: https://stock.adobe.com/uk/images/landing-at-sunset/82605693
For this assignment you are working as a Data Analytics Consultant for the Marjanta Airlines and
have been asked to prepare a Consultancy Report based on the airline’s passenger ‘satisfaction’ Data
Set. This report and your findings will be used in a ‘visually appealing’ presentation to the CEO,
Senior Flight personnel and Cabin Crew in the Annual Staff Conference and it has been proposed
some interactive elements will be placed securely on the company intranet.
Summative Submission
You are provided with a set of data MARJANTA_DATA_CW3.csv that summarises the levels of
passenger ‘satisfaction’. The file contains over 103,000 rows of information from the UK National
Airlines database system for the current calendar year. Your objective is to use machine learning
principles to model and visualise key data with a view to helping staff better understand what
factors impacted levels of ‘satisfaction’ for passengers using the airline. Each feature is listed below:
Satisfied Y = Satisfied
N = Unsatisfied
Age Number
Type of Travel
Class
Your summative submission should be a written report in MSWord format (NOT a PDF file) and
should be at most 2,500 words. It should describe how applied modelling and visualisation can be
used to present summaries of passenger data. Your report will inform a corporate presentation so
should be appropriately tailored to a rich and varied audience consisting of CEO, Senior Flight
personnel and Cabin Crew. You are also required to carry out independent research into the
deferent categories of ‘satisfaction’ and techniques used to analyse and forecast data in your report.
The solution must use two analytical models to predict the scale and accuracy of the airline’s data
using the Python programming language and relevant Python libraries taking into consideration the
following guidance notes.
• Logistic regression
• Decision Tree
• Bagging
• Random Forest
• AdaBoost
(ILO2 – Critically evaluate the use of algorithms and model when developing analytical solutions)
Task 2: Critically analyse the two models chosen for your solution in Task 1 (ILO2)
Critically analyse the two models chosen for your solution in Task 1, and in particular, the strengths
and limitations of each model using the guidance notes provided below with references to the
relevant literature.
(ILO3 – Critically appraise the concepts, tools and techniques for data visualisation)
Task 3: Communicate your findings supported by several outputs from Task 1 (ILO3)
Communicate your findings supported by several outputs from Task 1, including graphical outputs
such as correlation matrix, heat map, and confusion matrix using the guidance notes provided
below.
✓ An analysis of how the Exploratory Data Analysis (EDA) output guided your selection of the
analytical models
✓ An explanation of the justification for performing EDA and the use of appropriate descriptive
statistics and visualisations to understand the results of that analysis
✓ A recommendation of the use of one model for sustaining or increasing the rate of ‘satisfaction’
Your report should include a list of references used to develop the report and research to support
the suggested approach. The list should use only the Harvard Referencing System as highlighted in
the General Assessment Guidance section of this document. All the figures/tables used in the report
must have captions and, wherever needed, properly referenced, and explained in your submission.
Table of Contents
Recommendations
Next steps
References
Appendix
Locate the report file and embed your Pre-run Python notebook. If you are unable to
embed your python notebook in your MS Word document for any reason, you must provide
a shared link to the file. This is easily done within Google Colab by selecting the ‘Share
button’ in the top right-hand corner of the screen:
IMPORTANT: If you do not embed your notebook or provide a link you will lose marks
Modelling and Visualisation Fail Marginal Fail Pass Merit Distinction High Distinction
0-39% 40-49% 50-59% 60-69% 70-79% 80-100%
30% Formulate data-driven Notebook fails to Notebook correctly loads Notebook correctly loads Notebook correctly loads the Notebook correctly loads the Notebook correctly loads the input
execute, fails to the input data file into a the input data file into a input data file into a Python input data file into a Python data data file into a Python data structure
solutions (ILO1) display the options, or Python data structure. No Python data structure. data structure. Comments structure. The comments in a modular fashion. The comments
halts during execution. comments are given on Comments are given on and explanations are given provided cover technical details of provided cover exceptional technical
the method used. the approach taken. with detail on the extract the extract phase of the project, details of the extract phase of the
Guidelines: Inadequate and often Notebook uses a package phase of the project. demonstrating extensive project, demonstrating extensive
• Adopt an appropriate implicit knowledge to conduct EDA, as well as Notebook correctly knowledge on dataframe imports. knowledge on dataframe imports and
management framework base with some comparisons of the outputs handles duplicate values Notebook handles duplicate their peculiarities.
omissions and/or lack of the appropriate model as well as EDA. values, missing values as Notebook handles duplicate
( e.g. PPDAC or CRISP-DM of theory relating outcomes and metrics but Comments are given. well as descriptive statistics values, missing values and Notebook handles duplicate values,
or SDLC) to the use of ETL with no explanation or explaining the steps taken to explains in detail the steps taken handles missing values, correctly
processes. No comments. The script achieves reach the results. Notebook to reach the results. uses a package to achieve prediction
• Perform an Extract, discussion of prediction for the also achieves prediction for for the future trends and outputs the
Transform, and Load ambiguities, Weak and often implicit ‘satisfaction’ likelihood the ‘satisfaction’ Likelihood Correctly uses a package to appropriate model outcomes, metrics
assumptions or knowledge base with some and also correctly with good explanation and achieve prediction for the as well as an example of the
(ETL) process anomalies. ‘satisfaction’ likelihood and
omissions and/or lack of outputs appropriate comments about the method prediction in action for a new mock
• Perform Exploratory Data model outcomes and used. There are model outputs the appropriate model entries and scenarios. Comments
theory of the use of
Analysis (EDA) Notebook fails metrics with reasonable evaluation metrices outcomes and metrics. provided are profound in detail.
to produce any modelling and visualisation level of commentary and outputted alongside
• Use TWO analytical outputs which can be for a data project (and explanation. predictions. Explanations are detailed and Explain in detail the steps taken to
models for analysis used to communicate relevant code libraries) profound. reach the results with further
• Produce appropriate your findings Notebook correctly uses Notebook correctly uses a explanation of methods to expand
Notebook correctly uses a a package to produce package to produce Notebook correctly uses a the steps taken or process followed.
visualisations of results communication tools, communication package to produce
package to produce
with reasonable tools with good explanation communication tools, with very Also explains rationale behind the
communication tools explanations and and comments about the detailed explanation and methods used.
but does not contain any comments. method used. comments about the model
explanation or output and your chosen method Notebook correctly uses a package
commentary. of communication conveys this. to produce communication tools with
very detailed explanation and
comments about the method
used including examples of similar
practices and suggestions to further
enhance the communication of
results.