Wia1007 - Group Assignment
Wia1007 - Group Assignment
Data science is the domain of study that deals with vast volumes of data using modern
tools and techniques to find unseen patterns, derive meaningful information, and make
business decisions. The first task of data scientist is to understand the objectives and
requirements of the project. It is important to determine business objectives and define
business success criteria. In other words, what should the project try to achieve? The
following diagram shows a data science process flow, using OSEMN framework.
As a data scientist , you may do the following tasks : i) Discover patterns and trends in
datasets to get insights; ii) Create forecasting algorithms and data models.; iii) Improve
the quality of data or product offerings by utilising suitable DS tools and machine
learning techniques and finally become the top in field of data science innovations.
Domain areas that you can work on are as follows: -
q Education (Higher Education)
q Government
q Health Care
q Tourism
q Environmental
q E-Commerce
q Human Resources
q Transportation
q Financial
q Others (please specify)
This proposal will lead to the development of a model and a data product. You may
choose a case study (scenario) from your reading, or actual scenario or event based on
your working experience before you come out with your proposal. A case study will also
help you to write your problem statement clearly, as case study discusses about
problems faced by a business. In addition, a case study will help you to identify why a
business needs to analyze the dataset, and gain insight from it.
The tasks for GA1 should include the first three stages of OSEMN Framework:
1. Project background - Description of the Data Science Project (suitable for which
organization, target users, potential benefits, etc)
2. Problem Statement
3. Project Objectives
4. Project Scope
5. Literature Study / Information Gathering Analysis
6. Description of Methodology
a. Obtain – Types of Data collected, Sources, Reliability
b. Scrub – Processes done to clean the dataset, types of imputation
used etc
c. Explore – Exploratory Data Analysis to investigate the data in terms
of anomalies, and to check assumptions using statistics and graphical
representations.
7. Impact of the Project to the society
8. References
Method & Submission for GA1:
- Submit the report on week 8 (before lecture hour) in a softcopy form. Prepare a power
Point slides for 7 minutes presentation.
- Content should include basic explanation about each task and any related points that
are suitable.
- You may include related diagrams, charts and any supporting material in your report.
- Report Format :
o 1.5 spacing, Arial 11, maximum of 15 pages excluding cover page and
attachment
- Your report should include cover page, with your group details (Name, matrix no, and
topic).
2 Literature Analysis 20
3 3 stages of OSEMN 30
5 Group Commitment 10
Total 100
Group Assignment 2 (20%)
In relation to Assignment 1, the second part of the assignment will be on Modelling and
Interpreting the Data. In the fourth step, we use analytic techniques to help in
making sense of the data and acquire important insights for data-driven
decision-making. This phase as many people would call it, “where the magic
happens”.
For instance, regression and predictions are used to forecast future values,
and classification identifies and groups the values obtained from the dataset
- Submit the report on week 14 (during lecture hour) in a softcopy form. Prepare a power
Point slides for 7 minutes presentation.
- Content should include basic explanation about each task and any related points that
are suitable.
- You may include related diagrams, charts and any supporting material in your report.
- Format :
o 1.5 spacing, Arial 11, maximum of 15 pages excluding cover page and
attachment
- Your report should include cover page, with your group details (Name, matrix no, and
topic).
1 Data Modelling 20
2 Data Interpretation 20
3 Data Product 20
6 Group Commitment 10
Total 100