0% found this document useful (0 votes)

40 views

IDS Unit - 5

1. Data science frameworks provide tools to execute data science techniques and gain insights from business data to drive decisions. The overall framework involves 7 steps: ask questions, acquire and assimilate data, analyze data, answer questions, advise and act. 2. Key components of a data science project include purpose, people with different skills, processes, platforms, and programmability. 3. Process evaluations examine if a program's planned operations and achievements align with expectations, and help identify areas needing monitoring or updates.

Uploaded by

Vrindapareek

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views

IDS Unit - 5

Uploaded by

Vrindapareek

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Unit – 5

Q. 1 What is Framework in Data Science ?

A framework in data science are software tools, that help you with executing data
science techniques on your business data to get the best insights that drive your
decisions.
Data Science is the art of turning data into actions and the overall framework is
the following 7 high level steps :

Ask > Acquire > Assimilate > Analyze > Answer > Advise > Act

1. Asking Questions : Data science begins by asking questions, where the

answers are then used to advise and act upon.
2. Acquiring and Assimilating Data : Once a series of questions are asked, a
data scientist will try and acquire the required data and assimilate in a
form which is usable.
3. Analyzing Data : Once the data has been collected and cleaned, we are now
ready to start the analysis or conduct data mining.
4. Answering Questions with data : After we are able to build a model
having the desired performance. We will be then able to answer the
question with data with a proper model.
5. Advise and Act : Another very important part of data science is the Advise
stage. After understanding data, a data scientist will have to provide
actionable advise.

Q. 2 What are various components of Data Science Project ?

1. Purpose : Just like in the classic approach of project management, a goal or

purpose should always be formulated.

2. People : Various types of people with different skillsets play an important

role within a data science project. In order to work successfully with data,
developers, testers, data scientists and domain experts are essential.

3. Processes : There are two main types of processes within data science
projects: organizational vs. technical processes.
4. Platforms : Besides the above mentioned factors, fundamental and
strategical questions that are concerned with what platforms you will use
for your analytics and products are also critical for successfully managing a
data science project.

5. Programmability : Finally, you want to think about which tools and

programming languages we want to use.

Q. 3 What is Process Evaluation ?

1. Process evaluations can help decision-makers understand if their program
is running as expected.
2. They help to examine whether planned program operations and
achievements have taken place in accordance with an organization’s
expectations.
3. We would deploy a process evaluation to investigate whether aspects of a
program are working as planned.
4. It helps clients to identify which areas need close monitoring.
5. It also helps to anticipate whether a program will run smoothly when
scaled up.
6. They helps teams to interrogate where the decisions/actions to be updated
to achieve their analysis goals.

Q. 4 What methodology used in Data Science Project ?

1. Business Understanding : Before solving any problem in the Business
domain it needs to be understood properly. Business understanding forms
a concrete base, which further leads to easy resolution of queries.

2. Analytic Understanding : Based on the above business understanding one

should decide the analytical approach to follow. The approaches can be of 4
types: Descriptive approach, Diagnostic approach, Predictive approach and
Prescriptive approach.

3. Data Requirements : The above chosen analytical method indicates the

necessary data content, formats and sources to be gathered.

4. Data Collection : Data collected can be obtained in any random format. So,
according to the approach chosen and the output to be obtained, the data
collected should be validated.

5. Data Understanding : Data understanding answers the question “Is the

data collected representative of the problem to be solved?”. Descriptive
statistics calculates the measures applied over data to access the content
and quality of matter.

6. Data Preparation : Here noise removal is done. if we don’t need specific

data then we should not consider it for further process. This whole process
includes transformation, normalization etc.

7. Modelling : Modelling decides whether the data prepared for processing is

appropriate or requires more finishing and seasoning.

8. Evaluation : Model evaluation is done during model development. It

checks for the quality of the model to be assessed and also if it meets the
business requirements.

9. Deployment : Deployment phase checks how much the model can

withstand in the external environment and perform superiorly as
compared to others.

10. Feedback : Feedback is the necessary purpose which helps in refining the
model and accessing its performance and impact.
Q. 5 Case Study of Industry Use of Data Science.

1. Data Science Case Study – Spotify

Music plays an important role in the lives of people of almost all age
groups. We frequently listen to our favorite songs in our daily routine such
as while traveling, in leisure time, etc. to release our stress and relax.
Today, there are many music playing applications in the market.

You all might have heard the name “Spotify” at least once and most
probably, you might have even used it. So you must have observed that as
soon as we start using it on a regular basis, it starts giving us personalized
music recommendations and options to create customized playlists. This is
what people like about it.
But how does Spotify do all this? The answer is “data”.

At the core of these personalized services lies a large amount of user data
that Spotify, actually not only Spotify but most of the music playing
applications are using.

Spotify is using this data for optimizing their algorithms, improving user
music experience, providing targeted ads, and making some good business
strategies.

The main goal of Spotify is to provide such a great experience to every user
that will make them continue listening for hours. To achieve this they are
using many advanced Data Science and Machine Learning techniques to
extract insights from the user data for matching with the music taste of
their individual customer.

2. Data Science Case Study – LinkedIn

LinkedIn is one of the most successful social media platforms that is
connecting professionals across the globe. It also uses customer data for
providing better services and customized user experience.

LinkedIn stores a large amount of user data including several details like
their contact information, previous history, interests, activities on different
social networking sites, etc. in its data warehouse for being aware of the
trends and patterns.
Using the insights gained from the user data, LinkedIn connects individual
users with their friends and people related to their areas of interest. It also
helps them to make some decisions regarding the business.

According to the different trends, LinkedIn provides various articles and

other services that might match user interests. LinkedIn also enables users
to promote their business to the right people by making use of targeting.

Also while using the customers’ data, LinkedIn makes sure that the data is
secure and no scrapping of data takes place from their site.

Q. 6 Different Data Science Project Methods

Method 1 : Scrum
This approach is the most widely used process framework for agile
development processes. Scrum emphasizes daily communication and
flexible reassessment of plans that are executed in a short period of time.

Method 2 : Kanban
This approach features managing/improving products with a focus on
continuous delivery without overloading the development team.

Method 3 : BEAM
BEAM aims at Agile Dimensional Modeling, with the goal of aligning
requirements analysis with business processes rather than reports.

Q. 7 Challenges in Data Science Project Management

1. Data Quality : The process of discovering data can is a crucial and

fundamental task in a data-driven project. The approaches for quality of data
can be discovered based on certain requirements, such as user-centered and
other organizational frameworks.

2. Data Integration : In general, the method of combining data from various

sources and store it together to get a unified view is known as data
integration. Inconsistent data in an organization is likely to have data
integration issues.
3. Dirty Data : Data which contains inaccurate information can be said as dirty
data. To remove the dirty data from a dataset is virtually impossible.
Depending on the severity of the errors, strategies to work with dirty data
needs to be implemented.

4. Data Uncertainty : Reasons for data uncertainties can be ranged from

measurement errors, processing errors, etc. Known and unknown errors, as
well as uncertainties, should be expected when using real-world data.

5. Data Transformation : Although the whole data can be transformed into a

usable form, yet there remain some issues which can go wrong with the ETL
project such as an increase in data velocity, time cost of fixing broken data
connections, etc.

Checklist Implementasi ISO 27k
100% (4)
Checklist Implementasi ISO 27k
42 pages
Coursera - Data Analytics - Course 1
No ratings yet
Coursera - Data Analytics - Course 1
8 pages
The Complete Mentoring Program Toolkit 1
No ratings yet
The Complete Mentoring Program Toolkit 1
26 pages
Data Science
No ratings yet
Data Science
18 pages
Data Science
No ratings yet
Data Science
18 pages
D.a_introduction to Data Analytics
No ratings yet
D.a_introduction to Data Analytics
16 pages
6001_DATASCIENCE WITH BIGDATA
No ratings yet
6001_DATASCIENCE WITH BIGDATA
34 pages
6 Phrase of Data Analysis
No ratings yet
6 Phrase of Data Analysis
9 pages
Introduction
No ratings yet
Introduction
14 pages
Notes Data Science With Python 1
No ratings yet
Notes Data Science With Python 1
18 pages
Fods Notes
No ratings yet
Fods Notes
139 pages
FDS NOTES
No ratings yet
FDS NOTES
137 pages
Introduction of Data Science
No ratings yet
Introduction of Data Science
3 pages
Foundation of Data Science
100% (2)
Foundation of Data Science
143 pages
Uses of Data Mining Tools
No ratings yet
Uses of Data Mining Tools
5 pages
Unit-2 Final Kshitij
No ratings yet
Unit-2 Final Kshitij
108 pages
Data Science Process Stages Lecture 2
No ratings yet
Data Science Process Stages Lecture 2
4 pages
introduction to data science
No ratings yet
introduction to data science
8 pages
Data Science
No ratings yet
Data Science
68 pages
What Is A Data Analytics Lifecycle
No ratings yet
What Is A Data Analytics Lifecycle
8 pages
Unit - 2 Notes - BADS
No ratings yet
Unit - 2 Notes - BADS
32 pages
Unit I
No ratings yet
Unit I
20 pages
DS
No ratings yet
DS
94 pages
Course 1 Data Analyst Data Data Everywhere
No ratings yet
Course 1 Data Analyst Data Data Everywhere
83 pages
Tools For Data Preparation
No ratings yet
Tools For Data Preparation
4 pages
Unit 1 Half
No ratings yet
Unit 1 Half
8 pages
Chap 1
No ratings yet
Chap 1
42 pages
analytics and data science
No ratings yet
analytics and data science
12 pages
Mastering Data-Driven Strategies
No ratings yet
Mastering Data-Driven Strategies
3 pages
IMTC634 - Data Science - Chapter 1
No ratings yet
IMTC634 - Data Science - Chapter 1
16 pages
Differences between Data Science and Data Analytics
No ratings yet
Differences between Data Science and Data Analytics
10 pages
BDA-24_Lect (3-4)-(Fundamentals of Data Analysis)
No ratings yet
BDA-24_Lect (3-4)-(Fundamentals of Data Analysis)
15 pages
PYTHON FOR DATA ANALYTICS: Mastering Python for Comprehensive Data Analysis and Insights (2023 Guide for Beginners)
From Everand
PYTHON FOR DATA ANALYTICS: Mastering Python for Comprehensive Data Analysis and Insights (2023 Guide for Beginners)
Waldo Todd
No ratings yet
DSE 3 Unit 1
100% (1)
DSE 3 Unit 1
10 pages
Data Analytics
No ratings yet
Data Analytics
5 pages
Review Questions CH 1&2
No ratings yet
Review Questions CH 1&2
5 pages
Data Analytics
No ratings yet
Data Analytics
12 pages
M3 - Business Data Analysis
No ratings yet
M3 - Business Data Analysis
31 pages
DATA ANALYTICS (1)
No ratings yet
DATA ANALYTICS (1)
7 pages
DATA ANALYTICS 1
No ratings yet
DATA ANALYTICS 1
13 pages
Ch1-Introduction to Data Analytics & LifeCycle
No ratings yet
Ch1-Introduction to Data Analytics & LifeCycle
26 pages
Module 1 (3)
No ratings yet
Module 1 (3)
50 pages
Week 3 - LAQ
No ratings yet
Week 3 - LAQ
5 pages
Note GG Data Analytics Course
No ratings yet
Note GG Data Analytics Course
16 pages
Unit V
No ratings yet
Unit V
3 pages
Week 2
No ratings yet
Week 2
4 pages
Life Cycle of Data Science - Complete Step-By-step Guide
No ratings yet
Life Cycle of Data Science - Complete Step-By-step Guide
3 pages
Unit 1 Data Science
No ratings yet
Unit 1 Data Science
12 pages
_unit2 DATA SCIENCE
No ratings yet
_unit2 DATA SCIENCE
8 pages
Research Methodology Quiz
No ratings yet
Research Methodology Quiz
19 pages
Chapter 1- Intr to DS and Business Understanding
No ratings yet
Chapter 1- Intr to DS and Business Understanding
35 pages
Bdo Co1 Session 1
No ratings yet
Bdo Co1 Session 1
31 pages
Ch1-Introduction to data analytics & LifeCycle
No ratings yet
Ch1-Introduction to data analytics & LifeCycle
25 pages
IDS- UNIT-1
No ratings yet
IDS- UNIT-1
14 pages
1.big Data and Its Importance
No ratings yet
1.big Data and Its Importance
17 pages
Notas - Curso Data Analysis
No ratings yet
Notas - Curso Data Analysis
38 pages
PBS - 3 (1)
No ratings yet
PBS - 3 (1)
20 pages
Additional Notes BADS
No ratings yet
Additional Notes BADS
9 pages
Data Science Ppt1 Update
No ratings yet
Data Science Ppt1 Update
67 pages
Coursera - Data Analytics - Course 2
No ratings yet
Coursera - Data Analytics - Course 2
6 pages
Data Analytics - Review 1
No ratings yet
Data Analytics - Review 1
7 pages
DAFD UNit-2
No ratings yet
DAFD UNit-2
16 pages
Di-2 21245513
No ratings yet
Di-2 21245513
1 page
IDS Unit - 1
No ratings yet
IDS Unit - 1
7 pages
Cambridge 17 Academic
No ratings yet
Cambridge 17 Academic
6 pages
Cambridge IELTS 14-Listening Test 1
No ratings yet
Cambridge IELTS 14-Listening Test 1
6 pages
Exam Form Notice
No ratings yet
Exam Form Notice
1 page
Week 5, Listening Test 2, Book 11, Classwork
No ratings yet
Week 5, Listening Test 2, Book 11, Classwork
8 pages
IDS Unit - 4
No ratings yet
IDS Unit - 4
4 pages
IDS Unit - 3
No ratings yet
IDS Unit - 3
4 pages
5 - DS - Attendance - Cummulative Aug To Sept
No ratings yet
5 - DS - Attendance - Cummulative Aug To Sept
1 page
En19cs303041 Ritik Ratnawat Cs-Ds Practical - 3
No ratings yet
En19cs303041 Ritik Ratnawat Cs-Ds Practical - 3
3 pages
Use Case Diagram
No ratings yet
Use Case Diagram
1 page
Assignment Ratio Proportions 20186556
No ratings yet
Assignment Ratio Proportions 20186556
4 pages
National and Professional Anchoring High Performance in PTCL
No ratings yet
National and Professional Anchoring High Performance in PTCL
18 pages
Fault Tree Analysis
No ratings yet
Fault Tree Analysis
10 pages
BBA - Accounting For Managers
No ratings yet
BBA - Accounting For Managers
3 pages
Bharti Axa Job
No ratings yet
Bharti Axa Job
4 pages
A-TQM Brochure January 2021
No ratings yet
A-TQM Brochure January 2021
8 pages
Aeon 2013 Part 2
No ratings yet
Aeon 2013 Part 2
94 pages
TQM PPT
No ratings yet
TQM PPT
15 pages
Watrfall Methodology For A Online Food Ordering System
No ratings yet
Watrfall Methodology For A Online Food Ordering System
10 pages
MRS - Field Cookies
100% (2)
MRS - Field Cookies
2 pages
Audit: An: Auditing Defined Types of Audits Inherent Limitations General Principles Theoretical Framework
0% (1)
Audit: An: Auditing Defined Types of Audits Inherent Limitations General Principles Theoretical Framework
18 pages
A Managers Guide To ISO22301 Standard For BCMS (LITE)
No ratings yet
A Managers Guide To ISO22301 Standard For BCMS (LITE)
48 pages
Management Theory I Artifact
No ratings yet
Management Theory I Artifact
13 pages
DR M. Settu, CEO, Syndicate Exports PVT LTD, Coimbatore-STC College, Pollachi - Profile 2017
100% (1)
DR M. Settu, CEO, Syndicate Exports PVT LTD, Coimbatore-STC College, Pollachi - Profile 2017
2 pages
CAF-3 CMAModelpaper
No ratings yet
CAF-3 CMAModelpaper
5 pages
19.presentation Capability Maturity Model
No ratings yet
19.presentation Capability Maturity Model
18 pages
Tk09 Report Assignment
No ratings yet
Tk09 Report Assignment
97 pages
Assignment On Porters Generic Strategies
78% (9)
Assignment On Porters Generic Strategies
12 pages
Illustration Inventories Part II
No ratings yet
Illustration Inventories Part II
2 pages
EAC0228-Course - Schedule 2023.1 - v2
No ratings yet
EAC0228-Course - Schedule 2023.1 - v2
3 pages
Audi Final
100% (1)
Audi Final
22 pages
Module 003 Staffing Matching Employees and Jobs - Jobs Analysis and Design
No ratings yet
Module 003 Staffing Matching Employees and Jobs - Jobs Analysis and Design
12 pages
MANA 01BC - PRINCIPLES OF MANAGEMENT AND ORGANIZATION - Standard
No ratings yet
MANA 01BC - PRINCIPLES OF MANAGEMENT AND ORGANIZATION - Standard
93 pages
Mba Insti Profil
No ratings yet
Mba Insti Profil
60 pages
Chapter 1 OJT
No ratings yet
Chapter 1 OJT
11 pages
Pengaruh Customer Engagement Terhadap Kepuasan Pelanggan Dan Kepercayaan Merek Serta Dampaknya Pada Loyalitas Merek
No ratings yet
Pengaruh Customer Engagement Terhadap Kepuasan Pelanggan Dan Kepercayaan Merek Serta Dampaknya Pada Loyalitas Merek
16 pages
Leadership Development Brochure
No ratings yet
Leadership Development Brochure
16 pages
Importance of Human Resource Management
No ratings yet
Importance of Human Resource Management
2 pages
Chapter 15 Leadership and Employee Behavior in International Business
No ratings yet
Chapter 15 Leadership and Employee Behavior in International Business
30 pages