0% found this document useful (0 votes)
10 views33 pages

Session 2 - Foundations of Data and Information - 2024

cvcvdxfcv

Uploaded by

31231023798
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views33 pages

Session 2 - Foundations of Data and Information - 2024

cvcvdxfcv

Uploaded by

31231023798
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

PROBLEM SOLVING IN

BUSINESS MANAGEMENT
Foundations of Data
and Information
M.Sc. Thien Nguyen
Email: [email protected]
Phone: 0949088908
Agenda
1. Types of data
2. Data quality and integrity
3. Data collection
4. Data Analytics
5. Problem Solving Using Data
What is Data Analytics?

How many steps in Data Analytics project?

3
Review: What is Data Analytics

Data analytics is the process of analysing raw data in order to draw out
meaningful, actionable insights
(Source: https://careerfoundry.com/en/blog/data-analytics/what-is-data-analytics/)

Typical Steps in a Data Analytics Project

Source: https://medium.com/codex/life-cycle-of-a-data-analytics-project-954d0e6926fe
4
Review: What is Data Analytics

Exploiting values in data needs an analytical


mindset and some technical skills

Source: https://vitalflux.com/what-are-actionable-insights-examples-
concepts/ Source: https://www.softwaretestinghelp.com/data-analytics-companies/

5
1. Understanding different
types of data
6 Type of Data in Statistic & Research: Key in Data Science

Qualitative:
● Nominal (định danh)
● Binary (định danh True/False)
● Ordinal (thứ tự)

Quantitative:
● Discrete (rời rạc)
● Continuous (liên tục)
● Interval (khoảng)

Source: https://www.intellspot.com/data-types/
7
6 Type of Data in Statistic & Research: Key in Data Science

Source: https://www.intellspot.com/data-types/ 8
6 Type of Data in
Statistic &
Research: Key in
Data Science

Source: https://www.intellspot.com/data-types/ 9
6 Type of Data in
Statistic &
Research: Key in
Data Science

Source: https://www.intellspot.com/data-types/ 10
Terminology: List of common terminology related to data and
commonly-used in class for students to read at home
● Data is most straightforward to analyse if it forms a single data table.
● A data table consists of observations and variables.
● Observations are also known as cases.
● Variables are also called features.
● A dataset is a broader concept that includes, potentially, multiple data tables with different
kinds of information to be used in the same analysis
Observations and variables

Source: https://www.statology.org/observation-in-statistics/
Best practices

Best practices in working with data:

1. Right indent for numbers, left indent for text

City Customer Type Gender Product Line Unit Price

Ha Noi Member Female 4 74.69

HCM City Normal Female 1 15.28

Ha Noi Normal Male 5 46.33

Ha Noi Member Male 4 58.22

Ha Noi Normal Male 6 86.31


Best practices

Best practices in working with data:

2. Naming conventions
a. E.g. 1: SalesOrders, ProductID, CountOfOrders
i. Capitalize each word
ii. No space (can use “_” underscore)
b. E.g. 2: SalesOrders_v1, SalesOrders_v2 → NO
i. SalesOrders_Raw / SalesOrders_Cleaned / SalesOrders_Temp
ii. SalesOrders_Dao / SalesOrders_Minh
Best practices

Best practices in working with data:

3. Standard format for Date columns


a. YYYY-MM-DD, MM/DD/YYYY
The amount of data matters. For some organizations, this
might be tens of terabytes of data. For others, it may be
Volume
hundreds of petabytes. Example: 1GB = 1024 MB, 1TB =
1024 GB, 1PB = 1024 TB

What exactly is big data?


Velocity is the fast rate at which data is received and
The definition of big data is data (perhaps) acted on. Some internet-enabled smart products
Velocity
that contains greater variety, operate in real time or near real time and will require real-
time evaluation and action.
arriving in increasing volumes
and with more velocity. Variety refers to the many types of data that are available.
Traditional data types were structured and fit neatly in
a relational database. With the rise of big data, data comes
Variety in new unstructured data types. Unstructured and semi
structured data types, such as text, audio, and video,
require additional preprocessing to derive meaning and
support metadata.
2. Importance of data
quality and integrity
Data Integrity & Data Quality

Data integrity refers to the accuracy and


consistency of data over its lifecycle. When data
has integrity, it means it wasn’t altered during
storage, retrieval, or processing without
authorization.

It’s like making sure a letter gets from point


A to point B without any of its content being
changed.

Data quality is about how well the data serves


its intended purpose. This involves several
elements, including accuracy, completeness,
consistency, timeliness, and relevance.

If we extend the mail carrier analogy, data


quality doesn’t just mean the letter gets to point
B, it goes a few steps further. It checks that it’s
the right letter, it’s clear and understandable, Source: https://www.montecarlodata.com/blog-data-integrity-vs-data-quality/
arrives exactly when it’s needed, and follows
a consistent format.
3. Data Collection and Storage
Key Steps in Data Collection Process
Step 4: Collecting Data
If you’re conducting a survey, you’ll
need to administer the survey to your
participants. If you’re doing a case
study, you’ll need to observe and
Step 2: Choosing Data interview your participants.
Collection Method
- Observations
- Interviews and Focus Groups Step 5: Cleaning and
Organizing the Data
- Transactional Tracking
This step is critical since it will
- Social Media Monitoring improve the accuracy of your data
- Online Tracking and make it easier to evaluate.
Step 3: Planning Data Collection Data will be analyzed and used
- Surveys
Procedures to discover any patterns and
- Forms relationships in the data using an
Planning includes deciding how you’ll
collect the data, who will manage it, algorithm
when you’ll collect it, and where you’ll
collect it.

Step 1: Defining the Goal of Research


- What types of products do customers prefer?
- Which colors do customers prefer?
- Are there specific features that customers would Source: https://safetyculture.com/topics/data-collection/
like to see included in future products?
4. Data Analytics
II. Introduction to DA
2. Types of DA

Four Types (Levels) Of Data Analytics - Is a simple, surface-level type of analysis


based on historical data to examine,
understand, and describe what happened
Descriptive Analytics
- Uses BI and visualization tools to summarize
(Phân tích mô tả)
the data, or discover trends and patterns
- E.g.: Have the number of customers gone
up? Are sales better this month than last?
- Tries to uncover causal relationships
- May involve seeking to identify anomalies
Diagnostic Analytics
within the data
(Phân tích chẩn đoán)
- E.g.: Did the latest marketing campaign
impact sales?
- Is based-on historical data, past trends, and
Predictive Analytics
assumptions to predict future outcomes
(Phân tích dự đoán)
- Uses machine learning models
- Tries to find out and suggest what individuals
or organizations should do to obtain future
Prescriptive Analytics
targets/goals
(Phân tích đề nghị)
- Uses predictive analytics to show results of
different scenarios

Others: cognitive analytics, behavioral analytics, risk analytics...


22
II. Introduction to DA
2. Types of DA

03 types of knowledge from data (not only insights):

➤Hindsight: ability to learn from the past.


➤Insight: ability to understand and respond to what is happening at the
present
➤Foresight: ability to predict/forecast and prepare for the future

Source:https://www.linkedin.com/pulse/hindsight-insight-foresight-key-ingredients-effective-woods
23
To succeed, we need all three!

Source: https://www.linkedin.com/pulse/hindsight-insight-foresight-patrick-mcdonald
24
5. Problem Solving Using Data
WHY is Problem-Solving using DATA?

Organizational leaders frequently make business


decisions based on personal expertise and instinct.
(e.g., over 50% of American usually make decisions based on their guts - The
Conversation, 2015)

Business data analytics removes cognitive and


personal biases from the decision-making process by
using data as the primary input for decision-making.

When performed well, business data analytics can


create a competitive advantage for the organization.

26
WHY Problems to Solve using DATA?

1. DESCRIBING & PREDICTING


An advertising company wants to know the best method to acquire new
customers

2. DETECTING ABNORMALITIES
A fitness tracker company wants to help customers spot something unusual
with their health, e.g., resting heart rates.

3. CATEGORIZING
A retailer wants to have customized promotions for different customers

27
WHY Problems to Solve using DATA?

4. IDENTIFYING TRENDS & THEMES


A social media company wants to detect and promote trending contents
on its platform.

5. DISCOVERING CONNECTIONS
A beauty company wants to cross-sale its products based on
past customer purchases

6. FINDING PATTERNS
A supermarket wants to learn more about its customers frequency and
purchasing behaviors throughout the week

28
How to SOLVE PROBLEMS Using DATA?

STEP 5 – COMMUNICATION

STEP 4 – DATA VISULAZATION

STEP 3 – DATA ANALYSIS

STEP 2 – DATA WRANGLING

STEP 1 – PROBLEM STATEMENT

29
Group Activities
THANK YOU

You might also like