0% found this document useful (0 votes)
17 views

Descriptive Analytics

Uploaded by

Micah Guinucud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Descriptive Analytics

Uploaded by

Micah Guinucud
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Descriptive

Analytics

Dianna Jean A. Gayo, LPT, MST-Math


Instructor
LEARNING OBJECTIVES:

• Explain the key principles of Descriptive


Analytics; and
• Apply descriptive analytics techniques to a
given dataset using MS Excel
DATA ANALYTICS
 Data analytics focuses on processing and performing statistical
analysis of existing datasets. Analysts concentrate on creating
methods to capture, process, and organize data to uncover
actionable insights for current problems, and establishing the
best way to present this data.
 The field of data and analytics is directed toward solving
problems for questions we know we don’t know the answers
to. It’s based on producing results that can lead to immediate
improvements.
UNIVARIATE ANALYSIS:

Univariate analysis involves the


examination of cases of one variable at
a time. There are three major
characteristics of a single variable that
we tend to look at:
• the distribution
• the central tendency
• the dispersion
1. DESCRIPTIVE ANALYTICS
Descriptive analytics are used to describe the basic
features of the data in a study.
They provide simple summaries of the sample and
the measures. Together with simple graphics
analysis, they form the basis of virtually every
quantitative analysis of data.
Examples:
• The number of students enrolled in the Education department
• The mean age of students in BTLED program
Commonly used Descriptive Analytics test:
Frequency No. of items
Percentage A portion of a whole
Range Distance between the
highest & lowest data
Mean Average
Mode Most frequent data
Median scores Middlemost data
Standard deviation Measures the dispersion of a
dataset relative to its mean
Descriptive research questions:

Example 1: What are the demographic


characteristics of voters in Santiago City
in terms of :
1.1 Sex Categorical variable
• Frequency (# of male &
1.2 Marital status female)
1.3 Age • Percentage (% of male &
female)
1.4 Monthly income
Descriptive research questions:

Example 1: What are the demographic


characteristics of voters in Santiago City
in terms of :
1.1 Sex
Categorical variable
1.2 Marital status • Frequency
1.3 Age • Percentage

1.4 Monthly income


Descriptive research questions:

Example 1: What are the demographic


characteristics of voters in Santiago City
in terms of :
1.1 Sex
Numerical (ratio) variable
1.2 Marital status • Mean (average)
• Median (middlemost score)
1.3 Age • Range
1.4 Monthly income • Standard deviation
Frequency Table

EXAMPLE: (POPULATION OF SANTIAGO CITY FROM 2018-2023)


The city of Santiago has witnessed dynamic changes in its
population over the past five years. In 2018, Santiago began the
period with approximately 700,000 residents, reflecting diverse
cultures, occupations, and lifestyles. By 2019, the population had
grown to approximately 721,000. The momentum continued,
with the population surging to around 750,000 in 2020. Even
amidst global adversities in 2021, Santiago marked a population
milestone of 772,500. Steadfast growth persisted, reaching an
estimated 800,000 residents by 2022.
In tabular form…

Year Population
2018 700,000
2019 721,000
2020 750,000
2021 772,500
2022 800,000
TRY THIS! SCORES OF 15 STUDENTS DURING
A QUIZ…

8 9 10 5 6
7 4 5 4 7
8 9 6 6 5

Organize the scores in a table


SCORES:
8, 9, 10, 5, 6, 7, 4, 5, 4, 7, 8, 9, 6, 6, 5

Scores Frequency
4 2
5 3
6 3
7 2
8 2
9 2
10 1
Total: 15
MEASURES OF CENTRAL TENDENCY

1. Mean- the computed average


score of a distribution.
2. Median- the center, or the
middle score within a
distribution.
3. Mode- is the most frequent
score within a distribution.
1. MEAN

It is the most common


measurement of average.
Also called the arithmetic mean or
the computed average.
Properties of the Mean
1. The most common measure of central
tendency.
2. Its value is dependent upon every item in
a set of data.
3. It is sensitive of affected by extreme
values. (e.g. 1, 1, 3, 2, 1, 4,3, 12)
4. It is computed for data measured in
interval and ratio scale
1. MEAN
The sum of all items, divided by the
numbers of items.
Formula:
𝜮𝒙
ഥ=
𝒙
𝒏
𝑤ℎ𝑒𝑟𝑒 Σ𝑥 𝑖𝑠 𝑡ℎ𝑒 𝑠𝑢𝑚 𝑜𝑓 𝑐𝑎𝑠𝑒𝑠/𝑑𝑎𝑡𝑎
𝑛 𝑖𝑠 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑎𝑠𝑒𝑠/𝑑𝑎𝑡𝑎

Σ: read as sigma/”the summation of”


2. MEDIAN

It is the middlemost value in a list


of items arranged in increasing or
decreasing order.
When the data set is ordered. It is
called a data array.
Properties of the Median
1. It is the value midway between the
highest & the lowest values in a rank
order distribution.
2. It is not sensitive to the size of the
extreme values. It is affected by the
number of cases. (1,1, 2, 3, 4, 5, 20)
3. The items must be arranged
according to size before the median
can be computed.
Computation of the Median
1. Arrange the items either in
ascending order or in descending
order.
2. Odd number of cases: get the
middlemost item to determine the
median.
3. Even number of cases, get the two
middlemost items & compute their
mean.
Computation of the Median
(For even number of cases)
Formula:
𝑥෤ = 𝑋𝑛+1
2

Where X= the item in the set


n=the number of cases.
3. MODE
It is the value that occurs most often
in the data set.
Unimodal (only one value occurs with
the greatest frequency)
Bimodal (2 modes)
Multimodal (more than 2 modes)
No mode (no data occurs more than
once.
3. MODE
Used when data are nominal or
categorical (gender, political
affiliation, ethnicity)
Used when the most typical case is
desired.
MEASURES OF VARIATION

the measure of dispersion or


variability of data.
Range, mean deviation,
standard deviation, variance,
1. RANGE

The range is the distance


(difference) between the lowest
and the highest data point.
R = H – L
Use for rough or quick
comparison.
Small measure of variability would indicate that the data are:

1. Clustered around the mean

2. More homogeneous
3. Less varied

4. More uniformly distributed


1. RANGE

 Example:
 The following are the test scores of Julia
in all of her subjects during the prelims:
89, 73, 84, 91, 87, 77, 94
Highest - lowest = 94 – 73 = 21
The range of Julia’s test scores is 21
points
2. MEAN DEVIATION

Also called as mean absolute


deviation.
Useful in describing how much a set
of data varies from the mean.
2. MEAN DEVIATION

Σ 𝑋 − 𝑋ത
𝑀𝐷 =
𝑁
 Where X= value in a data

 𝑋=mean of the data
 N= number of cases
3. VARIANCE

 It is the average of the square of the


distance of each value from the mean.
Population variance: Sample variance:
Σ(𝑥 − 𝜇) 2 2
Σ(𝑥 − 𝑥)ҧ
𝜎2 = 𝑠2 =
𝑁 𝑛−1

Where X= value in a data


𝜇 or 𝑥ҧ or =mean of the data
N or n= number of cases
4. STANDARD DEVIATION (SD)

It is the square root of the variance.

SD of a Population SD of a Sample
𝜮(𝒙−𝝁)𝟐
𝜎2 = 𝝈 = ഥ )𝟐
𝜮(𝒙 − 𝒙
𝑵 𝑠2 = 𝒔 =
𝒏−𝟏

Where X= value in a data


𝜇 or 𝑥ҧ or =mean of the data
N or n= number of cases
GRAPHS
● Graph is any pictorial device used to display or
present numerical relationship between
variables.
1. Bar graph
• A bar graph is used to compare things
between different groups or
categories using their frequencies.
• It uses parallel bars, either horizontal or
vertical, to represent counts for several
categories.
Data on the population of a Province for the last 5 years...

Year Population

2018 700,000
2019 721,000
2020 750,000
2021 772,500
2022 800,000
Population of a Province, 2018-2022
820,000

800,000

780,000

760,000

740,000

720,000

700,000

680,000

660,000

640,000
Population
2018 2019 2020 2021 2022
2. Time series/Line graph
• A line graph displays data which are
collected over a short and long
period of time to show how the data
changes at regular intervals.
Population of a Provincefrom 2018-2022

820,000

800,000 800,000

780,000
772,500
760,000
750,000
740,000

720,000 721,000

700,000 700,000

680,000

660,000

640,000
2018 2019 2020 2021 2022
3. Pie graph
• A Pie graph is a circle divided into
sections according to the percentage
of frequencies in each category of the
distribution. It shows how a part of
something relates to the whole
Example: The frequency table shows the number of
male and female students in your class.

Sex Frequency
Male 33
Female 19
Total 52
Sex Distribution of Students

37%
Male
63%
Female
THANK YOU!
Dianna Jean A. Gayo, LPT, MST-Math
Instructor

Email Address: [email protected]

You might also like