0% found this document useful (0 votes)
13 views74 pages

Midterms-Day-3

The document outlines the objectives and steps for conducting statistical analysis using software applications, focusing on constructing frequency distributions and calculating measures of central tendency. It includes detailed instructions for creating frequency distribution tables, histograms, and other statistical graphs, while emphasizing ethical use of statistics. Additionally, it discusses the importance of understanding grouped and ungrouped data and provides definitions for mean, median, and mode.

Uploaded by

Avelle Smith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views74 pages

Midterms-Day-3

The document outlines the objectives and steps for conducting statistical analysis using software applications, focusing on constructing frequency distributions and calculating measures of central tendency. It includes detailed instructions for creating frequency distribution tables, histograms, and other statistical graphs, while emphasizing ethical use of statistics. Additionally, it discusses the importance of understanding grouped and ungrouped data and provides definitions for mean, median, and mode.

Uploaded by

Avelle Smith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

AE 9

STATISTICAL ANALYSIS
with Software Applications
AE 9

STATISTICAL ANALYSIS
with Software Applications
OBJECTIVES:
After the class, the students are expected to :
- Construct a frequency distribution using the given
data and visually present it;
- Calculate measures of central tendency;
- Locate the measures of central tendency for
symmetric and skewed distribution;
Data Sorted Arranged

Presented
Vissually

Actionable
Explained (useful)
w/ Story
CONSTRUCTING
FREQUENCY DISTRIBUTION
(Absolute Frequency Distribution)
Frequency distributions are visual displays
that organize and present frequency
counts so that the information can be
interpreted more easily.

-Australian Bureau of Statistics


In constructing a frequency distribution, we
group data into mutually exclusive classes
and indicate the number of observations in
each (Lind et al. 2006, 25)
Expressed as a table, a frequency
distribution table separates data into classes
and tabulates how many data points fall into
exactly one class (Brase and Brase 2010, 36).
Steps in Constructing Frequency Distribution Table
1. Find the appropriate number of classes

The goal of organizing disorganized, ungrouped,


or raw data is to employ an appropriate number of
classes to depict the distribution’s shape. This is
because too few or too many classes may not be
appropriate to show the data’s fundamental shape.
Steps in Constructing Frequency Distribution Table
1. Find the appropriate number of classes

Use 2𝑘 rule

Where k = number of classes,


n = number of sample, and
2𝑘 > n
Steps in Constructing Frequency Distribution Table
1. Find the appropriate number of classes
Example n = 90
2𝑘 > 90

If we use k=6 26 = 64 < 90 , so 6 is inadequate

Now, try k=7 27 = 128 > 90, so 7 is recommended


Steps in Constructing Frequency Distribution Table
2. Determine the class interval
The width of each class, which should be constant across all
other classes, is sometimes referred to as the class interval ( i ). All
classes must at least span the range between the ungrouped data's
minimum and greatest values.
𝐻 − 𝐿 where
That is, i= , i is the class interval
𝑘 H is the maximum data point,
L is the minimum data point, and
k is the number of classes
Steps in Constructing Frequency Distribution Table

2. Determine the class interval

𝐻−𝐿 H = 35, 925 35, 925−15, 546


i= i=
7
𝑘
L = 15, 546
k=7 i= 2, 911 ≈ 3,000
Steps in Constructing Frequency Distribution Table

3. Set individual class limits.

To ensure that each observation may be assigned to only one


class, it is necessary to indicate the lower and upper bounds
of the class (i.e., the lower and upper values of your class
interval). That is, class limits should not overlap.
Steps in Constructing Frequency Distribution Table

3. Set individual class limits.

Example, since it is unclear how 1,500 will be classified, we


refrain from utilizing class limits like 1,000 to 1,500 and 1,500
to 2,000. We should also avoid using class limits like 1,000 to
1,500 and 1,600 to 1,650 because it would be confusing where
to classify a data point valued at 1,550.
Steps in Constructing Frequency Distribution Table

3. Set individual class limits.

For continuous variables, it is advised to use 1,000 up to 1,500 and


1,500 up to 2,000, and so on; for discrete variables, it is advised to use
1,000 to 1,500 and 1,501 to 2,000, and so on. We select the former since
our variable is continuous, making it evident that 1,499 belong to the
first class and 1,500 to the second.
Steps in Constructing Frequency Distribution Table
3. Set individual class limits.
Table 2: Class Limit

Class (k=7) Class Interval ( i = 3,000)

1 15,000 up to 18,000
2 18,000 up to 21,000
3 21,000 up to 24,000
4 24,000 up to 27,000
5 27,000 up to 30,000
6 30,000 up to 33,000
7 33,000 up to 36,000
Steps in Constructing Frequency Distribution Table
4. Add up the data points for each class and count how
frequently they occur.
Steps in Constructing Frequency Distribution Table
Table 3: Frequency Distribution Table (Absolute)

4. Add up the data Class Interval ( i = 3,000) Frequency

points for each class 15,000 up to 18,000 8


18,000 up to 21,000 25
and count how 21,000 up to 24,000 18
frequently they occur. 24,000 up to 27,000 22
27,000 up to 30,000 9
30,000 up to 33,000 5
33,000 up to 36,000 3
Total 90
Observations
1. Lot selling prices in Laguna range from about PHP 15,001 to
PHP 36,000 per square meter.
2. The lot selling prices are concentrated between PHP 18,000
and PHP 27,000 per square meter. A total of 65 or 72.2 % of
the lots are sold within this range.
3. The bulk of the data values are in the PHP 18,000 to PHP
21,000 class. The middle value of this class, called the class
midpoint, is PHP 19,500. We can claim that the usual selling
price is PHP 19,500 per square meter. Note that the class
midpoint is the halfway value between the lower limits of two
consecutive classes [(15,000+18,000)/2 = 19,500]. The class
midpoint represents the typical value.
4. There are three lots sold for PHP 33,000 per square meter or
more, and eight lots are sold for less than PHP 18,000 per
square meter.
Cross Tabulation
Previously, we have a unidimensional frequency distribution
table. It is also possible to do frequency tables
multidimensionally using cross tabulation.
Cross Tabulation simply presents the results of the entire group of
observations and the results from subgroups. It is a two-
dimensional (or more) table that counts the number of
observations that have the particular traits listed in the table’s
columns.
Cross Tabulation
Relative Frequency Distribution Table
Table 5: Relative Frequency Distribution Table (Absolute)
Class Interval ( i = 3,000) Frequency Total Frequency/Total Relative Frequency

15,000 up to 18,000 8 90 8/90 8.89%


18,000 up to 21,000 25 90 25/90 27.78%
21,000 up to 24,000 18 90 18/90 20.00%
24,000 up to 27,000 22 90 22/90 24.44%
27,000 up to 30,000 9 90 9/90 10.00%
30,000 up to 33,000 5 90 5/90 5.56%
33,000 up to 36,000 3 90 3/90 3.33%
Total 90 90 90/90
Cumulative Frequency Distribution Table
Table 5: Cumulative Frequency Distribution Table (Absolute)
Class Interval ( i = 3,000) Frequency Total Frequency/Total Relative Frequency Cumulative Frequency

15,000 up to 18,000 8 90 8/90 8.89% 8.89%


18,000 up to 21,000 25 90 25/90 27.78% 36.67%
21,000 up to 24,000 18 90 18/90 20.00% 56.67%
24,000 up to 27,000 22 90 22/90 24.44% 81.11%
27,000 up to 30,000 9 90 9/90 10.00% 91.11%
30,000 up to 33,000 5 90 5/90 5.56% 96.67%
33,000 up to 36,000 3 90 3/90 3.33% 100.00%
Total 90
GRAPHING FREQUENCY
DISTRIBUTION
HISTOGRAM
A histogram is a graphical representation of a grouped frequency
distribution with continuous classes. It is an area diagram and can
be defined as a set of rectangles with bases along with the intervals
between class boundaries and with areas proportional to
frequencies in the corresponding classes.
HISTOGRAM
It allows us to assess where the values are concentrated, what the
extremes are, and whether there are any gaps or anomalous
values.
HISTOGRAM
Steps in Constructing Frequency Distribution Table

1. Make sure Analysis Toolpak is enabled.


HISTOGRAM
Steps in Constructing Frequency Distribution Table

2. Type the class limits


HISTOGRAM
Steps in Constructing Frequency Distribution Table

3. Go to Data Tab and Select Data Analysis, then


select Histogram
HISTOGRAM
Steps in Constructing Frequency Distribution Table

3. Input range: Select your data


Bin Range: Select your upper limits
Check the Labels
Select Output Range (click where you want it to
appear)
Select Chart Output
HISTOGRAM
Steps in Constructing Frequency Distribution Table

4. Format your Histogram


Types of
HISTOGRAM
Types of HISTOGRAM

Mound-shaped Symmetrical

It has perfect symmetry when divided


vertically down the center, with both
sides matching each other in size and
shape. The balance reflects a steady
distribution pattern. It is also called
bell-shaped, normal distribution.
Types of HISTOGRAM

Uniform (Rectangular)

It shows uniform distribution means


that the data is uniformly distributed
among the classes, with each having a
same number of elements. It may
display many peaks, suggesting varying
degrees of incidence.
Types of HISTOGRAM

Bimodal

A histogram is called bimodal if it


has two distinct peaks, and peaks are
separated by at least one class. This
implies that the data consists of
observations from two distinct groups
or categories, with notable variations
between them.
Types of HISTOGRAM

Right-Skewed Histogram

A right-skewed histogram shows bars


leaning towards the right side. This
signifies that the majority of the data
points are on the left side, with a few
outliers reaching to the right.
Types of HISTOGRAM

Left-Skewed Histogram

A left-skewed histogram shows bars


that lean towards the left side. This
means that the majority of the data
points are on the right side, with a few
exceptionally low values extending to
the left.
Other Charts
and Graphs
Other Charts and Graphs

A statistical graph or chart is defined as the pictorial


representation of statistical data in graphical form.
The statistical graphs are used to represent a set of
data to make it easier to understand and interpret
statistical information.
Other Charts and Graphs
Bar Graph
Bar graphs are the pictorial
representation of grouped data in
vertical or horizontal rectangular bars,
where the length of bars is proportional
to the measure of data. The chart’s
horizontal axis represents categorical
data, whereas the chart’s vertical axis
defines discrete data. It can be used to
illustrate any of the four levels of
measurement.
Other Charts and Graphs
Bar Graph

Bar graphs are useful when we want to


compare the different parts, not
necessarily the parts to the whole.
Multiple Bar Graph
Simple Bar Graph
(Grouped Column Chart)
The multiple bar chart is an extension of a simple
The simple bar chart is used for the case
of one variable only. bar chart when there are quantities of several
variables to be displayed. The bars representing
the quantities for the different variables are piled
next to one another for each attribute.
Other Charts and Graphs

Line Graph

A graph that utilizes points and lines to


represent change over time is defined as
a line graph. In other words, it is a chart
that shows a line joining several points
or a line that shows the relation
between the points.
Simple Line Graph Multiple Line Graph
The simplest of line graphs is the single
line graph, so called because it displays Multiple line graphs illustrate information on several
information concerning one variable variables so that comparison is possible
only, in terms of its frequencies. between them.
Other Charts and Graphs
Pie Chart
A pie chart used to represent the
numerical proportions of a dataset. This
graph involves dividing a circle into
various sectors, where each sector
represents the proportion of a particular
element as a whole. This is also called a
circle chart or circle graph.
Other Charts and Graphs
Pie Chart

✦ Pie charts are useful for showing the


division of all possible values of a
qualitative variable into its parts.
Ethical Use of Statistics

As ethical producers of information using statistics,


we should not use the tools of statistics to
generate and spread “fake news” (Watson, 2021).
Ethical Use of Statistics
(Misleading Charts and Graphs)
Do not make statements in analysis, conclusion,
and or recommendations that are beyond the
scope of our statistical analysis.
Do not manipulate the data to get the results you
want and promote a story, statement or point of
view. This is called “fake news”.
Activity
Measures of
Central Tendency
Grouped and
Ungrouped Data
Ungrouped Data

Ungrouped data which


is also known as raw
data is data that has
not been placed in any
group or category after
collection.
Grouped Data

Grouped data is
the type of data
which is classified
into groups after
collection.
Measured of Central Tendency

Mean
Median

Mode
Mean
• It is the sum of the data values divided by the number of data
values.
• It is also called the average.
• It is appropriate only for data under interval and ratio scale
measurement.
• If you are interested in the “center of gravity” of your data,
then use the mean;
Advantage of Mean

✦ Simple to understand and easy to calculate.


✦ It is rigidly defined.
✦ It is the least affected fluctuation of sampling.
✦ It takes into account all the values in the series.
Population Mean Sample Mean
σ𝑋 σ𝑋
μ= 𝑋ത =
𝑁 𝑛
Where: Where:
μ is population mean 𝑋ത is sample mean
is the total values of X is the total values of X in sample
N number of data of values n number of data of values in sample
Sample Mean
σ𝑋
𝑋ത =
𝑛
15546 + 15795 + … + 35925
𝑋ത =
90
2,113,902
𝑋ത = 90
= 23,487.80
Sample Mean
Using MS Excel
formula:

“=average()”
Arithmetic Mean
• All interval- or ratio- level data has mean.
• All data values should be included in the mean
• A dataset has a singular mean.
• The sum of deviations of each value from the mean is zero
(0). Mathematically, σ(𝑋- 𝑋ത ) = 0

Ex. Given 30, 80 and 40 𝑋ത = 50

σ(𝑋- 𝑋ത ) = (30-50)+(80-50)+(40-50)=(-20) + 30 + (-10) = 0


Weighted Mean
A special case of arithmetic mean that can be used when there
are several data points or observations of the same value.
Where:

σ(𝑤𝑋) 𝑋ത𝑤 is weighted mean


𝑋ത𝑤 = sum of the product w and X values in sample
σ𝑤
σ𝑤 sum of the weights
Example:

A construction company pays its workers in the construction of a


condominium project in San Pablo City on a daily basis. It pays laborers
P537.00, masons P729.00, and carpenters P729.00. There are 30 laborers,
30 masons, and 90 carpenters. Determine the weighted mean daily rate
paid to the 150 workers.

σ(𝑤𝑋) 30 537 +30 729 +90 (729) 16110+21870+65610


𝑋ത𝑤 = σ𝑤
= = = 690.0
30+30+90 150
Weighted Mean
Using MS Excel: =sumproduct( ) / n
Type of Worker # of Workers Daily Rate (in P) # of Workers x
Daily Rate (in P)
Laborers 30 537 16110
Masons 30 729 21870
Carpenters 90 729 65610
Total (n) 150 Sum Product 103590
Manual Computing Divide by n = 150 690.6
Using Excel formula: =sumproduct () 103590
=sumproduct() / n 690.6
Median
It is the “middle observation” when the data set is sorted (in
either increasing or decreasing order).

The median divides the distribution into two equal parts.

Using MS Excel: =median()


Advantage of Median
✦ The median is not affected by the size of extreme values but
by the number of observations.
✦ The median can be calculated even when the frequency
distribution contains “open-ended” intervals.
✦ It can also be used to define the middle of a number of
objects, properties, or quantities which are not really
quantitative in a nature.
✦ It can be easily interpreted
Mode
The mode is simply the most frequently occurring data values
in the data set. Therefore, it is mainly useful for the nominal
level of measurement. Both median and mean are useful when
the variable being measured can be quantified. Also both data
sets have no mode that’s why mode is not appropriate measure
to use in these data sets

Using MS Excel: =mode.mult()


Mound-shaped
Symmetrical Mean

Median

Mode
Mode

Right-Skewed
Median
Histogram

Mean
Mode

Left-Skewed
Median
Histogram
Mean

You might also like