Module 1: Nature of Statistics
Module 1: Nature of Statistics
Introduction
Statistical Thinking will one day be as necessary for efficient citizenship as the ability to read
and write (H. G. Wells).
In 2017, The Economist published one of the striking changes in the world economy. It claims
that the world’s most valuable resource is no longer oil, but data. The five biggest tech giants –
Google, Amazon, Apple, Facebook, and Microsoft – had been taking advantage and profiting from
making use of consumer/customer data. This phenomenon prompted professionals to the use
of statistics and later popularizing the concept of data science.
To date, many business companies all over the world are hiring statisticians and data
scientists to further their competitive advantage. Since everything is data and everyone needs it
analyzed, you need to learn the important knowledge and skills of statistics. As Florence Nightingale
puts it,
“Statistics… is the most important science in the whole world: for upon it depends the practical
application of every other science and of every art; the one science essential to all political and social
administration, all education, all organization based upon experience, for it only gives the results of
our experience.”
Levels of Measurement
Measurement levels refer to different types of variables that imply how to analyze them.
1. It is a variable whose values don’t have an undisputed order. It may have two or
more exhaustive, non-overlapping categories but there is no intrinsic ordering of the
categories. Examples: sex, socioeconomic status, civil status, school division, religious
affiliation, mother-tongue
2. It holds the value that has an undisputed order but no fixed unit of measurement. An
ordinal variable is similar to a nominal variable except that there is a clear ordering of
the variables. Although, the difference between each range cannot be stated with
certainty. Examples: rating scales (Likert scales), shoe/shirt sizes, ranking, monthly
income (range)
3. An interval variable is similar to ordinal data except that the ranges are equally
spaced. It has a fixed unit of measurement but zero does not mean anything.
Examples: temperature, pressure, IQ score, mental ability ratings
4. A ratio variable is an interval variable with a true zero. It has a fixed unit of
measurement and zero means nothing. Example: weight, height, age, income
Introduction
After identifying your research problem, the next step is to collect appropriate and relevant
data. Data collection is crucial to the success of any investigation or study. If the investigator was not
able to collect enough relevant data, the findings and results of the study will be affected; thus,
conclusions, generalization, or implications derived from the available data may not be reliable or
valid. Becoming an expert in data collection methods and techniques require time and effort.
Guidance from an experienced researcher or statistician may help you in working your data collection
and sampling design
Data collection is a methodical process of gathering and analyzing specific information to give
solutions to relevant research questions.
Types of Data
1. Primary Data. These are data collected by the investigator himself/ herself for a
specific purpose. For instance, the data collected by an investigator for their research
projects is an example of primary
2. Secondary These are data collected by someone else for some other purposes, but
the being utilized by the current investigator for another purpose. For instance, the
census data is used to analyze the impact of education on career choice, and earning
is an example of secondary data.
Sampling Techniques
1. Probability Sampling. It is a sampling technique wherein the members of the population are
given an (almost) equal chance to be included as a sample.
Simple Random All members of the population have a chance of being included in the sample.
Example: lottery method, random numbers
Systematic Random Sampling (with a random start). It selects every kth member of the
population with a starting point determined at random. Example: Selecting every 5th member
of N = 1000, to get 200 samples. For instance, starting at 7th member, we have the 12th, 17th,
22nd, and so
Stratified Random This is used when the population can be divided into several smaller non-
overlapping groups (strata), then the sample is randomly selected from each group.
Cluster Sampling. Also called area sampling in which groups or cluster, instead of individuals
are selected randomly as sample
Multi-stage Sampling. If the population is too big, two or more sampling techniques may be
used until the desired sample is
2. Non-probability Sampling. It is a sampling technique wherein the sample is determined by set
criteria, purpose, or personal
1. Purposive or Judgment The sample is selected based on predetermined criteria set by
the researcher. Example: To determine the difficulties encountered by students in the
2017 national achievement test, only the Grade 6 pupils of the said school
will be included as a sample.
2. Convenience or Accidental It relies on data collection from population members who
are conveniently available to participate in the study. Facebook polls or questions can
be mentioned as a popular example of convenience sampling.
3. Quota Sampling. It is a non-probability sampling technique in which researchers look for a
specific characteristic in their respondents, and then take a tailored sample that is in
proportion to a population of
4. Snowball The samples are determined by referrals made by previous members of the sample
MODULE 3: DATA PRESENTATION AND VISUALIZATION
Introduction
Data visualization is a graphical representation of information and data. The different data
visualization tools provide an accessible way to see and understand trends, outliers, and patterns in
data. Being another form of visual art, data visualization grabs the interest and attention of the
audience on the message. It helps to tell the important stories by curating data into a form easier to
understand, highlighting the most important aspect of the data set. However, data presentation and
visualization are not as simple as creating graphs and tables. Effective presentation and visualization
of data involve a balance between form (aesthetics) and function.
A statistical graph (or chart) is a tool that helps readers to understand the characteristics of a
distribution of sample or a population. Effective data presentation follows the following principles.
Five Essential Elements of Data Visualization (Data Craze, 2020)
1. Consistent Style and Colors. Carefully choose and maintain the same style across your
visualizations. Remember that the true meaning and value of data are not just in
2. Select Right Visualization. A bar or pie chart is not the only visualization method in your
arsenal. Adjust what you want to present based on the purpose and type of data you
3. Less is More. Focus on the quality of what you want to present. The excessive number of charts
or indicators is distracting. Simplicity comes at a price – the less information to analyze the
4. Effective Visualization. The difference between effective and impressive visualization can be
huge. The data presented in the application should foremost give a value – effect in the form of
specific
5. Data Quality. The trust of users is difficult to build, but it is easy to lose. Unexpected
information is desirable, errors are not. Try to detect errors at an early
Other useful graphs and charts, with their description, use, and other important features may be
found at The Data Visualization Catalogue via datavizcatalogue.com
Here are some tips on improving your charts and graphs (Visme, 2020).
1. Our eyes do not follow a specific order, so you need to create that order. Create a visualization
that deliberately takes viewers on a predefined visual
2. Our eyes first focus on what stands out, so be intentional with your focal point. Create charts
and graphs with one clear message that can be effortlessly
3. Our eyes can only handle a few things at once, so do not over crowd your design. Simplify your
charts so that they highlight one main point you want you
4. Our brains are designed to immediately look for connections and try to find meaning in the data.
Assign colors deliberately to improve the functionality of your
5. We are guided by cultural
Lesson 2: Tabular Presentation of Data
Almost all research and technical reports use tables to present data. Tabular presentation of
data is a systematic and logical arrangement of data into rows and columns with respect to the
characteristics of data.
Components of Tables
1. Table Number and Title. It is included for easy reference and identification. It should indicated
the nature of the information that is included in the
2. Stub (Row Labels). It is placed on the left side of the tabular form indicating specific issues in the
3. Captions (Column Headings). It placed at the top of the columns of a table to explain figures of
the
4. Body. The most important part of the table which comprises numerical contents and reveal the
whole story of investigated
5. Footnote. It provides further explanation that may be needed for any item that is included in a
6. Source note. It is placed at the bottom of the table to indicate the sources of
Tabular Presentation of Nominal and Ordinal Data
Nominal or ordinal data are presented using a frequency table or frequency distribution
table. The table displays frequency count and percentages for each value of a variable.
Example: Suppose your research objective is to determine the profile of the respondents. The data
may be presented as follows.
A contingency table or crosstabulation can also be used to display the relationship between
categorical variables. This type of presentation allows us to examine a hypothesis regarding the
independence or dependence of between variables.
Example: Suppose your research objective is to determine the profile of the respondents. The data
may be presented in crosstabulation as follows.
Tabular Presentation of Interval and Ratio Data
The data on the interval or ratio scale are organized using a frequency distribution table.
These are the steps in constructing a frequency distribution table.
1. Determine the number of class intervals, = 1 + 3.322 , the range = – , and the class size c = R/k
2. Construct the class intervals based on the class The first and last class intervals should contain
the minimum and maximum value, respectively. It is advisable to start the first class interval
with the minimum value.
3. Arrange the data in in either ascending or descending order. Then tally the scores based on the
class intervals in step
4. Add columns for class boundaries, class mark or class midpoint, relative frequency, and
cumulative
The class interval contains the lower (L) and upper limits (U). (e g. In the class interval 46
– 65, the lower limit is 46 and the upper limit is 65)
The class mark or class midpoint (X) is the value in the middle of the class interval. (e. g. In the
class interval 46 – 65, the class mark is 55.5; that is,
The class boundaries are the true class limits of the class intervals. It is halfway below the
lower limit and halfway above the upper limit. (e. g. In the class interval 46 – 65, the class
boundary is 44.5 – 65.5)
The relative frequency (also known as percentage frequency) is computed using the formula
where f is the frequency of the class interval and n is the total of the frequencies.
The less than cumulative frequency (<cf) and greater than cumulative frequency (>cf) are
obtained by adding the frequencies from top to bottom and from bottom to top,
respectively.
Example: Using the scores of 50 students in a 55-item Mathematics test, construct a frequency
distribution table.
43 30 35 37 42 19 26 48 34 15
35 18 46 41 27 18 13 40 29 14
40 17 10 21 28 13 14 39 30 5
19 50 36 20 31 28 48 32 20 38
25 12 33 31 28 16 40 32 26 35
Solution:
Step 1: Determine the number of class intervals the range, and the class size.
= 1 + 3.322 = –
= 1 + 3.322 (50) = 50 – 5
= 6.643978 = 7 = 45
c= R/K
c = 45/7 = 6.43 =7
Step 2: Construct the class intervals based on the class size.
Since our minimum value is 5 and the class size if 7, the first class interval is 5 – 11. Note
that this class interval contains 7 values – 5, 6, 7, 8, 9, 10, 11.
To construct the succeeding intervals, add the class size to the lower and upper limits.
Class
Intervals
5 – 11
12 – 18
19 – 25
26 – 32
33 – 39
40 – 46
47 – 53
Step 3: Arrange the data in in either ascending or descending order. Then tally the scores based on the
class intervals in step 2.
5 14 18 20 27 30 32 35 40 43
10 14 18 21 28 30 33 36 40 46
12 15 19 25 28 31 34 37 40 48
13 16 19 26 28 31 35 38 41 48
13 17 20 26 29 32 35 39 42 50
This data set can be organized or sorted using stem-and-leaf plot. A stem-and-leaf plot is a special
table where each data value is split into a stem, first digit or digits, and a leaf, last digit.
Stem Leaf
0 5
1 0233445678899
2 00156678889
3 001122345556789
4 000123688
5 0
Step 4. Add columns for class boundaries, class mark or class midpoint, relative frequency, and
cumulative frequencies.
10/15/21, 5:06 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
A. MEAN - It is the balance point of a data set and the most reliable, most sensitive measure of average. It is always unique and is affected by extreme values. It is used if
data are interval or ratio, and when the data is normally distrubted.
For ungrouped data, the mean can be computed as follows.
Example: The following are the scores of 10 students in a 50-item Qualifying Examination.
45 44 42 40 45 48 49 50 50 47
The mean score of 10 students in a 50-item Qualifying Examination is 46. In general, the group of students did well in the examination.
Example: A high school teacher conducts a semester evaluation of the Math textbook used in Calculus. The following are data collected among the 40 students.
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 1/7
10/15/21, 5:06 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
Table 1 shows the textbook evaluation results for Calculus. The students strongly agreed that the Calculus book is organized appropriately for the users (M
= 3.63). The students also agreed that the Calculus book has relevant content (M = 3.38), developmentally appropriate supplementary activities (M = 3.25), and
other features that promote higher order thinking skills (M = 3.08).
For grouped data (data arranged in the frequency distribution table), we computed the mean using the following formulas.
Example: Compute and interpret the mean of the data set presented in the following frequency distribution table.
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 2/7
10/15/21, 5:06 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
The mean score of 50 students who took the 55-item Mathematics Test is 29. The mean
score suggests that the students did not perform satisfactorily in the mathematics test.
B. MEDIAN.It is the middle value in an ordered set of data; hence, it is not affected by outliers and is unique. It is used if data is ordinal, when there are few extreme values
in the data set, when some values are missing or underestimated, or there are open-ended distributions.
For the ungrouped data, the median can be determined as follows:
Example: Consider the following data set. Compute the median values for each group.
Group A 22 33 21 18 19 15 16 18 16
Group B 15 15 15 16 17 20 21 20 20 28 25 27
Group A 15 16 16 18 18 19 21 22 33
Group B 15 15 15 16 17 20 20 20 21 25 28 27
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 3/7
10/15/21, 5:06 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
Example: Compute and interpret the median of the data set presented in the following frequency distribution table.
The median, equivalent to 29.27, means that 50% of the students have scores less than 29.27 or 50% have scores greater than 29.27.
C. MODE. It is the most frequently occurring value in the data set; hence, it is not unique. That is, a data set may have no mode, one mode, or multiple modes. For this
reason, it is the most unreliable among the three averages. It is used when data are in nominal scale.
Example: Consider the following data set. Determine the mode for each group.
Group A 22 33 21 18 19 15 16 18 16
Group B 15 15 15 16 17 20 21 20 20 28 25 27
Solution: By inspection, the mode of Group A are 18 and 16; hence, a bimodal distribution. The mode of Group B is 15 and 20, which is also a bimodal
distribution.
For grouped data, we can compute the mode using the following formula
Example: Compute and interpret the mode of the data set presented in the following frequency distribution table.
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 4/7
10/15/21, 5:06 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
The mode, equivalent to 30, means that majority of the students who took the math test have score equal to 30.
Watch the following videos on measures of central tendency to understand more about these concepts.
Khan Academy. (14 November 2011). Finding mean, median, and mode [Video clip]. Retrieved 25 July 2020 from
The Organic Chemistry Tutor. (26 January 2019). Mean, median, and mode of grouped data & frequency distribution tables statistics [Video clip]. Retrieved 25
July 2020 from
Emmanuel, E. (11 February 2019). Mean, Median, and Mode (grouped data) [Video clip].
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 5/7
10/15/21, 5:06 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 6/7
10/15/21, 5:06 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 7/7
10/15/21, 5:06 PM Subjects - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_i…
Quartiles
Example: Determine the first and third quartile of the following data sets.
Group A: 85, 88, 87, 89, 86, 86, 85, 87, 89, 88,
Group B: 94, 80, 79, 88, 96, 86, 83, 81, 85, 99, 92
Solution
For Group A, we arrange the values in an array: 85, 85, 86, 86, 87, 87, 88, 88, 89, 89
The median is ̃=
The lower 50% includes 85, 85, 86, 86, 87. The middle value, 86, is the first quartile.
The lower 50% includes 87, 88, 88, 89, 89. The middle value, 88, is the third quartile.
For Group B, we arrange the values in an array: 79, 80, 81, 83, 85, 86, 88, 92, 94, 96, 99 The median is 86.
The lower 50% includes 79, 80, 81, 83, 85. The middle value, 81, is the first quartile.
The lower 50% includes 88, 92, 94, 96, 99. The middle value, 94, is the third quartile.
2. If k is an integer, the value of the first quartile is . If k is not an integer, the value of the first quartile is + ( +1 − ) × ( ). For third quartile,
1. Compute =
2. If t is an integer, the value of the first quartile is . If t is not an integer, the value of the first quartile is + ( +1 − ) × ( ).
Example: Determine the first and third quartile of the following data sets.
Group A: 85, 88, 87, 89, 86, 86, 85, 87, 89, 88,
Group B: 94, 80, 79, 88, 96, 86, 83, 81, 85, 99, 92
Solution:
For Group A, we arrange the values in an array: 85, 85, 86, 86, 87, 87, 88, 88, 89, 89
For the first quartile, we compute = = 2.75. Since k is not an integer, the first quartile is computed as follows.
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 1/5
10/15/21, 5:07 PM Subjects - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_i…
Let 2= 85, 3 = 86, ( ) = 0.75, Thus, 1 = 85 + (86 − 85)(0.75) = 85.75
For the third quartile, we compute = = 8.25. Since t is not an integer,the third quartile is computed as follows.
Let 8 = 88, 9 = 89, ( ) = 0.25, Thus, 3 = 88 + (89 − 88)(0.25) = 88.25 Using MS Excel
To compute for the values of the quartiles using excel, we will use the formula
=QUARTILE.EXC(array, quart). The “quart” in the formula may be 1, 2, or 3 for first, second, or third quartile, respectively.
For Group B, we arrange the values in an array: 79, 80, 81, 83, 85, 86, 88, 92, 94, 96, 99
For the first quartile, we compute = = 3. Since k is an integer, the first quartile is :
4
1= 3=81
For the third quartile, we compute = = 9. Since t is an integer, the third quartile is :
3 = 9 =94
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 2/5
10/15/21, 5:07 PM Subjects - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_i…
The computation for deciles and percentiles follows the same procedures as the computation of quartiles using the formula.
Example: Determine the 10th, 80th, and 90th percentiles of the following data set.
Group A: 85, 88, 87, 89, 86, 86, 85, 87, 89, 88,
Solution:
For Group A, we arrange the values in an array: 85, 85, 86, 86, 87, 87, 88, 88, 89, 89
Figure 8 shows the MS Excel output using the formula =PERCENTILE.EXC(array, k), where k has values between 0 to 1. That is for 80 th percentile, k = 0.8.
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 3/5
10/15/21, 5:07 PM Subjects - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_i…
Note that these formulas are very similar with the formula of the median.
Example: Compute and interpret the first and third quartile, 3rd decile, and 10th and 90th percentile of the data set presented in the following frequency
distribution table.
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 4/5
10/15/21, 5:07 PM Subjects - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_i…
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 5/5
10/15/21, 5:07 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
MEASURES OF VARIABILITY
Lesson 3: Measures of Variability
To get a complete description of the data set, it is not enough to compute averages. We may also consider other statistics called measures of variability to
determine homogeneity or heterogeneity of the data in a distribution. Measures of variability describe how spread out or scattered a set of data is.
As shown in figure 6, the data sets may have the same average but may have a totally different interpretation due to the spread (variation or differences) of
data values.
A. Absolute Variability
Range. This refers to the difference between the maximum and minimum values.
Mean Absolute Deviation. This refers to the average absolute deviations of the values from the mean.
Ungrouped data :
Grouped data :
Variance and Standard Deviation. The variance refers to the squared deviations of the values from the mean. The standard deviation refers to the square root
of the variance.
Ungrouped data:
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 1/6
10/15/21, 5:07 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
Interquartile Range and Quartile Deviation. It is the measure of dispersion of the middle 50%
Example: Consider the following raw data on performance rating set taken from a sample of 10 respondents.
i 1 2 3 4 5 6 7 8 9 10
xi 84 86 83 81 87 90 93 85 80 94
Compute the range, mean absolute deviation, variance, standard deviation, interquartile range, and quartile deviation.
Solution:
a. Range = 94 − 80 = 14
The ratings span 14 points from the minimum to the maximum value.
To compute the mean absolute deviation, variance, and standard deviation, the mean must be determined fir st. Then, the column for | − | is computed
by getting the absolute value of the difference between the ratings and the mean. The column for (
−̅
2
)
is computed by squaring the values in the
̅
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 2/6
10/15/21, 5:07 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 3/6
10/15/21, 5:07 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
2
CLASS INTERVALS CLASS BOUNDARIES f x fx fx
47-53
46.5 – 53.5 3 50 150 7500
Example. Which groups of the following groups is most varied in terms of their performance ratings. Compute their coefficients of variation.
Group1: 85, 85, 87, 89, 86, 86, 85, 87, 89, 88, 87
Group 2: 94, 80, 79, 88, 96, 85, 83, 81, 85, 99, 87
Group 3: 70, 80, 79, 88, 89, 89, 85, 100, 90, 100
Solution: Notice that groups have the same mean value of 87. Hence, we need another measure to compare them. First, compute the standard deviation.
You can verify the following values using MS Excel or your calculator.
1 87 1.490712 10
2 87 7.055337 10
3 87 9.201449 10
Notice that Group 3 is the most varied than the two other groups. Furthermore, the coefficient of variation tells us the same observation.
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 5/6
10/15/21, 5:07 PM STAT.APP(BSA3-4_ 7:00-10:00 SATURDAY) - https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson…
Example. Which groups of the following groups is most varied in terms of their performance ratings. Compute their quartile deviations.
Group1: 85, 85, 87, 89, 86, 86, 85, 87, 89, 88, 87
Group 2: 94, 80, 79, 88, 96, 85, 83, 81, 85, 99, 87
Group 3: 70, 80, 79, 88, 89, 89, 85, 100, 90, 100
Solution: After arranging the terms in an array, we get the following values for the first and third quartiles.
1 85.75 88.25
2 80.75 94.5
3 79.75 92.5
Since some groups have outliers, let us focus on the middle 50% of the distribution.
In terms of the variation in the middle 50%, Group B is most varied than the other two groups.
https://ubian.ub.edu.ph/student_lesson/show/2750742?from=%2Fstudent_lesson%2Fshow%2F2750742%3Flesson_id%3D12458339%26router%3Dtr… 6/6