Module 1 - Statistics For Remedial
Module 1 - Statistics For Remedial
Statistics
This module deals with the definition of statistics and terms used in the
study of statistics. It will also discuss the history and importance of the study of
statistics, summation rules, sampling techniques, collection of statistical data and
organizing collected data in a table, constructing frequency distribution tables,
and finding the measures of central tendency for ungrouped data. As you go over
the discussion and exercises, you will appreciate more the importance of
statistics in daily life. Enjoy learning this module and go over the discussion and
examples if you have not yet mastered a concept.
mean
median
mode
How much do you know
a. Sampling c. Organizing
b. Drawing d. Collecting
a. Graph c. Drawing
b. Table d. Sampling
5
4. Which of the following means x
i 1
i ?
a. 1 + 2+ 3 + 4 + 5
b. x + 2x + 3x + 4x + 5x
c. x1 + x2 + x3 + x4 + x5
d. none of the above
2
6. The frequency distribution below shows the scores
obtained by 300 students in a Mathematics test of 50 items.
Number of
Score Students
45-49 15
40-44 32
35-39 42
30-34 108
25-29 67
20-24 21
15-19 10
10-14 5
Total 300
47 45 35 44 48 39 37 29 28 50
a. 28 c. 40.2
b. 35 d. 45
a. 39 c. 43.5
b. 41.5 d. 44
3
What you will do
Lesson 1
Two important terms that you should understand in studying statistics are
population and sample.
Study the following situations. Identify the phrase which represents the
sample and which phrase shows the population.
1. Mrs. Jara wants to know the nutritional status of the first year students in
her school so she got 150 first year students to represent the year level.
2. When Sandra bought a sack of rice, she examined a handful from the
sack to check if it is the variety she wants.
3. A doctor wants to know what causes the infection in a patient so he
requested for the patient’s blood examination. The medical technologist
extracted only 10 cubic centimeters of blood from the patient for
examination.
4. The chef wants to check if the food being cooked tastes as he wants it to
be so he tasted a spoonful of it.
4
5. The school guidance counselor would like to know the course preference
of the graduating students in their school so she interviewed 100 of the
fourth year students.
Lesson 2
Records also show that the Roman Empire was the first government to
gather extensive data about the population, area, and wealth of the territories
that it controlled. In Europe, few comprehensive censuses were made during the
Middle Ages. in the early 16th century, registration of deaths and births begun in
England. Then in 1662 the first noteworthy statistical study of population was
made. In 1691, a similar study of mortality made in Breslau, Germany was used
by the English astronomer Edmond Halley as a basis for the earliest mortality
table. In the 19th century, investigators recognized the need to reduce
information to numerical values to avoid the ambiguity of verbal description.
5
Lesson 3
In the given data, there are 10 observations denoted as x 1, x2, x3, x4, x5, x6,
x7, x8, x9, x10.
10
Hence, x
i 1
i = x1+ x2+ x3+ x4+ x5+ x6+ x7+ x8+ x9+ x10.
10
The symbol x i 1
i is read as “the sum of 10 observations x1 to x10”.
x
i 1
i = 35 + 40+ 29 + 37 + 25 + 33 + 49 + 47 + 28 + 42
6
10
x
i 1
i = 365
For large observations, say 50, the summation will be expressed as:
50
x
i 1
i = x1+ x2+ x3 + …..+x50.
n
In general, x
i 1
i = x1+ x2+ x3 + …..+xn.
If all the given values of a variable are to be used in finding the sum, the
limits of the summation are usually omitted, as
10
x
i 1
i = x
Example 2: Given are the ages of the first 4 shoppers at a newly opened
convenience store in the neighborhood – 12, 24, 30, 45.
Answers:
1. x will represent the ages of the first 4 shoppers in the newly opened
convenience store.
2. I will represent the first 4 shoppers in the newly opened convenience
store.
4
3. x
i 1
i is the expression for the summation.
4. The lower limit is 1 and the upper limit is 4.
4
5. x
i 1
i = x1 + x2 + x3 + x4
= 12 + 24 + 30 + 45
= 111
x
i 1
i = x1 + x2 + x3 + x4 + x5 ;
7
the sum of the squares of the five observations is represented as:
5
x
2
i = x12+ x22+ x32+ x42+ x52;
i 1
a x
i 1
i i = a1x1+ a2x2+ a3x3+ a4x4+ a5x5
Solutions:
4
1. x
i 1
i = x1+ x2+ x3+ x4
=2+4+6+8
= 20
x
2
2. i = x12+ x22+ x32+ x42
i 1
= 22 + 4 2 + 6 2 + 8 2
= 4 + 16 + 36 + 64
=120
4
3. a x
i 1
i i = a1x1+ a2x2+ a3x3+ a4x4
= 1(2) + 2(4) + 3(6) + 4(8)
= 2 + 8 + 18 + 32
= 60
6 6
Example 4: Find 1. 3
i 1
2. ( 3 )
i 1
Solutions:
6
1. 3 = 3 + 3 + 3 + 3 + 3 + 3 = 6(3) = 18
i 1
8
6
2. ( 3) = (-3) + (-3) + (-3) + (-3) + (-3) + (-3) = 6(-3) = -18
i 1
7
1. x
i 1
i
2. z
j 1
j
3
3. y
k 1
k
4. p
j1
3
5. a
k 1
k yk
9
Compute:
4
16. 5
i 1
10
17. ( 2)
i 1
8
18. 4
i1
Lesson 4
Sampling Techniques
Another method is the stratified random sampling. This is used when the
population can be naturally classified into groups or strata.
Example: The clinic teacher wants to determine the average height and
weight of the first year students. How can she randomly select 50
students consisting of 250 male students and 300 female
students to represent the population using (a) simple random
technique? (b) systematic random technique? (c) stratified
random technique?
Answers:
The clinic teacher can randomly select the sample using simple random
sampling by following these simple steps:
10
2. Write the student number with his/her corresponding height in uniform
size slips of paper.
3. Roll the pieces of paper uniformly and place them in a box.
4. Draw a slip of paper at a time, shaking the box after each draw until 50
samples are taken.
The clinic teacher can select the sample using the systematic random
sampling using the steps as follows:
1. Using the same data and with the students assigned with numbers,
and arranged chronologically, the clinic teacher with eyes closed,
points to a number. If the number pointed to is, let us say, 7, student
number 7 becomes a part of the sample (sample number 1). This is a
“random start”.
The clinic teacher can select the sample using the stratified random sampling
by using the following procedures:
1. The data should be classified into two groups, male and female.
2. Get a proportional number of samples from each group or strata. The
250 5
number of samples from the males will be or of 50 which is
550 11
23 and from the females 27.
3. Place the slips of paper, properly filled up in separated boxes for each
group.
4. Draw, one at a time, the required number for each group.
1. Mrs. Lucas is studying the heights and weights of the students in her class.
Which of the following samples is most likely to be a good representation of
the whole class? Justify your answer.
2. For his report in Social Studies, Dennis wishes to wishes to interview a sample
of Metro Manila residents to determine their opinion regarding the economic
11
status of the country today.Tell whether he could find a sample that reflects
the entire population being studied at
1) a depressed area in Payatas.
2) a shopping mall in Makati.
3) the Starbucks coffee shop.
Lesson 5
The tabular and graphical forms are used when more detailed information
about the data is to be presented.
Example 1:
Mahusay National High School
Enrolment, SY 2005-2006
Year Level Male Female
First 216 267
Second 197 216
Third 187 227
Fourth 176 215
12
Total 776 925
You will observe that the table above shows clearly the enrolment data in
Mahusay National High School for the school year 2005-2006.
17 20 15 18 19 16 11 10 15 16
12 12 13 14 11 10 14 13 12 11
13 15 14 10 15 16 17 17 18 20
20 18 19 19 18 17 16 15 12 12
13 14 15 19 20
Solution: To prepare a frequency table for the given set of scores, the scores are
listed from highest to lowest, tally marks are made and counted. The
counted tally marks will then be recorded under the column frequency.
Notice that every 5th tally crosses the first four tallies. This is done to
make counting of marks easier especially if the number of cases is
rather big.
13
Try this out
87 90 89 92 94
88 90 91 88 87
90 94 92 91 90
1.36 1.51 1.61 1.61 1.62 1.62 1.62 1.59 1.58 1.61
1.38 1.49 1.65 1.63 1.58 1.57 1.61 1.62 1.63 1.65
1.44 1.59 1.57 1.57 1.58 1.60 1.61 1.63 1.64 1.64
1.55 1.58 1.59 1.65 1.66 1.72 1.56 1.68 1.69 1.63
Lesson 6
14
A frequency distribution is a distribution of the total number of measures or
frequencies over arbitrarily defined categories or classes. The number of
measures falling under a class is called class frequency.
Example 1.
Number of
Score Students
45-49 15
40-44 32
35-39 42
30-34 108
25-29 67
20-24 21
15-19 10
10-14 5
Total 300
In the example above, the symbol 45-49 and the other symbols which
follow up to 10-14 are called class intervals. The end numbers are called class
limits. For instance in the class interval 45-49, 45 is called the lower limit while 49
is called the upper limit.
Each class interval has also a lower boundary and a higher boundary. For
the class interval 45-49, the lower boundary is 44.5 while the higher boundary is
49.5. Hence, for the class interval 45-49, 44.5 – 49.5 are called the class
boundaries.
The size of the class interval, also called class size is the difference
between the upper boundary and the lower boundary. Hence, the class size in
the given example is 5
The following are the rules in determining the size of the class interval:
1. The class interval must cover the total range of the observation
where the range, R = H – L (H = highest and L = lowest). It is
usually between 10 to 20 intervals.
15
2. Select class intervals with a range of 1, 3, 5, 10, or 20 points since
these will meet the requirements of most set of data.
3. Start the class interval at a value which is a multiple of the size of
the interval. For example, with a class interval of 3, the intervals
should start with the values 3, 6, 9, etc.
39 93 80 49 41 85 75 59 62 68
34 49 50 46 69 72 73 76 77 54
95 63 66 64 88 90 51 53 56 79
70 70 78 85 86 59 66 72 77 76
71 79 70 65 40 57 82 75 89 82
Solution:
Since the class interval is already given, and the lowest score is 34 then
the class interval containing the lowest score should be 30-34 since the rule
states that the class interval should start with a number which is divisible by the
class size. After arranging the class intervals, tally the scores to determine the
frequency. Look at the obtained frequency distribution below.
16
Try this out
12 20 17 19 23 32 15 45 60 65
18 22 27 35 37 57 47 38 40 28
13 10 19 24 29 28 38 47 48
57
27 29 33 34 49 76 55 65 37 39
40 14 17 20 32 33 60 65 62
57
37 35 40 42 36 57 38 44 60 45
52 64 38 39 40 42 50 56 45 43
38 39 50 41 42 56 57 54 55 60
35 38 40 40 42 53 47 48 39 50
35 37 39 39 50
Lesson 7
Aside from tables and graphs, another way of describing a set of data is
by stating a single numerical value associated with it. This value is where all the
other values in a distribution tend to cluster. It is called the average or measure
of central tendency. There are three kinds of average: the mean, the median, and
the mode.
The Mean
The mean (also known as the arithmetic mean) is the most commonly
used measure of central position. It is the sum of measures divided by the
_
number of measures in a variable. It is symbolized as x (read as x bar).
The mean is used to describe a set of data where the measures cluster or
concentrate at a point. As the measures cluster around each other, a single value
17
appears to represent distinctively the total measures. It is, however, affected by
extreme measures, that is, very high or very low measures can easily change the
value of the mean.
N = number of values of x
Example: The grades in Chemistry of 10 students are 87, 84, 85, 85, 86, 90, 79,
82, 78, 76. What is the average grade of the 10 students?
Solution:
87 84 85 85 86 90 79 82 78 76 832
Mean = 83.2
10 10
The Median
The median is the middle entry or term in a set of data arranged in either
increasing or decreasing order.
To find the median of a given set of data, take note of the following:
Example 1: The number of books borrowed in the library from Monday to Friday
last week were 58, 60, 54, 35, and 97 respectively. Find the
median.
18
The median is 58.
Example 2: Cora’s quizzes for the second quarter are 8, 7, 6, 10, 9, 5, 9, 6, 10,
and 7. Find the median.
5, 6, 6, 7, 7, 8, 9, 9, 10, 10
78
Md = 7 .5
2
The Mode
2. if two or more measures appear the same number of times, and the
frequency they appear is greater than any other measures, then each
of these values is a mode;
3. if every measure appears the same number of times, then the set of
data has no mode.
Answer: The mode is 6 since it is the shoe size that occurred the most number
of times.
Example 2: The sizes of 9 classes in a certain school are 50, 52, 55, 50, 51, 54,
55, 53 and 54.
Answer: The modes are 54 and 55 since the two measures occurred the same
number of times. The distribution is bimodal.
19
Try this out
Find the mean, median, and mode (modes) of each of the following sets of
data.
.
1. 29, 34, 37, 22, 15, 38, 40
25 33 35 45 34
26 29 35 38 40
45 38 28 29 25
39 32 37 47 45
Let’s Summarize
20
A frequency distribution is a distribution of the total number of measures or
frequencies over arbitrarily defined categories or classes. The number of
measures falling under a class is called class frequency.
The mean (also known as the arithmetic mean) is the most commonly
used measure of central position. It is the sum of measures divided by the
_
number of measures in a variable. It is symbolized as x (read as x bar). It is
used to describe a set of data where the measures cluster or concentrate at a
point.
The median is the middle entry or term in a set of data arranged in either
increasing or decreasing order. It is a positional measure. The values of the
individual measures in a set of data do not affect the median. It is affected by the
number of measures and not by the size of the extreme values.
The mode is the value which occurs most frequently in a set of data.
a. Sampling
b. Simple random sampling
c. Systematic random sampling
d. Stratified random sampling
10
3. In the expression x
i 1
i , what is the upper limit?
a. i c. 1
b. x d. 10
Enrolment, SY 2005-2006
Year Level Male Female
21
First 456 497
Second 427 456
Third 487 467
Fourth 356 425
a. 953 c. 1845
b. 1726 d. 3571
20 20
a. xi
i 1
c. x
i 1
Number of
Score Students
80-89 4
70-79 6
60-69 12
50-59 10
40-49 8
30-39 10
20-29 7
10-19 3
Total 60
a. 10-19
b. 20-29
c. 30-39
d. 45-49
a. 5
b. 8
22
c. 9
d. 10
a. mean
b. mode
c. median
d. class size
7 5 5 4 8 9 7 9 8 5
a. 8 c. 9
b. 5 d. 7
a. 5 c. 7
b. 9 d. d. 8
23