Lesson 23
Lesson 23
Lesson 23
Introduction to Statistics
GENERAL INSTRUCTIONS: Write all your answers in the answer sheet provided.
COLLECTION OF DATA
Data may be gathered by the following:
1. Interview – This method is referred to as the direct method of gathering data
because this requires a face-to-face inquiry with the respondent.
2. Questionnaire – This method is referred to as the indirect method of gathering
data because this makes use of written questions to be answered by the
respondent.
3. Observation – This method makes use of the different human senses in gathering
information.
4. Registration or Census – This method requires the enactment of law to take
effect because it needs the participation of a large, if not the entire, population.
5. Experimentation – This method is usually conducted in laboratories where
specimens are subjected to some aspects of control to find out cause and effect
relationships.
Data gathered may be classified as primary or secondary.
Primary data are information gathered directly from the source.
Secondary data are gathered from the secondary sources, such as books, journals,
magazines, or thesis of other researchers.
In data gathering, information is usually taken from a sample. The number of samples is
determined by using the Slovin’s formula.
Slovin’s Formula
N
n= 2 where n¿ number of samples
1+ N e
N ¿ number of population
E ¿ margin of error
Example 1: What is the sample size if the population is 3,000 (N) and the margin of error
is set at:
a. 5 %
SOLUTION: The margin of error is 5 % or 0.05 (just divide 5 by 100 to get 0.05). Thus,
N
n= 2
1+ N e
3000 * Substitute the values
n= 2 N = 3000 and e = 0.05
1+(3000)(0.05)
3000
n= * Multiply 0.05 by itself
1+ ( 3000 )( 0.0025 )
to get 0.0025
3000 * Multiply 0.0025 by
n=
1+7.5 3000 to get 7.5
3000
n= * Add 7.5 by 1 to get
8.5
8.5 and divide 3000 by
8.5
Note: This module is intended for 2 weeks! 174
Unit 9
Math 7 Introduction to Statistics
n=352.94
* Round up the answer
n=353
b. 3 %
SOLUTION: The margin of error is 3 % or 0.03 (just divide 3 by 100 to get 0.03). Thus,
N
n= 2
1+ N e
3000 * Substitute the values
n= 2 N = 3000 and e = 0.03
1+(3000)(0.03)
3000
n= * Multiply 0.03 by itself
1+ ( 3000 )( 0.0009 )
to get 0.0009
3000 * Multiply 0.0009by
n=
1+2.7 3000 to get 2.7
3000
n= * Add 2.7 by 1to get 3.7
3.7
and divide 3000 by 3.7
n=810.81
* Round up the answer
n=811
After determining the number of samples, the next thing to do is to know how
these samples will be gathered and what factors are to be considered in getting these
samples. There are different sampling techniques, the purposes of which may vary from
one another.
1. Probability Sampling – It is a sampling procedure where every element of a
population is given an equal chance of being selected as a member of the
sample.
a. Random sampling – This basic sampling procedure may be done by
lottery or with the aid of a Table of Random Numbers, or the random
function of a scientific calculator.
b. Systematic sampling – This is an alternative to simple random
sampling especially when the population is too big that random
sampling becomes tedious.
c. Stratified random sampling – This is done by creating different classes
or strata within the population. The grouping may be done based on
grade level, income groupings, and gender, among others.
d. Cluster sampling – If the population is too big, a sampling method may
be employed to a smaller area. The population may be divided
geographically into regions, divisions, or districts. To these smaller
areas, other probability sampling procedure can be employed.
2. Nonprobability sampling – This is a sampling procedure in which not every
element of the population is given an equal chance of being selected as sample.
The drawing of samples is based purely on the researcher’s objectives.
a. Convenience sampling – The researcher’s convenience is the primary
concern in using this method.
b. Quota sampling – This is similar to stratified sampling but the drawing
of samples in quota sampling is not done randomly.
The Mean
The mean (in symbol x ¿ or arithmetic average is the most important, the most
useful, and the most widely used measure of central tendency. It is obtained by adding all
the scores/values in a set of data and dividing this sum by the total number of
scores/values. This is also called the computed average.
Mean=
∑ of data
number of data
Example 2: The grades of Khares in five major subjects are 88, 82, 95, 90, and 85. Find
the mean.
SOLUTION:
∑ of data * Add all the given data or
Mean= numbers
number of data
88+82+95+ 90+85 * Since there are 5 data which are
¿
5 88, 82, 95, 90, and 85, so divide
the sum of data by the number of
440
¿ data there are which is 5.
5
* Thus, the mean is 88.
x=88
Example 3: The following temperature readings were recorded in Tokyo, Japan on one
winter day.
6 :00 A.M. −1.9 ° C
9 :00 A.M. 3.2 ° C
12 :00 noon 8.7 ° C
3 :00 P.M. 5.4 ° C
6 :00 P.M. 2.0 ° C
9 :00 P.M. −1.2 ° C
Find the mean temperature of these data. * Add all the given data or
SOLUTION: numbers
Mean=
∑ of data
number of data * Since there are 6 data which
are −1.9 ° C , 3.2 ° C , 8.7 ° C ,
−1.9+3.2+8.7+5.4 +2.0+(−1.2) 5.4 ° C ,2.0 ° C and −1.2 ° C , so
¿
6 divide the sum of data by the
¿
16.2 number of data there are which
6 is 6.
x=2.7 ° C * Thus, the mean temperature is
2.7 ° C .
The Median
The median is the value in the middle position of a given set of data, which is
arranged in descending or ascending order. The median is denoted by Md. or ~x .
To find the median of an ungrouped data, follow the steps:
1. Arrange the quantities either in ascending or descending order.
2. Number the quantities consecutively from 1 to n.
( )
th
n+1
3. If n is odd, the median is the quantity.
2
( ) ()
th th
n n
If n is even, the median is the mean of +1 and quantities.
2 2
SOLUTION:
First, arrange each score in ascending or descending order.
Number Score
1 30
2 27
3 23
4 18
5 15
6 14
7 13
8 10
9 9
( )
th
9+1
Since n = 9 and is odd, the median is or the 5 th score. Therefore, the
2
~
median is x=15.
76 80 88 89 95 98
() ( )
th th
n n
Since n = 6 and is even, the median is the mean of and +1 scores. The
2 2
() ( )
th th
6 6
score or the 3 rd score 88. The +1 or the 4th score is 89 .
2 2
Simply, you have to get the mean of the two middle grades, which are 88 and 89.
88+89
Therefore, the median is ~x= =88.5.
2
The Mode
The mode is the most observed data in an experiment. It is denoted by Mo. In a set
of data, it is the most frequently occurring number.
A set of data is a unimodal distribution if it contains only one mode. For
instance, the set 11, 15, 13, 15, 14, 13, 15 is unimodal. The mode is 15 with 3
frequencies.
A set is a bimodal distribution if it contains two modes. For example, the sets 88,
89, 82, 89, 88, 89, and 63, 55, 57, 60, 60, 66, 56, 58, 57 are bimodal. The modes are 82
and 89, and 57 and 60, respectively.
A set of data with three modes is trimodal.
Example 6: The scores of Yolanda in her periodic tests are 15, 20, 23, 20, 18, 20, and 25.
What is the mode of her scores?
SOLUTION: The most observed data from the set of scores is 20. Therefore, the mode
is 20.
a. 9 , 7 , 8 , 9 ,7 , 4 , 6 , 5 ,5 , 7 - The mode is 7.
MEASURES OF VARIABILITY
Measures of dispersion or variability refer to the spread of the values about the
mean. There are at least three measures of dispersion, namely range, average deviation,
and standard deviation.
The Range
Range is simply the difference of the highest (H) and lowest (L) scores in a set of
data under consideration.
H = highest score
L = lowest score
The Average Deviation
The average deviation is the average distance of the scores from the mean.
Because you are computing the distances of each score from the mean, you will be
dealing with absolute values. Remember that distances are always positive. The average
deviation is computed using the formula:
AD=
∑ |x−x|
n
Variance
The variance is the average square distance of the scores from the mean. It is
denoted by σ 2, where σ is the Greek letter sigma (lowercase).
2
σ =
∑ ( x −x )2
n
where: σ 2 = variance
x = values in the set of data
x = mean
n = total number of values
SD=
√ ∑ ( x−x )2
n
where: SD = standard deviation
x = individual score
x = mean
n = number of scores
For items 13-22: Identify the most appropriate methods in collecting data to be used in
each of the following research topic. Choose your answer from the choices. Write only
the letter of your answer.
For items 23-29: Choose your answer from the choices. Write only the letter of your
answer.
23. Which of the following best defines the measure of central tendency?
A. It is a single value that tells how the scores are related in a set of data.
B. It is a single value that describes the spread of the scores in a set of data.
C. It is a single value that shows variation of the scores in a set of data.
D. It is a single value that describes and represents a set of data.
24. Which is not a measure of central tendency?
A. Mean C. Average deviation
Note: This module is intended for 2 weeks! 181
Unit 9
Math 7 Introduction to Statistics
B. Median D. Mode
For items 33-36. Define the following measures of variability. (2 pts. each)
33. Range
34. Average deviation
35. Variance
36. Standard Deviation
References
Canlapan, R., & Urgena, J.H. (2018). Practical math 7. Makati City. Diwa Learning
aaaaaSystems Inc.
Geruels, M., De Guzman, L.A, &, Garcia, A.K. (2015). Mathematics for the 21st
century aaaaalearner 7. Makati City: Diwa Learning Systems Inc.
Javier, D., & Dy, A. (2014). Infomath.com 7: A textbook in K to 12 mathematics
Grade aaaaa7. Manila: A@D Publishing, Inc.
Manalo, C.B, Suzara, J.L., & Mercado, J.P. (2017). Next century mathematics 7.
aaaaaaQuezon City: Phoenix Publishing House, Inc.
Note: This module is intended for 2 weeks! 182
Unit 9
Math 7 Introduction to Statistics
2.
3.
Note: This module is intended for 2 weeks! 183
Unit 9
Math 7 Introduction to Statistics
Variability
33.
34.
35.
36.
Answer Key
1. A
2. B
3. A
4. C
5. A
6. D
7. B
8. A
9. A
10. B
11. B
12. C
13. B
14. C
15. A
16. B
17. D
18. D
19. C
20. A
21. C
22. B
23. D
24. C
25. B
26. A
27. B
28. C
29. C
30. B
31. B
32. D