Types of Data
Types of Data
Discrete
Continuous
Continuous Data can take any value, that is, it is not restricted to defined separate values
but can occupy any value over a continuous range.
Ungrouped data is simply the data obtained in original form, that is, raw data.
Grouped data is raw data organized in ‘intervals’ and presented in a frequency table.
A class interval is defined as a grouping of statistical data. Class intervals allow for the data
to be presented in a simpler way.
Class intervals have, what are termed CLASS LIMITS. Each class interval has two class
limits. The class limits are the end values of a class interval.
NOTE: Instructor will use the table below to explain class intervals and limits further.
Example: The masses of 100 students correct to the nearest kilogram are shown in
frequency table:
≈ - Approximation symbol
34.2 ≈ 34
34.4 ≈ 34
34.5 ≈ 35
44.6 ≈ 45
By making reference to the table for before, students will observe which class interval the
above-mentioned numbers, based on their approximation fall.
Observing that 34.5 ≈ 35 , it implies that a student of weight 34.5 𝑘𝑔 would be found in the
first class interval. We say that the lower class boundary for the first class interval is
34.5 𝑘𝑔 and the upper class boundary is 39.5.
Similarly, the second class interval, (40 – 44)kg, includes students with masses between
39.5 kg and 44.4kg. Where the lower class boundary is 39.5 kg and the upper class
boundary is 44.5 kg.
𝑢𝑝𝑝𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑙𝑜𝑤𝑒𝑟 𝑟𝑎𝑛𝑘 𝑐𝑙𝑎𝑠𝑠 + 𝑙𝑜𝑤𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑢𝑝𝑝𝑒𝑟 𝑟𝑎𝑛𝑘 𝑐𝑙𝑎𝑠𝑠
2
Therefore, the class boundary is the average of the class limits involved
Note: The lower class boundary of a class interval is the equivalent upper class boundary of
the preceding class interval.
1 6 17 7 19 10
16 16 22 1 27 2
21 22 18 14 15 15
5 2 23 18 27 20
22 21 26 4 24 21
17 18 19 19 29 17
20 23 24 24 2 28
18 25 26 20 28 3
3 19 4 15 16 29
15 17 3 23 20 16
(a) Draw up a tally chart for the classes 0-4, 5-9, 10-14, 15-19, 20-24 and 25-29
(b) Construct a table showing the class intervals and the theoretical class intervals
representing the income earned by the families.
Solution:
(a)
(b)
Mean
∑𝑥 ∑𝑥
𝑥̅ = =
∑𝑓 𝑛
Where,
𝑥̅ − is the mean
Solution:
∑𝑥
𝑥̅ =
∑𝑓
1 + 3 + 5 + 7 + 11 + 12 + 13 + 15 + 16 + 17
𝑥̅ =
10
100
𝑥̅ =
10
𝑥̅ = 10
∑ 𝑓𝑥
𝑥̅ =
∑𝑓
Where,
𝑓𝑥 is the product of the frequency and the value of the corresponding observation
And,
Example: The marks obtained by 100 students in a test in which the maximum possible
mark was 10 are shown in the table below.
Marks Frequency
0 2
1 5
2 8
3 17
4 23
5 0
6 15
7 12
8 9
9 6
10 3
Solution:
494
𝑥̅ =
100
𝑥̅ = 4.94 𝑚𝑎𝑟𝑘𝑠
Median
The median is defined as the ‘middle’ or central value in a set of ascending or descending
observations. The median is represented by the symbol 𝑄2 .
Note: The median has the same number of values above it as there are value below.
Solution:
Median value
8, 9, 10, 11, 12
Example: Find the median of the following heights which are stated in cm.
1
(𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
2
Example: The masses of 100 pupils in a school are shown in the table below
Solution:
*Students will be required to find the cumulative frequency in order to find the median
from the frequency distribution
1
Now, the median will be found from the 2 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
1
(100 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
2
= 50.5th rank
The 50.5th rank is located between the 50th and the 51st ranks.
= 55.5kg
Mode
The mode of a distribution is defined as the observation with the highest frequency.
Example: Determine the mode of the basic wages in the following distribution
Solution
Range
=$47
To calculate the range from a frequency distribution with ungrouped data we use the
formula
Example: The masses of 50 lambs were estimated to the nearest kilogram. The results can
be seen below.
A quartile is one of three values that divide an ordered set of data into four equal parts.
The lower quartile 𝑄1 is the value below which one-quarter of the data lies.
The middle quartile 𝑄2 is the value below which one-half of the data lies. This quartile is
known as the median.
The upper quartile 𝑄3 is the value below which three-quarters of the data lies
𝐼𝑛𝑡𝑒𝑟𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑔𝑒 = 𝑄3 − 𝑄1
𝑄3 − 𝑄1
𝑆𝑒𝑚𝑖 − 𝑖𝑛𝑡𝑒𝑟𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑔𝑒 =
2
The Interquartile and Semi-interquartile Range from Raw Data
Example: Calculate the interquartile range and the semi-interquartile range of the
following heights, stated in cm:
163, 158, 154, 161, 156, 159, 155
Thus 𝑄2 = 158
𝑄1 = 155
𝑄3 = 161
𝐼𝑄𝑅 = 𝑄3 − 𝑄1
= 161 – 155
= 6 cm
𝑄3 −𝑄1
𝑆𝐼𝑄𝑅 = 2
161−155
= 2
6
=2
= 3 cm
Recall that when the observations are given as a frequency distribution with ungrouped
1
data, then 𝑄2 is given by 2 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘 and the median is the value corresponding to this
rank.
1 3
Therefore, 𝑄1 𝑎𝑛𝑑 𝑄3 are the values corresponding to 4 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘 and 4 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
respectively.
Example: The masses of 100 pupils in a school are shown in the table below
1
(i) the position of 𝑄1 = (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
4
1
= (100 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
4
1
= (101)𝑡ℎ 𝑟𝑎𝑛𝑘
4
= 25.25𝑡ℎ 𝑟𝑎𝑛𝑘
This implies that 𝑄1 is the average of the 25th and 26th observations
53+54
Therefore 𝑄1 = 2
= 53.5 kg
3
(ii) the position of 𝑄3 = 4 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
3
= (100 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
4
3
= (101)𝑡ℎ 𝑟𝑎𝑛𝑘
4
= 75.75𝑡ℎ 𝑟𝑎𝑛𝑘
This implies that 𝑄3 is the average of the 75th and 76th observations
57+57
Therefore 𝑄1 = 2
= 57 kg
𝑄3 −𝑄1
(ii) 𝑆𝐼𝑄𝑅 = 2
57−53.5
= 2
= 1.75 kg