0% found this document useful (0 votes)
14 views

Types of Data

Important

Uploaded by

nn9b8s2rpr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Types of Data

Important

Uploaded by

nn9b8s2rpr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Types of Data

 Discrete
 Continuous

Discrete Data is counted. It can be numeric and or categorical.

For example: (i) The number of students in a class

(ii) Favourite colour of students, red or blue

Continuous Data can take any value, that is, it is not restricted to defined separate values
but can occupy any value over a continuous range.

For example: (i) A person’s height

(ii) Time clocked by an athlete in a race

Grouped vs Ungrouped Data

Ungrouped data is simply the data obtained in original form, that is, raw data.

Grouped data is raw data organized in ‘intervals’ and presented in a frequency table.

These intervals are referred to as CLASS INTERVALS.

A class interval is defined as a grouping of statistical data. Class intervals allow for the data
to be presented in a simpler way.

Class intervals have, what are termed CLASS LIMITS. Each class interval has two class
limits. The class limits are the end values of a class interval.

There is a lower class limit and an upper class limit.

NOTE: Instructor will use the table below to explain class intervals and limits further.

Example: The masses of 100 students correct to the nearest kilogram are shown in
frequency table:

Class [Masses in kg] Frequency


35 – 39 12
40 – 44 8
45 – 49 10
50 – 54 7
55- 59 12
60 – 64 15
65 – 69 12
70 – 74 8
75 – 79 7
80 - 84 9
 Class Boundaries

≈ - Approximation symbol

NOTE: Tutor will begin with introduction to approximations as follows:

34.2 ≈ 34

34.4 ≈ 34

34.5 ≈ 35

44.6 ≈ 45

By making reference to the table for before, students will observe which class interval the
above-mentioned numbers, based on their approximation fall.

These approximations are what will be referred to as class boundaries.

Observing that 34.5 ≈ 35 , it implies that a student of weight 34.5 𝑘𝑔 would be found in the
first class interval. We say that the lower class boundary for the first class interval is
34.5 𝑘𝑔 and the upper class boundary is 39.5.

Similarly, the second class interval, (40 – 44)kg, includes students with masses between
39.5 kg and 44.4kg. Where the lower class boundary is 39.5 kg and the upper class
boundary is 44.5 kg.

Finding the class boundaries:

𝑈𝑝𝑝𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 =

𝑢𝑝𝑝𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑙𝑜𝑤𝑒𝑟 𝑟𝑎𝑛𝑘 𝑐𝑙𝑎𝑠𝑠 + 𝑙𝑜𝑤𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑢𝑝𝑝𝑒𝑟 𝑟𝑎𝑛𝑘 𝑐𝑙𝑎𝑠𝑠
2
Therefore, the class boundary is the average of the class limits involved

Note: The lower class boundary of a class interval is the equivalent upper class boundary of
the preceding class interval.

𝑙𝑜𝑤𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 + 𝑢𝑝𝑝𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦


𝐶𝑙𝑎𝑠𝑠 𝑚𝑖𝑑𝑝𝑜𝑖𝑛𝑡 =
2
𝑊𝑖𝑑𝑡ℎ 𝑜𝑓 𝑎 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 = 𝑢𝑝𝑝𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 − 𝑙𝑜𝑤𝑒𝑟 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦

Width of a class interval is the class size.


Example: The results of a survey of the income earned per hour in dollars for a sample of
60 families is given below.

1 6 17 7 19 10
16 16 22 1 27 2
21 22 18 14 15 15
5 2 23 18 27 20
22 21 26 4 24 21
17 18 19 19 29 17
20 23 24 24 2 28
18 25 26 20 28 3
3 19 4 15 16 29
15 17 3 23 20 16

(a) Draw up a tally chart for the classes 0-4, 5-9, 10-14, 15-19, 20-24 and 25-29
(b) Construct a table showing the class intervals and the theoretical class intervals
representing the income earned by the families.

Solution:
(a)

Income Earned ($) Tally Frequency


0-4 |||| |||| 10
5-9 ||| 3
10-14 || 2
15-19 |||| |||| |||| |||| 20
20-24 |||| |||| |||| | 16
25-29 |||| |||| 9
Total frequency = 60

(b)

Income Earned ($) Theoretical Class Interval


0-4 0 ≤ 𝑥 < 4.5
5-9 4.5 ≤ 𝑥 < 9.5
10-14 9.5 ≤ 𝑥 < 14.5
15-19 14.5 ≤ 𝑥 < 19.5
20-24 19.5 ≤ 𝑥 < 24.5
25-29 24.5 ≤ 𝑥 < 29.5

Mean

The mean for a given set of data is the average.

To calculate the mean from raw data we use

∑𝑥 ∑𝑥
𝑥̅ = =
∑𝑓 𝑛

Where,

𝑥̅ − is the mean

𝑥 - is the value of an observation


𝑓 - is the frequency

∑ 𝑥 – is the sum of the observations

𝑛 𝑜𝑟 ∑ 𝑓 – is the sum of the frequencies

Example : Calculate the mean of the following numbers

1, 3, 5, 7, 11, 12, 13, 15, 16, 17

Solution:

∑𝑥
𝑥̅ =
∑𝑓

1 + 3 + 5 + 7 + 11 + 12 + 13 + 15 + 16 + 17
𝑥̅ =
10
100
𝑥̅ =
10
𝑥̅ = 10

 Calculating the Mean from a Frequency Distribution

∑ 𝑓𝑥
𝑥̅ =
∑𝑓

Where,

𝑓𝑥 is the product of the frequency and the value of the corresponding observation

And,

∑ 𝑓𝑥 is the sum of the product 𝑓𝑥

Example: The marks obtained by 100 students in a test in which the maximum possible
mark was 10 are shown in the table below.

Marks Frequency
0 2
1 5
2 8
3 17
4 23
5 0
6 15
7 12
8 9
9 6
10 3

Calculate the mean mark of the frequency.

Solution:

To calculate the mean mark, we first need 𝑓𝑥


Marks Frequency 𝑓𝑥
0 2 0
1 5 5
2 8 16
3 17 51
4 23 92
5 0 0
6 15 90
7 12 84
8 9 72
9 6 54
10 3 30
∑ 𝑓 = 100 ∑ 𝑓𝑥 = 494
∑ 𝑓𝑥
Now, 𝑥̅ = ∑𝑓

494
𝑥̅ =
100
𝑥̅ = 4.94 𝑚𝑎𝑟𝑘𝑠

Median

The median is defined as the ‘middle’ or central value in a set of ascending or descending
observations. The median is represented by the symbol 𝑄2 .

Note: The median has the same number of values above it as there are value below.

Example: Find the median of the following numbers

10, 12 ,8, 11, 9

Solution:

Median value

8, 9, 10, 11, 12

Example: Find the median of the following heights which are stated in cm.

(a) 163, 158, 154, 161, 156, 159, 155


(b) 158, 163, 154, 161, 157, 156, 159, 155
(c) Finding the Median for a Frequency Distribution with Ungrouped Data

Given a set of ungrouped data, the position of the median id given by

1
(𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
2
Example: The masses of 100 pupils in a school are shown in the table below

Masses (kg) Number of Pupils


51 7
52 8
53 10
54 12
55 13
56 15
57 12
58 9
59 8
60 6
Find the median of the masses shown in the frequency distribution given.

Solution:

*Students will be required to find the cumulative frequency in order to find the median
from the frequency distribution

Masses (kg) Cumulative Frequency


≤51 7
≤52 8 +7 = 15
≤53 15+10=25
≤54 25+12=37
≤55 37+13=50 50th rank
≤56 50+15=65
≤57 65+12=77 51st rank
≤58 77+9=86
≤59 86+8=94
≤60 94+6=100

1
Now, the median will be found from the 2 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘

1
(100 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
2
= 50.5th rank

The 50.5th rank is located between the 50th and the 51st ranks.

Taking the observations of each rank


55+56
 2

= 55.5kg

Mode

The mode of a distribution is defined as the observation with the highest frequency.

Example: Determine the mode of the basic wages in the following distribution

$125, $175, $195, $175, $205, $125, $175, $210

Solution

Mode = $175 as it occurs most often


Example: The table shows the number of children per family in the families of the pupils in
a class.

Determine the mode.

Solution: The mode is 3, since it has the highest frequency, 9.

 Range

To calculate the range from raw data we use the formula

𝑅𝑎𝑛𝑔𝑒 = 𝑡ℎ𝑒 𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛 − 𝑡ℎ𝑒 𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛

Example: the basic wages of the workers in a factory are:

$175, $160, $195, $149, $185, $167, $148

Calculate the range.

Range = $195 - $148

=$47

To calculate the range from a frequency distribution with ungrouped data we use the
formula

𝑅𝑎𝑛𝑔𝑒 = 𝑡ℎ𝑒 𝑢𝑝𝑝𝑒𝑟 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛


− 𝑡ℎ𝑒 𝑙𝑜𝑤𝑒𝑟 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑦 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛

Example: The masses of 50 lambs were estimated to the nearest kilogram. The results can
be seen below.

Mass (kg) Frequency


27 4
No. of children per
28 family 91 2 3 4 5 6 7
Frequency 29 16 2 3 9 5 6 4 1
30 13
31 5
32 2
33 1

What value is the range of these estimates?


 Solution:
Lower boundary limit of smallest observation = 26.5 kg
Upper boundary limit of largest observation = 33.5 kg

Therefore, range = 33.5 kg – 26.5 kg


= 7 kg

A quartile is one of three values that divide an ordered set of data into four equal parts.

The lower quartile 𝑄1 is the value below which one-quarter of the data lies.
The middle quartile 𝑄2 is the value below which one-half of the data lies. This quartile is
known as the median.
The upper quartile 𝑄3 is the value below which three-quarters of the data lies

𝐼𝑛𝑡𝑒𝑟𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑔𝑒 = 𝑄3 − 𝑄1
𝑄3 − 𝑄1
𝑆𝑒𝑚𝑖 − 𝑖𝑛𝑡𝑒𝑟𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑔𝑒 =
2
 The Interquartile and Semi-interquartile Range from Raw Data
Example: Calculate the interquartile range and the semi-interquartile range of the
following heights, stated in cm:
163, 158, 154, 161, 156, 159, 155

Solution: by 1st writing the heights in ascending order

154, 155, 156, 158, 159, 161, 163

Thus 𝑄2 = 158
𝑄1 = 155
𝑄3 = 161

 𝐼𝑄𝑅 = 𝑄3 − 𝑄1
= 161 – 155
= 6 cm
𝑄3 −𝑄1
 𝑆𝐼𝑄𝑅 = 2
161−155
= 2
6
=2
= 3 cm

 The Interquartile and Semi-interquartile Range from a Frequency Distribution

Recall that when the observations are given as a frequency distribution with ungrouped
1
data, then 𝑄2 is given by 2 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘 and the median is the value corresponding to this
rank.

1 3
Therefore, 𝑄1 𝑎𝑛𝑑 𝑄3 are the values corresponding to 4 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘 and 4 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
respectively.
Example: The masses of 100 pupils in a school are shown in the table below

Masses (kg) Number of Pupils


51 7
52 8
53 10
54 12
55 13
56 15
57 12
58 9
59 8
60 6

(a) Determine for the distribution given


(i) its lower quartile
(ii) its upper quartile
(b) Hence find the value of
(i) the interquartile range of the masses
(ii) the semi-interquartile range of the masses

Solution: By 1st constructing the cumulative frequency table


(a)

Masses (kg) Cumulative Frequency


≤51 7
≤52 8 +7 = 15
≤53 15+10=25
≤54 25+12=37
≤55 37+13=50
≤56 50+15=65
≤57 65+12=77
≤58 77+9=86
≤59 86+8=94
≤60 94+6=100

1
(i) the position of 𝑄1 = (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
4
1
= (100 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
4
1
= (101)𝑡ℎ 𝑟𝑎𝑛𝑘
4
= 25.25𝑡ℎ 𝑟𝑎𝑛𝑘

This implies that 𝑄1 is the average of the 25th and 26th observations
53+54
Therefore 𝑄1 = 2
= 53.5 kg

3
(ii) the position of 𝑄3 = 4 (𝑛 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
3
= (100 + 1)𝑡ℎ 𝑟𝑎𝑛𝑘
4
3
= (101)𝑡ℎ 𝑟𝑎𝑛𝑘
4
= 75.75𝑡ℎ 𝑟𝑎𝑛𝑘
This implies that 𝑄3 is the average of the 75th and 76th observations
57+57
Therefore 𝑄1 = 2
= 57 kg

(b) (i) 𝐼𝑄𝑅 = 𝑄3 − 𝑄1


= 57 – 53.5
= 3.5 kg

𝑄3 −𝑄1
(ii) 𝑆𝐼𝑄𝑅 = 2
57−53.5
= 2
= 1.75 kg

You might also like