Chapter 2 Summarising Data
Chapter 2 Summarising Data
Summarizing Data
3.1 Introduction
Raw data - Data recorded in the sequence in which there are collected and
before they are processed or ranked.
Example 1:
Here is a list of question asked in a large statistics class and the “raw data” given by
one of the students:
Example 2:
1
Qualitative raw data
A frequency distribution for qualitative data lists all categories and the
number of elements that belong to each of the categories.
It exhibits the frequencies are distributed over various categories
Also called as a frequency distribution table or simply a frequency table.
The number of students who belong to a certain category is called the
frequency of that category.
3.2.2 R
2
elative Frequency and Percentage Distribution
Example 3:
A sample of UUM staff-owned vehicles produced by Proton was identified and the
make of each noted. The resulting sample follows (W = Wira, Is = Iswara, Wj =
Waja, St = Satria, P = Perdana, Sv = Savvy):
W W P Is Is P Is W St Wj
Is W W Wj Is W W Is W Wj
Wj Is Wj Sv W W W Wj St W
Wj Sv W Is P Sv Wj Wj W W
St W W W W St St P Wj Sv
Construct a frequency distribution table for these data with their relative frequency
and percentage.
3
Solution:
Relative
Category Frequency Percentage (%)
Frequency
0.38*100
Wira 19 19/50 = 0.38
= 38
Iswara 8 0.16 16
Perdana 4 0.08 8
Waja 10 0.20 20
Satria 5 0.10 10
Savvy 4 0.08 8
Total 50 1.00 100
1. Bar Graphs
Figure 3.1
4
Horizontal Bar Chart
Savvy
Types of Vehicle
Satria
Waja
Perdana
Iswara
Wira
0 5 10 15 20
Frequency
Figure 3.2
5
2. Pie Chart
6
Figure 3.6
In stem and leaf display of quantitative data, each value is divided into two
portions – a stem and a leaf. Then the leaves for each stem are shown
separately in a display.
Gives the information of data pattern.
Can detect which value frequently repeated.
Example 10:
25 12 9 10 5 12 23 7
36 3 11 12 31 28 37 6
14 41 38 44 13 22 18 19
7
Solution:
0 3 5 6 7 9
1 0 1 2 2 2 3 4 8 9
2 2 3 5 8
3 1 6 7 8
4 1 4
A frequency distribution for quantitative data lists all the classes and the
number of values that belong to each class.
Data presented in form of frequency distribution are called grouped data.
The class boundary is given by the midpoint of the upper limit of one class
and the lower limit of the next class. Also called real class limit.
To find the midpoint of the upper limit of the first class and the lower limit of
the second class, we divide the sum of these two limits by 2.
e.g.:
400 401
400.5
2
8
Class Width (class size)
401 600
e.g: Midpoint of the 1st class = 500.5
2
c = 1 + 3.3 log n
9
2. Class width,
Largest value - Smallest value
i
Number of classes
Range
i
c
Example 11:
The following data give the total home runs hit by all players of each of the 30 Major
League Baseball teams during 2004 season
Solution:
10
ii) Class width,
242 135
i
6
17.8
18
11
3.3.4 Graphing Grouped Data
1. Histograms
12
10
8
Frequency
0
134.5 152.5 170.5 188.5 206.5 224.5 242.5
1
12
2. Polygon
Example 13
12
10
8
Frequency
0
134.5 152.5 170.5 188.5 206.5 224.5 242.5
1
Total home runs
13