0% found this document useful (0 votes)
5 views

Chapter 2 Summarising Data

Chapter 2 discusses the organization and summarization of both qualitative and quantitative data, introducing concepts such as raw data, frequency distributions, and graphical presentations. It explains how to create frequency tables, relative frequencies, and visual representations like bar graphs and pie charts. The chapter also covers stem-and-leaf displays and histograms for quantitative data, providing examples and formulas for constructing these representations.

Uploaded by

damia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Chapter 2 Summarising Data

Chapter 2 discusses the organization and summarization of both qualitative and quantitative data, introducing concepts such as raw data, frequency distributions, and graphical presentations. It explains how to create frequency tables, relative frequencies, and visual representations like bar graphs and pie charts. The chapter also covers stem-and-leaf displays and histograms for quantitative data, providing examples and formulas for constructing these representations.

Uploaded by

damia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Chapter 2:

Summarizing Data

3.1 Introduction

 Raw data - Data recorded in the sequence in which there are collected and
before they are processed or ranked.

 Array data - Raw data that is arranged in ascending or descending order.

Example 1:

Here is a list of question asked in a large statistics class and the “raw data” given by
one of the students:

a) What is your sex (m=male, f=female)?


Answer (raw data): m

b) How many hours did you sleep last night?


Answer: 5 hours

c) What is your height in inches?


Answer: 67 inches

d) What’s the fastest you’ve ever driven a car (mph)?


Answer: 110 mph

Example 2:

Quantitative raw data

1
Qualitative raw data

 These data also called ungrouped data

3.2 Organizing and Graphing Qualitative Data

3.2.1 Frequency Distributions/ Table


3.2.2 Relative Frequency and Percentage Distribution
3.2.3 Graphical Presentation of Qualitative Data

3.2.1 Frequency Distributions / Table

 A frequency distribution for qualitative data lists all categories and the
number of elements that belong to each of the categories.
 It exhibits the frequencies are distributed over various categories
 Also called as a frequency distribution table or simply a frequency table.
 The number of students who belong to a certain category is called the
frequency of that category.

3.2.2 R

2
elative Frequency and Percentage Distribution

 A relative frequency distribution is a listing of all categories along with their


relative frequencies (given as proportions or percentages).
 It is commonplace to give the frequency and relative frequency distribution
together.
 Calculating relative frequency and percentage of a category

Relative Frequency of a category = Frequency of that category


Sum of all frequencies

Percentage = (Relative Frequency)* 100

Example 3:

A sample of UUM staff-owned vehicles produced by Proton was identified and the
make of each noted. The resulting sample follows (W = Wira, Is = Iswara, Wj =
Waja, St = Satria, P = Perdana, Sv = Savvy):

W W P Is Is P Is W St Wj
Is W W Wj Is W W Is W Wj
Wj Is Wj Sv W W W Wj St W
Wj Sv W Is P Sv Wj Wj W W
St W W W W St St P Wj Sv

Construct a frequency distribution table for these data with their relative frequency
and percentage.

3
Solution:

Relative
Category Frequency Percentage (%)
Frequency
0.38*100
Wira 19 19/50 = 0.38
= 38
Iswara 8 0.16 16
Perdana 4 0.08 8
Waja 10 0.20 20
Satria 5 0.10 10
Savvy 4 0.08 8
Total 50 1.00 100

3.2.3 Graphical Presentation of Qualitative Data

1. Bar Graphs

 A graph made of bars whose heights represent the frequencies of respective


categories.
 Such a graph is most helpful when you have many categories to represent.
 Notice that a gap is inserted between each of the bars.
 It has simple/ vertical bar chart, horizontal bar chart, component bar chart and
multiple bar chart.

Simple/ Vertical Bar Chart

 To construct a vertical bar chart, mark the various categories on the


horizontal axis and mark the frequencies on the vertical axis

Figure 3.1

4
Horizontal Bar Chart

 To construct a horizontal bar chart, mark the various categories on the


vertical axis and mark the frequencies on the horizontal axis.

Example 4: Refer Example 3,

UUM Staff-owned Vehicles Produced By


Proton

Savvy
Types of Vehicle

Satria
Waja
Perdana
Iswara
Wira

0 5 10 15 20
Frequency

Figure 3.2

 Another example of horizontal bar chart: Figure 2.4

Figure 3.3: Number of students at Diversity College who are


immigrants, by last country of permanent residence

5
2. Pie Chart

 A circle divided into portions that represent the relative frequencies or


percentages of a population or a sample belonging to different categories.
 An alternative to the bar chart and useful for summarizing a single categorical
variable if there are not too many categories.
 The chart makes it easy to compare relative sizes of each class/category.
 The whole pie represents the total sample or population. The pie is divided
into different portions that represent the different categories.
 To construct a pie chart, we multiply 360o by the relative frequency for each
category to obtain the degree measure or size of the angle for the
corresponding categories.

Example 7 (Table 3.4 and Figure 3.5):

Table 3.4 Figure 3.5

Example 8 (Figure 3.6):

Movie Frequency Relative Frequency Angle Size


Genres
Comedy 54 0.27 360*0.27=97.2o
Action 36 0.18 360*0.18=64.8o
Romance 28 0.14 360*0.14=50.4o
Drama 28 0.14 360*0.14=50.4o
Horror 22 0.11 360*0.11=39.6o
Foreign 16 0.08 360*0.08=28.8o
Science Fiction 16 0.08 360*0.08=28.8o
200 1.00 360o

6
Figure 3.6

3.3 Organizing and Graphing Quantitative Data

3.3.1 Stem and Leaf Display


3.3.2 Frequency Distribution
3.3.3 Relative Frequency and Percentage Distributions.
3.3.4 Graphing Grouped Data

3.3.1 Stem-and-Leaf Display

 In stem and leaf display of quantitative data, each value is divided into two
portions – a stem and a leaf. Then the leaves for each stem are shown
separately in a display.
 Gives the information of data pattern.
 Can detect which value frequently repeated.

Example 10:

25 12 9 10 5 12 23 7
36 3 11 12 31 28 37 6
14 41 38 44 13 22 18 19

7
Solution:

0 3 5 6 7 9
1 0 1 2 2 2 3 4 8 9
2 2 3 5 8
3 1 6 7 8
4 1 4

3.3.2 Frequency Distributions

 A frequency distribution for quantitative data lists all the classes and the
number of values that belong to each class.
 Data presented in form of frequency distribution are called grouped data.

 The class boundary is given by the midpoint of the upper limit of one class
and the lower limit of the next class. Also called real class limit.
 To find the midpoint of the upper limit of the first class and the lower limit of
the second class, we divide the sum of these two limits by 2.

e.g.:
400  401
 400.5
2

8
 Class Width (class size)

Class width = Upper boundary – Lower boundary

e.g. : Width of the first class = 600.5 – 400.5 = 200

 Class Midpoint or Mark

Lower limit + Upper limit


class midpoint or mark =
2

401  600
e.g: Midpoint of the 1st class =  500.5
2

Constructing Frequency Distribution Tables

1. To decide the number of classes, we used Sturge’s formula, which is

c = 1 + 3.3 log n

where c is the no. of classes


n is the no. of observations in the data set.

9
2. Class width,
Largest value - Smallest value
i 
Number of classes
Range
i 
c

This class width is rounded to a convenient number.

3. Lower Limit of the First Class or the Starting Point

 Use the smallest value in the data set.

Example 11:

The following data give the total home runs hit by all players of each of the 30 Major
League Baseball teams during 2004 season

Solution:

i) Number of classes, c = 1 + 3.3 log 30


= 1 + 3.3(1.48)
= 5.89  6 class

10
ii) Class width,
242  135
i 
6
 17.8

 18

iii) Starting Point = 135

Table 2.10 Frequency Distribution for Data of Table 2.9

Total Home Runs Tally f


135 – 152 |||| |||| 10
153 – 170 || 2
171 – 188 |||| 5
189 – 206 |||| | 6
207 – 224 ||| 3
225 – 242 |||| 4
 f  30

3.3.3 Relative Frequency and Percentage Distributions

Frequency of that class


Relative frequency of a class =
Sum of all frequencies
f
=
f
Percentage = (Relative frequency)  100

Example 12 (Refer example 11)

Table 2.11: Relative Frequency and Percentage Distributions

Total Home Runs Class Boundaries Relative %


Frequency

135 – 152 134.5 less than 152.5 0.3333 33.33


153 – 170 152.5 less than 170.5 0.0667 6.67
171 – 188 170.5 less than 188.5 0.1667 16.67
189 – 206 188.5 less than 206.5 0.2 20
207 – 224 206.5 less than 224.5 0.1 10
225 – 242 224.5 less than 242.5 0.1333 13.33
Sum 1.0 100%

11
3.3.4 Graphing Grouped Data

1. Histograms

 A histogram is a graph in which the class boundaries are marked on the


horizontal axis and either the frequencies, relative frequencies, or
percentages are marked on the vertical axis. The frequencies, relative
frequencies or percentages are represented by the heights of the bars.
 In histogram, the bars are drawn adjacent to each other and there is a space
between y axis and the first bar.

Example 13 (Refer example 11)

12

10

8
Frequency

0
134.5 152.5 170.5 188.5 206.5 224.5 242.5
1

Total home runs

Figure 3.7: Frequency histogram

12
2. Polygon

 A graph formed by joining the midpoints of the tops of successive bars in a


histogram with straight lines is called a polygon.

Example 13

12

10

8
Frequency

0
134.5 152.5 170.5 188.5 206.5 224.5 242.5
1
Total home runs

Figure 3.8: Frequency polygon

13

You might also like