Lecture 2_Table and Chart (1)
Lecture 2_Table and Chart (1)
• Recap
– In the previous chap you know how to collect data. Data collected
Tables and Charts information => need to summarise before presenting to audience
• Requirement
– Data summary clears away details but should give the overall
pattern.
– Summarised information are concise but should reflect the accurate
Reading materials: view of the original data
Chap 2,3 (Keller) • Methods to summarise and present data
– Tables
– Charts
– Numerical summaries (measure of location and dispersion)
1 2
Outline
Frequency tables
• Frequency is the number of times a certain value has
happened
• A frequency distribution records the number of
Univariate distribution times each value occurs and is presented in the form
of table
• Types of frequency distribution:
• Simple frequency distribution
• Grouped frequency distribution
5 6
1
Simple frequency distribution Simple frequency table: example 1
• What is a simple frequency table?
– Consider each observed value as a class (group) Marks Number of students (frequency)
• Applications: 4 3
• Qualitative data 5 3
8 3
• You are given a raw data of midterm marks of 20 students as
follows: 7, 7, 10, 8, 5, 4, 5, 6, 4, 9, 8, 7, 6, 4, 8, 5, 7, 10, 10, 9 9 2
7 8
2
Grouped FT with equal class interval:
Grouped FT with equal class interval:
discrete variable with many values (cont.)
continuous variable
Marks (class interval) Number of
candidates
(frequency)
Example 4: draw a frequency table of wages (in
21 – 30 2
USD) paid to 30 people as follows:
Note: Decision on the
31 – 40 11 number of classes and
class intervals is 202 277 554 145 361
41 – 50 17
subjective, depending on 457 87 94 240 144
51 – 60 20
the study objective. 310 391 362 437 429
61 – 70 5
176 325 221 374 216
71 – 80 2
480 120 274 398 282
81 – 90 1
153 470 303 338 209
Total 58
13 14
15 16
17 18
3
Class midpoint, cumulative, percentage, and Class midpoint, cumulative, percentage, and
cumulative percentage frequency distribution cumulative percentage frequency distribution
Wages (class Class Number of Cumulative Percentage Cumulative • Class midpoint: the average
interval) midpoint people frequency frequency percentage
(frequency) frequency • Cumulative frequency: running total of frequencies
< $100 50 2 2 6.7 6.7 through the classes of a FT
$100 – < $200 150 5 7 16.7 23.3
$200 – <$300 250 8 15 26.7 50.0
• Percentage (relative) frequency: proportion of a
$300 – <$400 350 9 24 30.0 80.0 frequency of a class on total frequencies.
$400 – <$500 450 5 29 16.7 96.7 • Cumulative percentage frequency: similar to
$500 – <$600 550 1 30 3.3 100.0
cumulative frequency but in percentage
Total 30
19 20
21 22
4
Bar charts: example of UNSW Pie charts: example of UNSW
150 1.60%
100
17.49%
50 33.09%
0
Australia China South India USA & UK & Other Rest of
Australia & NZ China South East Asia India
& NZ East Asia Canada Ireland Europe the world
USA & Canada UK & Ireland Other Europe Rest of the world
25 26
More on bar and pie charts More on bar and pie charts
- Should we use bar or pie? - Should we use bar or pie?
150 1.60%
100
17.49%
33.09%
50
0
Australia China South India USA & UK & Other Rest of Australia & NZ China South East Asia India
& NZ East Asia Canada Ireland Europe the world USA & Canada UK & Ireland Other Europe Rest of the world
27
Notes Histograms
Choose charts that present information most Raw data => frequency table => histograms
effectively (‘Learning by doing’) A histogram looks like a bar charts except that
the bars are joined together
Practice with SPSS
Two types of histograms:
Equal-width histogram
Unequal-width histogram
29 30
5
Equal-width histogram Equal-width histogram with normal curve
All bars have the same width (the same class intervals)
The height of each bar represents the frequency or
percentage frequency of the class intervals
Using raw data in the example 4, draw a histogram
representing wages
31
H i s togr a m of P os itiv e s k e w
Histogr a m of S y mme tr ic
35
50
30
40
25
30
Frequency
Frequency
20
20
15
10 10
5
0
-2.4 -1.6 -0.8 0.0 0.8 1.6 2.4
Sy mme t r ic 0
0 .0 1.5 3.0 4.5 6.0 7.5
Po s it iv e s ke w
33 34
30
20
25
15
Frequency
Frequency
20
15 10
10
5
5
0
0 -1.5 0.0 1.5 3.0 4.5 6.0
3.0 4.5 6.0 7.5 9.0 Bimodal
Nega t iv e ske w
35 36
6
Histogram terms Histograms of COVID19 in the world
• Modal class – class with highest number of
• https://covid19.who.int/?gclid=EAIaIQobChMI
observations
8a‐
• Uni-modal, bi-modal, tri-modal, multi-modal
unPCH6wIVix0rCh3tQAogEAAYASAAEgJb5_D_
• Skewness, symmetry
BwE
• Relative frequency histogram: replace frequency
for each class by • Access data: 7/8/2020
class frequency/total number of obs.
37 38
39 40
41
7
Distribution of national HS exam scores 2018 Distribution of national HS exam scores 2018
46
47 48
8
Investigating the relationship between variables Cross-table
• Methods: • Cross-table is used to investigate the relationship
– Table: Cross-table b/w two categorical vars or discrete variables with
– Charts: few values.
o Multiple bar chart • Note:
o Scatterplot (mentioned in lecture 8)
– Need to identify dependent and independent variables.
– Know how to calculate row and column percentages
– Rule of thumb: independent var in row and dependent
var in column
50
51 52
53 54