0% found this document useful (0 votes)
25 views

02.2 Graphical Summary Techniques

This document discusses graphical summary techniques, including stem and leaf plots, boxplots, histograms, and scatterplots. It provides examples of stem and leaf plots using test score and long jump distance data to demonstrate how to construct and interpret them. Guidelines and key aspects of boxplots such as interquartile range, median, and outliers are also outlined.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

02.2 Graphical Summary Techniques

This document discusses graphical summary techniques, including stem and leaf plots, boxplots, histograms, and scatterplots. It provides examples of stem and leaf plots using test score and long jump distance data to demonstrate how to construct and interpret them. Guidelines and key aspects of boxplots such as interquartile range, median, and outliers are also outlined.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Graphical Summary

Techniques
Glyzel Grace M. Francisco
STAT2205 – Introduction to Biostatistics
2nd Semester, 2020-2021

CENTRAL LUZON STATE UNIVERSITY


DEPARTMENT of
STATISTICS Graphical Summary Techniques

1. Stem and-Leaf Display

2. Boxplots

3. Histogram

4. Scatterplot

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 2


DEPARTMENT of
STATISTICS Stem and Leaf Plot

- A special table where each data value is split value into a


“stem” (the first digit/s) and a “leaf” (usually the last digit).
- A way to organize data
- It sorts data quickly
- Gives an overview of the spread of the data
- Able to tell minimum value, maximum value and range in a
glance.
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 3
DEPARTMENT of
STATISTICS Stem and Leaf Plot

Example 1:

The test scores of s class of 20 students are as follows:

83, 72, 73, 65, 65, 95, 70, 50, 100, 88,
87, 13, 92, 35, 56, 8, 40, 23, 39, 45

The leaf will be the rightmost digit of each score;


the stem will be the other digits listed in ascending order.

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 4


DEPARTMENT of
STATISTICS Stem and Leaf Plot
Stem Leaf
Example 1: 0 8
1 3
83 72 73 65 65 2 3
95 70 50 100 88 3 59
87 13 92 35 56 4 05
5 06
8 40 23 39 45 6 55
7 230
8 387
9 52
Key: “1 | 3” means “13” 10 0
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 5
DEPARTMENT of
STATISTICS Stem and Leaf Plot
Stem Leaf Data set
0 8 08
1 3 13
2 3 23
3 59 35, 39
4 05 40, 45
5 06 50, 56
6 55 65, 65
7 230 72, 73, 70
8 387 83, 88, 87
9 52 95, 92
10 0 100
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 6
DEPARTMENT of
STATISTICS Stem and Leaf Plot

Example 2:

The long jump distances (measured in meters) in a P.E. class of 25


students are as follows:
2.3, 2.5, 3.5, 6.5, 2.0, 2.3, 3.7, 4.5, 3.8, 4.0,
3.5, 6.1, 4.3, 3.8, 5.0, 3.5, 4.5, 4.7, 4.5, 2.8,
6.3, 2.2, 3.2, 3.4, 4.5,
The leaf will be the tenth digit of each reading;
the stem will be the ones digits listed in ascending order.

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 7


DEPARTMENT of
STATISTICS Stem and Leaf Plot
Example 2:
Stem Leaf
2.3 2.5 3.5 6.5 2.0 2 350382
2.3 3.7 4.5 3.8 4.0 3 57858524
3.5 6.1 4.3 3.8 5.0 4 503755
3.5 4.5 4.7 4.5 2.8 5 0
6.3 2.2 3.2 3.4 4.5 6 513

Key: “2 | 3” means “2.3”

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 8


DEPARTMENT of
STATISTICS Stem and Leaf Plot

Stem Leaf Data Set


2 350382 2.3, 2.5, 2.0, 2.3, 2.8, 2.2
3 57858524 3.5, 3.7, 3.8, 3.5, 3.8, 3.5, 3.2, 3.4
4 503755 4.5, 4.0, 4.3, 4.7,5.4, 4.5
5 0 5.0
6 513 6.5, 6.1, 6.3

Key: “2 | 3” means “2.3”

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 9


DEPARTMENT of
STATISTICS Stem and Leaf Plot

- can also be used to make a comparison between two classes


of scores.

- We can draw them back to back

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 10


DEPARTMENT of
STATISTICS Stem and Leaf Plot

Example 3:

Female scores: 76, 45, 58, 79, 82, 63, 76, 72, 58, 13, 45, 90, 72, 65, 70

Male scores: 56, 34, 72, 89, 15, 97, 45, 34, 72, 65, 98, 12, 26, 64, 54

The leaf will be the ones digit of each score;


the stem will be the tens digits listed in ascending order.

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 11


DEPARTMENT of
STATISTICS Stem and Leaf Plot
Example 3: Leaf (F) Stem Leaf (M)
3 1 52
76 45 58 79 82
Female 2 6
63 76 72 58 13 3 44
scores
45 90 72 65 70 55 4 5
56 34 72 89 15 88 5 64
Male
97 45 34 72 65 35 6 54
scores
98 12 26 64 54 696220 7 22
2 8 9
Key: “3| 1| 5 2” means Female: “13” 0 9 78
Male: “15” & “12”
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 12
DEPARTMENT of
STATISTICS Stem and Leaf Plot
Female Leaf (F) Stem Leaf (M) Male
13 3 1 52 15, 12
2 6 26
3 44 34, 34
45, 45 55 4 5 45
58, 58 88 5 64 56, 54
63, 65 35 6 54 65, 64
76, 79, 76, 72, 72, 70 696220 7 22 72, 72
82 2 8 9 89
90 0 9 78 97, 98
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 13
DEPARTMENT of
STATISTICS Boxplot
• also called as the box-and-whiskers plot
• graph that is very useful for displaying the following features
of data:
▪ Location
▪ Spread
▪ Symmetry
▪ extremes/outliers

• useful for identifying outliers


• use in comparing distributions

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 14


DEPARTMENT of
STATISTICS Boxplot

Interquartile Range
Outlier (IQR) Outliers

whisker whisker
Minimum/ Maximum/
Lower Fence Median Upper Fence
𝑸𝟏 𝑸𝟑
(𝑸𝟏 − 1.5 ∗ IQR) (25th Percentile) (75th Percentile) (𝑸𝟑 + 1.5 ∗ IQR)

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 15


DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1. Compute for the following values:
1. 1st quartile
𝑄1 = 𝑃25
25 25 12th
𝑸𝟏 =17
𝐿= 𝑥𝑁 = 𝑥46 = 11.5 ≈ 12 data
100 100
1 5 5 6 8 10 12 13 14 15
12th
15 17 data
18 18 18 18 18 20 24 25
26 26 26 26 27 28 31 31 34 37
37 37 39 40 40 41 41 42 43 44
44 46 46 47 49 50
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 16
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot

2. 2nd quartile or the median 𝟐𝟔+𝟐𝟔


𝑄2 = 𝑃50 𝑸𝟐 = =26
𝟐

50 50 𝟐𝟑𝒓𝒅 + 𝟐𝟒𝒕𝒉
𝐿= 𝑥𝑁 = 𝑥46 = 23 𝟐
100 100

1 5 5 6 8 10 12 13 14 15
15 17 18 18 18 18 18 20 24 25
23rd 24th
26 26 data 26 26 data
27 28 31 31 34 37
37 37 39 40 40 41 41 42 43 44
44 46 46 47 49 50
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 17
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot

3. 3rd quartile
𝑄3 = 𝑃75

75 75 35th
𝐿= 𝑥𝑁 = 𝑥46 = 34.5 ≈ 35 data
𝑸𝟑 =40
100 100

1 5 5 6 8 10 12 13 14 15
15 17 18 18 18 18 18 20 24 25
26 26 26 26 27 28 31 31 34 37
35th
37 37 39 40 40 data
41 41 42 43 44
44 46 46 47 49 50
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 18
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot

4. Interquartile range
IQR = 𝑄3 − 𝑄1
IQR = 40 − 17
IQR = 23

5. Lower and Upper Fence

𝐹𝐿 = 𝑄1 − 1.5𝑥𝐼𝑄𝑅 𝐹𝑈 = 𝑄3 + 1.5𝑥𝐼𝑄𝑅
𝐹𝐿 = 17 − 1.5𝑥23 𝐹𝐿 = 40 + 1.5𝑥23
𝐹𝐿 = −17.5 𝐹𝐿 = 74.5
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 19
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
2. Construct a rectangle with one end at the first quartile and the
other end at the third quartile.

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


𝟏𝟕 𝟒𝟎
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 20
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
3. Put a line across the interior of the rectangle at the median.

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


𝟏𝟕 𝟐𝟔 𝟒𝟎
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 21
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
4. Locate the smallest value/observation in the interval [𝐹1 , 𝑄1 ].
Draw a line from this value to 𝑄1 .

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


−𝟏𝟕. 𝟓 𝟏𝟕 𝟐𝟔 𝟒𝟎
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 22
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
5. Locate the largest value/observation in the interval [𝑄3 , 𝐹𝑢 ]. Draw
a line from this value to 𝑄3 .

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


−𝟏𝟕. 𝟓 𝟏𝟕 𝟐𝟔 𝟒𝟎 74.𝟓
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 23
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot
1 5 5 6 8 10 12 13 14 15 𝑄1 = 17
15 17 18 18 18 18 18 20 24 25 𝑄2 = 26
26 26 26 26 27 28 31 31 34 37 𝑄3 = 40
37 37 39 40 40 41 41 42 43 44 𝐹𝐿 = −17.5
44 46 46 47 49 50 𝐹𝑈 = 74.5
6. Values falling outside the fences are considered outliers and are
usually denoted by “x”.

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110


−𝟏𝟕. 𝟓 𝟏𝟕 𝟐𝟔 𝟒𝟎 74.𝟓
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 24
DEPARTMENT of
STATISTICS Steps in Constructing a Boxplot

-20 -10 0 10 20 30 40 50 60 70 80 90 100 110

−𝟏𝟕. 𝟓 𝟏𝟕 𝟐𝟔 𝟒𝟎 74.𝟓

Observations:
• Median is not exactly at the middle of the rectangle

• no possible outliers

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 25


DEPARTMENT of
STATISTICS Histogram

• The word histogram comes from the Greek histos,


meaning pole or mast, and gram, which means chart or
graph.

• The direct definition of “histogram” is “pole chart.”

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 26


DEPARTMENT of
STATISTICS Histogram
• A histogram is a graphical display of data using bars of
different heights.

• It is used to display the distribution of data values along the


real number line

• It is created by dividing up the range of the data into a small


number of intervals or bins.

• The number of observations falling in each interval is counted.


This gives a frequency distribution.
GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 27
DEPARTMENT of
STATISTICS Histogram
-is a graph of the frequency distribution in which the vertical axis
represents the count (frequency) and the horizontal axis
represents the possible range of the data values.

➢ A Histogram visually represent


the distribution of a continuous
variable.

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 28


DEPARTMENT of
STATISTICS Histogram
Example
8
Customer Wait Time in minutes
(n=20) 7

13.1 42.2 43.5 45.2 6

25.6 15.5 40.3 54.1 5

Frquency
37.6 30.3 10.2 45.6 4

3
36.5 21.4 37.3 36.5
2
45.3 35.6 31.2 43.1
1

0
0-10.0 10.1-20.0 20.1-30.0 30.1-40.0 40.1-50.0 50.1-60.0
Customer wait time in minutes

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 29


DEPARTMENT of
STATISTICS Scatterplot
- shows the relationship between two sets of data
- The data is plotted on the graph as "Cartesian (x,y) Coordinates”

In this example, each dot shows one person's weight versus their height

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 30


DEPARTMENT of
STATISTICS Scatterplot
Example:
Temperature Data and Cricket Chirps (Excerpt)

Temperature
57 60 64 65 68 71 74 77
(Fahrenheit)
Number off Chirps
18 20 21 23 27 30 34 39
(in 15 seconds)

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 31


DEPARTMENT of
STATISTICS Scatterplot
Y=Temperature X=No. of Chirps
(Fahrenheit) (in 15 seconds)
57 18
60 20
64 21
65 23
68 27
71 30
74 34
As the temperature increases, the number of chirps increases.

GGMFRANCISCO STAT2205 – INTRODUCTION TO BIOSTATISTICS | 32

You might also like