0% found this document useful (0 votes)
5 views

Chapter 02

aaaaaaaaa

Uploaded by

king78m7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Chapter 02

aaaaaaaaa

Uploaded by

king78m7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Chapter 2:

Frequency Distributions and Graphs

By
Dr. Abdelfattah Mustafa
Associate Professor of Mathematical Statistics

Mathematics Department, Faculty of Science,


Islamic University of Madinah, KSA

Reference: Allan G. Bluman, Elementary Statistics: A Step by Step Approach, 8 edition,


McGraw-Hill, 2012.

February 8, 2024

1/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 1 / 43
Contents

1 Introduction

2 Organizing Data

3 Histograms, Frequency Polygons, and Ogives

4 Other Types of Graphs

2/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 2 / 43
Introduction

Introduction

When conducting a statistical study, the researcher must gather data for the
particular variable under study.

To describe situations, draw conclusions, or make inferences about events, the


researcher must organize the data . The method of organizing data is to
construct a frequency distribution.

After organizing the data, the researcher must present them so they can be
understood.

The most useful method of presenting the data is by constructing statistical


charts and graphs. There are many different types of charts and graphs, and
each one has a specific purpose.

3/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 3 / 43
Organizing Data

2-1 Organizing Data

Suppose a researcher wished to do a study on the ages of the top 50 wealthiest


people in the world.

When the data are in original form, they are called raw data and are listed
next.
69 65 76 59 74 81 73 38 57 49
61 69 49 85 65 78 68 69 56 54
67 64 43 82 78 43 37 68 81 48
80 59 85 40 85 79 77 81 56 52
74 87 90 83 61 69 61 57 71 60

Since little information can be obtained from looking at raw data,

the researcher organizes the data into what is called a frequency distribution.

4/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 4 / 43
Organizing Data

A frequency distribution is the organization of raw data in table form, using


classes and frequencies.

Two types of frequency distributions that are most often used are

1 the categorical frequency distribution

2 the grouped frequency distribution.

5/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 5 / 43
Organizing Data

Categorical Frequency Distributions:

The categorical frequency distribution is used for data that can be placed in
specific categories, such as nominal or ordinal levels data.

Example 1 ( Example 2-1: Distribution of Blood Types)

Twenty-five army inductees were given a blood test to determine their blood type.
The data set is
A B B AB O O O B AB
B B B O A O A O O
O AB AB A O B A

Construct a frequency distribution for the data.

Solution:

6/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 6 / 43
Organizing Data

There are four blood types: A, B, O, and AB.

These types will be used as the classes for the distribution.

Make a table as shown:

where,
frequency
Percent = × 100 (1)
Total

7/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 7 / 43
Organizing Data

Grouped Frequency Distributions:

For quantitative data, when the range of the raw data is large, the data must
be grouped into classes that are more than one unit in width, in what is called a
grouped frequency distribution.

The researcher must decide how many classes to use and the width of each class.
To construct a frequency distribution, follow these rules:

1 There should be between 5 and 20 classes.

2 It is preferable but not absolutely necessary that the class width be an odd
number.

3 The classes must be mutually exclusive.

4 The classes must be continuous.

5 The classes must be exhaustive.


8/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 8 / 43
Organizing Data

6 The classes must be equal in width. One exception occurs when a distribution
has a class that is open-ended.

A frequency distribution with an open-ended class is called an open-ended


distribution. Here are two examples of distributions with open-ended classes.

Example 2–2 shows the procedure for constructing a grouped frequency


distribution, i.e., when the classes contain more than one data value.

9/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 9 / 43
Organizing Data

Example 2 (Example 2-2: Record High Temperatures)

The following data represent the record high temperatures in degrees Fahrenheit (F)
for each of the 50 states.

112 100 127 120 134 118 105 110 109 112
110 118 117 116 118 122 114 114 105 109
107 112 114 115 118 117 118 122 106 110
116 108 110 121 113 120 119 111 104 111
120 113 120 117 105 110 118 112 114 114

Construct a grouped frequency distribution for these data.

Solution:

The procedure for constructing a grouped frequency distribution for numerical data
follows.
10/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 10 / 43
Organizing Data

1 Determine the classes:


Find the Range:

Range = highest value − lowest value = 134 − 100 = 34

Select the number of classes desired (usually between 5 and 20). In this case, 7 is
arbitrarily chosen. The number of classes can be calculated by:

k = 1 + 3.322 log10 (n)

Find the class width:


R 34
width = = = 4.9 ∼
=5
number of classes 7
A number is rounded up if there is any decimal remainder when dividing. For
example, 85 ÷ 6 = 14.167 ∼
= 15.
Select a starting point for the lowest class limit: This can be the smallest data
value or any convenient number less than the smallest data value. In this case, 100
is used.

11/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 11 / 43
Organizing Data

Lower upper
100 105 - one unit = 104
100 + width = 105 104 + width = 109
110 114
115 119
120 124
125 129
130 134

2 Tally the data.

3 Find the numerical frequencies from the tallies.

4 Find the class boundaries by

lower boundary = lower class limit − half of a unit

upper boundary = upper class limit + half of a unit

12/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 12 / 43
Organizing Data

The completed frequency distribution is

The frequency distribution shows that the class 109.5–114.5 contains the largest
number of temperatures (18) followed by the class 114.5–119.5 with 13
temperatures.

Hence, most of the temperatures (31) fall between 109.5 and 119.5F.

13/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 13 / 43
Organizing Data

Sometimes it is necessary to use a cumulative frequency distribution.

A cumulative frequency distribution is a distribution that shows the


number of data values less than or equal to a specific value (usually an upper
boundary).

The values are found by adding the frequencies of the classes less than or equal
to the upper class boundary of a specific class.

The cumulative frequency distribution for the data in this example is as follows:

14/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 14 / 43
Organizing Data

Cumulative frequencies are used to show how many data values are accumulated
up to and including a specific class.

After the raw data have been organized into a frequency distribution, it will be
analyzed by looking for peaks and extreme values.

The peaks show which class or classes have the most data values compared to
the other classes.

Extreme values, called outliers, show large or small data values that are relative
to other data values.

When the range of the data values is relatively small, a frequency distribution
can be constructed using single data values for each class.

This type of distribution is called an ungrouped frequency distribution and is


shown next.

15/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 15 / 43
Organizing Data

Example 3 (Example 2-3: MPGs for SUVs)

The data shown here represent the number of miles per gallon (mpg) that 30 selected
four-wheel-drive sports utility vehicles obtained in city driving.

12 17 12 14 16 18 16 18 12 16
17 15 15 16 12 15 16 16 12 14
15 12 15 15 19 13 16 18 16 14

Construct a frequency distribution, and analyze the distribution.

Solution:

The range of the data set is

R = largest value − smallest valuee = 19 − 12 = 7

The standard number of classes is 7, so the width of the classes is

16/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 16 / 43
Organizing Data

R 7
width = = =1
k 7
therefore, the classes consisting of a single data value can be used.

The classes: 12, 13, 14, 15, 16, 17, 18, 19.

Therefore, the ungrouped frequency distribution is

In this case, almost one-half (14) of the vehicles get 15 or 16 miles per gallon..

17/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 17 / 43
Histograms, Frequency Polygons, and Ogives

The cumulative frequencies are

2-2 Histograms, Frequency Polygons, and Ogives


The three most commonly used graphs in research are

1 The histogram.

2 The frequency polygon.

3 The cumulative frequency graph, or ogive (pronounced o-jive).

18/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 18 / 43
Histograms, Frequency Polygons, and Ogives

Frequency Histogram:
The histogram is a graph that displays the data by using contiguous vertical
bars of various heights to represent the frequencies of the classes.

Example 4 (Example 2-4: Record High Temperatures)

Construct a histogram to represent the data shown for the record high temperatures
for each of the 50 states.
Class boundaries Frequency
99.5 – 104.5 2
104.5 – 109.5 8
109.5 – 114.5 18
114.5 – 119.5 13
119.5 – 124.5 7
124.5 – 129.5 1
129.5 – 134.5 1

19/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 19 / 43
Histograms, Frequency Polygons, and Ogives

Solution:
1 Draw and label the x and y axes.

2 Represent the frequency on the y axis and the class boundaries on the x axis.

3 Using the frequencies as the heights, draw vertical bars for each class.

Figure 1: The frequency Histogram for Record High Temperatures.

20/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 20 / 43
Histograms, Frequency Polygons, and Ogives

Frequency Polygon:
The frequency polygon is a graph that displays the data by using lines that
connect points plotted for the frequencies at the midpoints of the classes. The
frequencies are represented by the heights of the points.

Example 5 (Example 2-5: Record High Temperatures)

Using the frequency distribution given in Example 4, construct a frequency polygon.

Solution:
Find the midpoints of each class.

upper boundary + lower boundary


midpoint =
2

Also, class limits can be used to calculate the midpoints.

21/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 21 / 43
Histograms, Frequency Polygons, and Ogives

Plot the frequency at each midpoint as shown in the following figure

Figure 2: The Frequency Polygon for Record High Temperatures.

22/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 22 / 43
Histograms, Frequency Polygons, and Ogives

The ogive:
The ogive is a graph that represents the cumulative frequencies for the classes in
a frequency distribution.

Example 6 (Example 2-6: Record High Temperatures)

Construct an ogive for the frequency distribution described in Example 4.

Solution:

1 Find the cumulative frequency for each class.

23/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 23 / 43
Histograms, Frequency Polygons, and Ogives

2 Plot the cumulative frequency at each upper class boundary, as shown in the
following figure.

3 Connect adjacent points with line segments. Then extend the graph to the first
lower class boundary, 99.5, on the x axis.

Figure 3: The Ogive for Record High Temperatures.

24/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 24 / 43
Histograms, Frequency Polygons, and Ogives

Cumulative frequency graphs are used to visually represent how many values are
below a certain upper class boundary.

For example, to find out how many record high temperatures are less than
114.5F, from the Figure 3, is 28.

25/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 25 / 43
Histograms, Frequency Polygons, and Ogives

Relative Frequency Graphs:

The histogram, the frequency polygon, and the ogive shown previously were
constructed by using frequencies in terms of the raw data.

These distributions can be converted to distributions using proportions instead


of raw data as frequencies. These types of graphs are called relative frequency
graphs.

Example 7 (Example 2-7: Miles Run per Week)


Construct a histogram, frequency polygon, and ogive using relative frequencies for
the distribution (shown here) of the miles that 20 randomly selected runners ran
during a given week.
Classes 5.5-10.5 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5 35.5-40.5
Frequency 1 2 3 5 4 3 2

26/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 26 / 43
Histograms, Frequency Polygons, and Ogives

Solution:

Step 1: Convert each frequency to a proportion or relative frequency.

Step 2: Find the cumulative relative frequencies.

27/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 27 / 43
Histograms, Frequency Polygons, and Ogives

Step 3: Draw each graph as following.

28/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 28 / 43
Histograms, Frequency Polygons, and Ogives

Distribution Shapes:

A distribution can have many shapes, and one method of analyzing a


distribution is to draw a histogram or frequency polygon for the distribution.

Several of the most common shapes are shown in following Figures.

29/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 29 / 43
Histograms, Frequency Polygons, and Ogives

Distributions can have other shapes in addition to the ones shown here.

30/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 30 / 43
Other Types of Graphs

2-3 Other Types of Graphs

In addition to the histogram, the frequency polygon, and the ogive, several other
types of graphs are often used in statistics. They are the bar graph, Pareto chart,
time series graph, and pie graph.

Bar Graphs: When the data are qualitative or categorical, bar graphs can be
used to represent the data.

A bar graph represents the data by using vertical or horizontal bars whose
heights or lengths represent the frequencies of the data.

Example 8 (College Spending for First-Year Students)

The table shows the average money spent by first-year college students. Draw a
horizontal and vertical bar graph for the data.

31/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 31 / 43
Other Types of Graphs

Electronics $728
Dorm decor 344
Clothing 141
Shoes 72

Solution:
Draw the bars corresponding to the frequencies.

Figure 4: The bar charts for College Spending for First-Year Students.

32/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 32 / 43
Other Types of Graphs

The Time Series Graph: When data are collected over a period of time, they
can be represented by a time series graph.

A time series graph represents data that occur over a specific period of time.

Example 9 (Example 2-10: Workplace Homicides)

The number of homicides that occurred in the workplace for the years 2003 to 2008 is
shown. Draw and analyze a time series graph for the data.

Year 2003 2004 2005 2006 2007 2008


Number 632 559 567 540 628 517

Solution:

1 Draw and label the x (Years) and y (Numbers) axes.

2 Plot each point according to the table.


33/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 33 / 43
Other Types of Graphs

3 Draw line segments connecting adjacent points.

Figure 5: The Time Series Graph for Workplace Homicides

34/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 34 / 43
Other Types of Graphs

A Pie Graph: Pie graphs are used extensively in statistics. The purpose of the
pie graph is to show the relationship of the parts to the whole by visually
comparing the sizes of the sections. Percentages or proportions can be used. The
variable is nominal or categorical.

A pie graphs is a circle that is divided into sections or wedges according to the
percentage of frequencies in each category of the distribution.

Example 10 (Example 2-11: Super Bowl Snack Foods)

This frequency distribution shows the


number of pounds of each snack food
eaten during the Super Bowl. Construct a
pie graph for the data.

35/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 35 / 43
Other Types of Graphs

Solution

Since there are 360◦ in a circle, the frequency for each class must be converted
into a proportional part of the circle. This conversion is done by using the
following formulas

f f
%= × 100 Degrees = × 360.
n n

Therefore,

Snack Pounds frequency Percent Degree


Potato chips 11.2 37.3 134
Tortilla chips 8.2 27.3 98
Pretzels 4.3 14.3 52
Popcorn 3.8 12.7 46
Snack nuts 2.5 8.3 30
Total 30.0 ∼
= 100 360

36/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 36 / 43
Other Types of Graphs

Next, using a protractor and a compass, draw the graph using the appropriate
degree measures, and label each section with the name and percentages, as
shown in the following Figure

Figure 6: Pie Graph for Super Bowl Snack Foods.

37/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 37 / 43
Other Types of Graphs

Example 11 (Example 2-12: Distribution of Blood Types)

Construct a pie graph showing the blood types of the army inductees described in
Example 1. The frequency distribution is repeated here.

Solution:
The number of degrees can be calculated as in the following table
Class Frequency Percent Degree
A 5 20 72
B 7 28 100.8
O 9 36 129.6
AB 4 16 57.6
Total 25 100 360
38/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 38 / 43
Other Types of Graphs

Figure 7: The Pie Graphs for Distribution of Blood Types.

Stem and Leaf Plots:


The stem and leaf plot is a method of organizing data and is a combination of
sorting and graphing.

39/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 39 / 43
Other Types of Graphs

A stem and leaf plot is a data plot that uses part of the data value as the
stem and part of the data value as the leaf to form groups or classes.

Example 12 (Example 2-13:)

At an outpatient testing center, the number of cardiograms performed each day for
20 days is shown. Construct a stem and leaf plot for the data.

25 31 20 32 13 14 43 02 57 23
36 32 33 32 44 32 52 44 51 45

Solution:

1 Arrange the data in order:


02, 13, 14, 20, 23, 25, 31, 32, 32, 32, 32, 33, 36, 43, 44, 44, 45, 51, 52, 57
2 Separate the data according to the first digit, as shown.
02 13, 14 20, 23, 25 31, 32, 32, 32, 32, 33, 36 43, 44, 44, 45
51, 52, 57 40/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 40 / 43
Other Types of Graphs

3 A display can be made by using the leading digit as the stem and the trailing
digit as the leaf

Leading digit(Stem) Trailing digit (leaf)


0 2
1 3 4
2 0 3 5
3 1 2 2 2 2 3 6
4 3 4 4 5
5 1 2 7

Example 13 (Example 2-14)

An insurance company researcher conducted a survey on the number of car thefts in


a large city for a period of 30 days last summer. The raw data are shown. Construct
a stem and leaf plot by using classes 50–54, 55–59, 60–64, 65–69, 70–74, and 75–79.

41/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 41 / 43
Other Types of Graphs

52 62 51 50 69 58 77 66 53 57
75 56 55 67 73 79 59 68 65 72
57 51 63 69 75 65 53 78 66 55

Solution:

1 Arrange the data in order.


50, 51, 51, 52, 53, 53, 55, 55, 56, 57, 57, 58, 59, 62, 63, 65, 65, 66, 66, 67, 68, 69,
69, 72, 73, 75, 75, 77, 78, 79

2 Separate the data according to the classes.


50, 51, 51, 52, 53, 53 55, 55, 56, 57, 57, 58, 59 62, 63 65, 65, 66, 66,
67, 68, 69, 69 72, 73 75, 75, 77, 78, 79

3 Plot the data as shown here.

42/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 42 / 43
Other Types of Graphs

Leading digit(Stem) Trailing digit (leaf)


5 0 1 1 2 3 3
5 5 5 6 7 7 8 9
6 2 3
6 5 5 6 6 7 8 9 9
7 2 3
7 5 5 7 8 9

When the data values are in the hundreds, such as 325, the stem is 32 and the
leaf is 5. For example, the stem and leaf plot for the data values 325, 327, 330,
332, 335, 341, 345, and 347 looks like this.
32 5 7
33 0 2 5
34 1 5 7

Exercises 2
Page 95: 1 - 6. Page 98: 17, 18, 20, 21
43/43
Dr. Abdelfattah Mustafa STAT 3111: General Statistics February 8, 2024 43 / 43

You might also like