100% found this document useful (1 vote)
155 views

TPS5e Lecture PPT Ch1.1

This document is a chapter from a statistics textbook about analyzing categorical data. It discusses displaying categorical data using bar graphs and frequency tables, calculating and interpreting marginal and conditional distributions from two-way tables, and describing associations between categorical variables. The chapter cautions that some graphs can be deceptive and associations can be influenced by other variables.

Uploaded by

Stephen Zhou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
155 views

TPS5e Lecture PPT Ch1.1

This document is a chapter from a statistics textbook about analyzing categorical data. It discusses displaying categorical data using bar graphs and frequency tables, calculating and interpreting marginal and conditional distributions from two-way tables, and describing associations between categorical variables. The chapter cautions that some graphs can be deceptive and associations can be influenced by other variables.

Uploaded by

Stephen Zhou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 12

CHAPTER 1

Exploring Data
1.1
Analyzing Categorical
Data
The Practice of Statistics, 5th Edition
Starnes, Tabor, Yates, Moore

Bedford Freeman Worth Publishers

Analyzing Categorical Data


Learning Objectives
After this section, you should be able to:
DISPLAY categorical data with a bar graph
IDENTIFY what makes some graphs of categorical data
deceptive
CALCULATE and DISPLAY the marginal distribution of a
categorical variable from a two-way table
CALCULATE and DISPLAY the conditional distribution of a
categorical variable for a particular value of the other
categorical variable in a two-way table
DESCRIBE the association between two categorical variables

The Practice of Statistics, 5th Edition

Categorical Variables
Categorical variables place individuals into one of several
groups or categories.
Frequency Table
Format

Variable

Count of Stations

Format

Percent of Stations

Adult Contemporary

1556

Adult Contemporary

Adult Standards

1196

Adult Standards

8.6

Contemporary Hit

4.1

Contemporary Hit

569

11.2

Country

2066

Country

14.9

News/Talk

2179

News/Talk

15.7

Oldies

1060

Oldies

Religious

2014

Religious

Rock

869

Spanish Language

750

Other Formats

Values

Relative Frequency Table

Total

The Practice of Statistics, 5th Edition

1579
13838

7.7
14.6

Rock

Count
Spanish Language
Percent

6.3
5.4

Other Formats

11.4

Total

99.9

Displaying Categorical Data


Frequency tables can be difficult to read.
Sometimes is is easier to analyze a distribution by displaying it with a
bar graph or pie chart.
Frequency Table
Format

Relative Frequency Table

Count of Stations

Format

Percent of Stations

Adult Contemporary

1556

Adult Contemporary

Adult Standards

1196

Adult Standards

8.6

Contemporary Hit

4.1

Contemporary Hit

569

11.2

Country

2066

Country

14.9

News/Talk

2179

News/Talk

15.7

Oldies

1060

Oldies

Religious

2014

Religious

7.7
14.6

Rock

869

Rock

6.3

Spanish Language

750

Spanish Language

5.4

Other Formats
Total

The Practice of Statistics, 5th Edition

1579
13838

Other Formats

11.4

Total

99.9

Graphs: Good and Bad


Bar graphs compare several quantities by comparing the heights of
bars that represent those quantities. Our eyes, however, react to the
area of the bars as well as to their height.
When you draw a bar graph, make the bars equally wide.
It is tempting to replace the bars with pictures for greater eye appeal.
Dont do it!
There are two important lessons to keep in mind:
(1)beware the pictograph, and
(2)watch those scales.

The Practice of Statistics, 5th Edition

Two-Way Tables and Marginal Distributions


When a dataset involves two categorical variables, we begin by
examining the counts or percents in various categories for one of the
variables.
A two-way table describes two categorical variables,
organizing counts according to a row variable and a
column variable.

What are the variables


described by this
two-way table?
How many young
adults were surveyed?

The Practice of Statistics, 5th Edition

Two-Way Tables and Marginal Distributions


The marginal distribution of one of the categorical variables in a twoway table of counts is the distribution of values of that variable among
all individuals described by the table.
Note: Percents are often more informative than counts, especially
when comparing groups of different sizes.

How
How to
to examine
examine aa marginal
marginal distribution:
distribution:
1)Use
1)Use the
the data
data in
in the
the table
table to
to calculate
calculate the
the marginal
marginal
distribution
distribution (in
(in percents)
percents) of
of the
the row
row or
or column
column totals.
totals.
2)Make
2)Make aa graph
graph to
to display
display the
the marginal
marginal distribution.
distribution.

The Practice of Statistics, 5th Edition

Two-Way Tables and Marginal Distributions


Examine the marginal
distribution of chance
of getting rich.

Response

Percent

Almost no
chance

194/4826 = 4.0%

Some chance

712/4826 = 14.8%

A 50-50 chance

1416/4826 = 29.3%

A good chance

1421/4826 = 29.4%

Almost certain

1083/4826 = 22.4%

The Practice of Statistics, 5th Edition

Relationships Between Categorical Variables


A conditional distribution of a variable describes the values of that
variable among individuals who have a specific value of another
variable.

How
How to
to examine
examine or
or compare
compare conditional
conditional distributions:
distributions:
1)
1) Select
Select the
the row(s)
row(s) or
or column(s)
column(s) of
of interest.
interest.
2)
2) Use
Use the
the data
data in
in the
the table
table to
to calculate
calculate the
the conditional
conditional
distribution
distribution (in
(in percents)
percents) of
of the
the row(s)
row(s) or
or column(s).
column(s).
3)
3) Make
Make aa graph
graph to
to display
display the
the conditional
conditional distribution.
distribution.
Use
Use aa side-by-side
side-by-side bar
bar graph
graph or
or segmented
segmented bar
bar
graph
graph to
to compare
compare distributions.
distributions.

The Practice of Statistics, 5th Edition

Relationships Between Categorical Variables


Calculate the conditional
distribution of opinion
among males. Examine the
relationship between gender
and opinion.

Response

Male

Female

Almost no chance

98/2459 =
4.0%

96/2367 =
4.1%

Some chance

286/2459 =
11.6%

426/2367 =
18.0%

A 50-50 chance

720/2459 =
29.3%

696/2367 =
29.4%

A good chance

758/2459 =
30.8%

663/2367 =
28.0%

Almost certain

597/2459 =
24.3%

486/2367 =
20.5%

The Practice of Statistics, 5th Edition

10

Relationships Between Categorical Variables


Can we say there is an association between
gender and opinion in the population of young
adults?
Making this determination requires formal
inference, which will have to wait a few
chapters.

Caution!
Even a strong association between two categorical variables can
be influenced by other variables lurking in the background.

The Practice of Statistics, 5th Edition

11

Data Analysis: Making Sense of Data


Section Summary
In this section, we learned how to
DISPLAY categorical data with a bar graph
IDENTIFY what makes some graphs of categorical data
deceptive
CALCULATE and DISPLAY the marginal distribution of a
categorical variable from a two-way table
CALCULATE and DISPLAY the conditional distribution of a
categorical variable for a particular value of the other categorical
variable in a two-way table
DESCRIBE the association between two categorical variables

The Practice of Statistics, 5th Edition

12

You might also like