0% found this document useful (0 votes)

66 views

Sfs5e PPT ch01

Uploaded by

mai nguyễn

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views

Sfs5e PPT ch01

Uploaded by

mai nguyễn

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 88

STATISTICS

INFORMED DECISIONS USING DATA

Fifth Edition

Chapter 1
Data Collection

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
Learning Objectives
1. Define statistics and statistical thinking
2. Explain the process of statistics
3. Distinguish between qualitative and quantitative variables
4. Distinguish between discrete and continuous variables
5. Determine the level of measurement of a variable

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.1 Define Statistics and Statistical Thinking
Statistics is the science of collecting, organizing, summarizing, and
analyzing information to draw conclusions or answer questions. In addition,
statistics is about providing a measure of confidence in any conclusions.
The information referred to in the definition is data. Data are a “fact or
proposition used to draw a conclusion or make a decision.” Data describe
characteristics of an individual.
A key aspect of data is that they vary. Is everyone in your class the same
height? No! Does everyone have the same hair color? No! So, among
individuals there is variability.
In fact, data vary when measured on ourselves as well. Do you sleep the
same number of hours every night? No! Do you consume the same number
of calories every day? No!
One goal of statistics is to describe and understand sources of variability.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.2 Explain the Process of Statistics (1 of 7)
The entire group of individuals to be
studied is called the population. An
individual is a person or object that
is a member of the population being
studied. A sample is a subset of the
population that is being studied.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.2 Explain the Process of Statistics (2 of 7)
Descriptive statistics consist of organizing and
summarizing data. Descriptive statistics describe data
through numerical summaries, tables, and graphs. A
statistic is a numerical summary based on a sample.
Inferential statistics uses methods that take results from a
sample, extends them to the population, and measures the
reliability of the result.
A parameter is a numerical summary of a population.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.2 Explain the Process of Statistics (3 of 7)
EXAMPLE Parameter versus Statistic
Suppose the percentage of all students on your campus who
have a job is 84.9%. This value represents a parameter
because it is a numerical summary of a population.
Suppose a sample of 250 students is obtained, and from this
sample we find that 86.4% have a job. This value represents
a statistic because it is a numerical summary based on a
sample.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.2 Explain the Process of Statistics (4 of 7)
The Process of Statistics
1. Identify the research objective. A researcher must determine the question(s) he or she
wants answered. The question(s) must clearly identify the population that is to be
studied.

2. Collect the data needed to answer the question(s) posed in (1). Conducting research
on an entire population is often difficult and expensive, so we typically look at a sample.
This step is vital to the statistical process, because if the data are not collected correctly,
the conclusions drawn are meaningless. Do not overlook the importance of appropriate
data collection. We discuss this step in detail in Sections 1.2 through 1.6.

3. Describe the data. Descriptive statistics allow the researcher to obtain an overview of
the data and can help determine the type of statistical methods the researcher should
use. We discuss this step in detail in Chapters 2 through 4.

4. Perform inference. Apply the appropriate techniques to extend the results obtained from
the sample to the population and report a level of reliability of the results. We discuss
techniques for measuring reliability in Chapters 5 through 8 and inferential techniques in
Chapters 9 through 15.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.2 Explain the Process of Statistics (5 of 7)
EXAMPLE Illustrating the Process of Statistics
Many studies evaluate batterer treatment programs, but there are few
experiments designed to compare batterer treatment programs to non-
therapeutic treatments, such as community service. Researchers designed
an experiment in which 376 male criminal court defendants who were
accused of assaulting their intimate female partners were randomly
assigned into either a treatment group or a control group. The subjects in
the treatment group entered a 40-hour batterer treatment program while
the subjects in the control group received 40 hours of community service.
After 6 months, it was reported that 21% of the males in the control group
had further battering incidents, while 10% of the males in the treatment
group had further battering incidents. The researchers concluded that the
treatment was effective in reducing repeat battering offenses.
Source: The Effects of a Group Batterer Treatment Program: A Randomized Experiment in
Brooklyn by Bruce G. Taylor, et. al. Justice Quarterly, Vol. 18, No. 1, March 2001.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.2 Explain the Process of Statistics (6 of 7)
Step 1: Identify the research objective.
To determine whether males accused of battering their intimate female
partners that were assigned into a 40-hour batter treatment program are
less likely to batter again compared to those assigned to 40-hours of
community service.
Step 2: Collect the information needed to answer the question.
The researchers randomly divided the subjects into two groups. Group 1
participants received the 40-hour batterer program, while group 2
participants received 40 hours of community service. Six months after the
program ended, the percentage of males that battered their intimate
female partner was determined.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.2 Explain the Process of Statistics (7 of 7)
Step 3: Describe the data - Organize and summarize the
information.
The demographic characteristics of the subjects in the experimental and
control group were similar. After the six month treatment, 21% of the males
in the control group had any further battering incidents, while 10% of the
males in the treatment group had any further battering incidents.
Step 4: Draw conclusions from the data.
We extend the results of the 376 males in the study to all males who batter
their intimate female partner. That is, males who batter their female partner
and participate in a batter treatment program are less likely to batter again.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.3 Distinguish between Qualitative and Quantitative
Variables (1 of 3)

Variables are the characteristics of the individuals within the

population.
Key Point: Variables vary. Consider the variable height. If all
individuals had the same height, then obtaining the height of
one individual would be sufficient in knowing the heights of
all individuals. Of course, this is not the case. As
researchers, we wish to identify the factors that influence
variability.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.3 Distinguish between Qualitative and Quantitative
Variables (2 of 3)

Qualitative or Categorical variables allow for classification

of individuals based on some attribute or characteristic.
Quantitative variables provide numerical measures of
individuals. The values of a quantitative variable can be
added or subtracted and provide meaningful results.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.3 Distinguish between Qualitative and Quantitative
Variables (3 of 3)
EXAMPLE Distinguishing between Qualitative and Quantitative
Variables
Researcher Elisabeth Kvaavik and others studied factors that affect the eating
habits of adults in their mid-thirties. (Source: Kvaavik E, et. al. Psychological explanatorys of
eating habits among adults in their mid-30’s (2005) International Journal of Behavioral Nutrition and
Physical Activity (2)9.)

Classify each of the following variables considered in the study as qualitative or

quantitative.
a. Nationality Qualitative
b. Number of children Quantitative
c. Household income in the previous year Quantitative
d. Level of education Qualitative
e. Daily intake of whole grains (measured in grams per day) Quantitative

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.4 Distinguish between Discrete and Continuous Variables (1 of 3)

A discrete variable is a quantitative variable that has either a finite

number of possible values or a countable number of possible values. The
term countable means the values result from counting such as 0, 1, 2, 3,
and so on. A discrete variable cannot take on every possible value
between any two possible values.
A continuous variable is a quantitative variable that has an infinite
number of possible values it can take on and can be measured to any
desired level of accuracy.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.4 Distinguish between Discrete and Continuous Variables (2 of 3)

EXAMPLE Distinguishing between Discrete and Continuous

Variables

Researcher Elisabeth Kvaavik and others studied factors that affect the eating
habits of adults in their mid-thirties. (Source: Kvaavik E, et. al. Psychological explanatorys of
eating habits among adults in their mid-30’s (2005) International Journal of Behavioral Nutrition and
Physical Activity (2)9.)

Classify each of the following quantitative variables considered in the study as

discrete or continuous.
a. Number of children Discrete

b. Household income in the previous year Continuous

c. Daily intake of whole grains (measured in grams per day) Continuous

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.4 Distinguish between Discrete and Continuous Variables (3 of 3)

The list of observations a variable assumes is called data.

While gender is a variable, the observations, male or female, are
data.
Qualitative data are observations corresponding to a qualitative
variable.
Quantitative data are observations corresponding to a
quantitative variable.
• Discrete data are observations corresponding to a discrete
variable.
• Continuous data are observations corresponding to a continuous
variable.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.5 Determine the Level of Measurement of a Variable (1 of 3)

A variable is at the nominal level of measurement if the values

of the variable name, label, or categorize. In addition, the naming
scheme does not allow for the values of the variable to be
arranged in a ranked, or specific, order.
A variable is at the ordinal level of measurement if it has the
properties of the nominal level of measurement and the naming
scheme allows for the values of the variable to be arranged in a
ranked, or specific, order.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.5 Determine the Level of Measurement of a Variable (2 of 3)

A variable is at the interval level of measurement if it has the

properties of the ordinal level of measurement and the differences
in the values of the variable have meaning. A value of zero in the
interval level of measurement does not mean the absence of the
quantity. Arithmetic operations such as addition and subtraction
can be performed on values of the variable.
A variable is at the ratio level of measurement if it has the
properties of the interval level of measurement and the ratios of
the values of the variable have meaning. A value of zero in the
ratio level of measurement means the absence of the quantity.
Arithmetic operations such as multiplication and division can be
performed on the values of the variable.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.1 Introduction to the Practice of Statistics
1.1.5 Determine the Level of Measurement of a Variable (3 of 3)

EXAMPLE Determining the Level of Measurement of a Variable

A study was conducted to assess school eating patterns in high schools in the
United States. The study analyzed the impact of vending machines and school
policies on student food consumption. A total of 1088 students in 20 schools were
surveyed. (Source: Neumark-Sztainer D, French SA, Hannan PJ, Story M and Fulkerson JA (2005)
School lunch and snacking patterns among high school students: associations with school food
environment and policies. International Journal of Behavioral Nutrition and Physical Activity 2005, (2)14.)

Determine the level of measurement of the following variables considered in the

study.
a. Number of snack and soft drink vending machines in the school Ratio

b. Whether or not the school has a closed campus policy during lunch Nominal

c. Class rank (Freshman, Sophomore, Junior, Senior) Ordinal

d. Number of days per week a student eats school lunch Ratio

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.2 Observational Studies Versus Designed Experiments
Learning Objectives

1. Distinguish between an observational study and an experiment

2. Explain the various types of observational studies

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.2 Observational Studies Versus Designed Experiments
1.2.1 Distinguish between an observational study and an
experiment (1 of 10)

EXAMPLE Cellular Phones and Brain Tumors

Researchers Joachim Schüz and associates wanted “to
investigate cancer risk among Danish cellular phone users who
were followed for up to 21 years.” To do so, they kept track of
420,095 people whose first cellular telephone subscription was
between 1982 and 1995. In 2002, they recorded the number of
people out of the 420,095 people who had a brain tumor and
compared the rate of brain tumors in this group to the rate of brain
tumors in the general population.

EXAMPLE Cellular Phones and Brain Tumors

They found no significant difference in the rate of brain tumors
between the two groups. The researchers concluded “cellular
telephone was not associated with increased risk for brain tumors.”
(Source: Joachim Schüz et al. “Cellular Telephone Use and Cancer Risk: Update
of a Nationwide Danish Cohort,” Journal of the National Cancer Institute 98(23):
1707-1713, 2006)

EXAMPLE Cellular Phones and Brain Tumors

Researchers Joseph L. Roti and associates examined “whether
chronic exposure to radio frequency (RF) radiation at two common
cell phone signals–835.62 megahertz, a frequency used by
analogue cell phones, and 847.74 megahertz, a frequency used
by digital cell phones–caused brain tumors in rats. The rats in
group 1 were exposed to the analogue cell phone frequency; the
rats in group 2 were exposed to the digital frequency; the rats in
group 3 served as controls and received no radiation. The
exposure was done for 4 hours a day, 5 days a week for 2 years.
The rats in all three groups were treated the same, except for the
RF exposure.

EXAMPLE Cellular Phones and Brain Tumors

After 505 days of exposure, the researchers reported the following
after analyzing the data. “We found no statistically significant
increases in any tumor type, including brain, liver, lung or kidney,
compared to the control group.” (Source: M. La Regina, E. Moros, W.
Pickard, W. Straube, J. L. Roti Roti. “The Effect of Chronic Exposure to 835.62
MHz FMCW or 847.7 MHz CDMA on the incidence of Spontaneous Tumors in
Rats.” Bioelectromagnetic Society Conference, June 25, 2002.)

In both studies, the goal of the research was to determine if radio

frequencies from cell phones increase the risk of contracting
brain tumors. Whether or not brain cancer was contracted is the
response variable. The level of cell phone usage is the
explanatory variable.
In research, we wish to determine how varying the amount of an
explanatory variable affects the value of a response variable.

An observational study measures the value of the response

variable without attempting to influence the value of either the
response or explanatory variables. That is, in an observational
study, the researcher observes the behavior of the individuals (in
the study) without trying to influence the outcome of the study.
If a researcher assigns the individuals in a study to a certain
group, intentionally changes the value of the explanatory variable,
and then records the value of the response variable for each
group, the researcher is conducting a designed experiment.

EXAMPLE Observational Study or Designed Experiment? Do

Flu shots Benefit Seniors?
Researchers wanted to determine the long-term benefits of the influenza
vaccine on seniors aged 65 years and older. The researchers looked at
records of over 36,000 seniors for 10 years. The seniors were divided
into two groups. Group 1 were seniors who chose to get a flu
vaccination shot, and group 2 were seniors who chose not to get a flu
vaccination shot. After observing the seniors for 10 years, it was
determined that seniors who get flu shots are 27% less likely to be
hospitalized for pneumonia or influenza and 48% less likely to die from
pneumonia or influenza. (Source: Kristin L. Nichol, MD, MPH, MBA, James D.
Nordin, MD, MPH, David B. Nelson, PhD, John P. Mullooly, PhD, Eelko Hak, PhD.
“Effectiveness of Influenza Vaccine in the Community-Dwelling Elderly,” New England
Journal of Medicine 357:1373–1381, 2007)

Based on the results of this study, would you recommend that all
seniors go out and get a flu shot?
The study may have flaws! Namely, confounding.
Confounding in a study occurs when the effects of two or more
explanatory variables are not separated. Therefore, any relation
that may exist between an explanatory variable and the response
variable may be due to some other variable or variables not
accounted for in the study.
A lurking variable is an explanatory variable that was not
considered in a study, but that affects the value of the response
variable in the study. In addition, lurking variables are typically
related to any explanatory variables considered in the study.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.2 Observational Studies Versus Designed Experiments
1.2.1 Distinguish between an observational study and an
experiment (9 of 10)

Some lurking variables in the influenza study:

age, health status, or mobility of the senior
Even after accounting for potential lurking variables, the
authors of the study concluded that getting an influenza shot
is associated with a lower risk of being hospitalized or dying
from influenza.

Observational studies do not allow a researcher to claim

causation, only association.
A confounding variable is an explanatory variable that was
considered in a study whose effect cannot be distinguished
from a second explanatory variable in the study.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.2 Observational Studies Versus Designed Experiments
1.2.2 Explain the Various Types of Observational Studies (1 of 6)

Cross-sectional Studies Observational studies that collect

information about individuals at a specific point in time, or over a
very short period of time.
Case-control Studies These studies are retrospective, meaning
that they require individuals to look back in time or require the
researcher to look at existing records. In case-control studies,
individuals who have certain characteristics are matched with
those that do not.
Cohort Studies A cohort study first identifies a group of
individuals to participate in the study (the cohort). The cohort is
then observed over a long period of time. Over this time period,
characteristics about the individuals are recorded. Because the
data is collected over time, cohort studies are prospective.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.2 Observational Studies Versus Designed Experiments
1.2.2 Explain the Various Types of Observational Studies (2 of 6)

EXAMPLE Observational Study or Designed Experiment?

Determine whether each of the following studies depict an
observational study or an experiment. If the researchers conducted
an observational study, determine the type of the observational study.
a. Researchers wanted to assess the long-term psychological effects
on children evacuated during World War II. They obtained a sample
of 169 former evacuees and a control group of 43 people who were
children during the war but were not evacuated. The subjects’ mental
states were evaluated using questionnaires. It was determined that
the psychological well being of the individuals was adversely affected
by evacuation. (Source: Foster D, Davies S, and Steele H (2003) The evacuation of British
children during World War II: a preliminary investigation into the long-term psychological effects.
Aging & Mental Health (7)5.)

Observational study; Case-control

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.2 Observational Studies Versus Designed Experiments
1.2.2 Explain the Various Types of Observational Studies (3 of 6)

EXAMPLE Observational Study or Designed Experiment?

b. Xylitol has proven effective in preventing dental caries (cavities)
when included in food or gum. A total of 75 Peruvian children were
given milk with and without xylitol and were asked to evaluate the
taste of each. Overall, the children preferred the milk flavored with
xylitol. (Source: Castillo JL, et al (2005) Children's acceptance of milk with xylitol or
sorbitol for dental caries prevention. BMC Oral Health (5)6.)

Designed experiment

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.2 Observational Studies Versus Designed Experiments
1.2.2 Explain the Various Types of Observational Studies (4 of 6)

EXAMPLE Observational Study or Designed Experiment?

c. A total of 974 homeless women in the Los Angeles area were
surveyed to determine their level of satisfaction with the healthcare
provided by shelter clinics versus the healthcare provided by
government clinics. The women reported greater quality satisfaction
with the shelter and outreach clinics compared to the government
clinics. (Source: Swanson KA, Andersen R, Gelberg L (2003) Patient satisfaction for
homeless women. Journal of Women’s Health (12)7.)

Observational study; Cross-sectional

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.2 Observational Studies Versus Designed Experiments
1.2.2 Explain the Various Types of Observational Studies (5 of 6)

EXAMPLE Observational Study or Designed Experiment?

d. The Cancer Prevention Study II (CPS-II) is funded and conducted by
the American Cancer Society. Its goal is to examine the relationship
among environmental and lifestyle factors on cancer cases by tracking
approximately 1.2 million men and women. Study participants
completed an initial study questionnaire in 1982 providing information
on a range of lifestyle factors such as diet, alcohol and tobacco use,
occupation, medical history, and family cancer history. These data
have been examined extensively in relation to cancer mortality. Vital
status of study participants is updated biennially. Cause of death has
been documented for over 98% of all deaths that have occurred.
Mortality follow-up of the CPS-II participants is complete through 2002
and is expected to continue for many years. (Source: American Cancer Society)
Observational study; cohort
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.2 Observational Studies Versus Designed Experiments
1.2.2 Explain the Various Types of Observational Studies (6 of 6)

A census is a list of all individuals in a population along with

certain characteristics of each individual.

1. Obtain a simple random sample

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.3 Simple Random Sampling
1.3.1 Obtain a simple random sample (1 of 8)

Random sampling is the process of using chance to select

individuals from a population to be included in the sample.
If convenience is used to obtain a sample, the results of the
survey are meaningless.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.3 Simple Random Sampling
1.3.1 Obtain a simple random sample (2 of 8)

A sample of size n from a population of size N is obtained

through simple random sampling if every possible sample
of size n has an equally likely chance of occurring. The
sample is then called a simple random sample.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.3 Simple Random Sampling
1.3.1 Obtain a simple random sample (3 of 8)

EXAMPLE Illustrating Simple Random Sampling

Suppose a study group consists of 5 students:
Bob, Patricia, Mike, Jan, and Maria
2 of the students must go to the board to demonstrate a
homework problem. List all possible samples of size 2 (without
replacement).
• Bob, Patricia
• Bob, Mike
• Bob, Jan
• Bob, Maria
• Patricia, Mike
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.3 Simple Random Sampling
1.3.1 Obtain a simple random sample (4 of 8)

EXAMPLE Illustrating Simple Random Sampling

• Patricia, Jan
• Patricia, Maria
• Mike, Jan
• Mike, Maria
• Jan, Maria

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.3 Simple Random Sampling
1.3.1 Obtain a simple random sample (5 of 8)

Steps for Obtaining a Simple Random Sample

1) Obtain a frame that lists all the individuals in the population of
interest. Number the individuals in the frame 1 to N.
2) Use a random number table, graphing calculator, or statistical
software to randomly generate n numbers where n is the
desired sample size.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.3 Simple Random Sampling
1.3.1 Obtain a simple random sample (6 of 8)

EXAMPLE Obtaining a Simple Random Sample

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.3 Simple Random Sampling
1.3.1 Obtain a simple random sample (7 of 8)

EXAMPLE Obtaining a Simple Random Sample

The 112th Congress of the United States had 435 members in the
House of Representatives. Explain how to conduct a simple
random sample of 5 members to attend a Presidential luncheon.
Then obtain the sample.
Step 2 Randomly select five numbers using a random number
generator. First, set the seed. The seed is an initial point
for the generator to start creating random numbers—like
selecting the initial point in the table of random numbers.
The seed can be any nonzero number. Then generate the
random numbers.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.3 Simple Random Sampling
1.3.1 Obtain a simple random sample (8 of 8)

EXAMPLE Obtaining a Simple Random Sample

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
Learning Objectives

1. Obtain a stratified sample

2. Obtain a systematic sample
3. Obtain a cluster sample

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.1 Obtain a stratified Sample (1 of 2)

A stratified sample is obtained by separating the population into

nonoverlapping groups called strata and then obtaining a simple
random sample from each stratum. The individuals within each
stratum should be homogeneous (or similar) in some way.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.1 Obtain a stratified Sample (2 of 2)

EXAMPLE Obtaining a Stratified Sample

In 2008, the United States Senate had 47 Republicans, 51
Democrats, and 2 Independents. The president wants to have a
luncheon with 4 Republicans, 4 Democrats and 1 Other. Obtain a
stratified sample in order to select members who will attend the
luncheon.
To obtain the stratified sample, conduct a simple random sample
within each group. That is, obtain a simple random sample of 4
Republicans (from the 47), a simple random sample of 4
Democrats (from the 51), and a simple random sample of 1 Other
from the 100. Be sure to use a different seed for each stratum.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.2 Obtain a Systematic Sample (1 of 3)

A systematic sample is obtained by selecting every kth

individual from the population. The first individual selected is
a random number between 1 and k.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.2 Obtain a Systematic Sample (2 of 3)

EXAMPLE Obtaining a Systematic Sample

A quality control engineer wants to obtain a systematic sample of
25 bottles coming off a filling machine to verify the machine is
working properly. Design a sampling technique that can be used
to obtain a sample of 25 bottles.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.2 Obtain a Systematic Sample (3 of 3)

Step 4: Randomly select a number between 1 and k. Call this

number p.
Step 5: The sample will consist of the following individuals:
p, p + k, p + 2k,…, p + (n − 1)k

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.3 Obtain a Cluster Sample (1 of 7)

A cluster sample is obtained by selecting all individuals

within a randomly selected collection or group of individuals.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.3 Obtain a Cluster Sample (2 of 7)

EXAMPLE Obtaining a Cluster Sample

A school administrator wants to obtain a sample of students in
order to conduct a survey.
She randomly selects 10 classes and administers the survey to all
the students in the class.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.3 Obtain a Cluster Sample (3 of 7)

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.3 Obtain a Cluster Sample (4 of 7)

Stratified and cluster samples are different. In a stratified

sample, we divide the population into two or more
homogeneous groups. Then we obtain a simple random
sample from each group. In a cluster sample, we divide the
population into groups, obtain a simple random sample of
some of the groups, and survey all individuals in the
selected groups.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.3 Obtain a Cluster Sample (5 of 7)

A convenience sample is one in which the individuals in the

sample are easily obtained.

Any studies that use this type of sampling generally have

results that are suspect. Results should be looked upon with
extreme skepticism.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.3 Obtain a Cluster Sample (6 of 7)

EXAMPLE Multistage Sampling

In practice, most large-scale surveys obtain samples using a
combination of the techniques just presented.
As an example of multistage sampling, consider Nielsen Media
Research. Nielsen randomly selects households and monitors the
television programs these households are watching through a
People Meter. The meter is an electronic box placed on each TV
within the household. The People Meter measures what program
is being watched and who is watching it. Nielsen selects the
households with the use of a two-stage sampling process.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.4 Other Effective Sampling Methods
1.4.3 Obtain a Cluster Sample (7 of 7)

EXAMPLE Multistage Sampling

Stage 1 Using U.S. Census data, Nielsen divides the country into
geographic areas (strata). The strata are typically city
blocks in urban areas and geographic regions in rural
areas. About 6000 strata are randomly selected.
Stage 2 Nielsen sends representatives to the selected strata and
lists the households within the strata. The households are
then randomly selected through a simple random sample.
Nielsen sells the information obtained to television stations and
companies. These results are used to help determine prices for
commercials.

1. Explain the sources of bias in sampling

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.5 Bias in Sampling
1.5.1 Explain the sources of bias in sampling (1 of 6)

If the results of the sample are not representative of the

population, then the sample has bias.
Three Sources of bias
1. Sampling bias
2. Nonresponse bias
3. Response bias

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.5 Bias in Sampling
1.5.1 Explain the sources of bias in sampling (2 of 6)

Sampling bias means that the technique used to obtain the

individuals to be in the sample tends to favor one part of the
population over another.
Undercoverage results in sampling bias. Undercoverage
occurs when the proportion of one segment of the population
is lower in a sample than it is in the population.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.5 Bias in Sampling
1.5.1 Explain the sources of bias in sampling (3 of 6)

Nonresponse bias exists when individuals selected to be in

the sample who do not respond to the survey have different
opinions from those who do.
Nonresponse can be improved through the use of callbacks
or rewards/incentives.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.5 Bias in Sampling
1.5.1 Explain the sources of bias in sampling (4 of 6)

Response bias exists when the answers on a survey do not

reflect the true feelings of the respondent.
Types of Response Bias
1. Interviewer error
2. Misrepresented answers
3. Wording of questions
4. Order of questions or words

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.5 Bias in Sampling
1.5.1 Explain the sources of bias in sampling (5 of 6)

Data-entry Error
Although not technically a result of response bias, data-entry error
will lead to results that are not representative of the population.
Once data are collected, the results may need to be entered into a
computer, which could result in input errors. Or, a respondant may
make a data entry error. For example, 39 may be entered as 93. It
is imperative that data be checked for accuracy. In this text, we
present some suggestions for checking for data error.

Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.5 Bias in Sampling
1.5.1 Explain the sources of bias in sampling (6 of 6)

Nonsampling errors are errors that result from sampling

bias, nonresponse bias, response bias, or data-entry error.
Such errors could also be present in a complete census of
the population.
Sampling error is an error that results from using a sample
to estimate information about a population. This type of error
occurs because a sample gives incomplete information
about a population.

1. Describe the characteristics of an experiment

2. Explain the steps in designing an experiment
3. Explain the completely randomized design
4. Explain the matched-pairs design

An experiment is a controlled study conducted to determine the

effect of varying one or more explanatory variables or factors has
on a response variable. Any combination of the values of the
factors is called a treatment.
The experimental unit (or subject) is a person, object or some
other well-defined item upon which a treatment is applied.
A control group serves as a baseline treatment that can be used
to compare to other treatments.
A placebo is an innocuous medication, such as a sugar tablet,
that looks, tastes, and smells like the experimental medication.

Blinding refers to nondisclosure of the treatment an experimental

unit is receiving.
A single-blind experiment is one in which the experimental unit
(or subject) does not know which treatment he or she is receiving.
A double-blind experiment is one in which neither the
experimental unit nor the researcher in contact with the
experimental unit knows which treatment the experimental unit is
receiving.

EXAMPLE The Characteristics of an Experiment

The English Department of a community college is considering adopting
an online version of the freshman English course. To compare the new
online course to the traditional course, an English Department faculty
member randomly splits a section of her course. Half of the students
receive the traditional course and the other half is given an online
version. At the end of the semester, both groups will be given a test to
determine which performed better.
(a) Who are the experimental units?
The students in the class

(b) What is the population for which this study applies?

The students in the class

(c) What are the treatments?

Traditional vs. online instruction

(d) What is the response variable?

Exam score

(e) Why can’t this experiment be conducted with blinding?

Both the students and instructor know which treatment they are
receiving

To design an experiment means to describe the overall plan

in conducting the experiment.

Steps in Conducting an Experiment

Step 1: Identify the problem to be solved.
• Should be explicit
• Should provide the experimenter direction
• Should identify the response variable and the
population to be studied.
• Often referred to as the claim.

Steps in Conducting an Experiment

Step 2: Determine the factors that affect the response variable.
• Once the factors are identified, it must be determined
which factors are to be fixed at some predetermined
level (the control), which factors will be manipulated
and which factors will be uncontrolled.

Steps in Conducting an Experiment

Step 3: Determine the number of experimental units.
• As a general rule, choose as many experimental units
as time and money allow. Techniques exist for
determining sample size, provided certain information
is available.

Steps in Conducting an Experiment

Step 4: Determine the level of the predictor variables
1. Control: There are two ways to control the factors.
a) Set the level of a factor at one value throughout the
experiment (if you are not interested in its effect on the
response variable).
b) Set the level of a factor at various levels (if you are
interested in its effect on the response variable). The
combinations of the levels of all varied factors constitute
the treatments in the experiment.

Steps in Conducting an Experiment

Step 4: Determine the level of the predictor variables
2. Randomize: Randomize the experimental units to
various treatment groups so that the effects of variables
whose level cannot be controlled is minimized. The idea
is that randomization “averages out” the effect of
uncontrolled predictor variables.

Steps in Conducting an Experiment

Step 5: Conduct the Experiment
a) Replication occurs when each treatment is applied to
more than one experimental unit. This helps to assure
that the effect of a treatment is not due to some
characteristic of a single experimental unit. It is
recommended that each treatment group have the same
number of experimental units.

Steps in Conducting an Experiment

Step 5: Conduct the Experiment
b) Collect and process the data by measuring the value of
the response variable for each replication. Any difference
in the value of the response variable is a result of
differences in the level of the treatment.

Steps in Conducting an Experiment

Step 6: Test the claim.
• This is the subject of inferential statistics.
• Inferential statistics is a process in which generalizations
about a population are made on the basis of results
obtained from a sample. Provide a statement regarding
the level of confidence in the generalization. Methods of
inferential statistics are presented later in the text.

A completely randomized design is one in which each

experimental unit is randomly assigned to a treatment.

EXAMPLE Designing an Experiment

The octane of fuel is a measure of its resistance to detonation with
a higher number indicating higher resistance. An engineer wants
to know whether the level of octane in gasoline affects the gas
mileage of an automobile. Assist the engineer in designing an
experiment.
Step 1: The response variable is miles per gallon.
Step 2: Factors that affect miles per gallon:
Engine size, outside temperature, driving style, driving
conditions, characteristics of car

Step 3: We will use 12 cars all of the same model and year.
Step 4: We list the variables and their level.
• Octane level - manipulated at 3 levels. Treatment A: 87 octane,
Treatment B: 89 octane, Treatment C: 92 octane
• Engine size - fixed
• Temperature - uncontrolled, but will be the same for all 12 cars.
• Driving style/conditions - all 12 cars will be driven under the
same conditions on a closed track - fixed.
• Other characteristics of car - all 12 cars will be the same model
year, however, there is probably variation from car to car. To
account for this, we randomly assign the cars to the octane
level.
Copyright © 2018, 2014, 2011 Pearson Education, Inc. All Rights Reserved
1.6 The Design of Experiments
1.6.3 Explain the Completely Randomized Design (5 of 6)

Step 5: Randomly assign 4 cars to the 87 octane, 4 cars to the 89

octane, and 4 cars to the 92 octane. Give each car 3
gallons of gasoline. Drive the cars until they run out of
gas. Compute the miles per gallon.
Step 6: Determine whether any differences exist in miles per
gallon.

Completely Randomized Design

A matched-pairs design is an experimental design in which

the experimental units are paired up. The pairs are matched
up so that they are somehow related (that is, the same
person before and after a treatment, twins, husband and
wife, same geographical location, and so on). There are only
two levels of treatment in a matched-pairs design.

EXAMPLE A Matched-Pairs Design

Xylitol has proven effective in preventing dental caries (cavities)
when included in food or gum. A total of 75 Peruvian children were
given milk with and without Xylitol and were asked to evaluate the
taste of each. The researchers measured the children’s ratings of
the two types of milk. (Source: Castillo JL, et al (2005) Children's acceptance of
milk with Xylitol or Sorbitol for dental caries prevention. BMC Oral Health (5)6.)

a) What is the response variable in this experiment? Rating

b) Think of some of the factors in the study. Which are
controlled? Which factor is manipulated?
Age and gender of the children; Milk with and without Xylitol is
the factor that was manipulated

c) What are the treatments? How many treatments are there?

Milk with Xylitol and milk without xylitol; 2

d) What type of experimental design is this?

Matched-pairs design

e) Identify the experimental units. 75 Peruvian children

f) Why would it be a good idea to randomly assign whether the
child drinks the milk with Xylitol first or second?
Remove any effect due to order in which milk is drunk.

g) Do you think it would be a good idea to double-blind this

experiment? Yes!

PCPC2016 Microbiology Guidelines
100% (5)
PCPC2016 Microbiology Guidelines
278 pages
Boston Diagnostic Aphasia Examination
No ratings yet
Boston Diagnostic Aphasia Examination
21 pages
Statistics Analysis With Software Application
No ratings yet
Statistics Analysis With Software Application
22 pages
Chapter 1
No ratings yet
Chapter 1
30 pages
Lecture 1 - Introduction To Statistics
No ratings yet
Lecture 1 - Introduction To Statistics
48 pages
Statistic
No ratings yet
Statistic
171 pages
Statistical Analysis with Software Application
100% (1)
Statistical Analysis with Software Application
6 pages
Statistics - Shikha Agrawal
No ratings yet
Statistics - Shikha Agrawal
33 pages
Statistics and Probability - Midterm Reviewer
No ratings yet
Statistics and Probability - Midterm Reviewer
12 pages
Statistics and Probability - Midterm Reviewer
No ratings yet
Statistics and Probability - Midterm Reviewer
13 pages
AE9 - Statistical Analysis With Software Application
100% (1)
AE9 - Statistical Analysis With Software Application
16 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
11 pages
UP Statistics Lecture
100% (1)
UP Statistics Lecture
102 pages
Chapter 1 - NATURE OF STATISTICS
No ratings yet
Chapter 1 - NATURE OF STATISTICS
14 pages
Statistics Lecture Part 1
No ratings yet
Statistics Lecture Part 1
55 pages
Hand-Out in Statistics Statistics
No ratings yet
Hand-Out in Statistics Statistics
4 pages
BS1 Statistics
No ratings yet
BS1 Statistics
26 pages
A Lesson 1 Introduction To Statistics & SPSS
100% (1)
A Lesson 1 Introduction To Statistics & SPSS
8 pages
Intro to Biostat (1)
No ratings yet
Intro to Biostat (1)
43 pages
Statistics 8
No ratings yet
Statistics 8
33 pages
PAS 111 Week 1
No ratings yet
PAS 111 Week 1
3 pages
Bio Statistics
No ratings yet
Bio Statistics
24 pages
Intro To Biostatistics Lecture BSMLS 3-A&B
No ratings yet
Intro To Biostatistics Lecture BSMLS 3-A&B
74 pages
BEH 260 Ch 1 Notes (2)
No ratings yet
BEH 260 Ch 1 Notes (2)
17 pages
Math 231 (1)
No ratings yet
Math 231 (1)
88 pages
Midterm Reviewer 1
No ratings yet
Midterm Reviewer 1
8 pages
Module 1 Introduction of Statistics Final
No ratings yet
Module 1 Introduction of Statistics Final
9 pages
Applied Statistics Basic Concepts
No ratings yet
Applied Statistics Basic Concepts
28 pages
Statistics
No ratings yet
Statistics
3 pages
Nature
No ratings yet
Nature
10 pages
Lec Notes Business Stat
No ratings yet
Lec Notes Business Stat
7 pages
Chapter - 1 - Introduction To Statistics
No ratings yet
Chapter - 1 - Introduction To Statistics
50 pages
GW E8 CH 01
No ratings yet
GW E8 CH 01
50 pages
Lecture 9
No ratings yet
Lecture 9
38 pages
ITC 112 Lesson 1
No ratings yet
ITC 112 Lesson 1
54 pages
Math-101-Statistics
No ratings yet
Math-101-Statistics
100 pages
Module One Two One
No ratings yet
Module One Two One
32 pages
Introduction To Biostatistics: Dr. M. H. Rahbar
No ratings yet
Introduction To Biostatistics: Dr. M. H. Rahbar
35 pages
Unit 1 Basic Concepts and Terms in Statistics
No ratings yet
Unit 1 Basic Concepts and Terms in Statistics
33 pages
COR-STAT1202 Introductory Statistics Seminar 1 Full Version
No ratings yet
COR-STAT1202 Introductory Statistics Seminar 1 Full Version
9 pages
Bio Statistics
No ratings yet
Bio Statistics
435 pages
MATH 121 (Chapter 1) - Nature of Statistics
No ratings yet
MATH 121 (Chapter 1) - Nature of Statistics
23 pages
Statistics Applied To Researchpp1
0% (1)
Statistics Applied To Researchpp1
67 pages
Week-1-Intro-to-Stat-Collection-of-Data
No ratings yet
Week-1-Intro-to-Stat-Collection-of-Data
95 pages
Chapter 1
No ratings yet
Chapter 1
9 pages
Chapter 1: Statistics: Scatterplot
No ratings yet
Chapter 1: Statistics: Scatterplot
30 pages
Lesson 1:: Basic Terminologies in Statistics
No ratings yet
Lesson 1:: Basic Terminologies in Statistics
3 pages
STATISTICS Powrepoint 2
No ratings yet
STATISTICS Powrepoint 2
82 pages
Elementary Statistics (Stat 1)
No ratings yet
Elementary Statistics (Stat 1)
41 pages
Introduction-To-The-Statistical-Concepts
No ratings yet
Introduction-To-The-Statistical-Concepts
48 pages
Research Samples and Explanations
No ratings yet
Research Samples and Explanations
56 pages
Stat195 Handout (Rev)
50% (2)
Stat195 Handout (Rev)
101 pages
Prof. Januario Flores JR
No ratings yet
Prof. Januario Flores JR
14 pages
[1] stat introduction
No ratings yet
[1] stat introduction
7 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
44 pages
MMW Module 4 Lesson 1
No ratings yet
MMW Module 4 Lesson 1
13 pages
Chapter 1: Introduction To Statistics
No ratings yet
Chapter 1: Introduction To Statistics
28 pages
MMW Module 4
No ratings yet
MMW Module 4
54 pages
stats.2021.u1
No ratings yet
stats.2021.u1
31 pages
Biostatistics Explored Through R Software: An Overview
From Everand
Biostatistics Explored Through R Software: An Overview
Vinaitheerthan Renganathan
3.5/5 (2)
Data Preparation and Exploration: Applied to Healthcare Data
From Everand
Data Preparation and Exploration: Applied to Healthcare Data
Robert Hoyt
No ratings yet
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Diabetic Retinopathy Screening Using Smartphone-Based Fundus Imaging in India
No ratings yet
Diabetic Retinopathy Screening Using Smartphone-Based Fundus Imaging in India
10 pages
Brennan Et Al Spine 2006 RCT LBP
No ratings yet
Brennan Et Al Spine 2006 RCT LBP
9 pages
(ACV-S07) Week 07 - Pre-Task - Quiz - Weekly Quiz (PA) - INGLES IV (38079)
No ratings yet
(ACV-S07) Week 07 - Pre-Task - Quiz - Weekly Quiz (PA) - INGLES IV (38079)
5 pages
Work at Height Safety Quiz - HSE STUDY GUIDE
No ratings yet
Work at Height Safety Quiz - HSE STUDY GUIDE
8 pages
Ncma113 Lecture
No ratings yet
Ncma113 Lecture
34 pages
Promotion of Malaria Rapid Test Kits - Carestart Malaria Kits
No ratings yet
Promotion of Malaria Rapid Test Kits - Carestart Malaria Kits
19 pages
Administrative Orders of Ra 4688
No ratings yet
Administrative Orders of Ra 4688
40 pages
Thyagaraj
No ratings yet
Thyagaraj
1 page
Sample of An Environmental Monitoring Protocol
No ratings yet
Sample of An Environmental Monitoring Protocol
3 pages
Acute Maxillary Sinusitis
No ratings yet
Acute Maxillary Sinusitis
5 pages
Presentation On Thematic Apperception Test
100% (1)
Presentation On Thematic Apperception Test
17 pages
Ataiza BSN 3C - Mmse
No ratings yet
Ataiza BSN 3C - Mmse
2 pages
1HHE 2016-0152 GSA Goodfellow Federal Complex Final 9 21 16
No ratings yet
1HHE 2016-0152 GSA Goodfellow Federal Complex Final 9 21 16
8 pages
BD Leucocount™ Kit: For Enumeration of Residual Leucocytes in Leucoreduced Blood Products
No ratings yet
BD Leucocount™ Kit: For Enumeration of Residual Leucocytes in Leucoreduced Blood Products
13 pages
Ophtha Quiz - Tests in Squint and Optics PDF
100% (1)
Ophtha Quiz - Tests in Squint and Optics PDF
3 pages
Flyer URYXXON 500 EN
No ratings yet
Flyer URYXXON 500 EN
4 pages
Advt 2024
No ratings yet
Advt 2024
5 pages
Low-Ionic-Strength-Solution-LISS
No ratings yet
Low-Ionic-Strength-Solution-LISS
2 pages
Lesson 7 Z Test Two Sample Mean Show
No ratings yet
Lesson 7 Z Test Two Sample Mean Show
18 pages
Totalt4 Arc
No ratings yet
Totalt4 Arc
6 pages
NABH Course Broucher
0% (1)
NABH Course Broucher
3 pages
Review Pre-Inspection Compliance Report - Ukrain (10-01-20)
No ratings yet
Review Pre-Inspection Compliance Report - Ukrain (10-01-20)
22 pages
Understanding Grease Analysis Results
No ratings yet
Understanding Grease Analysis Results
2 pages
Statistics Case 1 - Answers
No ratings yet
Statistics Case 1 - Answers
3 pages
Preview
No ratings yet
Preview
19 pages
10 MUST To KNOW in Medical Technology Laws Ethics 1docx
No ratings yet
10 MUST To KNOW in Medical Technology Laws Ethics 1docx
11 pages
Eyjinvswt43b4mkr3grxipaz
No ratings yet
Eyjinvswt43b4mkr3grxipaz
2 pages
D Shackle Inspection Checklist
100% (2)
D Shackle Inspection Checklist
16 pages