0% found this document useful (0 votes)
6 views

Unit 03 - Producing Data - 4 Per Page

The document outlines key concepts for collecting data, including: 1. It distinguishes between association and causation, noting that association does not necessarily imply causation due to potential lurking/confounding variables. 2. It presents a hierarchy of data sources from anecdotal to experiments, with experiments seen as most reliable since they can control for lurking variables through randomization. 3. It discusses observational studies like sample surveys versus controlled experiments, noting experiments can impose treatments to directly observe effects while surveys only collect information.

Uploaded by

Kase1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Unit 03 - Producing Data - 4 Per Page

The document outlines key concepts for collecting data, including: 1. It distinguishes between association and causation, noting that association does not necessarily imply causation due to potential lurking/confounding variables. 2. It presents a hierarchy of data sources from anecdotal to experiments, with experiments seen as most reliable since they can control for lurking variables through randomization. 3. It discusses observational studies like sample surveys versus controlled experiments, noting experiments can impose treatments to directly observe effects while surveys only collect information.

Uploaded by

Kase1
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

7/2/2012

Unit 3 Outline: Collecting Data


 Association vs. Causation
 Hierarchy of data sources
Unit 3  Design of experiments

Collecting Data • Principles of experimental design


• Control, Randomization, Replication
IPS 2.6, 3.1 - 3.3 • Blocking
• Placebos
 Sample surveys

• Simple random sampling


• Stratified and multistage sampling
• Sampling Bias
 Introduction to statistical inference 2
1 2

Association Common relationships between X and Y


versus
causation

3
A study shows that there is a positive correlation
between hospital size (number of beds, X) and median (a) Association between X and Y (partially) due to “X causes Y”
number of days, Y, patients stay in hospital. Does this (b) Association between X and Y (partially) explained by a
mean that you can shorten a hospital stay by choosing “lurking variable” (Z)
a small hospital? (c) Association between X and Y is mixed up with and cannot be
distinguished from the effect of an additional variable (Z)
33 4 4

1
7/2/2012

Examples: Association versus causation Examples: Association versus causation


• People with a healthy diet (low in fat) have a lower
incidence of cancer (nearly all types of cancer)
• Low fat diet may help prevent some types of cancer
• Hospital size and length of stay example • People with low fat diet may have other important healthy
• Polio vaccine and the prevention of polio lifestyles (e.g. low alcohol intake, no tobacco habit)
• Internet access and cell phone use in 190 countries (confounding variables)
• Mother’s body mass index (BMI) and daughter’s body • Kids who eat a good breakfast do better in school
mass index (BMI) (r = 0.51, r2 = 0.26) • What are the potential confounders?
• A student’s SAT score and first year college GPA In the absence of a controlled experiment, the relationship
• This classification concept has only limited usefulness. between X and Y may be confounded by their association
Most important: “causation” versus “other association” with a third variable Z (a confounder)
With causation if we change X we then also change Y
5 5 6 6

Unit 3 Outline: Collecting Data


Establishing causation  Association vs. Causation
 Hierarchy of data sources
• The best (and only?) method of clearly establishing
 Design of experiments
causation is to conduct a carefully-designed
randomized experiment that changes X, the  Principles of experimental design

explanatory variable, and observes the effect on Y • Control, Randomization, Replication


• Blocking
• Any potential effect of lurking variables is thus • Placebos
controlled (balanced) by randomization  Sample surveys

• Simple random sampling


• Stratified and multistage sampling
• Sampling Bias
8
 Introduction to statistical inference
7 7 8 8

2
7/2/2012

Hierarchy of Data Anecdotal evidence


• Anecdotal evidence is based on haphazardly selected
• Data can be produced in many ways: individual cases, that often come to our attention
because they are striking (probably not representative)
1. Anecdotal information
• Example: Politicians often cite the case of a single
2. Available data individual to invoke a public response consistent with
3. Observational studies the politicians’ desire (a sample of size n = 1)
4. Controlled experiments • “Ask for averages, not testimonials” – Joe Hallinan

5. Randomized controlled experiments


• Major differences in quality of information produced
and ultimately the reliability of conclusions that can
be drawn (lower on list is better)
• Randomized controlled experiments provide by far
the most reliable information
9 9 1
10
0

Available data Observational studies (e.g. sample surveys)


• Available data are data that were produced versus experiments
in the past for some other purpose but may
• An observational study collects information from
help answer a present question
individuals making no attempt to influence the responses
• Many use available data because producing
• An experiment imposes an intervention (e.g. treatment)
new data is expensive (nearly always most
on individuals in order to observe their responses
costly part of research)
• Sample surveys are a type of observational study
• Data reported by government agencies is
often aggregate data – recall the ecological Example: Opinion polls often survey 1,000-1,500 people
fallacy problem • Clinical trials are a type of experiment
• Manner available data collected may not be Example: A comparison of different drugs for women with
consistent with good study design to answer breast cancer, often with just a few 100 people
present question
11 1 12 1
1 2

3
7/2/2012

Experiments: terminology and concepts Designing controlled experiments


• Experimental units (e.g., individual subjects) are the • Sir Ronald Fisher (the “father of statistics”) was sent to the
objects of the study Rothamsted Agricultural Station (UK) in 1919 to evaluate
the success of various fertilizer treatments
• A specific experimental condition (intervention) is called a
treatment • He found the data from years of experiments to be
• An experiment imposes some “treatment” on individuals in basically worthless because of poor experimental design
order to observe their responses
• Fertilizer had been applied to a field one year and not the
• An experiment allows us to control lurking variables next year in order to compare the yield of grain produced
• In principle, randomized controlled experiments are the in the two years - but it may have rained more in one year,
“gold-standard” of evidence to support “causation” the seeds may have been different, etc., etc.
• Experiments may not always be ethical or practical
• Too many factors affecting the results were “uncontrolled”
13 1 14 1
3 4

Example of a controlled experiment: Physicians’ Health Study


Physicians’ Health Study
• A key study in preventive medicine
• Organized by C. Hennekens et al. at Harvard Medical
School - started in 1982
• Subjects were 22,071 male physicians
• Treatments involved 2 factors
• Aspirin versus placebo
• Beta carotene versus placebo
• Response: heart disease and cancer
• Subjects were randomized to 1 of 4 treatment groups
• Major result: daily low-dose aspirin decreased
. risk of first myocardial infarction by 44%
15 1 16 1
5 6

4
7/2/2012

Unit 3 Outline: Collecting Data


Principles of experimental design
 Association vs. Causation
 Hierarchy of data sources
• 1st Control – directly compare two or more
 Design of experiments treatments – helps control effects of lurking variables
• Principles of experimental design
• Control, Randomization, Replication • 2nd Randomization – use randomization to assign
• Blocking individuals (experimental units) to treatments
• Placebos
 Sample surveys
• 3rd Replication – replicate each treatment on many
individuals to reduce effect of chance variation
• Simple random sampling
(Also called repetition)
• Stratified and multistage sampling
• Sampling Bias
 Introduction to statistical inference
17 1 18 1
7 8

Control group
Earliest controlled medical experiment - 1747
• Control - 1st principle of experimental design
• James Lind studied six strategies for treating scurvy on
• In a “controlled experiment”, two or more groups of HMS Salisbury attempting to control for other factors
individuals (subjects, experimental units) are compared
• “I took 12 patients in the scurvy on board the Salisbury at
• Treatment group: subjects receive a specific intervention sea. The cases were as similar as I could have them…
• Control group (aka, comparison group): subjects do not They lay together in one place and had one diet common
to them all… To two of them was given a quart of cider a
receive the specific intervention and are compared to the day, to two an elixir of vitriol, to two vinegar, to two sea
treatment group water, to two oranges and lemons, and to the remaining
• Controlled comparisons allow us to eliminate (or reduce) two an electuary recommended by a ship’s surgeon. The
most sudden and visible good effects were perceived from
effects of selection of subjects, placebo effects and the use of oranges and lemons, one of those who had taken
potential biases (systematic favoring of a certain outcome) them being at the end of six days fit for duty… The other
• If studies are uncontrolled, results may be meaningless was appointed nurse for the sick.”

19 1 20 2
9 0

5
7/2/2012

Example: gastric freezing studies


What’s a control group?
• Treatment for ulcers: patients swallow a deflated
balloon, then refrigerated liquid is pumped into the
balloon for 1 hour
• A single “arm” treatment study: gastric freezing reduced
acid production and relieved ulcer pain
• Result was not considered reliable (uncontrolled study)

• A subsequent controlled study was conducted


• 34% of 82 subjects receiving gastric freezing improved
• 38% of 78 subjects in the control group (liquid in
balloon at body temperature) improved

21 2 22 2
1 2

Random Assignment of treatments Example: A comparison of two diets for rats


• The 2nd principle of experimental design
concerns assignment of subjects to treatments • An experiment is designed to compare a new diet
• We want the treatment groups to be alike as (“Wonder Rat Feed”) to standard diet (control)
much as possible in every way (except for the • Subjects: 30 rats
treatment) for a fair comparison • Treatments: new diet and standard diet
• We could do it by matching (e.g. by subject’s • Response: weight gain after 28 days
age, sex, smoking), but matching is not enough
(unknown lurking variables cannot be matched)
• Instead, use chance to decide - randomization
• Assignment of treatments using randomization
helps ensure balance of known and unknown
factors in the treatment groups
23 2 24 2
3 4

6
7/2/2012

How to randomize to 2 treatments


Table B (IPS)
• Usually accomplished by software; can use tables of
A table of random digits (Table B in IPS)
random digits • Example: diet study for 30 rats comparing T1 vs. T2
- use for • Pick a spot to begin in Table B (e.g. line 102, second
designing block – 47150 – for no particular reason)
randomized
studies
• For subject 1, assign T1 if digit in range 0 - 4, assign
T2 if digit in range 5 - 9
• Use next digit for next subject, etc.
• Assignments (using starting point above) will be:
25 2 T1, T2, T1, T2, T1, . . . 26 2
5 6

How to randomly select individuals from a group


• Label each of the N individuals in the group from 1 to N How not to randomize
• If N is 2 digits in length (for example), select a sequence of 2
digits from Table B, and select individuals by number
• Example: Select 5 individuals at random from a group of 50
• Start at row 103 – for no particular reason
• Select individuals 45, 46, 17, 9 and 32 (ignore duplicates)

45 46 77 17 09 77 55 80 00 95 32 86 32 94 85 82 22 69 00 56
27 2 28 2
7 8

7
7/2/2012

Improving the precision of a design


Replication • Not all randomized controlled experiments are equal
• Randomization produces treatment groups that are • Example: Compare two different synthetic material
similar in all respects except treatment received soles for the amount of wear on 10 boys’ shoes
• Therefore differences in the response must be due to • A (new material) and B (standard material)
either the treatments or the play of chance • A possible design: a controlled randomized design
• Replication of treatments on many subjects (large A shoes (new)
RAND R
sample size) reduces the role of chance variation B shoes (standard)
• Replication gives the experiment the power to detect • Assign each boy to receive shoes with either A or B
differences between the treatments • Compare the average wear after 2 months:
• A treatment difference so large it would rarely occur boys assigned to A shoes vs. boys assigned to B shoes
by chance is said to be “statistically significant” • Weakness: large variation in levels of wear among the
boys (indiv. to indiv.) may cloud the comparison
29 2 30 3
9 0

Improving the precision of a design Improving the precision of a design


• This design can be improved • One more design modification – very beneficial
• Have each boy wear a special pair of shoes: sole of one • Same randomization – left vs. right shoe, A or B
shoe made with A and sole of the other made with B • Now instead of comparing the average of the A’s vs.
• Decision whether the left or right shoe was made with the average of the B’s - consider the average of the
A or B is determined by the flip of a coin (random) pair-wise differences (A minus B) for each boy
• Compare average wear after 2 • (Instead of ave. of 10 A values vs. ave. of 10 B
months (ave. A’s vs. ave. B’s) values, look at the ave. of 10 [A minus B] diff's)
• Design is better (less impact of • This uses each boy as their own control –
variation in wear among boys), dramatically reducing the variation
but still not optimal
• Called a matched-pairs design
• Haven’t yet taken full advantage
of left vs. right foot pairing
31 3 332
1 2

8
7/2/2012

Reducing variability with Blocking in experimental designs


a matched-pairs design
14 14
• A block is a group of individuals known to be
similar in some way that is thought likely to
12 12
influence the response variable
W 10
e
a
W 10
e
a
• Block designs can have blocks of any size
r
• In a “randomized block design,” randomization
r
8
8

is carried out separately within each block


A B

Materia l
2 3 4 5

bo y
6 7 8 9
• By confining treatment comparisons to within
Wear differences computed Wear differences computed such blocks, greater precision can be obtained
between groups within boys (i.e. smaller σ for comparisons within each block)
Average difference same in two designs, but variance is
much smaller in matched-pairs design – more precision
33 3 34 3
3 4

Matched-pairs design as a A randomized block design to study


special case of a block design cell phone use and poor driving
• For our shoe wear example, the block size was two and • Consider an experiment designed to determine if cell
we compared two treatments A and B phone use while driving leads to poor driving
• We expected the 2 shoes of each boy to be more • Note: some drivers are naturally better than others
homogeneous than the aggregate (all shoes of all boys) • Major elements of the study design
• Matched-pairs designs often use  Subjects: 60 drivers in their 20s from Boston
identical twins  Design: A randomized controlled experiment
• A very efficient and effective • Factor (intervention) : cell phone use or not while
research design driving on a closed standardized driving course
• However, too few identical twins (30 drivers assigned to each group)
• Block: driving record (0 versus > 1 accidents in
• Matched-pairs designs are not often feasible – but previous 5 years) [Blocking also call stratification]
blocking is nearly always possible and very valuable  Response: driver rating score by a police officer
35 3 36 3
5 6

9
7/2/2012

The block design schema


Choice of blocks
Good Cell phone use
• Blocks should be chosen on the basis of the most
driving R Rating score
important (known) unavoidable source of variation
records No cell phone
among the individuals (experimental units)
Subjects • Randomization then averages out the remaining
sources of variability to allow “unbiased” (i.e.,
Poor Cell phone use un-confounded) estimation of treatment effects
driving R Rating score • Blocks allow greater precision, because a source of
records No cell phone systematic variation is removed (reduced variability)
from the experimental comparison
R denotes randomization

37 3 38 3
7 8

Placebo effects Placebos and blinding


• A placebo is a medically inert substance, such as a 46 patients with chronic severe itching were randomly
sugar pill, used to replace medication in a clinical given one of four treatments
research trial
• The placebo effect is a measurable, observable, or High itching score = more itching
felt improvement not attributable to a treatment Treatment Itching score
• Example: patients suffering pain after wisdom-tooth Cyproheptadine HCI 27.6
extraction got as much relief from a fake application Trimeprazine tartrate 34.6
of ultrasound as from a real application (provided
both the patient and therapist thought the machine Placebo 30.4
was on) Nothing 49.6
• In psychiatric studies, up to 20% of patients respond
to placebo drugs (JAMA 1968; 203: 418-419)

39 3 40 4
9 0

10
7/2/2012

Blinding
• Blinding: comparison of treatments can be distorted if
subjects or persons administering or evaluating
treatment know which treatment is being allocated –
especially for subjective endpoints
• Doctors want new treatments to work
• Patients want to please their doctors
• Blinding avoids many sources of unconscious biases

• Single-blind: subjects do not know which treatment


they have received
• Double-blind: neither subjects nor experimenters know
which treatments have been received
41 4 42 4
1 2

In-class exercise on study design Unit 3 Outline: Collecting Data


What would be a feasible, effective and ethical study  Association vs. Causation
design to answer the following questions?  Hierarchy of data sources

• Do consumers prefer the taste of Coke Classic or  Design of experiments

Coke Zero? • Principles of experimental design


• Will people spend less on health care if their health • Control, Randomization, Replication
insurance requires them to pay part of the cost?
• Blocking
• Should people convicted of drunken driving be • Placebos
given lenient, moderate or harsh sentences?
 Sample surveys
• Does access to the internet improve the quality of
• Simple random sampling
life for people in developing countries?
• Stratified and multistage sampling
• What proportion of net income should automotive
• Sampling Bias
companies invest in research and development?
 Introduction to statistical inference
43 4 44 4
3 4

11
7/2/2012

Population and sample Simple random sample (SRS)


• Population: entire group of individuals on which we • In a SRS of size n:
desire information
• Sample: part of population on which we actually • each individual in the population has an
collect data equal chance of being chosen
• Sampling design: method used to choose sample • every set of n individuals has an equal
from population chance of being the sample chosen
• Census: survey of an entire population
• Example: selection of a 3-member advisory
committee at random from the 21 faculty
• Why sample, instead of taking a census? members of the Statistics Department
• Time, expense, and sometimes sampling units are
changed by being measured

45 4 46 4
5 6

From the Gallup Poll website


http://www.gallup.com/ Drawback of simple random sampling
Americans are sharply divided • Weakness of SRS: it does not use information
Opinion on Healthcare Ruling over Thursday's Supreme about the population structure (such as a
50
Court decision on the 2010 small group of people known to be poorer
45 healthcare law. than the others) to ensure proper balance that
40
35 pure random sampling may miss
30 “As you may know, the U.S. Supreme
25
Court upheld the entire 2010 • Stratified random sampling uses information
about the population structure – to improve
20
15 heathcare law declaring it
10 constitutional. Do you agree with the estimate
5
this decision?
0

Agree Disagree No Opinion


• National surveys can be complicated, using
±4 pct. pt. margin of error multistage sampling
June 28, 2012
Sample size = 1,012 national adults
47 4 48 4
7 8

12
7/2/2012

Stratified random samples Stratified Random Sample: An Example


• Basic Research Question: what is the average income of
Basic idea: sample important groups separately, adults in Massachusetts
then combine these samples • Problem with SRS: may not sample enough minorities to
get a good estimate of average within these groups.
1) Divide population into groups of similar
Example: American Indians
individuals, called strata
• Create strata based on racial groups (Key: we know the
2) Choose a separate simple random sample breakdown of racial groups in Mass). For this example,
within each strata let's say there are 5 racial groups
3) Combine these simple random samples • Sample a specific number of invidiuals, say 50, within
together to form the full sample each racial group, rather than 250 overall
(in the correct proportions) • When calculating the overall average, weight the groups
averages appropriately.

49 4 50 5
9 0

Multistage Sampling
Bias: cautions for sample surveys
• In multistage sampleing, units are randomly selected at
different levels of what you define as an individuals. 1) Selection bias: some groups in population are
• Expample: in national surveys often indiv's are chosen by: over- or under-represented in sample
– first randomly select states
(the sampling frame is limited)
– then randomly counties within those states, 2) Non-response bias: non-respondents may differ
– then randomly neighborhoods within those in important ways from respondents
counties (individuals choose not to respond)
• Reason: its much easier to survey groups of people rather
than just one person/unit that could be very far away. 3) Response bias: e.g., wording of questions,
• Works best if the higher levels that are sampled (states, telescoping in the recall of events
counties, etc...) are good representations of the entire
population themselves

51 5 52 5
1 2

13
7/2/2012

1936 Literary Digest Poll What went wrong with the Digest’s Poll?
• Literary Digest had predicted the winner of every US Selection bias and non-response bias
presidential election since 1916
• In 1936, Literary Digest mailed questionnaires to 10 • Selection bias: people surveyed
million people came from telephone books, club
memberships, mail order lists,
• 2.4 million people responded - the largest number of automobile ownership lists
people ever replying to a poll (more affluent households during
• When publishing the 1936 results, the Digest wrote: depression year)
“We make no claim to infallibility. We did not coin the phrase • Non-response bias: 76% did not respond
“uncanny accuracy” which has been so freely applied to
• The Gallup Poll correctly predicted Roosevelt's
our polls.”
victory with a sample of 50,000 people
• Prediction: Roosevelt 43%, Landon 57% (1/50th size of Digest’s Poll)
• Actual result: Roosevelt 62%, Landon 38%
53 5 54 5
3 4

Response bias Unit 3 Outline: Collecting Data


Wording of questions can deliberately bias results:  Association vs. Causation
• Do you favor Gestapo-style police tactics to prevent  Hierarchy of data sources

smoking in public buildings?  Design of experiments

• Do you think smokers have the right to impose their • Principles of experimental design
filthy habits on the rest of us, polluting our precious air? • Control, Randomization, Replication
• Blocking
• Placebos
 Sample surveys

• Simple random sampling


• Stratified and multistage sampling
• Sampling Bias
 Introduction to statistical inference
55 5 56 5
5 6

14
7/2/2012

Sampling distribution
Parameters and statistics • What would happen if a sample (or an experiment) were
repeated many times? (a “thought experiment”)
Parameter: number that describes the population • Take repeated samples of the same size from the same
population:
Statistic: number that describes a sample
– 1st sample, calculate the statistic of interest
– 2nd sample, calculate the statistic of interest
Statistical inference: use information from a
sample (a statistic) to make an inference about and so on . . .
the larger population (a population parameter) • The statistic will vary from sample to sample
• The theoretical sampling distribution of a statistic is the
Sample  Population distribution of values taken by the statistic in all possible
(partial information) samples of the same size from the same population
• The sampling distribution often has a predictable pattern
57 5 58 5
7 8

Simulating an opinion poll Results of repeated surveys of size n = 100


Note: the variability of results
• In the 1984 presidential election, Ronald Reagan
received about 60% of the popular vote
• Simulate a survey (repeatedly) of 100 people
(just 2 days before the election)
• First sample 56/100 = 56%
• Second sample 46/100 = 46%
• Third sample 61/100 = 61%
and so on . . .
• Continue this for many, many surveys of size 100
From IPS p 215 [5 234]
• Graph the results in a histogram Entire population
59 5 60 6
9 0

15
7/2/2012

Repeat the process for surveys of size n = 2,500 The major concept of statistical inference
Note: the variability decreases as the sample size increases • A sampling distribution characterizes the behavior of
a statistic
• A sampling distribution is inherently unobservable,
because there will (in almost all cases) be only one
survey, or one experiment, or one observational study
• Probability theory provides tools for calculating the
theoretical form of a sampling distribution
• Understanding the behavior of a statistic under
(hypothetical) repeated samplings (the sampling
distribution) helps understand the precision and
reliability of the statistic
Entire population
61 6 62 6
1 2

Bias and variability


• Two measures of the reliability of a statistic
 Bias – the distance of the center of the sampling
distribution from the true parameter
 Variability – the variance of the sampling
distribution
• Bias is often thought of as a measure of validity of a
study design (e.g. reduced by using random sampling)
• Variability captures the spread in the sampling
distribution (e.g. reduced by increasing sample size)
• Survey results come with a “margin of error” (+ 3%)
• If bias = 0 and variability is small, the values of a
statistic will be tightly clustered around the “truth”

63 6 64 6
3 4

16
7/2/2012

Size (of the population) doesn’t matter


Size (of the population) doesn’t matter
• Population size doesn’t matter Intuition:
• The variability of a statistic from a random sample
doesn’t depend on the size of the population Imagine you are a chef, tasting soup
(provided the population is substantially larger
than sample) As long as the soup is well-mixed
(ensuring a random sample), the
• Important consequences for surveys: A SRS of
variability of the results depends only
2,500 from the more than 230 million adults in US
on the size of the spoon (sample) and
gives results as precise as a SRS of 2,500 from the
not on the size of the pot (population)
460,000 adult inhabitants of Boston

65 6 66 6
5 6

What did you learn today? Summary of Unit 3


• Method of data collection (or data sources) influences type
of conclusion that can be drawn (association vs causation),
as an accurate reflection of a population
• Experiments used to infer causality
 If an intervention is associated with a change in
response in a controlled setting, it can be thought to
have caused the change
• Experimental designs can be refined to increase precision
in measuring an intervention effect
• Sample surveys are valuable tools for learning about
populations, and are a useful place to begin the study of
inference
67 6 68
6
7 8

17

You might also like