0% found this document useful (0 votes)
68 views

L11 PHPC2017 EM, Bias, Causal Inference 2020

This document discusses effect modification, bias, and causal inference in epidemiology. It begins by defining effect modification and distinguishing it from confounding. It then reviews sources of selection bias and information bias in different study designs. Finally, it discusses the basic principles of making causal inferences in epidemiology.

Uploaded by

gyg
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

L11 PHPC2017 EM, Bias, Causal Inference 2020

This document discusses effect modification, bias, and causal inference in epidemiology. It begins by defining effect modification and distinguishing it from confounding. It then reviews sources of selection bias and information bias in different study designs. Finally, it discusses the basic principles of making causal inferences in epidemiology.

Uploaded by

gyg
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

PHPC2017

Effect Modification, Bias,


Causal Inference

Prof. Joey Yang


Division of Epidemiology
JC School of Public Health and Primary Care
The Chinese University of Hong Kong
Contents and learning objectives

Effect modification
Understand effect modification
Distinguish between confounding and effect
modification
Error & bias
Understand selection bias and information bias
Identify sources of bias in different study designs
Causal inference
Understand the basic principles of making causal
inference in epidemiology
Review: definition of confounding
A situation in which the effect of exposure or association
between exposure and outcome is distorted by the
presence of another variable (confounder)
The observed effect of exposure on the outcome is not
purel the effect of exposure itself, but a mixture of the
effects of exposure and other factors
Review: definition of confounder

A) Associated with the disease


Cause or a proxy for a cause (determinant of the disease)
Not an effect of the disease

B) Not part of the causal pathway of the exposure


Must not be an effect of the exposure

C) Associated with the exposure


Imbalance in the comparison groups (exposed vs. non-exposed)

(A) and (B): judged by prior knowledge, common sense or


biological reasoning, not testable by the data from your own
study. (C): can be tested with your study data.
Review: definition of confounder

B) not part of the causal


pathway of the exposure
Exposure Outcome
(e.g., smoking) (e.g., lung cancer)

C) associated with A) a determinant


the exposure of the disease

The third variable


(confounder, e.g., age)
The relationship between exposure, outcome and confounder
Review: result of confounding

Young+Old Cancer+ Cancer- Total


Smokers 31 2150 2181 RRcrude=0.4
Non-smokers 118 3209 3327

Old People Cancer + Cancer - Total


Smokers 15 150 165 RRold=1.5
Non-smokers 110 1705 1815
Young People Cancer + Cancer - Total
Smokers 16 2000 2016 RRyoung=1.5
Non-smokers 8 1504 1512

RRadjusted (1.5) RRcrude (0.4)


Review:
methods for controlling confounding

In the design stage


1) Restriction
2) Matching
3) Randomization (random allocation)

In the analysis stage


1) Stratified analysis
2) Standardization
3) Multivariable regression analyses
Review: stratified analysis for
controlling confounding

1. Calculate the crude measure of association (e.g. crude RR)


2. Divide the data into strata according to categories of a third factor
(e.g., gender, age).
3. Within each stratum, calculate a stratum-specific RR (e.g. gender-
specific RRs)
4. If the stratum-specific RRs were similar to each other, pool them
over all strata to calculate a weighted average (i.e. the adjusted RR)
using the Mantel-Haenszel method
5. Compare the adjusted RR with the crude one. If there is a difference
between the two, confounding exists.
Hypothetical data of effect modification

All people Disease + Disease - Total


Exposed 400 800 1200 RR=1.52
Non-exposed 350 1250 1600

Male Disease + Disease - Total


Exposed 150 650 800 RR=0.75
Non-exposed 200 600 800
Female Disease + Disease - Total
Exposed 250 150 400 RR=3.33
Non-exposed 150 650 800

Stratum-specific effect estimates are different


9
Effect modification: definition
Effect modification, or interaction, can be defined as:
A change in the magnitude of an effect measure
according to the value of some third variables (after
exposure and disease), which is called an effect
modifier.

Rothman K. Modern Epidemiology


Effect modification: definition

The effect of an exposure on the outcome is


modified by a third factor
The effect of an exposure on the outcome vary
with the status/level of a third factor
The effect of an exposure is different in different
subgroups defined by a third factor

11
Hypothetical data of effect modification

All people Disease + Disease - Total


Exposed 400 800 1200 RR=1.52
Non-exposed 350 1250 1600

Male Disease + Disease - Total


Exposed 200 600 800 RR=1.00
Non-exposed 200 600 800
Female Disease + Disease - Total
Exposed 200 200 400 RR=2.67
Non-exposed 150 650 800

Stratum-specific effect estimates are different,


suggesting effect modification
12
Hypothetical data of effect modification

All people Disease + Disease - Total


Exposed 400 800 1200 RR=1.52
Non-exposed 350 1250 1600

Male Disease + Disease - Total


Exposed 200 600 800 RR=1.33
Non-exposed 150 650 800
Female Disease + Disease - Total
Exposed 200 200 400 RR=2.00
Non-exposed 200 600 800

Stratum-specific effect estimates are different,


suggesting effect modification
13
Effect modification: examples
Would this drug be more effective in men than in women?
Could this drug work in Caucasians but not in Chinese?
Could these two drugs, if used in combination, be more
effective than the sum of the effects of the two used
separately?
Could these two drugs, if used in combination, cause severe
harmful effect?
Effect modification: implication
May be used to inform clinical or public health practice:
RR for treatment efficacy of drug A vs drug B: 2.67 in
people with mutant EGFR gene, 0.48 in those without
people with the mutation should receive drug A, while
others should receive drug B
A virus causes an infectious disease in Asian (RR>1), but
not in Caucasians (RR=1) Asian people should avoid
exposure to the virus to prevent the disease, while
Caucasians do not need to worry about it
Effect modification: detection

● Stratified analysis
● Multiple regression (NOT required at this stage)
● Need clinical/biologic argument for Effect
Modification, rather than relying solely on
numbers/statistics

16
Effect modification: stratified analysis

1. Stratify data by a 3rd factor (potential effect


modifier)
2. Estimate the stratum-specific effects (RR, OR, etc.)
3. Test for heterogeneity (difference)
4. Significant heterogeneity suggests interaction
5. Present stratum-specific effect rather than a pooled
overall one
Stratified analysis for EM vs confounding

1 Calculate crude OR

2 Stratify by 3rd factor

3 Calculate stratum-specific ORs


OR1 OR2

Similar Different Test whether they are similar


4
Adjusted OR Evidence for effect modification

5 Compare with crude OR


Report stratum-specific ORs
If different confounding
Effect modification: detection

● Stratified analysis
● Multiple regression (NOT required at this stage)
● Need clinical/biologic argument for Effect
Modification, rather than relying solely on
numbers/statistics

19
Confounding vs. effect modification
Confounding Interaction/Effect Modification
3 factors involved: exposure, 3 factors involved: exposure,
outcome, confounder outcome, effect modifier
Homogeneous strata-specific RRs Heterogeneous strata-specific RRs
& adjusted RR crude RR
A spurious relationship that we A useful knowledge for
don’t want understanding the mechanism
(e.g. a drug is more effective in
males than females)
Could be minimized with proper Cannot be removed (as a fact of
study design (e.g. matching, nature)
restriction, randomization)
Analysis method: stratified Analysis method: stratification,
analysis, standardization, multiple multiple regression
regression
20
Contents

Confounding & effect modification


Error & bias
Causal inference
Error
The basic goals of epidemiologic studies are: 1) to measure a
disease frequency or 2) to evaluate whether and to what
extent two factors (e.g., exposure & outcome) are associated
with each other, which will serve as the base for causal
inference (see below).
The objective of a specific stud is to know the truth about
something, e.g. the true prevalence of depression, the true 5-
year incidence of stroke, the true effect of smoking on lung
cancer (i.e., the true RR/OR), the true treatment efficacy of
a drug, etc.
However, in most cases, it is impossible to get the true
values, for various reasons.

22
Error

Error: difference between the observed value


obtained from specific studies and the true
value
Types
1. Random error
2. Systematic error

23
Error

Random error occurs because the estimates we


produce are based on samples, which may not
accurately reflect what is really going on in the
population at large
Can be reduced a lot by increasing sample size

Smpasitef

24
Bias
Systematic error occurs in the design or
implementation of an epidemiological study
that results in an incorrect estimate of the
association between exposure and outcome
Types:
1. Selection bias
2. Information bias
3. Confounding bias

25
Selection bias
Any error that arises in the process of identifying
(selecting) the population for study or analysis
Leads to systematic differences in characteristics
between those selected for study (or analysis) and
those not
The prevalence of a health status, or the relation
between exposure and disease, is different for those
who participate and those who should be theoretically
eligible for study, including those who do not
participate

26
Selection bias: examples

Researchers used a postal questionnaire survey to investigate


the career progression of NHS doctors. The questionnaire
included details about past and current employment, future
career plans, and when career milestones were reached.
Analysis was confined to respondents working in the UK
NHS.
The participants were all those who graduated from UK
medical schools in 1977, 1988, and 1993. The questionnaire
was sent to 10 344 graduates, of whom 7012 replied, giving a
response rate of 68%.

27
Selection bias: examples

The respondents to the survey were self selected and not a


random sample from the three cohorts of graduates. This
would have introduced non-response bias. The respondents
would have been different to the non-responders in some
way, not least in their motivation to complete the
questionnaire. This would have ultimately affected the
results of the survey.
For example, doctors may have been less likely to return the
questionnaire if they had become disillusioned with their
career because its progression was not as fast as expected.
This would have resulted in the survey underestimating the
length of time doctors took to achieve their career
milestones.
28
Selection bias: examples
o Healthy worker effect
Initially observed in studies of occupational diseases:
workers usually exhibit lower overall death rates than the
general population
Should we encourage people to do the job to prolong
survival? No, because the severely ill and chronically
disabled are ordinarily excluded from employment, or quit
from the job shortly.
The workers are healthier than general population. Their
death rate does not reflect the death rate of ordinary
people doing this work (if they were hired)

29
Selection bias: examples
o Survival/survivorship bias
Only the subjects that survived a process are included in the
analysis.
A case-control study investigating stroke that only
enrolled prevalent stroke cases (who were diagnosed years
ago). Those who died shortly after diagnosis were excluded.
The selected sample, therefore, included less severe cases, but
not fatal cases, thus not representative of the target
population (all stroke cases).
Prevalence-incidence bias (Neyman bias)
May get wrong conclusion, e.g., age vs. stroke

30
Selection bias: examples
⑦ Loss-to-follow-up: people who are staying in a
cohort study / clinical trial are not the same as
those who left (migration, refuse to continue
because of worsening health condition, death)
If you are looking at the death rate in this group of
people and included in the analysis only those
staying in the cohort (may be not representative of
the original target population), then the death rate
you obtained could be biased.
Similar to healthy worker effect and non-response
bias in a way
31
Selection bias: how to reduce it?

In[cross-sectional]surveys, probability sampling


(e.g., simple random sampling ) is preferable than
non-probability methods (e.g., convenience
sampling)
In[case-control studies,]use incident/new cases
instead of prevalent/old cases
In all studies, take as many measures as possible to
improve compliance / response / follow-up rate, e.g.
by providing incentive

32
Information bias (measurement error)

The bias arising from measurement error


Also referred to as misclassification (of
exposure and/or outcome status)

33
Information bias: examples
① Interviewer bias
● Problem: difference in soliciting, recording, interpreting
information from subjects
● e.g., in a case-control study, interviewers ask the case group
about previous exposure history in a leading or more in-depth
way, as compared with the control group
● Possible solutions
1) Blind exposure status of subjects in prospective studies,
including both cohort studies and RCTs, when collecting
outcome data
2) Blind case/control status of subjects when measuring
exposure history in case control studies
3) Mask study hypothesis
4) Use standardized, closed-form questionnaires
34
Information bias: examples
Recall Bias

Information accurac depends on respondents
memory
E.g., in case-control study, cases recall and/or report
previous exposure differently from controls
(differential)
Possible solutions
If better recall of cases is due to search for cause of
their disease, then use diseased (hospital) controls with
similar potential for recall
Rely on historical/ objective records rather than ask
subjects to recall
35
Information bias: examples

⑦ Detection Bias
Cohort studies and RCT
Systematic difference in following, obtaining, and
ascertaining outcome information between
comparison (exposed and non-exposed) groups
e.g., potential outcomes are identified through
thorough medical examination in the exposed
group, but by self-reporting in the control group
Possible solution: use standardized procedures
for outcome ascertainments for all subjects
36
Information bias: consequence

● By definition, information bias leads to misclassification


● Non-differential misclassification
● Degree of misclassification is the same in study and
comparison group
● E.g., blood pressure of all subjects are lower than their true
value due to the problem of blood pressure gauge used
● Tends to under-estimate the association (OR & RR
biased towards 1)

37
Information bias: consequence

Differential misclassification
Degree of misclassification is NOT the same in
study and comparison group
E.g., more in-depth and detailed interview in the
cancer cases group to obtain their exposure history
than in the control group
The association can be biased in any direction, over-
estimated or under-estimated

38
Bias: difficulty in dealing with it

Bias should be anticipated and better


controlled during study design, data collection
and analysis phase.
In practice, bias can be minimized, yet never
completely eliminated
Always discuss the likely impact of potential
bias on your result in the discussion part of
paper

39
Contents

Confounding & effect modification


Error & bias
Causal inference
Causal inference
Causal inference is the process of drawing a
conclusion about a causal relation between two
things (e.g., a disease and a suspected cause) based
on existing evidence.
Causal relation: examples

Risk factor and disease


Treatment and its efficacy
Treatment and its adverse effects
Causal relation: criteria

①1. Temporal order


1) the cause must be present or occur before its effects
(the outcome)
2) an inarguable requirement for causal relationship
3) could be difficult to establish in cross-sectional studies

Diagnosed with Started


lung caner smoking

2015 2016 2017 2018

Could smoking be the cause of the person s lung cancer?


Causal relation: criteria

A cross-sectional study shows that girls who eat breakfast


every day weigh less than the girls who don't, and conclude
that breakfast helps to lose weight. But is it true?
1) Eating breakfast makes girls slim?
2) Slim girls are more likely to eat breakfast? (i.e., reversed)
Causal relation: criteria

1. Temporal order
2. Association: RR/OR 1
However, associated factors may not be causally related!
Causal relation: criteria

Two students, who do not know each other, always enter and
leave a lecture hall at the same time: (1) They are associated;
(2) But the are not the cause of the other s presence; (3) It
is because there is a regular class in the lecture hall.

In year 2001, a small tree in front of Jack s house was 50 cm


tall, and little Jack was 80 cm tall. In year 2018, the tree was
15 m tall, and Jack was 1.8 m tall. Jack grew as the tree grew.
Was the tree a cause of Jack s growth?

46
Causal relation: criteria

1. Temporal order
2. Association: RR/OR 1
3. Consequential change: the association between the
exposure and outcome is really caused by the
exposure, rather than by chance (random error) or
bias (systematic error)
Causal relation: criteria

Chance (random error): can be easily reduced by increasing


sample size

Bias (systematic error):


1) The more measures are taken to reduce bias, the more likely
that the bias is small (and the association is a result of the
causal relation)
2) Refer to and compare with other studies/evidence
Hierarchy of evidence pyramid

o
Hill s Criteria for Causal Inference
1. Temporality (temporal order)
2. Strength of association (RR, OR, RD, etc)
3. Biological gradient (dose-response relationship)
4. Consistency (repeatability)
5. Experimental evidence (RCT evidence)
6. Plausibility (explicable by known knowledge)
7. Coherence (biological/theoretical/factual/statistical)
8. Specificity (in the cause/effect, suitable for infectious disease)
9. Analogy (experience of similar events)
Strength of association

Is there an association?
How strong is the association? Stronger
associations are less likely to be entirely due to bias

Relative Risk (RR) Strength of association

0.9-1.0 1.0-1.2 None/Marginal


0.7-0.8 1.2-1.5 Weak
0.4-0.6 1.5-3.0 Moderate
0.1-0.3 3.0-10.0 Strong
<0.1 >10.0 Very Strong

51
Biological gradient

Dose-response relationship
Cigarette smoking and risk of cancer
Blood pressure and risk of stroke
Incidence of
stroke (/100,000)

52
Biological gradient

Biological gradient is a strong indication of causal


relationship, but not necessary
Risk of outcome

Level of exposure

Beware of confounding, e.g. birth order & maternal


age vs. Down s s ndrome

53
Consistency
Consistent findings observed by different persons in
different places with different samples strengthens
the likelihood of an effect
There could be no chance for a second study, e.g.,
atomic bomb vs. cancer in Nagasaki and Hiroshima
of Japan
Other studies could be severely flawed and thus not
reliable

54
Experimental evidence

Here, experiment = RCT, not laboratory experiment


on animals/cells
Not ethical to test potentially harmful factors
For harmful factors, cessation of exposure should
reduce the risks
e.g., if smoking causes cancer, smoking cessation
should decrease the risk of cancer

55
Causal inference is subjective process

Causal inference is at best tentative and still a


subjecti e process.
Rothman 1986

All scientific work is incomplete whether it be observational or


experimental. All scientific evidence is liable to be upset or modified
by advancing knowledge. That does not confer upon us a freedom to
ignore the knowledge we have already have, or to postpone the
action that appears to demand at a given time.
Bradford Hill 1965
PHPC2017
Effect Modification, Bias,
Causal Inference
Thank you!

Joey Yang
Division of Epidemiology
JC School of Public Health and Primary Care
The Chinese University of Hong Kong

You might also like