0% found this document useful (0 votes)
3 views

bias and confounding

The document discusses the importance of evaluating epidemiological studies for validity by assessing the roles of chance, bias, and confounding. It details types of bias, including selection and information bias, and their potential impact on study results. Additionally, it explains confounding effects and offers strategies for controlling bias and confounding in study design and analysis.

Uploaded by

dirshu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

bias and confounding

The document discusses the importance of evaluating epidemiological studies for validity by assessing the roles of chance, bias, and confounding. It details types of bias, including selection and information bias, and their potential impact on study results. Additionally, it explains confounding effects and offers strategies for controlling bias and confounding in study design and analysis.

Uploaded by

dirshu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 66

Role of chance, bias

and confounding
in evaluation of an
epidemiological study
Validity of a study
• An epidemiological study is intended to
answer a study question.

• Next step of evaluation of study result is


validation of the finding.

• An observed association or finding is


validated for bias, chance and confounding
To show a Valid Statistical
Association
We should assess:
1. Role of chance: how likely is what we found is a
true (real) finding

2. Bias: Whether systematic error has been


introduced into the study design

3. Confounding: Whether an extraneous factor is


related to both the disease and the exposure
1. Bias
1. Bias
• Describes error arising from the design or
execution of a study

• It is undesirable

• It can’t be ‘adjusted for’

• Useful to consider in any study design

• Essential to consider in critical appraisal


Bias
It is a systematic error introduced into the
study design.

Two major forms


1. Selection Bias: refers to any error that arises
in the process of identifying the study subjects.

2. Information Bias: includes any systematic


error in the measurement of information on
either exposure and outcome variables
1. Selection bias
• Selection bias occurs when identification of
subjects for inclusion into a study depends on
the interest of the data collector or investigator

• If selection of cases and controls (eg in case


control study) is based on different criteria, then
bias can occur.

• There are lots of circumstances selection bias to


occur, but there are two major known forms
Types of Selection Bias
1. Response Bias:
• Those who agree to be in a study may be in some
way different from those who refuse to participate
– Volunteers may be different from those who are enlisted.

2. Berksonian Bias:
• Bias that is introduced due to difference in criteria/
probabilities of admission to a hospital for these
with the disease and with out the disease of
interest.
– Admission criteria of the hospital
‘Selection’
‘SOURCE POPULATION’
?
SAMPLING FRAME (e.g. electoral roll, household enumeration)
?
APPROACHED TO PARTICIPATE
?
WILLING TO PARTICIPATE

•WHO IS MISSING?
•DOES IT MATTER?
‘Selection bias’ and cross-sectional surveys

• ‘Non-response bias’ is the major concern

• Unrepresentative sampling

• Some people listed may be not reached

• Replacement of selected study participant

• It might affect observed prevalence rates


Selection Bias in case-control studies

Source population

?? ??

Cases Controls

Criteria for selection of cases and controls should be


similar except for the outcome variable?
Selection Bias
• Controls should be selected to be as
representative as possible of the
population from which the cases are
drawn - but should be free from disease

• ...Selection bias occurs if selection of


cases or controls is dependent on their
exposure status
Selection Bias - an example

• Association between violence against


women and depression

• Depressed women from hospital series

• Non-depressed women (controls) from the


community
Selection Bias: solution
• Cross-sectional surveys
– Maximise response rates

– Think about sampling techniques

– Gather information about ‘missing’ groups, don’t


replace study subjects

– Be careful about inferences

• Case control studies


– As above

– Careful selection of control groups, have a similar


criteria as in selection of controls
2. Information Bias
• In analytical studies, usually one factor is
known and another is measured

• e.g. in case control studies, the ‘outcome’


is known and the ‘exposure’ is measured

• e.g. in cohort studies, the exposure is


known and the outcome is measured
Information bias
• Error in the measurements / information
obtained in the study could be:
– Error due to participants

– Error due to ‘observers’

– Differential (Non-random)

– Non-differential (Random)
• (i.e. Is it influencing equally on the exposure and
the outcome?)
Types of Information Bias
1. Interviewer Bias: an interviewer’s knowledge on
the exposure and outcome may influence the
structure of questions and the manner of
presentation which may influence the response

2. Recall bias: those with a particular outcome or


exposure may remember events more clearly
or amplify their memories.
Cont…
3. Observer Bias: Observers may have
preconceived expectations of what they
should find in an examination

4. Lose to follow up: Those that are lost to


follow up or who withdraw from the study
may be different from those who are
followed for the entire study
Cont…
5. Hawthorne effect: An effect first documented
at Hawthorne manufacturing plant; People act
differently if they know they are being watched

6. Surveillance Bias: The group with the known


exposure or outcome may be followed more
closely or longer than the comparison group.

7. Misclassification bias; Errors are made in


classifying either the disease or exposure
status
Types of misclassification Bias
1. Differential Misclassification- Errors in
measurement are one way only

Example:
– Measurement bias: Instrument may not be
accurate, such as using only one size BP cuff
to take measurements on both adults and
children
Cont…
2. Non-Differential (Random) Misclassification
– Errors in assignment of group happens in more
than one direction

– This usually will dilute the study findings (Bias


towards the NULL)
Information bias: Cross-sectional

• The results of a survey are entirely


dependent on the measurement applied

• Best considered as ‘measurement issue’


rather than ‘bias’

• Measurement of depression status using


tools or using a psychiatrist (Clinician)
Information bias: case control studies
• ‘Recall bias’
– Cases selectively more likely to remember and
disclose their exposure status than controls

• ‘Observer bias’
– Misclassification problems introduced by
“observers” collecting the data.

– If observers know about the caseness or non-


caseness, their measurement of the exposure
could be biased
Recall bias- an example
• Food poisoning and salmonella
– Case informant more likely to recall
ingestion of eggs (effort after meaning)

• Mothers having a child with diarrhoea


more to remember (recall) their
children’s diet in the last three days
than women having children with out
diarrhoea
Observer bias- an example
• Smoking and lung cancer

• Interviewer 'tries harder' to elicit smoking


exposure in patients with lung CA. cases
than among patients with out lung CA
Information bias: cohort studies
and trials
• Focus is on outcome rather than exposure

• Problems when outcome is ambiguous


(i.e. in a lot of Psychiatry/ behaviour
measurement)

• e.g. Knowledge of intervention group in a


trial may influence outcome
Information bias: solution
• Maximise accuracy of measurements

• Minimise ambiguity of measurements (i.e. their


susceptibility to influence)
– ‘Biological’ vs. questionnaire / interviewer rating

– Historical measures of exposure (e.g. old notes)

– Blinding interviewers to case status

– Blinding participants (and /interviewers) to study


hypothesis
Information bias: inference
• All measurements subject to error

• Most measurement error is ‘non-differential’


- i.e. not influenced by the comparison of
interest

• Non-differential error biases the result ‘to


the null’ (because of dilution)
Information bias: inference

• If the error in measurement is influenced


by belonging to one comparison group, it
is ‘differential’
• (e.g. exposure rating influenced by
caseness or control status)

• The direction of differential bias may be in


either direction
Controlling for Bias
• Be purposeful in the study design to minimize the
chance for bias

• Define, on a priori, who is a case or what


constitutes exposure so that there is no overlap
– Define categories within groups clearly (age groups,
aggregates of person years)

• Set up strict guidelines for data collection


– Train observers or interviewers to obtain data in the
same fashion
– It is preferable to use more than one observer or
interviewer, but not much, since they cannot be trained
in an identical manner
Controlling for Bias (cont)
• Randomly allocate interviewer during data
collection assignments

• Institute a masking process if appropriate


– Single masked study – subjects are unaware of
whether they are in the experimental or control group

– Double masked study – the subject and the observer


are unaware of the subject’s group allocation

• Build in methods to minimize loss to follow-up


2. Confounding effect
2. Confounding effect
• The word came from Latin, ‘confundere’
meaning ‘to mix up’

• Measured effect of an exposure is


distorted because of association of the
exposure with other factor (confounder)
that influences the outcome.
Exposure Outcome

Confounder
Confounding
• A problem resulting from the fact that one
feature of study subjects has not been separated
from second feature, and has thus been
confounded with it, producing a spurious result

• The spuriousness arises from the effect of the


first feature being mistakenly attributed to the
second feature.

• Confounding can produce either a type I or a


type II error, but we usually focus on type I
errors
Cont…

At the simplest level, confounding can be thought


of as a confusion of effects.

The apparent effect of the exposure of interest is


distorted because the effect of an extraneous third
factor is mixed with the actual effect
Cont…
• Difference from Bias…..
– Bias creates an association that is not true, but
confounding describes an association that is true,
but potentially misleading

• Key principle of confounding include that a


confounder should be associated with both the
independent and dependent variables (ie, with
the exposure and the disease)

• Association of the confounder with just one of


the two variables is not enough to produce
spurious result.
Cont…

? ? ?
Exposure of Interest Health Outcome

Confounders
Epidemiologic Studies are
Primarily Exercises in
Measurement and
Estimation

Confounders

Exposure of Interest

Mechanisms

measure

Health Outcome
Confounding
Example
Grey hair Death

Age
Age (the confounder) is strongly and independently
associated both with the outcome (dying) and with the
exposure (grey hair)

If left uncontrolled, the confounder would have produced a


spurious association between exposure and disease
Effect of a confounder

 could be large
 may produce an over or underestimate of the true

effect

 may change the apparent direction of the effect


Confoundi
ng
Two forms
– Positive confounding = Risk ratio or risk difference is
increased from the true value by effect of the
confounding variable

– Negative confounding = Risk difference or Risk Ratio


are brought closer to the null by the effect of the
confounding variable
Mediator and Confounding
• Not every factor that is associated with both the
exposure and the disease is a confounding variable.

• It could be a mediating variable

• A mediator is also associated with both the dependent


and independent variables, but is part of the causal
chain between the independent and dependent
variables.

• Difficult to distinguish statistically, but only differentiated


based on biological knowledge of the process of action.
Mediation
• As a confounder, it is associated to both the
exposure and the outcome, but is a path of action.

• It is distinguished by careful consideration of causal


pathways.

• Knowledge of biological plausibility about the


mediator is necessary

Cigarette fibrinogen Atherosclerosis


Exposure mediator outcome
Example
• How to control for confounding

• Observed statistics: There is an association


between an exposure (Coffee drinking) and a
disease (myocardial infarction)

A B
Coffee drinking Myocardial infarction
Steps
Step I. Is there an association?
Heavy coffee drinking is statistically
significantly associated with higher rates of MI.
Is coffee then a cause of MI

Step 2. Identify potential confounders


Could cigarette smoking be a confounder?

Step 3. Is the potential confounder associated with


the exposure?
Heavy coffee drinking is associated with higher
rates of smoking. Thus, smoking fulfills one
criterion for potential confounding
Cont…
Step 4. Is the potential confounder
associated with the disease of interest?
• Smoking is associated with higher rates of MI.
Smoking fulfils the second criterion for
potential confounding.

Step 5. What happens when we control for


cigarette smocking?
• Adjustment for cigarette smoking eliminates
the association of heavy coffee drinking and
MI. The association is explained by the fact
that more coffee drinkers are also smokers.
• Conclusion:

• Does it seem true? ……No


Coffee Cigarette
smocking
MI
drinking
Coffee drinking is not a cause of myocardial infarction
Example II
• Observed association

• There is an association between an


exposure (obesity) and a disease (MI)

A B
Obesity MI
Steps
Step I. Is there an association?
Obesity is statistically significantly associated with
higher rates of MI. Is obesity then a cause of MI?

Step 2. Identify potential confounders


Could cholesterol level be a confounder?

Step 3. Is the potential confounder associated


with the exposure?
Obesity is associated with higher level of
cholesterol
Cont…
Step 4. Is the potential confounder associated
with the disease of interest?
• Cholesterol level is associated with higher
rates of MI.

Step 5. What happens when we control for


cholesterol level?
• Adjustment for cholesterol level eliminates
the association of obesity and MI.
Cont…
• Conclusion

Obesity Cholesterol MI

• Does it seem true? ……Yes


• We should not conclude that obesity is not
a real cause of myocardial infarction,
Because cholesterol level may be part of
the pathway (mediator) from obesity to MI.
Interaction (effect modification)
• Two or more factors acting together to
cause, prevent or control a disease

• The effect of two or more causes acting


together is often greater than would be
expected on the basis of summing the
individual effects.

• Example
– Smocking and asbestos dust Vs Lung cancer.
Interaction
• Factor A having RR= 2.0 to develop disease D

• Factor B having RR= 1.7 to develop disease D

• Combination of factors A&B have a RR of 4.5


to develop disease D

• Expected RR if summative effect would be 3.7


In the Study Design:

 Randomization

 Restriction

 Matching In the Analysis:


 Stratification

 Multivariable Adjustment

 Propensity Scores
In the Study Design:

1. Randomization - Rarely possible except in RCT

2. Restriction

-It is restricting to certain population (gender, certain


age group)

-Reduces eligible subject pool


-Requires narrow range on restriction variables
-Some restriction variables may actually be of
scientific interest e.g., gender
3. Matching
technique that selects subjects so that the distribution

of potential confounders is similar in both groups


can be used in any design but most often used in

case/control studies where ‘n’ is smaller


matching can be expensive and time consuming
can limit the ability of the study to investigate the

matching factors themselves


only controls confounding of matching factors
In the Analysis:
1. Stratification

Controls confounding and helps describe effect modification

A+ A-

B D B D

 Observing RR and RD at stratified levels of a 3rd

variable (A) describes joint effects and controls

confounding
Steps in stratification
• Step 1: Do analysis crudely

• Step 2: Do analysis after stratifying (in the


presence and absence of the potential
confounder)

• Step 3: Compare the results for difference


between stratified and combined
• Interpretation of result:
– if no (much) difference between stratified and
combined then the potential confounder is not a
real confounder.

– If the combined effect is higher than their additive


effect then there is an interaction
Example
• Association thought: Alcohol Vs MI
• Possible confounder: cigarette smocking

– Step 1: Do analysis crudely

Combined Outcome

Exposed
+ -
+ 100 50

700 1000
OR= 2.9
Example
• Association thought: Alcohol Vs MI
• Possible confounder: cigarette smocking

• Step 2: Do analysis after stratifying (in the presence


and absence of the potential confounder)

Stratified
Smokers Non-smokers
Outcome Outcome
+ - + -
+
Expos
70 20
+
Expos
30 30

200 500 500 500

OR= 8.7 OR= 1.0


Example
• Association thought: Alcohol Vs MI
• Possible confounder: cigarette smocking

– Step 3: Compare the results for difference


between stratified and combined
– Interpretation of result:
• if no (much) difference between stratified and combined
then the potential confounder is not a real confounder.

• If the combined effect is higher than their additive effect


then there is an interaction

Combined OR = 2.9 Stratified


Presence of Conf. : OR = 8.7
Absence of conf. : OR = 1.0
Conclusion
There is much difference, thus there is confounding effect
In the Study Design:
• Randomization 

• Restriction 
In the Analysis:
• Matching 
• Stratification 
• Multivariable Adjustment
• Propensity Scores
2. Multivariable Adjustment

• Simple linear regression model:


Y =  + 1 X1 + 2 X2 + 3 X3

assumes an ADDITIVE RISK association E  D

• Logistic regression model:


Log (Y / 1-Y) =  + 1 X1 + 2 X2 + 3 X3

assumes a MULTIPLICATIVE RISK association E  D

• Proportional Hazards - Survival regression model:


Log Incidence Rate (t) = (t) + 1 X1 + 2 X2 + 3 X3

assumes a MULTIPLICATIVE RISK association E  D at time (t)


In the Study Design:

• Randomization 
• Restriction 
In the Analysis:
• Matching 
• Stratification 
• Multivariable Adjustment 
• Propensity Scores
Propensity scores

• Can be used when there are a large number of


confounding factors.

• Adding confounders singly to the models is the


actual method.

• Propensity scores are regression-based vectors


of confounders upon which the analysis of the
primary variable can be stratified.
Thank you

You might also like