0% found this document useful (0 votes)
8 views

6 Study Designs

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

6 Study Designs

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 72

Epidemiologic Study Designs

Department of Public Health


Gulu University
23/05/2023

Dr. Akera Peter


Introduction
• Choosing the appropriate study design is a crucial step in an
epidemiological investigation.

• Each study design has strengths and weaknesses.

• An Epidemiologists must consider all sources of bias and confounding,


and strive to reduce them.

• Ethical issues are important in epidemiology, as in other sciences.


Introduction
• Epidemiological studies can be classified as either observational or
experimental.

• Studies may have one or more names (alternate names)

• Each has unit of study – population, individual, groups, communities


Observational studies
• Observational studies allow
nature to take its course: the
investigator measures but
does not intervene.

• They include studies that


can be called descriptive or
analytical
Descriptive study
A descriptive study • Who (Please name all the people in
the household, starting with the
• is limited to a description of the oldest. Don’t forget to include yourself.
occurrence of a disease in a • What type of Insecticide Treated
population and is often the first mosquito net is this type of net to
step in an epidemiological people use?
investigation. • When (How long ago was the net last
soaked or dipped?)
• How many Was (name) given medicine
• It provides the what, Who, for malaria by the health worker or at
When and Where of health the health facility?
related events • Where (Name of health facility)
Analytical studies
• An analytical study goes further
by analysing relationships
ITN use

between health status and other


variables. Education

Net
• Used to search for causes and soaked
Malaria

other factors that influence the


occurrence of health related
events. Age

Type of
ITN
Note !
• Apart from the simplest descriptive studies, almost all epidemiological
studies are analytical in character. WHY?
• Pure descriptive studies are rare, but descriptive data in reports of health
statistics are a useful source of ideas for epidemiological studies.
• Limited descriptive information (such as that provided in a case series) in
often stimulates the initiation of a more detailed epidemiological study
• E.g. the description in 1981 of four young men with a previously rare
form of pneumonia was the first in a wide range of epidemiological
studies on the condition that became known AIDS
• Nodding disease??????
Experimental studies
• Experimental or intervention studies involve an active attempt to
change a disease determinant – such as an exposure or a behaviour –
or the progress of a disease through treatment
• Similar in design to experiments in other sciences-
• However, they are subject to extra constraints, since the health of the
people in the study group may be at stake. Major experimental study
designs include the following:
1. Randomized Controlled Trials (RCT)using patients as subjects (clinical trials),
2. Field trials in which the participants are healthy people, and
3. Community trials in which the participants are the communities themselves.
Note!
• In all epidemiological studies it is essential to have a clear definition of
a case of the disease being investigated by delineating the symptoms,
signs or other characteristics indicating that a person has the disease.
• A clear definition of an exposed person is also necessary.
• This definition must include all the characteristics that identify a
person as being exposed to the factor in question.
• In the absence of clear definitions of disease and exposure, it is very
difficult to interpret the data from an epidemiological study.
Epidemiological studies

Observational studies Experimental studies


• Descriptive studies Randomized controlled
Ecological/correlational trials/clinical trials
Cross-sectional/prevalence Cluster randomized controlled
trials
• Analytical studies
Field trials Community trials/
Case control/case reference community intervention studies
Cohort/follow-up
Descriptive studies

• A simple description of the health status of a community, based on


routinely available data or on data obtained in special surveys-UDHS
• first step in an epidemiological investigation. E.g. HIV prevalence
figures UG?
• Pure descriptive studies make no attempt to analyse the links
between exposure and effect.
• They are usually based on mortality statistics and may examine
patterns of death by age, sex or ethnicity during specified time
periods or in various countries/communities .
Ecological studies
Table1: Percentage of children age 6-59
Ecological months with Hb lower than 8.0 g/dl, UG 2011
Region Hemoglobin <8.0 g/dl
• Ecological (or correlational) studies are
useful for generating hypotheses. Kampala 1.4
Central 1 6.2
• The units of analysis are groups of
Central 2 3.3
people rather than individuals. East Central 8.9
• Can also be done by comparing Eastern 7.9
populations in different places at the Karamoja 6.4
same time or, in a time series, by North 0.4
comparing the same population in one West Nile 5.2
place at different times. Western 3.0
• What hypothesis from table 1? South West 0.4
Ecological studies

• An observation as seen above would need to be tested by controlling


for all the potential confounders to exclude the possibility that other
characteristics – such as poverty, other diseases etc. in the different
populations – did not account for the relationship.
• Simple to conduct
• Rely on data collected for other purposes
• Difficult to interpret since it is seldom possible to examine directly the
various potential explanations for findings
• since the unit of analysis is a group, the link between exposure and
effect at the individual level can not be made
Cross-sectional studies

• Measure the prevalence of disease and thus are often called prevalence
studies.
• In a cross-sectional study the measurements of exposure and effect are
made at the same time.
• It is not easy to assess the reasons for associations shown in cross-sectional
studies.
• The key question to be asked is whether the exposure precedes or follows
the effect. If the exposure data are known to represent exposure before
any effect occurred, the data from a cross-sectional study can be treated
like data generated from a cohort study.
• Poverty vs disease ??? Discuss
Cross-sectional studies
• Relatively easy and inexpensive to conduct
• Useful for investigating exposures that are fixed characteristics of
individuals, such as ethnicity or blood group.
• In sudden outbreaks of disease, measure several exposures can be
the most convenient first step in investigating the cause.
• Helpful in assessing the health care needs of populations.
• Data from repeated cross-sectional surveys provide useful indications
of trends- sleep under net -1990 -1995-2000 etc
Cross-sectional studies
• Each survey should have a clear purpose.
• Valid surveys need
well-designed questionnaires
an appropriate sample of sufficient size
Good response rate.
• However frequency of disease and risk factors can then be examined
in relation to age, sex, education level, poverty, ethnicity etc
Analytical studies
• The outcome of any analytical study is usually the conclusion that
“a disease and its suspected cause are, or are not, associated”.
• Attempts to provide the Why and How of such events by comparing
groups with different rates of disease occurrence and with differences
in demographic characteristics, genetic or immunologic make-up,
behaviors, environmental exposures and other risk factors
Case-control studies

• Provide a relatively simple way to investigate causes of diseases,


especially rare diseases.
• Include people with a disease (or other outcome variable) of interest
and a suitable control (comparison or reference) group of people
unaffected by the disease or outcome variable.
• The study compares the occurrence of the possible cause in cases and
in controls.
• The investigators collect data on disease occurrence at one point in
time and exposures at a previous point in time.
Case-control studies
• Case-control studies are longitudinal
• Retrospective studies since the investigator is looking backward from
the disease to a possible cause.
• This can be confusing because the terms retrospective and
prospective are also used to describe the timing of data collection in
relation to the current date.
• In this sense a case-control study may be either retrospective, when
all the data deal with the past, or prospective, in which data collection
continues with the passage of time.
Case-control studies
Time

Direction of inquiry start with

Exposed
Cases
Not exposed

Population
Exposed
Control
Not exposed
Selection of cases and controls

• Cases should represent all the cases in a specified population


group.
• Cases are selected on the basis of disease, not exposure.
• Controls are people without the disease.
• The controls should represent people who would have been
designated study cases if they had developed the disease.
• Case control studies can estimate relative risk of disease
Exposure

• An important aspect of case-control studies is the determination of


the start and duration of exposure for cases and controls.
• Exposure status of the cases is usually determined after the
development of the disease (retrospective data)
• Got by direct questioning of the affected person or a relative or friend
• The informant’s answers may be influenced by knowledge about the
hypothesis under investigation or the disease experience itself.
Exposure
• Exposure is sometimes determined by biochemical measurements (e.g.
lead in blood or cadmium in urine), which may not accurately reflect
the relevant past exposure.
• For example, lead in blood at age 6 years is not a good indicator of
exposure at age 1 to 2 years, which is the age of greatest sensitivity to
lead.
• This problem can be avoided if exposure can be estimated from an
established recording system (e.g. stored results of routine blood
testing or employment records) or if the case-control study is carried
out prospectively so that exposure data are collected before the disease
develops.
Example of a case-control study
• Researchers in Papua New Guinea compared the history of meat
consumption in people who had enteritis necroticans, with people
who did not have the disease.

• Proportionately more people who had the disease (50 of 61 cases)


reported prior meat consumption than those who were not affected
(16 of 57).
• What would the 2X2 table look like?
• What is the Odds Ratio?
Exposure (Recent meat
Ingestion)
Present Absent

Present 50 11 61
Disease
(enteritis
necroticans)
Absent 16 41 57

66 52 118
Total
Odds Ratio
• The association of an exposure and a disease (relative risk) in a case-
control study is measured by calculating the odds ratio (OR), which is
the ratio of the odds of exposure among the cases to the odds of
exposure among the controls.
• Odds of exposure among cases = 50/11
• Odds of exposure among controls= 16/41
• OR = (50/11) / (16/41) = (50 x 41)/ (11 X 16) = 11.6
• Meaning,
• cases were 11.6 times more likely than the controls to have recently
eaten meat
Odds Ratio
• The odds ratio is very similar to the risk ratio, particularly if a disease
is rare.
• For the odds ratio to be a good approximation, the cases and controls
must be representative of the general population with respect to
exposure.
• However, because the incidence of disease is unknown, the absolute
risk can not be calculated.
• An odds ratio should be accompanied by the confidence interval
observed around the point estimate
Cohort studies

• Also called follow-up or incidence studies


• Begin with a group of people who are free of disease, and who are
classified into subgroups according to exposure to a potential cause of
disease or outcome
• Variables of interest are specified and measured and the whole
cohort is followed up to see how the subsequent development of new
cases of the disease (or other outcome) differs between the groups
with and without exposure.
• Because the data on exposure and disease refer to different points in
time, cohort studies are longitudinal, like case control studies
Cohort studies
Time

Direction of inquiry Disease

Exposed
No disease

People
with out
Population disease

Disease

Not exposed
No disease
Cohort studies
• Cohort studies have been called prospective studies, but this
terminology is confusing and should be avoided.

• As mentioned previously, the term “prospective” refers to the timing


of data collection and not to the relationship between exposure and
effect.

• Thus there can be both prospective and retrospective cohort studies.


Cohort studies
• Provide the best information about the causation of disease and the
most direct measurement of the risk of developing disease.
• Although conceptually simple, cohort studies are major undertakings
and may require long periods of follow-up since disease may occur a
long time after exposure
• For example, the induction period for leukaemia caused by radiation
(i.e. the time required for the specific cause to produce an outcome)
is many years and it is necessary to follow up study participants for a
long time.
Cohort studies
• Many exposures investigated are long-term in nature and accurate
information about them requires data collection over long periods.
• As cohort studies start with exposed and unexposed people, the
difficulty of measuring or finding existing data on individual exposures
largely determines the feasibility of doing one of these studies.
• If the disease is rare in the exposed group as well as the unexposed
group there may also be problems in obtaining a large enough study
group.
Cohort studies
• The expense of a cohort study can be reduced by using routine
sources information about mortality or morbidity, such as disease
registers or national registers of deaths as part of the follow-up.
• Example,
• In 1976, 121 700 married female nurses aged 30–55 years completed
the initial Nurses’ Health Survey questionnaire.
• Every two years, self-administered questionnaires were sent to these
nurses, who supplied information on their health behaviours and
reproductive and medical histories.
Cohort studies
• The initial cohort was enrolled with the objective of evaluating the
health effects of oral contraceptive use.
• Investigators tested their methods on small subgroups of the larger
cohort, and obtained information on disease outcomes from routine
data sources.
• In addition to studying the relationship between oral contraceptive
use and the risk of ovarian and breast cancer, they were also able to
evaluate other diseases in this cohort – such as heart disease and
stroke, and the relationship between smoking and the risk of stroke
Historical cohort studies
• Costs can occasionally be reduced by using a historical cohort
(identified on the basis of records of previous exposure).

• This type of investigation is called a historical cohort study, because


all the exposure and effect (disease) data have been collected before
the actual study begins.
Historical cohort studies
• For example, records of military personnel exposure to radioactive
fall-out at nuclear bomb testing sites have been used to examine the
possible causal role of fall-out in the development of cancer over the
past 30 years.

• This sort of design is relatively common for studies of cancer related


to occupational exposures.
Nested case-control studies
• The nested case-control design makes cohort studies less expensive.

• The cases and controls are both chosen from a defined cohort, for
which some information on exposures and risk factors is already
available
• Additional information on new cases and controls, particularly
selected for the study, is collected and analysed.
• This design is particularly useful when measurement of exposure is
expensive.
Nested case-control studies

Cases
Disease

People
with out
Population disease

No disease Sample Controls


Nested case-control studies
• To determine if infection with Helicobacter pylori was associated with
gastric cancer, investigators used a cohort of 128 992 people that had
been established in the mid-1960s.

• By 1991, 186 people in the original cohort had developed gastric


cancer.

• The investigators then did a nested case-control study by selecting the


186 people with gastric cancer as cases and another 186 cancer-free
individuals from the same cohort as controls.
Nested case-control studies
• H. pylori infection status was determined retrospectively from serum
samples that had been stored since the 1960s.

• 84% of people with gastric cancer and only 61% of the controls – had
been infected previously with H. pylori, suggesting a positive
association between H. pylori infection and gastric cancer risk
General principles
• The outcome of any analytical study is usually the conclusion that
“a disease and its suspected cause are, or are not, associated”.

• The simplest method of presenting an association is by means of a


contingency (2 x 2) table
General principles
• 2 x 2 tables are concise & easy to analyse statistically

• NB Only appropriate if the disease can be recorded as present or


absent (i.e. dichotomous categorical variables)

• Associations involving continuous data that can not be grouped are


analysed using other statistical indices (correlation, ANOVA, etc)
Contingency table for case –control & cohort
studies
Disease
Present Absent

Present a b a+b
Exposure

Absent c d c+d

Total a+c b+d a+b+c+d


What do the cells mean?
a = subjects with both the risk factor & the disease
b = subjects with the risk factor, but not the disease
c = subjects with the disease but not the risk factor
d = subjects with neither the risk factor nor the disease
What do the cells mean?
a + b = all subjects with the risk factor
c + d = all subjects without the risk factor
a + c = all subjects with the disease
b + d = all subjects without the disease

A + b + c + d = all study subjects


Experimental epidemiology

• Intervention or experimentation involves attempting to change a


variable in one or more groups of people.
• This could mean the elimination of a dietary factor thought to cause
allergy, or testing a new treatment on a selected group of patients
• The effects of an intervention are measured by comparing the
outcome in the experimental group with that in a control group.
• An interventional study is usually designed as a randomized
controlled trial, a field trial, or a community trial.
Randomized controlled trials

• A randomized controlled trial is an epidemiological experiment


designed to study the effects of a particular intervention, usually a
treatment for a specific disease (clinical trial).
• Subjects in the study population are randomly allocated (by chance)
to intervention and control groups, and the results are assessed by
comparing outcomes.
• If the initial selection and randomization is done properly, the control
and treatment groups will be comparable at the start of the
investigation
Field trials

• In contrast to clinical trials, involve people who are healthy but


presumed to be at risk; data collection takes place “in the field,”
usually among non-institutionalized people in the general population
• Since the subjects are disease-free and the purpose is to prevent
diseases that may occur with relatively low frequency, field trials are
often logistically complicated and expensive endeavours.
• Field trials can be used to evaluate interventions aimed at reducing
exposure without necessarily measuring the occurrence of health
effects
Field trials
• One of the largest field trials was that testing the Salk vaccine for the
prevention of poliomyelitis, which involved over one million children
• Used to evaluate interventions aimed at reducing exposure without
necessarily measuring the occurrence of health effects.
• For instance, measurement of blood lead levels in children has shown
the protection provided by elimination of lead paint in the home
environment.
• Such intervention studies can be done on a smaller scale, and at lower
cost, as they do not involve lengthy follow-up or measurement of
disease outcomes.
Community trials

• In this form of experiment, the treatment groups are communities


rather than individuals.
• This is particularly appropriate for diseases that are influenced by
social conditions, and for which prevention efforts target group
behaviour.
• Prevention efforts/interventions like mass media, health policy,
environment, community sports facilities
• Cardiovascular disease is a good example of a condition appropriate
for community trials
Limitations of community trials

• A limitation of such studies is that only a small number of


communities can be included
• Random allocation of communities is usually not practicable;
• other methods are required to ensure that any differences found at
the end of the study can be attributed to the intervention rather than
to inherent differences between communities
• It is difficult to isolate the communities where intervention is taking
place from general social changes that may be occurring.
Limitations of community trials
• Design limitations, especially in the face of unexpectedly large,
favourable risk factor changes in control sites, are difficult to
overcome.

• As a result, definitive conclusions about the overall effectiveness of


the community-wide efforts are not always possible.
Potential errors in epidemiological studies
• Epidemiological investigations aim to provide accurate measures of
disease occurrence (or other outcomes).

• However, there are many possibilities for errors in measurement. Such


as……..

• Need to minimize errors and assessing the impact of errors that can not
be eliminated.

• Sources of error can be random or systematic.


Random error
• Random error is when a value of the sample measurement diverges
due to chance alone from that of the true population value.
• Random error causes inaccurate measures of association.
• There are three major sources of random error:
• individual biological variation;
• sampling error; and
• measurement error.

• Random error can never be completely eliminated.


Random error
• Individual biological variation
• Individual variation always occurs.
• Sample error
• caused by the fact that a small sample is not representative of all the
population’s variables.
• The best way to reduce sampling error is to increase the size of the study.
• Measurement error
• No measurement is perfectly accurate
• Can reduced by stringent protocols, and by making individual measurements
as precise as possible.
Sample size
• The sample size must be large enough for the study to have sufficient
statistical power to detect the differences deemed important.
• Sample size calculations can be done with standard formulae
• Need the following information:
• required level of statistical significance of the ability to detect a difference
• acceptable error, or chance of missing a real effect
• magnitude of the effect under investigation
• amount of disease in the population
• relative sizes of the groups being compared.
Sample size
• In reality, sample size is often determined by logistic and financial
considerations, and a compromise always must be made between
sample size and costs.

• The precision of a study can also be improved by ensuring that the


groups are of appropriate relative size.

• This is often an issue of concern in case-control studies when a


decision is required on the number of controls to be chosen for each
case.
Systematic error
• Systematic error (or bias) occurs in epidemiology when results differ
in a systematic manner from the true values.
• A study with a small systematic error is said to have a high accuracy.
• Accuracy is not affected by sample size.
• Sources of systematic error in epidemiology are many (over 30 have
been identified).
• The principal biases are:
• selection bias
• measurement (or classification) bias.
Systematic error
• Selection bias occurs when there is a systematic difference between
the characteristics of the people selected for a study and the
characteristics of those who are not.

• An obvious source of selection bias occurs when participants select


themselves for a study, either because they are unwell or because
they are particularly worried about an exposure e.g. helmet use
Systematic error
• An important selection bias is introduced when the disease or factor
under investigation itself makes people unavailable for study. E.g.
factory worker exposed to toxic chemicals are most likely to have left
because of illness
• Similarly, if a study is based on examinations done in a health centre
and there is no follow-up of participants who do not turn up, biased
results may be produced: unwell patients may be in bed either at
home or in hospital.
• All epidemiological study designs need to account for selection bias.
Systematic error
• Measurement bias occurs when the individual measurements or
classifications of disease or exposure are inaccurate i.e., they do not
measure correctly what they are supposed to measure.
• There are many sources of measurement bias, and their effects are of
varying importance.
• E.g. biochemical or physiological measurements are never completely
accurate
• Different laboratories often produce different results on the same
specimen.
• Minimized by samples being analyzed randomly by different laboratories
Systematic error
• A form of measurement bias of particular importance in retrospective
case-control studies is known as recall bias.
• There is a differential recall of information by cases and controls
• e.g. cases may be more likely to recall past exposure, especially if it is widely
known to be associated with the disease under study – for example, lack of
exercise and heart disease.
• Recall bias can either exaggerate the degree of effect associated with
the exposure.
Systematic error
• If the investigator, laboratory technician or the participant knows the
exposure status, this knowledge can influence measurements and
cause observer bias .
• To avoid this bias, measurements can be made in a blind or double-
blind fashion.
• A blind study means that the investigators do not know how participants are
classified.
• A double-blind study means that neither the investigators, nor the
participants, know how the latter are classified.
Confounding

• Major issue
• In a study of the association between
exposure to a cause (or risk factor) and
the occurrence of disease, confounding
can occur when another exposure
exists in the study population and is
associated both with the disease and
the exposure being studied.
• Age and social class often confounders
Confounding
• Problem if risk factor is unequally distributed between the exposure
subgroups.

• Confounding occurs when the effects of two exposures (risk factors)


have not been separated and the analysis concludes that the effect is
due to one variable rather than the other.

• Several methods are available to control confounding, either through


study design or during the analysis of results.
Confounding
• Control confounding in the design of a study are:
• Randomization – equal distribution of risk factors
• Restriction– e.g in a study on the effects of coffee on coronary heart disease,
participation in the study could be restricted to nonsmokers
• Matching - same age and sex .
• At the analysis stage, confounding can be controlled by:
• Stratification – 10 year age groups, gender, social status etc
• statistical modeling –multivariate statistical models e.g. logistic regression etc.
Validity
• Validity is an expression of the degree to which a test is capable of
measuring what it is intended to measure.

• A study is valid if its results correspond to the truth; there should be


no systematic error and the random error should be as small as
possible.

• There are two types of validity: internal and external.


Validity
• Internal validity is the degree to which the results of an observation
are correct for the particular group of people being studied.

• For example, measurements of blood haemoglobin must distinguish


accurately participants with anaemia as defined in the study.

• Internal validity can be threatened by all sources of systematic error


but can be improved by good design and attention to detail.
Validity
• External validity or generalizability is the extent to which the results
of a study apply to people not in it.
• External validity requires external quality control of the
measurements and judgements about the degree to which the results
of a study can be extrapolated.
• This does not require that the study sample be representative of a
reference population.
• For example, evidence that the effect of lowering blood cholesterol in men is
also relevant to women requires a judgment about the external validity of
studies in men.
Validity
• External validity is assisted by study designs that examine clearly-
stated hypotheses in well defined populations.

• The external validity of a study is supported if similar results are found


in studies in other populations

• Ethical issues – another potential error in epidemiological studies


• Will be covered in research methods.
Study questions
1. What are the applications and disadvantages of the major
epidemiological study designs?
2. Outline the design of a case-control study and a cohort study to
examine the association of a high-fat diet with hypertension.

You might also like