0% found this document useful (0 votes)
58 views17 pages

Biostatics and Epidemiology 2022 1

Uploaded by

mohammad418151
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views17 pages

Biostatics and Epidemiology 2022 1

Uploaded by

mohammad418151
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

IMG STUDY GROUP

Epidemiology & Biostatistics


Dr. Shakeel Ahmed

Statistics
The science of collecting, monitoring, analyzing, summarizing, interpreting data.
Biostatistics
Statistics applied to biological (life) problems, including Public ealth, Medicine,
Ecological and environmental, Populations and Samples
Population – a group of individuals that we would like to know something about.
Studying populations is too expensive and time-consuming, and thus impractical
Often denoted with Greek letters (μ, σ, ρ)
Sample – a subset of a population, then by observing the sample we can learn
something about the population often denoted with lower letters (m, SD)
Incidence and Prevalence
Prevalence = Incidence x Duration

Lead Time Bias


Over Estimation of survival time due to early detection (Screening Program)
Not by Treatment
1-30-year Healthy Person
40-year Onset of Disease
55-year Screening performed, and disease detected
65-year Overt Disease, clinically detected
70-year Death from disease
Post Screening Survival 70 – 55 = 15 Years
No - Screening Survival 70 – 65 = 5 Years
Lead Time 15 – 5 = 10 Years

Incidence Rate
Incidence (New Cases) / Population exposed to the Risk
Resp. Infection among exposed / Total exposed children

Relative Risk
RR = Incidence Rate in Exposed
Incidence Rate in Non-Exposed

1
60 / 24 = 2.5
300/120 = 2.5, Statistically Significant
RR = 1 No Relationship, NOT statistically significant
RR = > 1 Positive association, Risk Factor
RR = < 1 Negative association, Protective Factor
Relative Risk of Disease is 2.5 times more in Exposed Population

Attributable Risk
= Incidence Rate in Exposed - Incidence Rate in Non-Exposed 60 – 24 = 36

Question No 1
A new blood test has been devised and used in 50 people to look for diabetes
The results are as follows.
True Positive 4
False Positive 16
False Negative 6
True Negative 24
What is the Sensitivity and Specificity of the test?
What is the positive and negative predicted value?

Likelihood Ratio (LR)


LR >10 High
LR 5-10 Moderate
LR 2-5 Small
LR 1-2 Rarely Used
If a test has a LR of 1, it does not change the pretest probability of disease.
If the LR is 10, it makes disease more likely.
If the LR is 1, disease is less likely.
LR+ indicates how much the probability of disease
increases if the test is positive
LR- indicates how much the probability of disease
decreases if the test is negative

Strength of the Test by Likelihood Ratio


Qualitative Strength LR (+) LR (-)
Sensitivity/1-Specificity 1-Sensitivity/Specificity
Excellent 10 0.1
Good 5 0.5
Useless 1 1
FOBT 7.33 0.37
2
Does this dyspneic patient in the emergency department have congestive heart
failure?
A Validated Clinical and Biochemical Score for the Diagnosis of Acute Heart
Failure: the ProBNP Investigation of Dyspnea in the Emergency Department
(PRIDE) Acute Heart Failure Score
Am Heart J 2006;151:48-54

Conclusions
For dyspneic adult emergency patients, Directed History, Physical Exam.
Chest X-Ray, and ECG should be performed.
If the suspicion of Heart Failure remains, obtaining a serum BNP level may be
helpful, especially for excluding heart failure.
A low serum BNP (<100 pg/mL) proved to be the most useful test
-LR = 0.11 (95% CI, 0.07-0.16)

Probability
Chance of something happening, or risk. A statistical way of quantifying
uncertainty
0- Never
1- Certain
• Values are mostly less than one
• Chance of heads 50% or 0.5
Risk = Number of Events
Number of people at Risk

3
If two patients develop a side effect of a drug out of 100 in a clinical trial.
Calculate the risk of side effect.
Risk = 2/100 (0.02 or 2%)

Basic Statistical Measures


Measures of Central Tendency
Mean
Median
Mode
Measures of Dispersion / Variability
Range, Inter Quartile Range (IQR) = 4.0 - 2.5 = 1.5 KG
Confidence Interval (95% CI) = 3 (2.5,4.5)
Standard Deviation (SD)
Mean, Standard Error of Mean (SEM)
50TH Percentile = 3 KG
25TH Percentile = 2.5 KG
75TH Percentile = 4.0 KG
16, 20, 18
16+20+18= 54
Mean = 54/3= 18
18-16 = +2
18-20 = -2
18-18 = 0
Total = 0

Mean
The average value, Summation of values divided by its Number,
Example:
Monthly income of 5 employees is:
1, 3, 4, 2, 5 dollars
Calculate their mean:
Arithmetic mean = sum of values / n
= 1+ 3+ 4+ 5+ 2
Mean = 15 / 5 = 3
Median
The value that divides the data into two equal sets after arrangement in descending
or ascending order.
The Middle Value
To calculate the median you need to:
1. Arrange the values in ascending or descending order.
4
2. Determine location of median:
(n+1)/2
Odd number, the location is direct
Even number, the location is midpoint between two values
1. Determine the Value of the median

Median
The middle value
Example
Number of children of some families were n= 9
6, 4, 5, 0, 1, 3,5,2,2
Calculate the median
▪ Arrange in an ascending order
▪ Pick-up the middle value
0,1,2,2,3,4,5,5,6
Mode
The most frequent value, least affected by skewness of bell curve
Example
Number of children in nine families
n= 9
3,5,2,4,3,0,1,3,2
0,1,2,2,3,3,3,4,5
Mode= 3

Standard Deviation
It’s a measured of spread away from the arithmetical mean
It lies either side of mean value, minus or plus
Narrow SD: The data is squeezed and close to the mean
Wide SD: The data is spread out and away from the mean
Example
US average men’s height = 178, SD = 8 cms
Most men (68%) height = 170 to 186 cms (One SD = 8 cms)
All men heights = 162 to 184 cms (Two SD = 16 cms)

Measures of Dispersion:
Range
• Used with ordinal, interval & ratio
• Difference between largest & smallest value
• Entirely dependent on the most extreme scores
• Outlier may exaggerate range
5
Example: n= 11 8 3 6 5 4 11 2 9 4 10 11
Range: n= 11 11-2 = 9
Example: 8 3 6 4 11 2 9 4 10 4 11
Range: 11-2 = 9
Example: 8 3 6 4 11 2 9 4 10 4 19
Range: 19 -2 = 17
Data are identical except for one point
Outliers/extreme scores has large effect on range.

Measures of Dispersion:
Interquartile Range (mid-spread)
• used for ordinal, interval & ratio data
• comprises the middle 50% of the data
• difference between the 75th and 25th percentile
• not influenced by extreme scores than range
• disregards half of the data (lower Quarter and upper Quarter)
• IQR for 1 year old baby wt.11.5 - 9.5 = 2.0 kg
• Mean , 50th Percentile for 1 year old baby wt. = 10.5 kg

Measures of Dispersion:
Interquartile Range
Example: 42 43 45 47 48 49 51 53 53 54
(N=10)
step 1: calculate Q1 (median of lower half)
step 2: calculate Q3 (median of upper half)
step 3: IQR = Q3-Q1
Q1= 45 (42 43 45 47 48)
Q3= 53 (49 51 53 53 54)
IQR= 8 (53-45)

Measures of Dispersion:
Comparing Range & Interquartile Range
Example: 42 43 45 47 48 49 51 53 53 54
• Interquartile range: 45-53 = 8
• Range: 54-42 = 12
Example: 42 43 45 47 48 49 51 53 53 64
• Interquartile range: 45-53 = 8
• Range: 42-64 = 22
• Interquartile range NOT affected by extreme score
• Range is affected by extreme score
6
• Measures of Dispersion: Standard Deviation
Standard Deviation
• take the square root of the variance
• returns value to the original unit of measurement
• Easier to interpret
• Average deviation from the mean
• How much scores vary on average

Question No 2
You are director of occupational health for a corporation that has many
employees aged over 45 who smoke one or more packs of cigarettes daily
and are at increased risk for lung cancer.
What strategy for the early detection of lung cancer in asymptomatic
individuals would you recommend?
a) No strategy has been shown to be effective in reducing mortality
b) Chest x-ray and sputum cytology every 6 months for high-risk employees
c) Annual chest x-ray and sputum cytology for high-risk employees
d) Annual chest x-ray and sputum cytology for all employees
e) Annual chest x-ray for all employees

Question No 3
A double-blind trial is planned to compare the utility of glyburide and
metformin in the treatment of diabetes mellitus. The main reasons for
randomizing patients are:
a) So that the number of subjects in each group will be identical
b) So that the two patient groups will have similar prognostic features
c) So that the statistician will not analyze the data in a biased fashion
d) So that the investigator does no know in advance what therapy which patient
will receive?
e) To prevent the clinician knowing which drug the patient is taking

Question No 4
A type of gynecological cancer has the same incidence rate in white women and
African American women in the US, but the prevalence rate of this type of cancer
is lower in African American than in white women. The most likely explanation
for this difference in prevalence rates is that when compared to white women,
African American women are more likely to;
A. Recover from this type of cancer
B. Have natural immunity to this type of cancer
C. Have increased access to treatment for this type of cancer
7
D. Be resistant to this type of cancer
E. African American live longer with this type of cancer

Question No 5
A research study is done to determine if a new drug (Drug A) will prevent stroke in
men aged 55–65 years who have hypertension. 4000 Hypertensive men in this age
group are randomly assigned to a group taking Drug A (n = 2,000) and placebo (n
= 2,000). Over 10 years, there were 400 strokes in the placebo group and 200
strokes in the Drug A group
Based on these data, how many men would have to be treated with Drug A to
prevent one case of stroke?
a) 10
b) 20
c) 40
d) 80

The absolute risk of stroke in the placebo 400/ 2,000 = 20%.


The absolute risk of stroke in the Drug A 200/ 2,000 = 10%.
The Absolute Risk Reduction (ARR) (20% − 10%) = 10%.
The Number Needed to Treat (NNT) 1/ARR 1/ 0.1 = 10
Since 10% of the hypertensive men were saved from stroke by the Drug A
Therefore, 10 men would have to be treated with the Drug A to prevent one case of
stroke.

Question No 6
A cohort study is conducted to evaluate the relationship between calcium
supplements and the occurrence of hip fractures in post-menopausal woman. The
study examines the hip fracture rate in 100 woman taking calcium supplements and
100 woman taking placebo over 3 years.5 women have hip fractures in the calcium
group and 10 woman have fractures in the placebo group
What is the risk of hip fracture in the group treated by calcium supplements?
a) 1%
b) 5%
c) 10%
d) 20%
e) 50 %
Risk of Fracture in Calcium group (Experimental) EER = 5 / 100 = 0.05 = 5%
Risk of Fracture in Controlled group (Control) CER = 10 / 100 = 0.1 = 10%
Relative Risk Reduction (by Calcium) RRR = 0.05 / 0.1= 0.5 = 50%

8
Null Hypothesis
Type I (α) error:
The probability of saying that there is a difference in treatment effects between
groups while in fact there is none (a falsely rejected null hypothesis)
p-value < 5% or p < 0.05
The p-value is an estimate of the probability that differences in treatment effects in
a study could have happened by chance alone.
Classically, differences associated with a p < 0.05 are statistically significant.
Type II (β) error:
The probability of saying that there is no difference in treatment effects.
A falsely accepted the null hypothesis while in fact there is a difference.
Power > 80 %

The probability that a study will find a statistically significant difference when one
is truly there.
It relates directly to the number of subjects. Power (β) = 1 − type II error.

Question No 7
A randomized trial comparing the efficacy of two drugs showed a difference
between two with a p value of < 0.05. However, two drugs do not differ.
This is an example of:

9
a. Type I error (α error)
b. Type II error (β error)
c. 1- α
d. 1- β
e. A statistically significant trial
Type I (α) error:
The probability of saying that there is a difference in treatment effects between
groups while in fact there is none (a falsely rejected Null Hypothesis)

Question No 8
To study the psychiatric morbidity of Schizophrenia, all patients with the disorder
being treated in the city, as identified by medical records, were interviewed by a
research team. A control group matched for age, sex and socio-economic status
was also interviewed. This is an example of which one of the following.

1) Prospective cohort study


2) Case control study
3) Cross-sectional prevalence survey
4) Lifetime prevalence study
5) Census survey
Question No 9
In a trial of 1000 children with Kawasaki, high dose of IVIG was compared to low
dose IVIG. The risk of developing coronary aneurysm was 3.2% in low dose IVIG
group and 1.2% in high dose IVIG group.
How many children would you need to treat with high dose IVIG in order to
prevent one child from having a coronary aneurysm?
a) 25
b) 200
c) 20
d) 100
e) 50
NNT = 1/ ARR
ARR = 3.2% - 1.2% = 2%
NNT = 1/0.02 or 100/2
NNT = 50

Question No 10
In a classroom of 25 students (15 males and 10 Females) , 5 males develop
hepatitis A over a 2-week period. During the next 6 weeks, an additional 3 males
and 2 females develop the infections.
10
Calculate secondary attack rates of hepatitis A in this class.
a) 5%
b) 10%
c) 20%
d) 25%
e) 50%

Primary Attack Rate 5/25 x 100 = 20% (25-5=20)


Secondary Attack Rate 5/20 x100 = 25%
The Attack Rate 10/25 x100 = 40%

Question No 11
Several studies have shown that 85% of Lung Cancer are due to cigarette smoking.
This measure is an example of
a. Incidence Rate
b. Attributable Risk
c. Relative Risk
d. Prevalence
e. Mortality Ratio

Question No. 12
40,000 students take MCC-QE1 each year.
The mean score is 222 with a Standard Deviation of 16.
If passing score is 226, How many students scored above 254?
(A) 10,000
(B) 6,400
(C) 1,000
(D) 600
Answer: There are 1,000 students who scored above 254 on MCC-QE1
• Total Student = 40,000
• 254 is 2 SD above the mean (222+16+16) or > 95% above the mean!
• This indicates that 2.5% were above 254
• 2.5 percent of 40,000 is 1,000 students (2.5 / 100 x 40,000 )

11
STANDARD NORMAL DISTRIBUTION

Question No. 13
During a hospitalization, a patient’s serum Na+ value follows a normal distribution
with a mean of 140 and an SD of 2.5. During his stay, what percentage of his Na+
values will be greater than 145?
Answer: 2.5%.
Therefore 5% of his values will be >2 SDs away; 2.5% will be >145 and 2.5% will
be <135
Mean serum Na+ = 140
❑ > 1 SD contains 68% of values (140+2.5 = 142.5)
❑ > 2 SD contains 95% of values (140++2.5++2.5 = 145)
❑ > 3 SD contains 99% of values (140++2.5++2.5 = 145)

Question No. 14
In a cohort study of coronary artery disease in people with smoking versus those
who are not smoking, the following results were obtained:
Calculate Relative Risk of CAD in smoker and what does it mean?

Relative Risk
= Incidence rate in smoker 80% / Incidence rate in non-smoker 50% = 1.6
It means that smokers are 1.6 times more at risk of developing coronary artery
disease.

12
Question No. 15
200 Hypertensive males are randomly allocated to treatment A (100 patients) and
Control (100 patients). After 3 months of treatment, 70 of the treated groups and
50 of the placebo group showed a drop in their blood pressure.
Calculate
• The incidence among treated = 30%
• The incidence among controls = 50%
• The Relative Risk = 30/50 = 0.6
• Absolute Risk Reduction (ARR) = 50% - 30% = 20% or 0.2
• Number Needed to Treat (NTT) = 1/ARR = 1/0.2 = 5
It means that you need to treat 5 Patients to have ONE favorable outcome, or
corrected blood pressure. As the RR is 0.6, the treatment is successful

In one study, the ARR of statin therapy is calculated at 4%. What is the NNT?
Number Needed to Treat (NTT) = 1/ARR = 1/0.04 = 25
It means that 25 patients would need to be treated with statins to prevent one MI.

Question No. 16
A new test for tuberculosis finds that 195 out of 200 people with tuberculosis test
positive. A total of 35 people out of 150 without the disease also tested positive.
Before calculation, decide whether this test is sensitive or specific.
Draw a 2 × 2 table and calculate the sensitivity, specificity, PPV, and NPV.

Test Disease Present Disease Absent

Positive TP = 195 FP = 35

Negative FN = 5 TN = 115

Sensitivity = Tp/ (Tp + Fn) = 195/ (195 + 5) = 97.5%.


Specificity = Tn/ (Tn + Fp) = 115/ (115 + 35) = 77%.
PPV = Tp/ (Tp + Fp) = 195/ (195 + 35) = 85%
NPV = Tn/ (Tn + Fn) = 115/ (115 + 5) = 96%
This test appears to be very good at identifying sick people (screening) but not
good at identifying healthy people. It is therefore sensitive but not specific.

13
Question No. 17
A case-control study is designed to study risk factors for developing Buerger’s
disease. Ten people are selected with Buerger’s disease, and 10 subjects are
selected as controls. Nine patients who developed Buerger’s disease were heavy
smokers, and two people without Buerger’s disease smoked.
Construct a 2 × 2 table and calculate the odds ratio.

RISK Disease Present Disease Absent

Present a = 10 b = 190

Absent c = 30 d = 170

Using these data, can we calculate the prevalence of Buerger’s disease?


Why should we calculate odds ratio and not relative risk?
OR = (a/b)/(c/d) = ad/cb = 36
The odds of having Buerger’s disease in smokers are 36 times as of non-smokers.
We cannot calculate prevalence using these data because we intentionally selected
10 patients with the disease and 10 without. As this is a case-control study, we
calculate odds ratio, not relative risk
Risk Factors for Osteoporosis in Postmenopausal Women

14
The Odds Ratio for Osteoporosis
OR increased by 8% (p=0.001) with each year of life.
The prevalence increased with age from 24.9% in 60-64 years to 37.4% in 70-75
years.
Non-smokers the Odds Ratio for osteoporosis was 0.424, which was statistically
significant (p<0.05).
BMI <18.5 increased the OR 1.86 95% CI (0.35-9.8)
BMI >30-35 decreased the OR 0.19 95% CI (0.13-0.28)
Age 60-75 years Risk of fractures increased with increasing age and observed
height loss (p<0.001).
Hormone Therapy (HT) decreased the prevalence of osteoporosis by 25% in
comparison with non-users.

Question No. 18
Investigators conduct a randomized control trial to study the benefits of a new
asthma medication. Of 200 people on the medication, only 10 had asthma attacks.
A total of 30 people out of 200 in the control group developed an asthma attack.
Construct a 2 × 2 table and calculate the incidence in the exposed and unexposed,

Exposure Disease Disease Absent


Present
Drug a = 10 b = 190

Control c = 30 d = 170

Calculate ARR, NNT, RR, and RRR for this medication.


Risk in exposed = a/ (a + b) = 10/ (10 + 190) = 0.05 = 5%.
Risk in unexposed = c/ (c + d) = 30/ (30 + 170) = 0.15 = 15%.
ARR = 15%− 5%= 10%.
NNT = 1/ARR = 1/0.1 = 10.
RR = risk in exposed/risk in unexposed = 5%/15% = 33%
RRR = (risk in exposed − risk in unexposed) risk in unexposed
= (15%− 5%) /15% = 67%.

15
Hierarchy of Evidence

CRP Levels are associated with physiological measures of Disease Severity


(Hypoxemic Respiratory Failure)

Scatterplot & Correlation Coefficients


At admission, CRP was clinically correlated with measured respiratory function.
CRP levels showed a strong positive association with SOFA score.

16
CRP levels showed a inverse correlation to P/F ratio on admission demonstrating
an association of these markers with the severity of Acute Hypoxemic Respiratory
Failure
A scatterplot displays the strength, direction, and form of the relationship between
two quantitative variables. A correlation coefficient measures the strength of that
relationship.

The P/F Ratio


It is the arterial pO2 (“P”) from the ABG divided by the FIO2 (“F”).
The fraction (percent) of inspired oxygen that the patient is receiving expressed as
a decimal (40% oxygen = FIO2 of 0.40).
A P/F Ratio less than 300 indicates acute respiratory failure.

Meta-Analysis
A Systematic Review that Combines Statistical Data from Similar Quantitative
Studies to find a common resultant effect.
The power of the meta-analysis comes from its ability to statistically digest dozens
of different reviews and emerge with a final assessment

Five Odds Ratios used in Meta-Analysis with the summary measure (centre line of
diamond) and associated Confidence Intervals (lateral tips of diamond),
and solid vertical line of No Effect

[email protected] https://www.facebook.com/imgstudygroup

17

You might also like