0% found this document useful (0 votes)
28 views43 pages

Phase 3 Statistics Record

Uploaded by

yfwg56xtfb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views43 pages

Phase 3 Statistics Record

Uploaded by

yfwg56xtfb
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

DESCRIPTIVE STATISTICS

Q1. The weight (in Kg) of 9 males were 53, 59, 45,
50, 80, 67, 59, 74, 62. Calculate mean, median,
mode and SD.
Write down the formula, substitute values and then write
answer

Q2. The respiratory rate/min of 10 adults were 13,


12, 20, 14, 16, 21, 15, 18, 17, 19. Calculate mean,
median, mode, SD and variance.
Coefficient of variation

Q1. In a sample of adults aged 21 years and children 3


months old the following data were obtained for
height. Find which series shows greater variation

Mean ht SD
Adults 160cm 10cm
Children 60cm 5cm
Coeff of Variation for adults = 10/160*100 = 6.25%

Coeff of Variation for children = 5/60*100 = 8.33%

Height of children shows greater variation


NORMAL (GAUSSIAN) DISTRIBUTION
 Curve is bell shaped and symmetrical about the mean
 Curve is asymptotic (does not touch the baseline)

 Mean = median = mode

 68.2% of observations will be within mean ± 1SD

 95.4% within mean ± 2SD

 99.7% within mean ± 3SD

 Area under the curve is equal to 1

 Two parameters:

Mean (µ) – determines the position of the curve


SD (σ) – determines the shape of the curve
STANDARD NORMAL DISTRIBUTION (Z)

Mean = 0 and SD = 1

68.2% b/w -1 and +1

95.4% b/w -2 and +2

99.7% b/w -3 and +3

𝑥−𝑚𝑒𝑎𝑛 𝑥−𝜇
Standard normal variate, 𝑧 = =
𝑆𝐷 𝜎
INFERENTIAL STATISTICS
95% confidence interval (CI)

1. 95% CI for population mean:


Sample mean ± 𝑧𝛼/2 *SE (mean)
s
= x ± 1.96 ∗
n

2. 95% CI for population proportion:


Sample proportion ± 𝑧𝛼/2 *SE (proportion)
pq
= p± 1.96 ∗
n
Q1. The mean weight of 50 boys aged 5-years was 18.6kg with SD 1.65kg. Find the
95% confidence interval for mean weight of 5-year old boys.

95% CI for population mean (mean weight of 5 year old boys)


𝑠
= 𝑥 ± 1.96 ∗ 𝑛
1.65
= 18.6 ± 1.96* = 18.14 – 19.06
50
95% CI for mean weight of 5 year old boys is 18.14kg – 19.06kg

Q2.The mean pulse rate of 100 MBBS students of GMCT was 74


with SD 3. Find the 95% confidence interval for the mean pulse
rate of all MBBS students of GMCT.

95% CI for mean pulse rate

𝑠
= 𝑥 ± 1.96 ∗ 𝑛

3
= 74 ± 1.96* = 73.4 – 74.6
100

95% CI for mean pulse rate of MBBS students of GMCT is 73.4/min – 74.6/min
Q3. In a study to find the prevalence of refractive errors among adolescents, out of 250 adolescents
selected 45 had refractive errors. Using this data estimate a 95% CI for
prevalence of refractive errors among adolescents.

95% CI for population proportion (prevalence of refractive errors)

𝑝𝑞
= p± 1.96 ∗
𝑛
45
p= ∗ 100 = 18%
250
18∗82
= 18 ± 1.96* = 13.24 – 22.76
250
95% CI for prevalence of refractive errors is 13.2% - 22.8%

Q4. In a study to find the prevalence of anemia among antenatal women, 200 antenatal patients were
selected and anemia was present in 40. Find the 95% CI for the prevalence of anemia in antenatal
Patients.

95% CI for prevalence of anemia

𝑝𝑞 40
= p± 1.96 ∗ p= ∗ 100 = 20%
𝑛 200

20∗80
= 20 ± 1.96* = 14.46 – 25.54
200
95% CI for prevalence of anemia in antenatal women is 14.5% - 25.5%
Testing of Hypothesis
1. One sample Z test for population mean
Q1. A random sample of 50 new born babies had mean birth
weight 2.95kg with SD 0.75kg. Test whether this data justify the
statement that mean birth weight of new born babies is 2.8kg at
5% level of significance?

Null hypothesis is H0:μ = 2.8


Alternative hypothesis is H1:μ ≠ 2.8

x  0 z
2.95  2.8
 1.41
Test statistic is z 0.75
s 50
n
At 5% level of significance, z / 2  1.96

Since z  1.41  1.96 , we accept H0 and reject H1:μ ≠ 2.8

Hence mean birth weight of new born babies is 2.8kg


2. Two samples Z test for equality of two population means

Q2. The following data on blood sugar was obtained from a


study conducted for testing the effect of two drugs A and
B. Test whether the effect of the drugs are different in reducing
blood sugar at 5% level of significance.
Drug Mean SD

A (100) 115 20

B (100) 140 30
H0: There is no difference in the effect of the
drugs
(H0:μ1 = μ2)

H1: There is difference in the effect of the drugs


(H1:μ1 ≠ μ2)
x1  x2
Test statistics is z
s12 s22

n1 n2

115  140
z
20 2 30 2

100 100
Z = - 6.93
At 5% level of significance, z / 2  1.96

Since z  6.93  1.96 , we reject H0 and accept


H1: μ1 ≠ μ2

Hence there is significant difference in the effect


of the drugs
3. One sample Z test for population proportion

Q3. In an otological examination of school children, out of 150


children examined 21 were found to have some type of otological
abnormalities. Does this agree with the statement that prevalence
of otological abnormalities among school children is 16%?

Null hypothesis is H0:P = 16%


Alternative hypothesis is H1:P ≠ 16%
p  p0 14  16
z 
p0 q0 16 * 84
Test statistic is
150
n
2
z  0.669
2.99
At 5% level of significance, z / 2  1.96

Since z  0.669  1.96 , we accept H0:P = 16% and reject H1

Hence prevalence of otological abnormalities among school


children is 16%.
4. Two samples Z test for equality of two population proportions

Q4. To compare the prevalence of dyslipidaemia between males and females,


random samples of 400 males and 400 females were selected. 120 males and
80 females were found to have dyslipidaemia. Test whether the prevalence of
dyslipidaemia between males and females is significantly different at 5%
level of significance.

Null hypothesis is H0:P1 = P2


Alternative hypothesis is H1:P1 ≠ P2

p1  p2
Test statistic is z 
1 1
pq  
 n1 n2 
n1 p1  n2 p2
where p  , q  100  p
n1  n2

400 * 30  400 * 20
p  25
400  400

q= 100-25 = 75

30  20 10
z 
 1 1  9.375
25 * 75  
 400 400 

z = 3.27
At 5% level of significance, z / 2  1.96

Since z  3.27  1.96 , we reject H0 and accept H1:P1 ≠ P2

Hence there is significant difference in the prevalence of


dyslipidaemia between males and females.
5. Chi square test

Used to test the association between two attributes

Null hypothesis is H0: There is no association b/w the


attributes

Alternative hypothesis is H1: There is association b/w


the attributes
Test statistic used is  2  
O E 
2

where O is the observed frequency and E is the expected


frequency of the cells.

E= row total*column total


grand total

If there are r rows and c columns in the frequency table,


(r-1) x (c-1) gives the degrees of freedom.
Critical region (rejection region) is  2  2 ,k , where α is
significance level and k is df

If  2
  2
 , k we reject the null hypothesis and accept the
alternative hypothesis

If  2
  2
 ,k we accept the null hypothesis and reject the
alternative hypothesis
Q5. To study the association between smoking and lung
cancer, 80 patients and 120 controls were selected.
Among patients 53 were smokers and among controls
37 were smokers. Using the data test whether there is
any association between smoking and lung cancer
Null hypothesis is

H0 : There is no association between smoking


and lung cancer

Alternative hypothesis is

H1: There is association between smoking


and lung cancer
Smoking Patients Controls Total
status
Smokers 53 37 90

Non 27 83 110
Smokers
Total 80 120 200
Expected frequencies are

E(53) = 90*80/200 = 36

E(37) = 90*120/200 = 54

E(27) = 110*80/200 = 44

E(83) = 110*120/200 = 66
 
2 O  E  2


53  36 37  54 27  44 83  66
2

2

2

2

36 54 44 66

= 24.3
df = (r-1)*(c-1) = (2-1)*(2-1) = 1

2 ,k  3.84 for   5% and df  1

Since  2
 24.3  3.84 , we reject the null hypothesis and
accept the alternative hypothesis.

Conclusion: There is association between smoking and


cancer.
When there is a 2x2 contingency table
Attribute 2 Attribute 2 Total
Present Absent
Attribute 1 a b a+b
Present
Attribute 1 c d c+d
Absent
Total a+c b+d a+b+c+d
(N)

the above formula is reduced to

 2

ad  bc  * (a  b  c  d )
2

a  b c  d a  c b  d 

and df = 1
Q6. A study was conducted to determine the association
between vitamin D deficiency and duration of exposure to
sunlight. 35 Vitamin D deficient subjects and 165 controls
were selected. 22 Vitamin D deficient subjects and 58
controls had a history of < 30min exposure to sunlight.
Test whether there is association between vitamin D
deficiency and duration of exposure to sunlight
Null hypothesis is

H0 : There is no association between vitamin D


deficiency and duration of exposure to sunlight

Alternative hypothesis is

H1: There is association between vitamin D


deficiency and duration of exposure to sunlight
Vit D deficient Control Total

< 30 min exposure 22 (a) 58 (b) 80


> 30 min exposure 13 (c) 107 (d) 120
Total 35 165 200

 
2 ad  bc 2
*N
a  b c  d a  c b  d 

 2

22 *107  13 * 58 * 200
2

22  5813  107 22  1358  107 


 2  9.24
2 ,k  3.84 for   5% and df  1

Since  2  9.24  3.84 , we reject the null hypothesis and


accept the alternative hypothesis.

Conclusion: There is association between vitamin D


deficiency and duration of exposure to sunlight.
VITAL STATISTICS

Important vital rates and ratios

Total number of live births during the year


 Crude Birth Rate CBR = ∗ 1000
Mid year population of the same year

Total number of deaths during the year


 Crude Death Rate CDR = ∗ 1000
Mid year population of the same year

CBR−CDR
Natural Growth Rate(NGR) = ∗ 100
1000
Age specific death Rate =
Total number of deaths in the specific age group during the year
 ∗ 1000
Mid year population of the same age group

 Sex specific death Rate =



Total number of deaths in the specific sex during the year
∗ 1000
Mid year population of the same sex

 Disease specific death Rate =


Total number of deaths from the specific disease during the year
 ∗ 1000
Mid year population of the same year
 Still Birth Rate SBR =
Total number of still births
∗ 1000
Number of live births + still births

 Perinatal Mortality Rate PMR

Total number of still births + deaths under 1 week of birth


= ∗ 1000
Number of live births + still births

 Neonatal Mortality Rate NMR

Total number of deaths upto 28 days of life


= ∗ 1000
Number of live births
 Post Neonatal Mortality Rate PNMR

Total number of deaths after 28 days of birth upto 1 year


= ∗ 1000
Number of live births

 Infant Mortality Rate IMR

Total number of deaths within 1 year of birth


= ∗ 1000
Number of live births

 Maternal Mortality Ratio MMR

Total number of maternal deaths


= ∗ 100000
Number of live births
Proportional mortality rate from a specific disease =

Total number of deaths due to a specific disease in the year


∗ 100
Total number of deaths due to all causes in the same year

 Proportional mortality rate aged 50 yrs and above =

Total number of deaths of persons aged 50 yrs and above in the year
∗ 100
Total number of deaths of all age groups in the same year
Case Fatality rate CFR

Total number of deaths due to a specific disease in a year


= ∗ 100
Total number of cases of the disease in the same year

 Survival rate 5 years

Total number of patients alive after 5 years


= ∗ 100
Total number of patients diagnosed or treated
MEASUREMENTS OF MORBIDITY

 Incidence
Number of new cases of a disease during a given time period
= ∗ 100
Population at risk during that period
 Point prevalence

Number of all current cases (old and new) of a disease at a given point in time
= ∗ 100
Estimated population at the same point in time
 Period prevalence

Number of all existing cases (old and new) of a disease during a given period of time
= ∗ 100
Estimated mid interval population at risk
Q1. Census population of an area during 2001 and 2011 were 231500 and
246500 resp. During 2004 there were 7500 live births and 2500 deaths. Of
these 50 deaths were within 1 year of birth and 15 deaths under 28 days of life.
There were 2 maternal deaths, 1750 deaths of above 50 years. Calculate
possible vital rates.

Write down the formula, substitute values and then write answer with
unit
Q2. The census population of a town during 2001 was 47000 and in 2011 was
65000. There were 2000 live births and 700 deaths in the year 2002. Of the 20
infant deaths, 8 infants died in the first 28 days of life and 4 of them died in the
first week itself. There were 10 still births in the same year. Calculate all
possible vital rates.
Q3. Census population of an urban area during 2001 and 2011 were 5,10,000
and 5,31,000. During 2003 there were 12500 live births and 4000 deaths. Out
of the total deaths 150 deaths were of children below one year. There were 25
maternal deaths, 5 deaths due to TB and 2800 deaths of persons above 50
years. 200 cases of TB were reported during that year. Calculate various vital
rates.

You might also like