0% found this document useful (0 votes)
18 views15 pages

Biostatistic Chpt.3

This document discusses measures of dispersion in statistics, including range, quartile deviation, mean deviation, variance, and standard deviation. It provides definitions and examples of calculating each measure. The key points are: - Range is the simplest measure of dispersion, defined as the difference between the largest and smallest values. - Quartile deviation is half the difference between the first (Q1) and third (Q3) quartiles. - Mean deviation is the average of the absolute deviations from the mean. - Variance is the average of the squared deviations from the mean. Standard deviation is the positive square root of the variance, providing a more accurate measure of dispersion than variance alone. - Examples

Uploaded by

Mohamed Ghareba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views15 pages

Biostatistic Chpt.3

This document discusses measures of dispersion in statistics, including range, quartile deviation, mean deviation, variance, and standard deviation. It provides definitions and examples of calculating each measure. The key points are: - Range is the simplest measure of dispersion, defined as the difference between the largest and smallest values. - Quartile deviation is half the difference between the first (Q1) and third (Q3) quartiles. - Mean deviation is the average of the absolute deviations from the mean. - Variance is the average of the squared deviations from the mean. Standard deviation is the positive square root of the variance, providing a more accurate measure of dispersion than variance alone. - Examples

Uploaded by

Mohamed Ghareba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023

Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Chapter III

MEASURES OF DISPERSION

3.1- Introduction

The measure of central tendency serve to locate the center of the


distribution, but they do not reveal how the items are spread out on either
side of the center. This characteristic of a frequency distribution is
commonly referred to as dispersion. In a series all the items are not equal.
There is difference or variation among the values. The degree of variation
is evaluated by various measures of dispersion. Small dispersion indicates
high uniformity of the items, while large dispersion indicates less
uniformity. For example consider the following marks of two students.

Student I Student II
68 85
75 90
65 80
67 25
70 65
Both have got a total of 345 and an average of 69 each. The fact is that
the second student has failed in one paper. When the averages alone are
considered, the two students are equal. But first student has less variation
than second student. Less variation is a desirable characteristic.

3.2- Range:

This is the simplest possible measure of dispersion and is defined as the


difference between the largest and smallest values of the variable.

In symbols, Range = L – S.

Where L = Largest value.

S = Smallest value.

In individual observations and discrete series, L and S are easily


identified. In continuous series (frequency tables), the following two
methods are followed.

1
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Method 1:

L = Upper boundary of the highest class

S = Lower boundary of the lowest class.

Method 2:

L = Mid value of the highest class.

S = Mid value of the lowest class.

Example:

Find the value of range for the following data.

7, 9, 6, 8, 11, 10, 4

Solution:

Range = L – S = 11- 4 = 7

Example:

Calculate range from the following distribution.

Intervals 60-63 63-66 66-69 69-72 72-75


Frequencies 5 1 42 27 8
Solution:

L = Upper boundary of the highest class =75

S = Lower boundary of the lowest class = 60

Range = L – S = 75 – 60 = 15

Merits and Demerits of Range :

Merits:

1. It is simple to understand.

2. It is easy to calculate.

3. In certain types of problems like quality control, weather forecasts,


share price analysis, et c., range is most widely used.

2
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Demerits:
1. It is very much affected by the extreme items.
2. It is based on only two extreme observations.
3. It cannot be calculated from open-end class intervals.
4. It is not suitable for mathematical treatment.
5. It is a very rarely used measure.

3.3- Quartile Deviation ( Q.D) :

Definition: Quartile Deviation is half of the difference between the first


and third quartiles.

Q3  Q1
Q.D 
2

Example :
Find the Quartile Deviation for the following data:
391, 384, 591, 407, 672, 522, 777, 733, 1490, 2488
Solution:
Arrange the given values in ascending order.
384, 391, 407, 522, 591, 672, 733, 777, 1490, 2488.
Number of items, n = 10
(n  1) 
Lower quartile, Q1 =   item
 4 
(10  1)  11
Q1 =     2.75 term
 4  4
From the quartile formula we can write;
Q1 = 2nd term + 0.75(3rd term-2nd term)
Q1= 391+0.75(407-391) = 391+0.75×16=391+12=403

3(n  1) 
Upper Quartile, Q3 =   item
 4 
3(10  1)  33
Q3 =    8.25 term
 4  4
Q3 = 8th term + 0.25(9th term-8th term)
Q3 = 777+0.25(1490-777) = 777 + 0.25 (713)
= 777 + 178.25 = 955.25
𝑄3 − 𝑄1 955.25 − 403
𝑄. 𝐷 = = = 276.125
2 2

3
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

3.4- Mean Deviation (M.D) :

1- For simple data:

It can be obtained by:

∑𝑛𝑖=1|𝑋𝑖 − 𝑋̅|
𝑀. 𝐷 =
𝑛
Example:

Calculate mean deviation (M.D) for the following data

3 , 4, 2 , 6 , 5

Solution:

∑5𝑖=1 𝑋𝑖 20
𝑥̅ = = =4
𝑛 5
then

∑𝑛𝑖=1|𝑋𝑖 − 𝑋̅| |3 − 4| + |4 − 4| + |2 − 4| + |6 − 4| + |5 − 4|
𝑀. 𝐷 = =
𝑛 5
1+0+2+2+1 6
= = 1.2
5 5

It can be obtained by:

∑𝑘𝑖= 𝑓𝑖 |𝑋𝑖 − 𝑋̅|


𝑀. 𝐷 =
∑𝑘𝑖=1 𝑓𝑖

3.5 Variance and Standard Deviation :

Karl Pearson introduced the concept of standard deviation in 1893. It is


the most important measure of dispersion and is widely used in many
statistical formulae. Standard deviation is also called Root-Mean Square
Deviation. The reason is that it is the square–root of the mean of the
squared deviation from the arithmetic mean. It provides accurate result.
Square of standard deviation is called Variance.

4
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Definition:

It is defined as the positive square-root of the arithmetic mean of the


Square of the deviations of the given observation from their arithmetic
mean.
1-Variance and Standard Deviation for Ungrouped or Raw data:

The Sample variance of 𝑥1 , 𝑥2 , 𝑥3 , … . . , 𝑥𝑛 is denoted by S2,

2
∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2
𝑆 =
𝑛−1
Or

2
∑𝑛𝑖=1 𝑋 2 − 𝑛𝑋̅ 2
𝑆 =
𝑛−1

The Sample standard deviation is denoted by 𝑆 = √𝑆 2

Which is

∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2
𝑆= √
𝑛−1

Or

∑𝑛𝑖=1 𝑋 2 − 𝑛𝑋̅ 2
𝑆=√
𝑛−1

Example:

Calculate the standard deviation and variance from the following data.

14, 22, 9, 15, 20, 17, 12, 11

Solution:
14 + 22 + 9 + 15 + 20 + 17 + 12 + 11 120
𝑥̅ = = = 15
8 8

5
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Values (x) 𝑥𝑖 − 𝑥̅ (𝑥𝑖 − 𝑥̅ )2 𝑥2


14 14 – 15 = -1 1 196
22 22 – 15 = 7 49 484
9 -6 36 81
15 0 0 225
20 5 25 400
17 2 4 289
12 -3 9 144
11 -4 16 121
120 0 140 1940

∑𝑛𝑖=1(𝑋𝑖 − 𝑋̅)2 140 140


𝑆= √ =√ =√ = 4.47
𝑛−1 8−1 7

or

∑𝑛𝑖=1 𝑋 2 − 𝑛𝑋̅ 2 1940 − [8 × (15)2 ] 1940 − (8 × 225)


𝑆=√ =√ =√ = 4.47
𝑛−1 8−1 7

Variance = 𝑆 2 = (4.47)2 = 20

2- Variance and Standard Deviation for Grouped data:


Following are the basic formulas used to calculate the variances for grouped data.

∑𝑛𝑖=1 𝑓𝑖 (𝑋𝑖 − 𝑋̅)2


2
𝑆 =
∑𝑛𝑖=1 𝑓𝑖 − 1

The standard deviation is denoted by 𝑆 = √𝑆 2

Which is

∑𝑛𝑖=1 𝑓𝑖 (𝑋𝑖 − 𝑋̅)2


𝑆=√
∑𝑛𝑖=1 𝑓𝑖 − 1

Example2:

Thirty farmers were asked how many farm workers they hire during a
typical harvest season. Their responses were:

Workers 0 1 2 3 4 5 6 7 8 9
Farmers 1 1 2 3 6 5 4 3 3 2
6
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Find the variance.

Solution:

𝑥𝑖 𝑓𝑖 𝑓𝑖 𝑥𝑖 𝑥𝑖 − 𝑥̅ (𝑥𝑖 − 𝑥̅ )2 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )2

0 1 0 -5 25 25

1 1 1 -4 16 16

2 2 4 -3 9 18

3 3 9 -2 4 12

4 6 24 -1 1 6

5 5 25 0 0 0

6 4 24 1 1 4

7 3 21 2 4 12

8 3 24 3 9 27

9 2 18 4 16 32

Total 30 150 152

∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 150
𝑥̅ = 𝑛 = =5
∑𝑖=1 𝑓𝑖 30

2
∑𝑛𝑖=1 𝑓𝑖 (𝑋𝑖 − 𝑋̅)2 152 152
𝑆 = = = = 5.241
∑𝑛𝑖=1 𝑓𝑖 − 1 30 − 1 29

7
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Example2:

Calculate variance and standard deviation for the following:

Intervals 𝑓𝑖
10-14 2
15-19 12
20-24 23
25-29 60
30-34 77
35-39 38
40-44 8
Solution:

Midpoint
Intervals 𝑓𝑖 𝑓𝑖 𝑥𝑖 𝑥𝑖 − 𝑥̅ (𝑥𝑖 − 𝑥̅ )2 𝑓𝑖 (𝑥𝑖 − 𝑥̅ )2
𝑥𝑖
10-14 2 12 24 -17.82 317.6 635.2
15-19 12 17 204 -12.82 164.4 1972.8
20-24 23 22 506 -7.82 61.2 1407.6
25-29 60 27 1620 -2.82 8 480
30-34 77 32 2464 2.18 4.8 369.6
35-39 38 37 1406 7.18 51.6 1960.8
40-44 8 42 336 12.18 148.4 1187.2
Total 220 6560 8013.2
∑𝑛𝑖=1 𝑓𝑖 𝑥𝑖 6560
𝑥̅ = 𝑛 = = 29.82
∑𝑖=1 𝑓𝑖 220

2
∑𝑛𝑖=1 𝑓𝑖 (𝑋𝑖 − 𝑋̅)2 8013.2 8013.2
𝑆 = 𝑛 = = = 36.51
∑𝑖=1 𝑓𝑖 − 1 220 − 1 219

→ 𝑆 = √𝑆 2 = √36.51 = 6.04

8
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Merits and Demerits of Standard Deviation:


Merits:
1. It is the most important and widely used measure of dispersion.
2. It is possible for further algebraic treatment.
3. If each observation is increased or decreased by a; then the standard
deviation does not change.
4 .If each observation is multiplied or divided by a non-zero number a;
then standard deviation is also multiplied or divided by a.
Demerits:
1. It is not easy to understand and it is difficult to calculate.
2. It cannot be used for the purpose of comparison.

3.6 Coefficient of Variation (C.V) :

The coefficient of variation is obtained by dividing the standard deviation


by the mean and multiply it by 100.
𝑆
Coefficient of variation 𝐶. 𝑉 = × 100
𝑋̅

If we want to compare the variability of two or more series, we can use


C.V. The series or groups of data for which the C.V. is greater indicate
that the group is more variable, less stable, less uniform, less consistent or
less homogeneous. If the C.V. is less, it indicates that the group is less
variable, more stable, more uniform, more consistent or more
homogeneous.

Example :

In two factories A and B located in the same industrial area, the average
weekly wages and the standard deviations are as follows:

Factory Average Standard Deviation No. of workers


A 34.5 5 476
B 28.5 4.5 524
Which factory A or B has greater homogeneous in individual wages?

9
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Solution:
𝑆𝐴 5
𝐶. 𝑉(𝐴) = × 100 = × 100 = 14.49%
̅̅̅
𝑋𝐴 34.5
𝑆𝐵 4.5
𝐶. 𝑉(𝐵) = × 100 = × 100 = 15.79%
̅̅̅̅
𝑋 𝐵 28.5

Factory A has more homogeneous in individual wages, since C.V. of


factory B is greater than C.V of factory A.

10
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

3.7 Skewness:
Skewness means ‘lack of symmetry’. We study skewness to have an idea
about the shape of the curve which we can draw with the help of the
given data.
• If in a distribution
Mean = Median = Mode
then that distribution is known as symmetrical distribution.

Mean = Median = Mode

If in a distribution
Mean ≠ Median ≠ Mode
then it is not a symmetrical distribution and it is called a skewed
distribution and such a distribution could either be positively skewed or
negatively skewed.
Positively skewed

Mode Median Mean

Negatively skewed

Mean Median Mode

11
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Measures of skewness:

The important measures of skewness are

1) First coefficient of skewness:

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒 𝑋̅ − 𝑋̂
𝛼1 = =
𝑆. 𝐷 𝑆
2) Second coefficient of skewness:

3(𝑀𝑒𝑎𝑛 − 𝑀𝑒𝑑𝑖𝑎𝑛) 3(𝑋̅ − 𝑋̃)


𝛼2 = =
𝑆. 𝐷 𝑆
3) Quartile coefficient of skewness:

(𝑄3 − 𝑋̃)(𝑋̃ − 𝑄1 )
𝛼3 =
𝑄3 − 𝑄1

Note:

- For coefficient of skewness = Zero Symmetric Curve


- For coefficient of skewness = Negative "+" Curve bent to Right
- For coefficient of skewness = Positive "−" Curve bent to Left

Example:

Using the following data,

𝑋̅ = 74.92 𝑋̂ = 76.33 𝑋̃ = 75.227 𝑆 = 22.17

Find 𝛼1 , 𝛼2 ?

Solution:

𝑋̅ − 𝑋̂ 74.92 − 76.33
𝛼1 = = = −0.0637
𝑆 22.17
and

3(𝑋̅ − 𝑋̃) 3(74.92 − 75.227)


𝛼2 = = = −0.0415
𝑆 22.17

12
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Since 𝛼1 < 0 and 𝛼2 < 0, then the curve of data is skewness to left
"negative skewness".

3.8 Kurtosis:

Kurtosis is for measure degree height the frequency curve of distribution


and width it. There are three cases:

Flat Normal Lepto

P = Platy Kurtic M= Meso Kurtic L= Lepto Kurtic


𝛽 > 0.263 𝛽 = 0.263 𝛽 < 0.263

Measure of Kurtosis:

• The percentile coefficient of kurtosis:


𝑄3 − 𝑄1
𝛽=
2(𝑃90 − 𝑃10 )

Example:

Using the following data,

𝑃10 = 15.09 𝑃90 = 78 𝑄1 = 23.33 𝑄3 = 66

Find percentile coefficient of kurtosis 𝛽?

Solution:
𝑄3 − 𝑄1 66 − 23.33 42.66
𝛽= = = = 0.339
2(𝑃90 − 𝑃10 ) 2(78 − 15.09) 125.818

Since 𝛽 > 0.263 then the frequency curve of this data is Platy Kurtic
"flat".

13
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

Exercises:

1. Fill in the blanks:

i. In a distribution S.D = 6. All observation multiplied by 2 would give


the result to S.D is ………………….…

ii. If the minimum value in a set is 9 and its range is 57, the maximum
value of the set is …………………

iii. The standard deviation of the five observations 5, 5,5,5,5 is ……….…

iv. The standard deviation of 10 observation is 15. If 5 is added to each


observations the value of new standard deviation is ……………………..

2. Choose the nearest number to your answer.

The mean of four numbers is 71.5 . If three of the numbers are 58, 76,
and 88, then the value of the fourth number is equal to:

,(A) 76 (B) 82 (C) 60 (D) 64

3. Find (𝑋̅ , ̃𝑋 , 𝑋̂ , 𝐶. 𝑉 , 𝑄1 , 𝑄3 , 𝑀. 𝐷 , 𝑄. 𝐷 , 𝑆 2 ). For the


following data:

15 , 10 , 12 , 12 , 10 , 9 , 10 , 14

4. For this table:

Intervals 40-50 50-60 60-70 70-80 80-90


fi 2 7 12 9 5
Find . 𝛼1 , 𝛼2 , 𝛽 , C.V , P90 , D5

5. If we have 5 numbers 𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 , 𝑋5 where ∑𝟓𝒊=𝟏(𝑿𝒊 − 𝟔) = 𝟒𝟓

- Find , 𝑋̅ and ∑𝟓𝒊=𝟏 𝑿𝒊

- and if we know that 𝑋1 = 20, 𝑋2 = 15, 𝑋3 = 10, 𝑋4 = 20 find also 𝑋5


14
Faculty of medicine and Pharmacy / Al-mergib University - Academic year 2022/2023
Biostatistics for Premedical students - Dr. A. A. Aziz - Dr. M.O. Alshrani

6. Two samples have been taken from a population:


𝟓𝟎 𝟓𝟎
First sample ∑ 𝑿𝟐𝒊 = 𝟐𝟗𝟖𝟒 ∑ 𝑿𝒊 = 400
𝒊=𝟏 𝒊=𝟏
𝟐𝟎 𝟐𝟎
Second sample ∑ 𝒀𝟐𝒊 = 𝟓𝟔𝟗𝟏 ∑ 𝒀𝒊 = 270
𝒊=𝟏 𝒊=𝟏

1. Calculate variance for each samples?

2. which sample is more homogeneous?

15

You might also like