0% found this document useful (0 votes)
5 views

Chapter3 - Notes and Exercises

The document provides statistical analysis of two datasets: the heights of 16 small statues found in Egypt and the litres of petrol sold over 60 days at a service station. It includes calculations for mean, median, mode, quartiles, percentiles, variance, standard deviation, coefficient of variation, and skewness, along with methods for constructing a box-plot. Detailed solutions and steps for each calculation are presented to illustrate the statistical concepts applied.

Uploaded by

differhlungwani
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Chapter3 - Notes and Exercises

The document provides statistical analysis of two datasets: the heights of 16 small statues found in Egypt and the litres of petrol sold over 60 days at a service station. It includes calculations for mean, median, mode, quartiles, percentiles, variance, standard deviation, coefficient of variation, and skewness, along with methods for constructing a box-plot. Detailed solutions and steps for each calculation are presented to illustrate the statistical concepts applied.

Uploaded by

differhlungwani
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Statistics

Chapter 3

Example of ungrouped data

In a rare discovery in Egypt, 16 small statues were found, presumed to be the first from
a much larger population of similar statues yet to be discovered. The heights (in
centimetres) of the statues are:
12.1 12.0 11.7 12.4 12.1 12.2 12.0 12.3
11.9 11.8 12.3 11.9 12.2 12.1 11.8 12.0

Calculate the following:


1. What is the average (mean) height of the statues? (2)
2. Value of the median (2)
3. Is there a mode? (1)
4. Q1 (lower quartile) (2)
5. Q3 (upper quartile) (2)
6. Calculate P30 (percentile 30) (2)
7. Variance (s ); & standard deviation (s)
2
(3 each)
8. Coefficient of variation (CV); indicate variability (2)
9. Coefficient of skewness (SK); comment on shape of data (2)
10. Box-plot; include determination of outliers in data set (5)

Solutions:
Ʃx 192.8
1. Mean: x= = = 12.05
n 16
• Add all the values: 12.1 + 12.0 + 11.7 + . . . . + 11.8 + 12.0 = 192.8
• Divide answer by number of values

2. Median (for n: even number)


Step 1: arrange data values in ascending order
11.7 11.8 11.8 11.9 11.9 12.0 12.0 12.0 12.1 12.1 12.1 12.2 12.2 12.3 12.3 12.4
n 16
Step 2: Determine median position = = 8th + 9th value
2 2
12.0+12.1
Step 3: Calculate median value: = 12.05
2
n+1
 Note: when n is odd: for median position:
2

3. Mode: most frequent occurring value(s) = 12.0 & 12.1

4. To do Q1 [use same steps as for median]


Step 1: Arrange data in ascending order
Step 2: Determine Q1 position (same for n odd or even):
n+1 16+1
= = 4.25th [thus 0.25 between 4th & 5th values]
4 4
Step 3: Calculate value of Q1: 11.9 + 0.25(11.9 – 11.9) = 11.9
1
 [(4th value + 0.25(5th – 4th)]
5. To do Q3 [use same steps as for median]
Step 1: Arrange data in ascending order
Step 2: Determine Q3 position (same for n odd or even):
3(n+1) 3(16+1)
= = 12.75th [thus 0.75 between 12th & 13th values]
4 4
Step 3: Calculate Q3 value: 12.2 + 0.75(12.2 – 12.2) = 12.2
[12th + 0.75(13th – 12th)]

6. To do P30 [use same steps as for quartiles]


Step 1: Arrange data in ascending order
Step 2: Determine P30 position (same for n odd or even):
j(n+1) 30(16+1)
= = 5.1th [thus 0.1 between 5th & 6th values]
100 100
Step 3: Calculate P30 value: 11.9 + 0.1(12.0 – 11.9) = 11.91
[5th + 0.1 (6th – 5th)]

7. Variance: 2
s =
∑ x 2−n(x )2 = 2323.84−16(12.05)
2
= 0.04
n−1 16−1

Standard deviation = √ s 2= s = √ 0.04 = 0.2


 Note: Ʃ x 2 = 11.7 2 + 11.8 2 + 11.8 2 + 11.9 2 + . . . . + 12.32 + 12.4 2 = 2 323.84
 Note: use value for x as calculated in question 1: x = 12.05

s
8. Calculate coefficient of variation: CV = x 100%
x
0.2
= x 100% = 1.66% (very few
12.05
variability)

3(x−Median) 3(12.05−12.05)
9. Skewness: SK = = = 0 (no skewness;
s 0.2
symmetrical)

 Note: use values for x , median & s as calculated in questions 1, 2 & 7.

10. Box-plot

Step 1: use 5-number summary table


Minimum value : 11.7
Q1 : 11.9
Median : 12.05
Q3 : 12.2
Maximum value: 12.4

2
Step 2: At the bottom of graph, construct a number-line, including minimum and
maximum values. ˫ ̶ ̶ ̶ ̶ ̶˫ ̶ ̶ ̶ ̶ ˫̶ ̶ ̶ ̶ ̶ ̶ ˫̶ ̶ ̶ ̶ ̶ ̶ ˫̶ ̶ ̶ ̶ ̶ ˫̶ ̶ ̶ ̶ ̶˫ ̶ ̶ ̶ ⊦̶
11.7 11.8 11.9 12.0 12.1 12.2 12.3 12.4

Step 3: Graph: between values of Q1 and Q3, draw a box; and indicate, with a
line, the median within the box.
Step 4: Draw from min. value a horizontal line to Q1 (of box); and draw from Q3
(of box) a horizontal line to max. value.
Step 5: Calculate outliers.

Answer for graph/box-plot:

Box-plot for heights of statues in Egypt

˫ ̶ ̶ ̶ ̶ ˫̶ ̶ ̶ ̶ ̶ ̶˫ ̶ ̶ ̶ ̶ ̶ ˫̶ ̶ ̶ ̶ ̶ ̶ ˫̶ ̶ ̶ ̶ ̶ ̶˫ ̶ ̶ ̶ ˫̶ ̶ ̶ ̶ ̶⊦

11.7 11.8 11.9 12.0 12.1 12.2 12.3 12.4


↓ ↓ ↓ ↓ ↓
Min. Q1 Me Q3 Max.

Calculations for outliers:


 Lower limit: Q1 – 1.5(Q3 – Q1) = 11.9 – 1.5(12.2 – 11.9) = 11.45
 Upper limit: Q3 + 1.5(Q3 – Q1) = 12.2 + 1.5(12.2 – 11.9) = 12.65

Min. value of 11.7 and max. value of 12.4 do not lie outside these two limits;
thus there is no outliers

3
Example of grouped data – Chapter 3

The number litres of petrol (in 1000 litres) sold at a service station was recorded for
each of 60 days. The amounts was summarised as follow:
Amount Days ← given; q1. x q1. fx Cum f
(1000 (f) add according (q2.)
litres) to question →
0 - < 10 6 5 30 6
10 - < 20 14 15 210 20 Q1 interval
20 - < 30 21 25 525 41 Med interval; Mode interval

30 - < 40 13 35 455 54 Q3 interval

40 - < 50 6 45 270 60
↓ ↓ 60 1490
lower upper
limits limits

Calculate the following:


1. Arithmetic mean (average) (3)
2. Median (4)
3. Mode (4)
4. Q1 (4)
5. Q3 (4)

Solution:
Notes: •class width: c = 10 •number of class intervals: k = 5
•count of observations: f (2nd column given)
•number of observations/sample size: n = Ʃf
•take note of “lower limits” & “upper limits” as indicated above

x=
∑ fx
1. To calculate mean:
∑f
Step 1: Calculate (add column) for x (class midpoint) → lower limit + ½ c
• find c (class width) = 10; ½c = 5; add ½c (in this case 5) to each lower
limit, e.g. lower limit of 1st class = 0; thus 0 + 5 = 5 (x for 1st class),
lower limit of 2nd class = 10; thus 10 + 5 = 15 (x for 2nd class), etc. ... do
for all classes: x: 5, 15, 25, 35, 45
Step 2: Calculate (add column for) fx, e.g. f for 1st class = 6, x for 1st class = 5;

4
thus fx for 1st class = 6⨯5 = 30; 2nd class: 14⨯15 = 210, etc. ...do for each
class: fx: 30, 210, 525, 455, 270
Step 3: Find Ʃfx . . . add: 30+210+525+455+270 = 1 490
Step 4: Substitute in formula:

x =
∑ fx 1 490
= = 24.83
∑f 60
n
c [ −f (¿)]
2. To calculate Median: Me = Ome + 2
f me

where: • Ome : lower limit of median interval


• c : class width
n
• : Me position
2
• f(<) : cumulative f of previous class
• fme : f (frequency) of median interval

Step 1: Calculate (add column) for cum f : 6, 20, 41, 54, 60


n 60
Step 2: then determine median position = = 30th value;
2 2
Step 3: then identify median class interval by looking at cum f . . . 30th value
lies in 3rd class → 1st cum f that is > 30
Step 4: Substitute values in formula:

10(30−20)
= 20 + = 24.76
21

c ( f m−f m−1 )
3. To calculate Mode: Mo = Omo +
2 ( f m ) −f m−1−f m+1

where: • Omo : lower limit of mode interval


• c : class width
• fm : f (frequency) of mode interval
• fm-1 : f of previous class
• fm+1 : f of next class

Step 1: Identify mode interval: choose class interval with highest


frequency (f): 3rd class (with f of 21)
Step 2: Substitute in formula:

10(21−14)
Mo = 20 + = 24.67
2 ( 21 ) −14−13

5
 NB note: formulae for Q1 & Q3 are not given on list of formulae; student must
conclude it from formula for median

n
c [ −f ( ¿ ) ]
4. To calculate Q1 Do on same basis as median Q1 = Oq1 + 4
f q1

where: • Oq1 : lower limit of Q1 interval


• c : class width
n
• : Q1 position
4
• f(<) : cumulative f of previous class
• fq1 : f (frequency) of Q1 interval

Step 1: Add column for cum f . . . . 6, 20, 41, 54, 60


n
Step 2: then determine Q1 position /4 = 60/4 = 15th value
Step 3: then identify Q1 class interval by looking at cum f . . . 15th value
lies in 2nd class → 1st cum f that is > 15
Step 4: Substitute values in formula:

10(15−6)
Q1 = 10 + = 16.43
14

3n
c[ −f ( ¿ ) ]
5. To calculate Q3 Do on same basis as median Q3 = Oq3 + 4
f q3

where: • Oq3 : lower limit of Q3 interval


• c : class width
3n
• : Q3 position
4
• f(<): cumulative f of previous class
• fq3 : f (frequency) of Q3 interval

Step 1: Add column for cum f . . . 6, 20, 41, 54, 60


3n 3(60)
Step 2: then determine Q3 position = = 45th value
4 4
Step 3: then identify Q3 class interval by looking at cum f ; 45th value
lies in 4th class → 1st cum f that is > 45

6
Step 4: Substitute values in formula:

10(45−14)
Q3 = 30 + = 33.08
13

__________________________________________________________

You might also like