Data Analysis Exam Question new
Data Analysis Exam Question new
Madas
DATA
ANALYSIS
EXAM
QUESTIONS
Created by T. Madas
Created by T. Madas
Question 1 (**)
The number of phone text messages send by 11 different students is given below.
14, 25, 31, 36, 37, 41, 51, 52, 55, 79, 112.
a) Find the lower quartile, the median and the upper quartile of the data.
c) Draw a suitably labelled box plot for this data, clearly indicating any outliers.
Created by T. Madas
Created by T. Madas
Question 2 (**)
The number of bottles of red wine sold by a local supermarket over a two week period
is shown below.
22, 14, 11, 33, 32, 45, 4, 12, 13, 20, 27, 44, 30, 15.
c) Find the median and the quartiles of the data and use them to determine if there
are any outliers.
Created by T. Madas
Created by T. Madas
Question 3 (**+)
The concentration of lactic acid, in appropriate units, after a period of intense exercise
was measured in the blood of 12 marathon runners.
Athlete A B C D E F G H I J K L
Lactic Acid Concentration 180 172 110 175 256 140 241 450 205 375 402 195
3 ( mean − median )
.
standard deviation
c) Evaluate this expression for this data and hence state its skew.
Created by T. Madas
Created by T. Madas
Question 4 (**+)
The % marks, rounded to the nearest integer, of a recent Mathematics test taken by 16
students, were summarised in an ordered stem and leaf diagram.
4 7
5 2,3,8
6 0,3, 4, a, b where 5 2 = 52 .
7 3,6, c, d ,8
8 1,9
b) Given the median is 68 and a ≠ b , find the value of a and the value of b .
Created by T. Madas
Created by T. Madas
Question 5 (**+)
A company decides to give their 23 employees a skills test in order to decide if any of
these employees need to be retrained.
The maximum possible score in this test is 50 and the results are summarised in an
ordered stem and leaf diagram.
0 5
1 9,9
2 1, 6,8
where 2 9 = 29 .
3 3, 4,5,7
4 2,3, 4, 4,8,9,9
5 0, 0, 0, 0,0,0
The company decides to retrain any employee whose score is less than the lower
quartile minus the interquartile range.
d) Draw a suitably labelled box plot for this data, clearly indicating any outliers,
as found in part (c).
Created by T. Madas
Created by T. Madas
Question 6 (**+)
The following set of data shows the number of posts made, in a given day, in a social
media site by a group of individuals.
1, 12, 13, 14, 16, 17, 20, 21, 23, 24, 26, 39, 55.
Created by T. Madas
Created by T. Madas
Question 7 (***)
A farmer keeps chicken and sells most of the eggs they lay.
The table below summarizes information about the number of eggs laid by his chickens
every week, for a period of 47 weeks.
a) Calculate the mean and the standard deviation of the eggs laid per week.
c) If the farmer only sells 45 eggs per week and keeps the rest for his family, find
the mean and the standard deviation of the eggs he keeps for his family.
d) Use the median and mean to determine the skew of the above data, and hence
determine whether this data can be modelled by a Normal distribution.
Created by T. Madas
Created by T. Madas
Question 8 (***)
The number of hours worked in a given week by a group of 64 individuals is
summarized in the table below.
Hours
Frequency
(nearest hour)
1 – 10 5
11 – 20 16
21 – 25 14
26 – 30 17
31 – 40 10
41 – 59 2
Created by T. Madas
Created by T. Madas
Question 9 (***)
A group of patients with a certain respiratory condition were asked to hold their breath
for as long as they could.
Time t
Frequency
(in seconds)
0 < t ≤ 10 30
10 < t ≤ 15 35
15 < t ≤ 18 33
18 < t ≤ 20 20
20 < t ≤ 30 25
30 < t ≤ 50 10
b) Use the histogram to estimate the number of patients that managed to hold their
breath between 24 and 36 seconds.
c) Calculate estimates for the mean and standard deviation of this data.
Created by T. Madas
Created by T. Madas
Question 10 (***)
The daily commuting distances of 125 individuals, rounded to the nearest mile, is
summarised in the table below.
Distance
Frequency
(nearest mile)
0–9 12
10 –19 22
20 – 29 48
30 – 39 26
40 – 49 8
50 – 59 5
60 – 69 3
70 – 79 1
a) Estimate the mean and the standard deviation of these commuting distances.
d) Explain which out of the mean and standard deviation or the median and the
interquartile range are more appropriate measures to summarize this data.
Created by T. Madas
Created by T. Madas
Question 11 (***)
The ages of the residents of Arnold Street are denoted by x the ages of the residents of
Benedict Street are denoted by y .
These are summarized in the following back to back stem and leaf diagram.
x y
50
5,5,3,3 1
9, 9,1 2 5
9,8, 6, 5,5, 4,3, 2, 2, 2,1 3 6, 7,8
6, 4,1, 0, 0, 0, 0 4 1, 2, 2, 3, 4,8
9 5 1, 4, 4, 4, 4,5,8,8
6 1, 3, 4, 4,5, 9,9
7 2, 6,9
a) Find separately for the residents of Arnold Street and Benedict Street, ...
ii. ... the lower quartile, the median and the upper quartile.
[continues overleaf]
Created by T. Madas
Created by T. Madas
mean − mode
.
standard deviation
mode = 40 mode = 54
Q1 = 29 Q1 = 42.5
Q2 = 34 Q2 = 54
MMS-D , Q3 = 40 , Q3 = 64
x ≈ 32.07 y ≈ 54.14
σ x ≈ 11.77 σ y ≈ 13.09
skew ≈ −0.67 skew ≈ 0.01
Created by T. Madas
Created by T. Madas
Question 12 (***)
The number of hours worked in a given week by a group of 64 freelance electricians
is summarized in the table below.
Hours
Frequency
(nearest hour)
1 – 10 5
11 – 20 16
21 – 25 14
26 – 30 17
31 – 40 10
41 – 59 2
b) Use the histogram to estimate the number of freelance electricians that worked
between 15 and 37 hours during that week.
MMS-L , ≈ 48 , Q2 ≈ 24.4
Created by T. Madas
Created by T. Madas
Question 13 (***)
The number of hours spent on homework by 70 students, in a particular week, is
summarized in the table below.
Hours
Frequency
(nearest hour)
2–3 6
4–6 18
7–9 15
10 – 11 18
12 7
13 – 15 6
b) Use the histogram to estimate the number of students that spent between 7.75
and 13.5 hours during that week.
MMS-E , ≈ 36 , Q2 ≈ 7.72
Created by T. Madas
Created by T. Madas
Question 14 (***)
The times taken to complete a 3 mile run, in minutes, by the members of a jogging
club are summarized in the table below.
Times
Frequency
(nearest hour)
11 – 14 24
15 – 17 24
18 – 19 19
20 11
21 – 23 21
24 – 28 15
d) Find the proportion of data which lies within 3 standard deviations of the mean.
Created by T. Madas
Created by T. Madas
Question 15 (***)
The monthly mileages of a sales rep are summarised in the table below.
x − 3325
y= ,
50
where x represents the midpoint of each class, estimate the mean and the standard
deviation of this data.
Created by T. Madas
Created by T. Madas
Question 16 (***+)
2
Frequency Density
1.5
0.5
The histogram above shows the distribution of the heights, to the nearest cm , of some
plants in a garden centre. It is further given that there were 18 plants with a height
between 5 cm and 8 cm , rounded to the nearest cm .
b) Estimate, by calculation, the mean and the standard deviation of the heights of
these plants.
Created by T. Madas
Created by T. Madas
Question 17 (***+)
In a histogram the commuting times of a group of individuals, correct to the nearest
minute, are plotted on the x axis.
Created by T. Madas
Created by T. Madas
Question 18 (***+)
The diameters of fine sand particles, in mm, are summarised in the table below.
y = 50 ( x − 0.09 ) ,
where x represents the midpoint of each class, estimate the mean and the
standard deviation of this data.
Created by T. Madas
Created by T. Madas
Question 19 (***+)
In a histogram the weights of apples, W grams, are plotted on the x axis.
In this histogram the class 125 ≤ W < 130 has a frequency of 75 and is represented by
a rectangle of base 1.8 cm and height 12 cm .
In the same histogram the class 150 ≤ W < 170 has a frequency of 40 .
Find the measurements, in cm , of the rectangle that represents the class 150 ≤ W < 170 .
Created by T. Madas
Created by T. Madas
Question 20 (***+)
The masses, x kg , of 40 students were measured and the results were summarized
using the notation below.
40 40
n =1
( xn − 50 ) = 140 and
n =1
( xn − 50 )2 = 4490 .
Calculate the mean and standard deviation of the masses of these 40 students.
MMS-O , x = 53.5 , σ = 10
Created by T. Madas
Created by T. Madas
Question 21 (***+)
In a histogram the weights of peaches, correct to the nearest gram, are plotted on the x
axis.
In this histogram the class 146 − 150 has a frequency of 75 and is represented by a
rectangle of base 2.8 cm and height 7.5 cm .
MMS-D , f = 210
Created by T. Madas
Created by T. Madas
Question 22 (***+)
The following information about 5 observations of x is shown below.
5 5
2
xi − 255 xi − 255
= 50 and = 1650 .
2 2
i =1 i =1
Created by T. Madas
Created by T. Madas
Question 23 (***+)
In a histogram the heights, h cm , of primary school pupils are plotted on the x axis.
In this histogram the class 120 ≤ h < 130 has a frequency of 72 and is represented by a
rectangle of base 4.2 cm and height 9 cm .
MMS-J , f = 32
Created by T. Madas
Created by T. Madas
Question 24 (***+)
The table below shows the length of time, rounded to the nearest minute, spent by a
group of patients for their dentist's check up visit.
Determine the standard deviation of these times, given that the mean of these times is
18.6 minutes.
MMS-M , σ ≈ 9.37
Created by T. Madas
Created by T. Madas
Question 25 (***+)
In a histogram the weights of baby hamsters, correct to the nearest gram, are plotted on
the x axis.
Created by T. Madas
Created by T. Madas
Question 26 (***+)
The distances rounded to the nearest mile, of 64 journeys covered by a taxi driver
during a given week, is summarized in the table below.
Distance
Frequency
(nearest mile)
3–5 12
6–7 14
8 19
9 – 11 13
12 – 17 6
a) Estimate the mean and the standard deviation of these weekly distances.
In a histogram drawn for the above data, the class 3 – 5 is represented by a rectangle of
base length 1.2 cm and height 5 cm .
c) Find the base length and height of the rectangle representing the class 12 – 17
in the same histogram.
It is further given that the lower and upper quartiles of these distances are 6.07 and
9.19 , respectively.
e) By considering the skewness using the averages, discuss briefly whether the
above set of data can be modelled by a normal distribution.
Created by T. Madas
Created by T. Madas
Question 27 (***+)
b) Estimate, by linear interpolation, the median value and hence determine with
justification, the skewness of the data.
In a histogram drawn for the above data, the 1 ≤ w < 3 class is represented by a
rectangle of base length 2.4 cm and height 2.5 cm .
c) Find the base length and height of the rectangle representing the 6.5 ≤ w < 7
class in the same histogram.
It is further given that the lower and upper quartiles of these distances are 4.68 and
6.43 , respectively.
e) Discuss briefly whether the above set of data can be modelled by a normal
distribution.
Created by T. Madas
Created by T. Madas
Question 28 (***+)
The masses of 68 cows, in kg, are summarised in the table below.
x − 662.5
y= ,
25
where x represents the midpoint of each class, estimate the mean and standard
deviation of this data.
b) Estimate, by the method of linear interpolation, the median mass of these cows.
Created by T. Madas
Created by T. Madas
Question 29 (****)
The histogram below shows the distribution of the marks of 250 students.
2
Frequency Density
1.5
0.5
0 20 50
40 60 100
0
Marks
c) Calculate estimates for the mean and standard deviation of the marks of these
students.
Created by T. Madas
Created by T. Madas
Question 30 (****)
The mean and standard deviation of 20 observations x1, x2 , x3, ..., x20 are
The mean and standard deviation of 12 observations y1, y2 , y3, ..., y12 are
y = 25 and σ y = 7.5 .
Created by T. Madas
Created by T. Madas
Question 31 (****)
The mean and standard deviation of the test marks of 40 pupils in a Mathematics class
are 65 and 18 , respectively.
The mean and standard deviation of the test marks of the 24 boys of the class are 72
and 20 , respectively.
Find the mean and standard deviation of the test marks of the 16 girls of the class.
Created by T. Madas
Created by T. Madas
Question 32 (****)
The masses, x kg , of 40 students were measured and the results were summarized
using the notation below.
40 40
n =1
( xn − 50 ) = 150 and
n =1
( xn − 50 )2 = 4650 .
40
Determine the value of
(
n =1
2
xn ) .
40
MMS-X ,
n =1
( xn )2 = 119650
Created by T. Madas
Created by T. Madas
Question 33 (****+)
It is given that for a sample of data x1 , x2 , x3 , x4 , x5 , … xn the mean x and standard
deviation σ are
2
n n n
x=
1
n
r =1
xr = 2 and σ =
1
n
r =1
( r)
x
2
−
1
n2
r =1
xr = 3 .
(
r =1
2
xr + 1) .
n
MMS-S ,
(r =1
2
xr + 1) = 18n
Created by T. Madas
Created by T. Madas
Question 34 (****+)
The test marks, x , of 20 students were coded and their results were summarized as
( x − 10 ) = 220 and
( x − 10 )2 = 2720 .
x 2 = 9120 .
b) Calculate the mean and standard deviation of the test marks of these students.
MMS-U , x = 21 , σ = 15 ≈ 3.87
Created by T. Madas