Statistics Book 2
Statistics Book 2
Prices
Introduction
Introduction
This unit and Unit 3 examine, in various ways, the question:
Are people getting better or worse off ?
Because this is a statistics module, we shall concentrate on the statistical
aspects of the question. This unit focuses on statistics about prices, and
Unit 3 moves on to consider statistics about earnings; this enables us to
look at the question of whether earnings have been increasing more rapidly
than prices.
However, it is not the case that statistics can provide all the answers – or
even the best answer – to the question of whether people are getting better
or worse off. There are many non-statistical issues which are relevant and
it is important to put the statistical approach in its correct perspective. To
take just one example: if earnings are rising rapidly but unemployment is
also rising, then no statistical analysis based on a comparison of earnings
with prices will have any relevance to the circumstances of a person who
has become unemployed.
In the question examined in these units, people does not refer specifically
to you, Open University students, but to the whole of society in the UK.
That is quite a big batch (more than 62 million in 2010, according to an
estimate from the UK’s Office for National Statistics), consisting of men,
women and children, living alone, in large or small households, or in
institutions; some of them working, others unemployed, some retired and
others not yet old enough for paid work.
It is not possible, using statistical techniques, to provide a complete answer
to this one question covering such a big theme, particularly an answer
which is valid for all these people and their varied economic and social
circumstances; data and techniques both have to be used with common
sense. Instead, the aim of these texts is more modest: to explore small
batches of data relevant to the question (and relating to some individuals
and groups in society), using basic analytical and graphical techniques.
We start with price data and look at some different ways of measuring the
overall location of a batch of price figures for a single item. In looking for
patterns in data, the initial procedures are to round the figures, if
necessary, in an appropriate and convenient way, then to draw a stemplot
(as described in Unit 1). The next step is to find a measure representing
the location of the batch; this will be a value lying between the lowest and
highest values of the batch. You have already met one important location
measure: the median. (There will be more about this in what follows.)
Another very important measure is the arithmetic mean, which is
introduced in Subsection 1.3.
Section 2 shows how to calculate the weighted mean, which is a quantity
related to the arithmetic mean. You will also learn about some
circumstances where it makes sense to calculate a weighted mean.
91
Unit 2 Prices
1 Measuring location
Measuring location has two components:
• gathering data about the quantity of interest
• determining a value to represent the location of the data.
The task of gathering appropriate data is somewhat problem-specific –
general strategies are available, but exact details usually need to be decided
for each problem. To determine the price of an electric kettle, for example,
we would have to decide the size and type of kettle we’re interested in,
where and when it’s purchased, and so forth. In contrast, choosing a value
to summarise the location of a set of data is more straightforward. In this
section, we will focus on the two most common measures of location: the
median and the mean. The data gathered about the quantity of interest
does not affect the way we calculate these location measures.
92
1 Measuring location
93
Unit 2 Prices
That is, if you drink this particular coffee, then changes in its price in
your locality will affect your cost of living. Similarly, your costs and
economic well-being will also be affected by what happens to the
prices of all the other things you need or like to consume.
On the other hand, someone who never buys instant coffee will be
unaffected by any change in its price; they will be much more
interested in what happens to the prices of alternative products such
as ground coffee, tea, milk or fruit juice. The problem of measuring
the effect of price changes on individuals with different consumption
patterns will be considered in Section 5.
26 8 8 8 8 9
27 5 9
28
29 5 5 5 5 9
30 5
31 5
32
33
34
35
36 9
94
1 Measuring location
This shows at a glance that if you shop around, you might well find
this brand of coffee on sale at less than 270p. (Indeed some stores
seem to have been ‘price matching’ at the lowest price of 268p.) On
the other hand, if you are not too careful about making price
comparisons then you might pay considerably more than 300p (£3).
However, you are most likely to find a shop with the coffee priced
between about 270p and 300p. Although there is no one price for this
coffee, it seems reasonable to say that the overall location of the price
is a bit less than 300p.
The median of the batch is a useful measure of the overall location of
the values in a batch. You met the median in Subsection 4.2 of
Unit 1; it was defined as the middle value of a batch of figures when
the values are placed in order. Let us revise, and extend slightly, what
you learned about the median in Unit 1.
The stemplot in Figure 1 shows the prices arranged in order of size.
We can label each of these 15 prices with a symbol indicating where it
comes in the ordered batch. A convenient way of showing this is to
write each value as the symbol x plus a subscript number in brackets,
where the subscript number shows the position of that value within
the ordered batch. Figure 2 shows the 15 prices written out in
ascending order using this subscript notation.
The subscript is (3), so this is the third value in the ordered batch
x(1) x(2) x(3) x(4) x(5) x(6) x(7) x(8) x(9) x(10) x(11) x(12) x(13) x(14) x(15)
268 268 268 268 269 275 279 295 295 295 295 299 305 315 369
EL Median EU
95
Unit 2 Prices
x(1) x(15)
x(2) x(14)
x(3) x(13)
x(4) x(12)
x(5) x(11)
x(6) x(10)
x(7) x(9)
An upside down V-shape
x(8)
Median
60 70 53 81 74
85 90 79 65 70
96
1 Measuring location
53 90
60 85
65 81
70 79
70 74
Figure 4 Prices of 10 digital cameras
0 9
1 0
Not that kind of flat screen
1 2 3 3 3
1 4 5 5 5 5
1 6 6 7
1 8 8 9
2
2
2 4 5
2 7
n = 20 0 9 represents £90
Figure 5 Prices of all flat-screen televisions with a screen size of 24 inches or
less on a major UK retailer’s website on a day in February 2012
This subsection can now be finished by using some of the methods we have
met to examine a batch of data consisting of two parts, or sub-batches.
97
Unit 2 Prices
Table 3 presents the average price of gas, in pence per kilowatt hour
(kWh), in 2010, for typical consumers on credit tariffs in 14 cities in the
UK. These cities have been divided into two sub-batches: as seven northern
cities and seven southern cities. (Legally, at the time of writing, Ipswich is
a town, not a city, but we shall ignore that distinction here.)
Table 3 Average gas prices in 14 cities
Northern Southern
Aberdeen 3.740 Birmingham 3.805
Edinburgh 3.740 Canterbury 3.796
Leeds 3.776 Cardiff 3.743
Liverpool 3.801 Ipswich 3.760
Manchester 3.801 London 3.818
Newcastle-upon-Tyne 3.804 Plymouth 3.784
Nottingham 3.767 Southampton 3.795
Arithmetic mean
The arithmetic mean is the sum of all the values in the batch divided
by the size of the batch. More briefly,
sum
mean = .
size
98
1 Measuring location
There are other kinds of mean, such as the geometric mean and the
harmonic mean, but in this module we shall be using only the arithmetic
mean; the word mean will therefore normally be used for arithmetic mean.
Note that in calculating the mean, the order in which the values are
summed is irrelevant.
For a larger batch size, you may find it helpful to set out your calculations
systematically in a table. However, in practice the raw data are usually fed
directly into a computer or calculator. In general, it is a good idea to
check your calculations by reworking them. If possible, use a different
method in the reworking; for example, you could sum the numbers in the
opposite order.
The formula ‘mean = sum/size’ can be expressed more concisely as follows.$
Referring
$ to the values in the batch by x, the ‘sum’ can be written as x.
Here is the Greek (capital) letter Sigma, the Greek version of S, and is
used in statistics to denote ‘the sum of’. Also, the symbol x is often used
to denote the mean – and as you have already seen in stemplots, n$can be
used to denote the batch size. (Some calculators use keys marked x and
x to produce the sum and the mean of a batch directly.)
Using this notation,
sum
mean =
size
can be written as
$
x
x= .
n
In this module we shall normally round the mean to one more figure than
the original data.
99
Unit 2 Prices
100
1 Measuring location
In the following activities, you can investigate some other ways in which
the median is more resistant than the mean.
In Activity 2 (Subsection 1.2) you may have noticed that Cardiff and
Ipswich had rather low gas prices compared to the other southern cities.
Here you are going to examine the effect of deleting them from the batch of
southern cities. Complete the following table and comment on your results.
101
Unit 2 Prices
Suppose the value for London had been misprinted as 8.318 instead of
3.818 (quite an easy mistake to make!). How would this affect your results
for the batch of five southern cities (again omitting Cardiff and Ipswich)?
Batch Mean Median
Five cities (correct data)
Suppose you wanted to use these values – the correct ones, of course – to
estimate the average price of gas over the whole country. The simple
arithmetic mean of the 14 values given in Table 3 would not allow for the
fact that much more gas is consumed in London, at a relatively high price,
than in other cities. To take account of this you would need to calculate
what is known as a weighted arithmetic mean. Weighted means are the
subject of the next section.
Exercises on Section 1
0 7
1 5
2
3 3 5
4 2 2 3
5 5 8
6 4 6 8
7 1 1 6 8 9
8 0 1 1 3 4 5 5 6 9
9 1 1 3 5 9
10 0 0
n = 33 0 7 represents a score of 7%
102
2 Weighted means
You have now covered the material needed for Subsection 2.1 of
the Computer Book.
2 Weighted means
For goods and services, price changes vary considerably from one to
another. Central to the theme question of this unit and the next, Are
people getting better or worse off ?, there is a need to find a fair method of
calculating the average price change over a wide range of goods and
services. Clearly a 10% rise in the price of bread is of greater significance
to most people than a similar rise in the price of clothes pegs, say. What
we need to take account of, then, are the relative weightings attached to
the various price changes under consideration.
103
Unit 2 Prices
74.0 81.6
pence
Figure 6 Means of biscuit prices
If we had all the individual prices, five from Alan and eight from
Beena, then they could be amalgamated into a single batch of 13
prices, and from this combined batch we could calculate the mean
price of the standard packet at all 13 shops. However, our two
investigators have unfortunately not written down, nor can they fully
remember, the prices from individual shops. Is there anything we can
do to calculate the mean of the combined batch?
Fortunately there is, as long as we are interested in arithmetic means.
(If they had recorded the medians instead, then there would have
been very little we could do.)
The mean of the combined batch of all 13 prices will be calculated as
sum (of the combined batch prices)
.
size (of the combined batch)
We already know that the size of the combined batch is the sum of
the sizes of the two original batches; that is, 5 + 8 = 13. The problem
here is how to find the sum of the combined batch of Alan’s and
Beena’s prices. The solution is to rearrange the familiar formula
sum
mean =
size
so that it reads
sum = mean × size.
104
2 Weighted means
This will allow us to find the sums of Alan’s five prices and Beena’s
eight prices separately. Adding the results will produce the sum of the
combined batch prices. Finally, dividing by 13 completes the
calculation of finding the combined batch mean.
Let us call the sum of Alan’s prices ‘sum(A)’ and the sum of Beena’s
prices ‘sum(B)’.
For Alan: mean = 81.6 and size = 5, so sum(A) = 81.6 × 5 = 408.
For Beena: mean = 74.0 and size = 8, so sum(B) = 74.0 × 8 = 592.
For the combined batch:
combined sum
mean =
combined size
408 + 592
=
13
1000
= ! 76.9
13
Here, the result has been rounded to give the same number of digits
as in the two original means.
The process that we have used above is an important one. It will be used
several times in the rest of this unit. The box below summarises the
method, using symbols.
105
Unit 2 Prices
To see why the term weighted mean is used for such an expression, imagine
that Figure 7 shows a horizontal bar with two weights, of sizes 5 and 8,
hanging on it at the points 81.6 and 74.0, and that you need to find the
point at which the bar will balance. This point is at the weighted mean:
approximately 76.9.
74.0 76.9 81.6
pence
5
8
106
2 Weighted means
To find the mean of the combined batch we use the formula above,
with
xA = 119, nA = 7, xB = 185, nB = 13.
This gives
(119 × 7) + (185 × 13)
xC =
7 + 13
833 + 2405
=
20
3238
=
20
= 161.9 ! 162.
Note that this is the weighted mean of 119 and 185 with weights 7
and 13 respectively. It lies between 119 and 185 but it is nearer to 185
because this has the greater weight: 13 compared with 7.
107
Unit 2 Prices
108
2 Weighted means
109
Unit 2 Prices
Weighted means have many uses, two of which you have already met. The
type of weights depends on the particular use. In our uses, the weights
were the following.
• The sizes of the batches, when we were calculating the combined batch
mean from two batch means.
• The quantities bought, when we were calculating the mean price of a
commodity bought on two separate occasions.
Another very important use is in the construction of an index, such as the
Retail Prices Index; we shall therefore be making much use of weighted
means in the final sections of this unit.
In the next example, we do not have all the information required to
calculate the mean, but we can still get a reasonable answer by using
weights.
110
2 Weighted means
Using the rules for weighted means, would you expect the weighted mean
price to be nearer the London price or the Edinburgh price? To check,
calculate the weighted mean price.
111
Unit 2 Prices
We have seen in Example 11 and Activity 7 that only the ratio of the
weights affects the answer, not the individual weights. So weights are often
chosen to add up to a convenient number like 100 or 1000. This is Rule 1
for weighted means (see Subsection 2.1).
Activity 7 should also have reminded you of another important property of
a weighted mean of two numbers: the weighted mean lies nearer to the
number having the larger weight. This is part of Rule 2 for weighted
means.
This is the formula which is used to find the weighted mean of any set of
numbers, each with a corresponding weight.
112
2 Weighted means
Sum 20 10 139.4
113
Unit 2 Prices
1
2
3
You will meet many examples of weighted means of larger sets of numbers
in Subsection 5.2, but we shall end this section with one more example.
114
2 Weighted means
or, in symbols,
$
xw
$ .
w
$ $
As xw = 6973.436 and w = 1834, the weighted mean is
6973.436
= 3.802 310 ! 3.802.
1834
So the weighted mean of these gas prices, using approximate
population figures as weights, is 3.802p per kWh.
Note that this weighted mean is larger than all but three of the gas
prices for individual cities. That is because the cities with the two
highest populations, London and Birmingham, also have the highest
gas prices, and the weighted mean gas price is pulled towards these
high prices.
Although the details of the calculation above are written out in full in
Table 5, in practice, using even a simple calculator, this is not necessary. It
is usually possible to keep a running sum of both the weights and the
products as the data are being entered. One way of doing this is to
accumulate the sum of the weights into the calculator’s memory while the
sum of the products is cumulated on the display. If you are using a
specialist statistics calculator, the task is generally very straightforward.
Simply enter each price and its corresponding weight using the method
described in your calculator instructions for finding a weighted mean.
Use your calculator to check that the sum of weights and sum of products
of the data in Table 5 are, respectively, 1834 and 6973.436, and that the
weighted mean is 3.802. (No solution is given to this activity.)
Table 6 is similar to Table 5, but this time it presents the average price of
electricity, in pence per kilowatt hour (kWh). These data are again for the
year 2010 for typical consumers on credit tariffs in the same 14 cities we
have been considering for gas prices, with the addition of Belfast. Again,
the weights are the approximate populations of the relevant urban areas,
in 10 000s.
115
Unit 2 Prices
Aberdeen 13.76 19
Belfast 15.03 58
Edinburgh 13.86 42
Leeds 12.70 150
Liverpool 13.89 82
Manchester 12.65 224
Newcastle-upon-Tyne 12.97 88
Nottingham 12.64 67
Birmingham 12.89 228
Canterbury 12.92 5
Cardiff 13.83 33
Ipswich 12.84 14
London 13.17 828
Plymouth 13.61 24
Southampton 13.41 30
Sum
Use these data to calculate the weighted mean electricity price. (Your
calculator will almost certainly allow you to do this without writing out all
the values in the xw column.)
Exercises on Section 2
116
3 Measuring spread
3 Measuring spread
As you have already seen, it is difficult to measure price changes when they
so often vary from shop to shop and region to region. Taking some average
value, such as the median or the mean, helps to simplify the problem.
However, it would be a mistake to ignore the notion of spread, as averages
on their own can be misleading.
Information about spread can be very important in statistical analysis,
where you are often interested in comparing two or more batches. In this
section we shall look first at measures of spread, and then at some
I like to sleep each night
methods of summarising the shape of a batch of data.
with my feet in the oven
But how can spread be measured? Just as there are several ways of and my head in the freezer.
measuring location (mean, median, etc.), there are also several ways of That way I’m comfortable
measuring spread. Here, we shall examine two such measures: the range on average.
and the interquartile range.
In the next unit you will learn about a further measure of spread called the
standard deviation.
The range
The range is the distance between the lower and the upper extremes.
It can be calculated from the formula:
range = EU − EL ,
where EU is the upper extreme and EL is the lower extreme.
Given an ordered batch of data, for example in a stemplot, the range can
easily be calculated. However, the range tells us very little about how the
values in the main body of the data are spread. It is also very sensitive to
changes in the extreme values, like those considered in Subsection 1.4. It
would be better to have a measure of spread that conveys more
information about the spread of values in the main body of the data. One
such measure is based upon the difference between two particular values in
the batch, known as the quartiles. As the name suggests, the two
quartiles lie one quarter of the way into the batch from either end. The
major part of the next subsection describes how to find them.
117
Unit 2 Prices
x(4) x(12)
x(3) x(5) x(11) x(13)
x(2) x(6) x(10) x(14)
x(1) x(7) x(9) x(15)
x(8)
Median
118
3 Measuring spread
The rule adopted here is the one used by Minitab. If your calculator can
find quartiles, note that it may use a different rule, and you may also have
used different rules in other Open University modules.
As you might have expected, the rule involves dividing (n + 1) by 4,
where n is the batch size (as opposed to dividing by 2 to find the median).
However, the rule is slightly more complicated for the quartiles and it
depends on whether (n + 1) is exactly divisible by 4.
The quartiles
(n + 1)
The lower quartile Q1 is at position in the ordered batch.
4
3(n + 1)
The upper quartile Q3 is at position in the ordered batch.
4
If (n + 1) is exactly divisible by 4, these positions correspond to a
single value in the batch.
If (n + 1) is not exactly divisible by 4, then the positions are to be
interpreted as follows.
• A position which is a whole number followed by 12 means ‘halfway
between the two positions either side’ (as was the case for finding
the median).
• A position which is a whole number followed by 14 means ‘one
quarter of the way from the position below to the position above’.
So for instance if a position is 5 14 , the quartile is the number
one quarter of the way from x(5) to x(6) .
• A position which is a whole number followed by 43 means ‘three
quarters of the way from the position below to the position above’.
So for instance if a position is 4 34 , the quartile is the number
three quarters of the way from x(4) to x(5) .
Before we actually use these rules to find quartiles, let us look at some
more examples of ∧∧-shaped diagrams for different batch sizes n. The case
where (n + 1) is exactly divisible by 4, so that 14 (n + 1) is a whole number,
was shown in Figure 9. The following three figures show the three other
possible scenarios, where (n + 1) is not exactly divisible by 4.
119
Unit 2 Prices
Median
Median
120
3 Measuring spread
Median
0 9
1 0
1 2 3 3 3
1 4 5 5 5 5
1 6 6 7
1 8 8 9
2
2
2 4 5
2 7
n = 20 0 9 represents £90
Figure 13 Prices of flat-screen televisions with a screen size of 24 inches
or less
To calculate the lower quartile Q1 you need to find the number that is
one quarter of the way from x(5) to x(6) . These values are both 130, so
Q1 is 130. To calculate the upper quartile Q3 you need to find the
number three quarters of the way from x(15) to x(16) . These values are
both 180, so Q3 is 180.
That example was easier than it might have been, because for each
quartile the two numbers we had to consider turned out to be equal!
121
Unit 2 Prices
(a) Find the lower and upper quartiles of the batch of 15 coffee prices in
Figure 14. (This batch of coffee prices was first introduced in Table 1
of Subsection 1.1.)
26 8 8 8 8 9
27 5 9
28
29 5 5 5 5 9
30 5
31 5
32
33
34
35
36 9
122
3 Measuring spread
(b) Find the lower and upper quartiles of the batch of 14 gas prices in
Figure 15. (This batch of gas prices was first introduced in Table 3 of
Subsection 1.2.)
374 0 0 3
375
376 0 7
377 6
378 4
379 5 6
380 1 1 4 5
381 8
A measure of spread
Now we can define a new measure of spread based entirely on the lower
and upper quartiles.
123
Unit 2 Prices
Calculate both the range and the interquartile range of the batch of
15 coffee prices, last seen in Figure 14.
In Activity 10(b) you found the quartiles of the 14 gas prices from
Activity 2 (Subsection 1.2). Find the interquartile range.
You may be wondering why you are being asked to learn a new measure of
spread when you already know the range. As a measure of spread, the
range (EU − EL ) is not very satisfactory because it is not resistant to the
effects of unrepresentative extreme values. The interquartile range, by
contrast, is a highly resistant measure of spread (because it is not sensitive
to the effects of values lying outside the middle 50% of the batch) and it is
generally the preferred choice.
Q1 Q3
EL M EU
Figure 16 Values in a five-figure summary
124
3 Measuring spread
Five-figure summary
n batch size
M median
M
Q3 Q1 lower quartile
n Q1
Q3 upper quartile
EL EU
EL lower extreme
EU upper extreme
150
n = 20 130 180
90 270
We hope you agree that the five-figure summary is quite an efficient way of
presenting a summary of a batch of data.
125
Unit 2 Prices
The central feature of this diagram is a box – hence the name box plot. The
box extends from the lower quartile (at the left-hand edge of the box) to
the upper quartile (the right-hand edge). This part of the diagram contains
50% of the values in the batch. The length of this box is thus the
interquartile range.
Outside the box are two whiskers. (Boxplots are sometimes called
box-and-whisker diagrams.) In many cases, such as in Figure 17, the
whiskers extend all the way out to the extremes. Each whisker then covers
the end 25% of the batch and the distance between the two whisker-ends is
then the range. (You will see examples later where the whiskers do not go
right out to the extremes.)
So far we have dealt with four figures from the five-figure summary: the
two quartiles and the two extremes. The remaining figure is perhaps the
most important: it is the median, whose position is shown by putting a
vertical line through the box.
Thus a boxplot shows clearly the division of the data into four parts: the
two whiskers and the two sections of the box; these are the four parts of
the ∧∧-shaped diagram and each contains (approximately) 25% of values
in the batch (see Figure 18).
126
3 Measuring spread
Q1 M Q3
EL EU
127
Unit 2 Prices
You can see that each whisker is longer than half the length of the
box.
However, this boxplot has a new feature. The whisker on the left goes
right down to the lower extreme. But the whisker on the right does
not go right to the upper extreme. The highest extreme data value,
270, which might potentially be regarded as an outlier, is marked
separately with a star. Then the whisker extends only to cover the
data values that are not extreme enough to be regarded as potential
outliers. The highest of these values is 250.
In Unit 3, you will learn in detail how to draw a boxplot. This
includes a rule to decide which data values (if any) can be regarded as
potential outliers that are plotted separately on the diagram.
128
3 Measuring spread
A stemplot of the gas price data from Activity 2 (Subsection 1.2) is shown,
yet again, in Figure 20.
374 0 0 3
375
376 0 7
377 6
378 4
379 5 6
380 1 1 4 5
381 8
129
Unit 2 Prices
5 3
5
6 0
6 5
7 0 0 4
7 9
8 1
8 5
9 0
n = 10 5 3 represents £53
Figure 22 Stemplot of ten camera prices
50 60 70 80 90
£
Figure 23 Boxplot of batch of ten camera prices
You have now spent quite a lot of time looking at various ways of
investigating prices and, in particular, at methods of measuring the
location and spread of the prices of particular commodities.
In order to begin to answer our question, Are people getting better or worse
off ?, we need to know not just location (and spread) of prices but also how
these prices are changing from year to year. That is the subject of the rest
of this unit.
Exercises on Section 3
130
3 Measuring spread
0 7
1 5
2
3 3 5
4 2 2 3
5 5 8
6 4 6 8
7 1 1 6 8 9
8 0 1 1 3 4 5 5 6 9
9 1 1 3 5 9
10 0 0
n = 33 0 7 represents a score of 7%
(b) For the television prices in Exercise 1, find the quartiles and calculate
the interquartile range. The table of prices is given below.
0 20 40 60 80 100
%
Figure 24 Boxplot of batch of 33 arithmetic scores
131
Unit 2 Prices
* *
132
4 A simple chained price index
2007 is the
base year
133
Unit 2 Prices
Work out the increase in Gradgrind’s gas price between 2007 and 2008 as a
percentage of the 2007 price.
So we could say that, for this company at least, gas has gone up by 20.8%.
In other words, for every £1 they spent on gas in 2007, they would have
spent £1.208 in 2008 if they had bought the same amount of gas in each
year. Or putting it another way, for every 100 units of money (pence,
pounds, whatever) they spent in 2007, they would have spent 120.8 units
of money in 2008 if they had bought the same amount. So a way of
representing this price change would have been to define an index for the
gas price such that it takes the value 100 for 2007, and 120.8 for 2008.
Notice that the value of the gas price index for 2008 could be calculated as
gas price in 2008
(value of the index in 2007, which is taken as 100) × .
gas price in 2007
That is, the value of the index in one year is the value of the index in the
previous year multiplied by a price ratio, in this case the gas price ratio for
2008 relative to 2007. This ratio, as a number, is 1.208.
But Gradgrind did not only use gas, they used electricity as well, and the
aim here is to find a representation of their overall fuel price change, not
just the change in gas prices.
An electricity price ratio for 2008 relative to 2007 can be worked out, like
the gas price ratio. It is 87
76 ! 1.145.
Use the electricity price ratio above to find the increase in Gradgrind’s
electricity price between 2007 and 2008 as a percentage of the 2007 price.
What would the 2008 value be for a price index of Gradgrind’s electricity
price alone, calculated in the same way as the gas price index (with 2007
as the base year)?
But this has got us no further in finding a price index that simultaneously
covers both fuels.
One possibility might be to look at how Gradgrind’s total expenditure on
these two fuels changed from 2007 to 2008. The expenditures are given in
Table 8.
134
4 A simple chained price index
2007 2008
Gas 9 298 8 145
Electricity 3 205 2 991
Total 12 503 11 136
This seems not to have helped. The total expenditure went down, but you
have already seen that the prices of both gas and electricity went up.
Use the data in Tables 7 and 8 to find the quantity of each fuel that
Gradgrind used in 2007 and 2008 (in MWh). Hence explain why the
energy expenditure fell.
135
Unit 2 Prices
This is indeed how a chained index of this kind is calculated – but the
calculations are rather messy. You might be wondering whether it would
be simpler to calculate the overall energy price ratio as a weighted mean of
the two price ratios for the two fuels, in much the same way that weighted
means were used to combine prices in Section 2. If you did think this, you
would be right – and furthermore, the resulting overall energy price ratio is
exactly the same as has just been found, if we make the right choice of
weights. The overall energy price ratio for 2008 relative to 2007 is just a
weighted mean of the two price ratios for gas and electricity, with the 2007
expenditures as weights.
Just to show it really does come to the same thing, let us see how it works
with the numbers, using the formula for weighted means in Subsection 2.3.
Price ratio (2008 relative to 2007) Weight (2007 expenditure)
Gas 1.208 9298
Electricity 1.145 3205
Table 9 Gradgrind’s energy prices and expenditures for 2008 and 2009
2008 2009
Gas price (£/MWh) 29 30
Gas expenditure (£) 8 145 23 733
Electricity price (£/MWh) 87 98
Electricity expenditure (£) 2 991 2 275
(a) Using the data in Table 9, calculate the price ratios for gas and for
electricity, in each case for 2009 relative to 2008.
(b) With the 2008 expenditures as weights, use your answers to part (a)
to calculate the overall energy price ratio for 2009 relative to 2008.
(c) Now see what happens if you use the 2009 expenditures as weights to
calculate the overall energy price ratio for 2009 relative to 2008. How
do the results of the calculation differ from what you got in part (b)?
136
4 A simple chained price index
The reason that the price ratios you calculated in parts (b) and (c) in
Activity 17 were so different is that Gradgrind’s ‘energy mix’ changed a lot
over the year. Compared with 2008, in 2009 they spent a great deal more
on gas but less on electricity. The weighted mean of the gas and electricity
price ratios is, in both cases, nearer the price ratio for gas than that for
electricity – this is Rule 2 for weighted means – but it is even nearer the
gas weighted mean when the 2009 expenditures are used. This is because
the weight for gas is proportionally much greater than it is when the 2008
expenditures are used as weights.
This all shows that it does make a difference which expenditures are used
as weights. In practice, it is much more common to use the expenditures
from the earlier year – 2008 in this case – as weights. In some
circumstances, though, there are good reasons for using the later year, or
indeed some more complicated set of weights that depend on both
expenditures. However, in this unit we shall use the expenditures from the
earlier year to provide the weights, partly because that matches more
closely what is done in calculating the official UK price indices.
Another possibility for weights would have been to continue to use the
2007 expenditures. These were used to find the overall energy price ratio
for 2008 relative to 2007 and could be used for later years as well. Again,
in some circumstances this would make sense, but here the pattern of
Gradgrind’s fuel expenditure has changed a lot over time, and weights
should change in consequence. To continue to use the 2007 expenditures
for all later years would mean that this change in the relative importance
to Gradgrind of the two fuels would never be taken into account. Instead,
to obtain the overall energy price ratio from one year to the next, we use
the fuel expenditures in the earlier year as weights, so each year the
weights change.
That determines the choice of weights in forming an overall price ratio.
Now, how is that used to find the energy price index ? Here we simply
continue the ‘chaining’ that started when finding the 2008 index: the 2009
index is found by multiplying the value of the index for the previous year,
2008, by the overall energy price ratio for 2009 relative to 2008. The value
of the index for 2008 was calculated earlier as 119.2, and (using the
weights from the previous year) the overall energy price ratio for 2009
relative to 2008 was found in Activity 17(b) as 1.059. So the value of
Gradgrind’s energy price index for 2009 is
119.2 × 1.059 ! 126.2.
(So, in a particular kind of average way, Gradgrind’s energy prices for 2009
have risen by 26.2% since the base year, 2007.)
137
Unit 2 Prices
In general, the value index for a particular year is found by multiplying the
value of the index for the previous year by the overall energy price ratio for
that year relative to the previous year. This is illustrated in Figure 27.
In the process of chaining, the overall price ratio is calculated anew each
year, looking back only at the previous year. The ratio is used to ‘chain’ to
earlier years and hence determine the value of the index. This method of
calculating a chained price index is summarised below. Although there
were only two commodities (gas and electricity) in Gradgrind’s index, this
summary is not restricted to two commodities.
138
4 A simple chained price index
Use the data in Table 10, and other necessary numbers from previous
calculations, to calculate the value of Gradgrind’s energy price index
for 2010.
Table 10 Gradgrind’s energy prices and expenditures for 2009 and 2010
2009 2010
Gas price (£/MWh) 30 28
Gas expenditure (£) 23 733 23 969
Electricity price (£/MWh) 98 88
Electricity expenditure (£) 2 275 2 920
The Retail Prices Index (RPI), published by the UK Office for National
Statistics, is calculated once a month rather than once a year, but the
method used is basically that outlined above, though with far more than
two commodities. The process of finding the weights in the Retail Prices
Index is also more complicated, because it involves taking into account the
expenditures of millions of people as measured in a major survey. However,
the principles are the same as for Gradgrind. The calculation each January
follows exactly this method. In the other 11 months of the year, the
calculation is very similar but uses only the increases in prices since the
previous January. In the next section, you will learn more about how all
this works.
Exercise on Section 4
2010 2011
Gas price (£/MWh) 28 30
Gas expenditure (£) 23 969 24 282
Electricity price (£/MWh) 88 86
Electricity expenditure (£) 2 920 3 117
139
Unit 2 Prices
140
5 The UK government price indices
For the RPI, the price ratio for the basket each month is calculated
relative to the previous January. Then the value of the index is obtained
by multiplying the value of the index for the previous January by this price
ratio. For example,
RPI for Nov. 2011 = RPI for Jan. 2011
× (price ratio for Nov. 2011 relative to Jan. 2011).
The CPI works in much the same way, except that price ratios are
calculated relative to the previous December. So, for example,
CPI for Nov. 2011 = CPI for Dec. 2010
× (price ratio for Nov. 2011 relative to Dec. 2010).
Since these price indices are calculated from price ratios, they measure
price changes in terms of the ratio of the overall level of prices in a given
month to the overall level of prices at an earlier date. In practice, data on
most prices are collected on a particular day near the middle of the month;
the values of the RPI and CPI calculated using these data are referred to
simply as the values of the RPI and CPI for the month. For example, the
RPI took the value 239.9 in February 2012. This value measures the ratio
of the overall level of prices in February 2012 to the overall level of prices
on a date at which the index was fixed at its starting value of 100. This
date, called a base date, is 13 January 1987 (at the time of writing). Thus
the general level of prices in February 2012, as measured by the RPI, was
239.9/100 = 2.399 times the general level of prices in January 1987.
The base date has no significance other than to act as a reference point.
(The CPI base date is 2005 and this refers to the average level of prices
throughout 2005, not to a specific date in 2005.)
The RPI and CPI are each based on a very large ‘basket’ of goods and
services. (The two baskets are similar, but not exactly the same.) Each
contains around 700 items including most of the usual things people buy:
food, clothes, fuel, household goods, housing, transport, services, and so
on.
Each basket is an ‘average’ basket for a broad range of households. The
items in the baskets are often grouped into broader categories. For the
RPI, the five fundamental groups are:
• Food and catering.
• Alcohol and tobacco.
• Housing and household expenditure.
• Personal expenditure.
• Travel and leisure.
These groups are divided into 14 more detailed subgroups (which are
further divided into sections), as shown in Figure 28.
141
Unit 2 Prices
Leisure goods
Fares and
other travel
costs
Leisure
services
Food
Motoring
expenditure
Travel Catering
and leisure Food and
catering
Personal goods
and services Personal Alcohol Alcoholic drink
expenditure and tobacco
Clothing and
footwear Tobacco
Housing and
household
Household expenditure
services
Household
goods Housing
Fuel
and
light
Figure 28 Structure of the RPI in 2012 (based on data from the Office for
National Statistics)
The inner circle shows the five groups, and the outer ring shows the
14 subgroups. Notice that in the inner circle the sector labelled ‘Food and
catering’ has been drawn almost twice as large (as measured by area) as
that labelled ‘Alcohol and tobacco’. This reflects the fact that the typical
household spends nearly twice as much on food and catering as on alcohol
and tobacco. The weight of an item or group reflects how much money is
spent on it. So the weight of the ‘Food and catering’ group is almost twice
that of ‘Alcohol and tobacco’.
The outer ring represents the same total expenditure as the inner circle,
but in more detail. For example, in the outer ring the area labelled ‘Food’
(which mostly consists of food bought for use in the home) is more than
twice as large as that labelled ‘Catering’ (which includes meals in
restaurants and canteens, and take-away meals and snacks), reflecting the
fact that the typical household spends more than twice as much on food as
on catering; the weight of the subgroup ‘Food’ is more than double the
142
5 The UK government price indices
(a) Using Figure 28, estimate roughly what fraction of the expenditure of
a typical household is on each of the following groups and subgroups:
• Personal expenditure
• Housing and household expenditure
• Housing
(b) Suppose that a household spends a total of £540 per week on goods
and services that are covered by the RPI. Use your answers to
part (a) to estimate very approximately how much is spent each week
on each of the groups and subgroups in part (a).
To ensure that the basket of goods for the index reflects the proportion of
average spending devoted to different types of goods and services, it is
necessary to find out how people actually spend their money. The Living
Costs and Food Survey (LCF) records the spending reported by a sample
of 5000 households spread throughout the UK. Data from the LCF are
used to calculate the weights of most of the items included in the
RPI basket. Since 1962, the weights have been revised each year, so that
the index is always based on a basket of goods and services that is as
up to date as possible. Because of this regular weight revision, the index is
chained (as was the Gradgrind Ltd index).
(Most of the weights for the CPI come from a different source, the UK
National Accounts, though in turn this source is partly based on data from
the LCF. Again, the weights are revised each year.)
The weight of a group or subgroup directly depends on the average
expenditure of households on that item. In Subsection 2.1, you saw that it
is only the relative size of the weights that affects the value of the weighted
mean – this is Rule 1 for weighted means. So instead of using the average
expenditure of an item as its weight, the expenditure figures for the items
can all be multiplied by the same factor to produce a new, more
convenient, set of weights. For the RPI, this factor is chosen so that the
sum of the weights is 1000.
143
Unit 2 Prices
Table 12 shows the 2012 weights used in the RPI for the groups and
subgroups. Notice that each group weight is obtained by summing the
weights for its subgroups.
Table 12 2012 RPI weights
144
5 The UK government price indices
145
Unit 2 Prices
The total expenditure was £1764. So the group weights were calculated by
multiplying all the group total expenditures by a constant factor of
1000/1764, to ensure the weights sum to 1000. The weight for ‘Food and
catering’, for example, is
1000
470 × ! 266.
1764
Another way to calculate this is to multiply the proportion of monthly
expenditure spent on food and catering by 1000. The proportion is
470
! 0.266.
1764
Since the total weight is 1000, the weight for ‘Food and catering’ is
0.266 × 1000 = 266.
Notice that the group weights for this particular household differ quite
considerably from those used in the RPI in 2012 (see Table 12). For
instance, a much greater proportion of expenditure is on ‘Food and
catering’ and a much smaller proportion is spent on ‘Alcohol and tobacco’.
Make rough estimates of your own household’s expenditure last year and
complete the final columns of the checklist above. For some categories, you
may find it easier just to make a rough estimate of, say, your annual
expenditure and then divide by 12. If you have no idea at all for a
category, then use the corresponding figure in the checklist as a starting
point for your own expenditure and adjust it up or down depending on
how you think you spend your money. One way of checking that your
figures are sensible is to consider how the sum of the expenditures relates
to your household’s monthly income. Do not spend more than 15 minutes
on estimating your expenditure; accurate figures are not needed.
Divide each group expenditure by your monthly expenditure total and
then multiply by 1000 to calculate your household’s group weights.
How do your household’s weights compare with those used in the RPI
in 2012?
146
5 The UK government price indices
One aim of the RPI is to make it possible to compare prices in any two
months, and this involves calculating a value of the price index itself for
every month.
147
Unit 2 Prices
In the 1950s, the mangle, crisps and dance hall admissions were
added to the basket, with soap flakes among the items taken out.
Two decades later, the cassette recorder and dried mashed
potato made it in, with prunes being excluded.
Then after the turn of the century, mobile phone handsets and
fruit smoothies were included. The old fashioned staples of an
evening at home – gin and slippers – were removed from the
basket.
So now, in 2012, it is the turn of tablet computers to be added
to mark the growing popularity of this type of technology.
That received the most coverage when it was added to the
basket of goods, with the ONS highlighting this digital-age
addition in its media releases.
But those seafaring captains who once used the then unusual
fruit as a symbol to show they were home and hosting might be
astonished to find that centuries on, the pineapple has also been
added to the inflation basket.
Technically, the pineapple has been added to give more varied
coverage in the basket of fruit and vegetables, the prices of
which can be volatile.
(Source: BBC News website, 14 March 2012)
148
5 The UK government price indices
The next steps in the process combine these price ratios, using weighted
means, to obtain 14 subgroup price ratios, and then the group price ratios
for the five groups. Finally, the group price ratios are combined to give the
all-item price ratio. This is the price ratio, relative to the previous
January, for the ‘basket’ of goods and services as a whole that make up the
RPI.
The all-item price ratio tells us how, on average, the RPI ‘basket’
compares in price with the previous January. The value of the RPI for a
given month is found by the method described in Section 4, that is, by
multiplying the value of the RPI for the previous January by the all-item
price ratio for that month (relative to the previous January):
RPI for month x = (RPI for previous January)
× (all-item price ratio for month x)
Thus, to calculate the RPI for November 2011, the final step is to multiply
the value of the RPI in January 2011 by the all-item price ratio for
November 2011.
You may have noticed that the weights here do not exactly match
those in Table 12. That is because the weights here are the 2011
weights, and those in Table 12 are the 2012 weights, and as has been
explained, the weights are revised each year.
The all-item price ratio is a weighted average of the group price ratios
given in the table. If the price ratios are denoted by the letter r, and
the weights by w, then the weighted mean of the price ratios is the
sum of the five values of rw divided by the sum of the five values of w.
149
Unit 2 Prices
150
5 The UK government price indices
The same 2011 weights were used to calculate the RPI for every month
from February 2011 to January 2012 inclusive. For each of these months,
the price ratios were calculated relative to January 2011, and the RPI was
finally calculated by multiplying the RPI for January 2011 by the all-item
price ratio for the month in question. In February 2012, however, the
process began again (as it does every February). A new set of weights, the
2012 weights, came into use. Price ratios were calculated relative to
January 2012, and the RPI was found by multiplying the RPI value
for January 2012 by the all-item price ratio. This procedure was used until
January 2013, and so on.
The process of calculating the RPI can be summarised as follows.
5. The value of the RPI for that month is found by multiplying the
value of the RPI for the previous January by the all-item price
ratio:
RPI for month x =RPI for previous January
× (all-item price ratio for month x).
The weights for a particular year are used in calculating the RPI for
every month from February of that year to January of the following
year.
151
Unit 2 Prices
Find the value of the RPI in July 2011 by completing the following table
and the formulas below. The value of the RPI in January 2011 was 229.0.
(The base date was January 1987.)
Table 14 Calculating the RPI for July 2011
Sum
(Source: Office for National Statistics)
sum (w) = ,
sum of products (rw) = ,
sum of products (rw)
all-item price ratio = = ,
sum(w)
value of RPI in July 2011 = .
The published value for the RPI in July 2011 was 234.7, slightly different
from the value you should have obtained in Activity 21 (that is, 234.6).
The discrepancy arises because the government statisticians use more
accuracy during their RPI calculations, and round only at the end before
publishing the results.
The following activity is intended to help you draw together many of the
ideas you have met in this section, both about what the RPI is and how it
is calculated.
152
5 The UK government price indices
153
Unit 2 Prices
The fact that the inflation rates that are generally reported in the media
relate to price increases (as measured in a price index) over a whole year
means that one has to be careful in interpreting the figures, in several ways.
• Media reports might say that ‘inflation is falling’, but this does not
mean that prices are falling. It simply means that the annual inflation
rate is less than it was the previous month. So when the BBC headline
said that the (annual) inflation rate had fallen to 3.4% in February 2012,
it meant that the February 2012 rate was smaller than the January 2012
rate (which was 3.6%). Prices were still rising, but not quite so quickly.
• The change in price levels over one month may be, and indeed usually is,
considerably different from the annual inflation rate. For instance, prices
actually fell between December 2011 and January 2012: the CPI was
121.7 in December 2011 and 121.1 in January 2012. (Prices in the UK
usually fall between December and January in the UK, as Christmas
shopping ends and the January sales begin.) But the annual inflation
rate for January 2012, measured by the CPI, was 3.6%.
• The effect of a single major cause of increased prices can persist in the
annual inflation rates long after the prices originally increased. For
instance, the standard rate of value added tax (VAT) in the UK went up
from 17.5% to 20% at the start of January 2011, causing a one-off
increase in the price (to consumers) of many goods and services. This
showed up in the annual inflation rate for January 2011, where prices
154
5 The UK government price indices
were 4.0% higher than a year earlier. Moreover, the annual inflation rate
for every other month in 2011 was also affected by the VAT increase,
because in each case the CPI was being compared to the CPI in the
corresponding month in 2010, before the VAT increase.
Another important use of price indices like the RPI and CPI is for
index-linking. This is used for such things as savings and pensions, as a
means of safeguarding the value of money held or received in these forms.
Index-linking an amount
To index-link any amount of money, the amount in question is
multiplied by the same ratio as the change in the value of the price
index. Another term for this process is indexation.
Pensions can be, and indeed increasingly are, index-linked using the CPI
rather than the RPI.
155
Unit 2 Prices
The purchasing power of the pound could be calculated using the CPI
instead, though the figures published by the Office for National Statistics
do happen to use the RPI.
156
5 The UK government price indices
For each of the following months, use the values of the RPI in Table 15 to
calculate the annual inflation rate (based on the RPI) and to calculate the
purchasing power of the pound (in pence) compared to one year previously.
(a) May 2010 (b) October 2011 (c) March 2011
You have seen that the RPI can be used as a way of updating the value of
a pension to take account of general increases in prices (index-linking).
The RPI is used in other similar ways, for instance to update the levels of
some other state benefits and investments. But the CPI could be used for
these purposes.
Why are there two different indices? Let’s look at how this arose. As well
as its use for index-linking, which is basically to compensate for price
changes, the RPI previously played an important role in the management
of the UK economy generally. The government sets targets for the rate of
inflation, and the Bank of England Monetary Policy Committee adjusts
interest rates to try to achieve these targets. Until the end of 2003, these
inflation targets were based on the RPI, or to be precise, on another price
index called RPIX which is similar to the RPI but omits owner-occupiers’
mortgage interest payments from the calculations. (There are good
economic reasons for this omission, to do with the fact that in many ways
the purchase of a house has the character of a long-term investment, unlike
the purchase of, say, a bag of potatoes.) From 2004, the inflation targets
have instead been set in terms of the CPI. The CPI is calculated in a way
that matches similar inflation measures in other countries of the European
Union. (So it can be used for international comparisons.)
In terms of general principles, though, and also in terms of most of the
details of how the indices are calculated, the differences between the RPI
and CPI are not actually very great. As mentioned in Subsection 5.1, the
CPI reflects the spending of a wider population than the RPI. Partly
because of this, there are certain items (e.g. university accommodation
fees) that are included in the CPI but not the RPI. There are also certain
157
Unit 2 Prices
items that are included in the RPI but not the CPI, notably some
owner-occupiers’ housing costs such as mortgage interest payments and
house-building insurance. Finally, the CPI uses a different method to the
RPI for combining individual price measurements.
Because of these differences, inflation as measured by the CPI tends
usually to be rather lower than that measured by the RPI. In Example 23,
you saw that the annual inflation rate in February 2012 as measured by
the CPI was 3.4%. The annual inflation rate in the same month, as
measured by the RPI, was 3.7%, as you saw in Activity 23. The RPI
continues to be calculated and published, and to be used to index-link
payments such as savings rates and some pensions. However, there are
reasons why the RPI is more appropriate than the CPI for some such
purposes, and it seems likely to continue in use for a long time.
Furthermore, changes in how index-linking is done can be politically very
controversial. For instance, in 2010, the UK government announced that in
future, public sector pensions would be index-linked to the CPI rather
than the RPI, which caused major complaints from those affected (because
inflation as measured by the CPI is usually lower than that measured
using the RPI, so pensions will not increase so much in money terms).
You might be asking yourself which is the ‘correct’ measure of inflation –
RPI, CPI, or something else entirely. There is no such thing as a single
‘correct’ measure. Different measures are appropriate for different
purposes. That’s why it is important to understand just what is being
measured and how.
In this section, you have seen how price rises are measured using an index
of retail prices. Earnings are discussed in the next unit. Only when prices
and earnings have both been considered can you begin to answer the
central question of these two units: Are people getting better or worse off ?
In the next unit, you will see how to use a price index in conjunction with
an index of earnings to see whether rises in earnings are keeping pace with
rises in prices.
Exercises on Section 5
158
6 Computer work: measures of location
Total
(Source: Office for National Statistics)
159
Unit 2 Prices
Summary
In this unit you have been discovering how statistics can be used to answer
questions about prices. You have learned how to find a single number to
summarise the price of an item at a particular point in time, even though
the item might be available from a number of sources. You have also
learned how to combine information on prices across a range of goods and
services. Then, through the use of price ratios, you have seen how changes
in price over time can be quantified. In particular, you have learned about
chained price indices such as the Retail Prices Index (RPI) and Consumer
Prices Index (CPI), used in the UK to measure inflation.
Two more measures of location, the mean and weighted mean, have been
introduced. The mean is a sensitive measure whereas the median is a
resistant measure. The weighted mean only depends on the relative sizes
of the weights, and the weighted mean of two numbers is always closer to
the value with the highest weight.
You have learned about measures of spread, in particular the range and
the interquartile range, and about quartiles, from which the interquartile
range is calculated. The five-figure summary was described, which consists
of the minimum, lower quartile, median, upper quartile and maximum,
along with the size of the batch. A way of displaying the five-figure
summary, the boxplot, was introduced. The ‘box’ in the boxplot runs
between the lower and upper quartiles and has a line in it corresponding to
the median, thus displaying three of the five numbers in the five-number
summary. The other two numbers in the five-number summary, the
minimum and maximum, are given by the lengths of the whiskers or
position of potential outliers.
You have learned how the RPI and the CPI are calculated by the Office for
National Statistics from a ‘basket’ of goods using weighted means to give
price ratios, group price ratios and all-commodities price ratios. These
all-commodity price ratios are then chained to give the value of the index
relative to a base date. The RPI and CPI can be used to calculate
inflation, to index-link amounts of money and to calculate the purchasing
power of the pound at one time compared with another.
160
Learning outcomes
Learning outcomes
After working through this unit, you should be able to:
• find the median of a batch of data
• find the mean of a batch of data
• describe what is meant by a resistant measure of location, and identify
which measures are resistant
• find the weighted mean of two numbers with associated weights
• use the weighted mean to combine two batch means to find the mean of
the combined batch
• use the weighted mean to find the overall average cost of a commodity
from the price paid and quantity purchased on two occasions
• understand the use of a weighted mean in other contexts and for larger
sets of numbers
• find the upper and lower quartiles and the interquartile range of a batch
of data
• prepare a five-figure summary of a batch of data
• interpret the boxplot of a batch of data
• use the boxplot to investigate the overall shape of a batch of data, in
particular its symmetry and skewness
• calculate a simple chained price index and explain what is meant by its
base date
• describe the major steps in producing the Retail Prices Index
• calculate the value of the Retail Prices Index from the five group price
ratios and weights
• use the Retail Prices Index or the Consumer Prices Index to compare
the general level of prices at two dates and calculate the rise in the
general level of prices over a year (the annual rate of inflation)
• use the Retail Prices Index or the Consumer Prices Index to do
index-linking calculations, and use the Retail Prices Index to find the
purchasing power of the pound at one date compared with another.
161
Unit 2 Prices
Solutions to activities
Solution to Activity 1
For a batch size of 20, the median position is 12 (20 + 1) = 10 12 . So, the
median will be halfway between x(10) and x(11) . These are both 150, so the
median is £150.
Solution to Activity 2
(a) A stemplot of all 14 prices in the table is shown below.
374 0 0 3
375
376 0 7
377 6
378 4
379 5 6
380 1 1 4 5
381 8
(b) Stemplots for the prices for northern and southern cities are shown
below.
Northern Southern
374 0 0 374 3
375 375
376 7 376 0
377 6 377
378 378 4
379 379 5 6
380 1 1 4 380 5
381 8
(c) For a batch size of 14, the median position is 21 (14 + 1) = 7 12 . So, the
all-cities median will be halfway between x(7) and x(8) . These
are 3.784 and 3.795, so the median is 3.7895, which is 3.790 when
rounded to three decimal places. (The rounded median should be
written as 3.790 and not 3.79, to show it is accurate to three decimal
places and not just two.)
162
Solutions to activities
For the northern and southern batches, both of size 7, the median for
each is the value of x(4) (that is, 12 (7 + 1) = 4). This is 3.776 for the
northern batch and 3.795 for the southern batch.
The range is the difference between the upper extreme, EU , and the
lower extreme, EL (range = EU − EL ). So the all-cities range is
3.818 − 3.740 = 0.078,
the range for the northern batch is
3.804 − 3.740 = 0.064,
and the range for the southern batch is
3.818 − 3.743 = 0.075.
The medians and ranges are summarised below.
Median Range
All cities 3.790 0.078
Northern cities 3.776 0.064
Southern cities 3.795 0.075
Thus the general level of gas prices in the country as a whole was
about 3.790p per kWh. The average price differed by only
0.078p per kWh across the 14 cities.
The difference between the median prices for the northern and
southern cities is 0.019p per kWh (3.795 − 3.776 = 0.019), with the
south having the higher median.
The analysis does not clearly reveal whether the general level of gas
prices for typical consumers in 2010 was higher in the south or in the
north, though there is an indication that prices were a little higher in
the south. The range of prices was also rather greater in the south. It
is worth noting that the differences in gas prices between the cities in
Table 3 were generally small, when measured in pence per kWh –
although, with a typical annual gas usage of 18 000 kWh, the price
difference between the most expensive city and the cheapest would
amount to an annual difference in bills of about £14 on a typical bill
of somewhere around £700.
Solution to Activity 3
Using the data for the prices from Activity 1:
sum 90 + 100 + . . . + 270
mean = = = £162.
size 20
$ $
Or using the notation, x = 90 + 100 + . . . + 270 = 3240 and n = 20, so
$
x 3240
mean = x = = = £162.
n 20
163
Unit 2 Prices
Solution to Activity 4
The entries are
Mean Median
3.7859 3.795
3.7996 3.796
Whereas deletion of Cardiff and Ipswich has the effect of increasing the
mean price by 0.0137p per kWh, the median price increases by only 0.001p
per kWh. This is what we would expect as, in general, the more resistant a
measure is, the less it changes when a few extreme values are deleted.
Solution to Activity 5
The entries are
Mean Median
3.7996 3.796
4.6996 3.796
Solution to Activity 6
You should expect the weighted mean price to be nearer the London price,
because of Rule 2 for weighted means (Subsection 2.1) and given that
London has a much larger weight then Edinburgh.
The weighted mean price given by the formula in Example 11 is (after
rounding) 3.814p per kWh, which is indeed much closer to the London
price than to the Edinburgh price.
Solution to Activity 7
(80 × 50) + (60 × 50) 4000 + 3000 7000
(a) OS = = = = 70.
50 + 50 100 100
This is the same as a simple (unweighted) mean of the two scores,
because the two component scores have equal weight. It lies exactly
halfway between the two scores ( 12 (80 + 60) = 70).
(80 × 40) + (60 × 60) 3200 + 3600 6800
(b) OS = = = = 68.
40 + 60 100 100
This is slightly less than the simple mean in (a) because the
component with the lower score (TMA) has the greater weight.
164
Solutions to activities
Solution to Activity 9
The table showing the required sums (and the values in the xw column,
that you may not have had to write down), is as follows.
$ $
Thus xw = 24 854.49, w = 1892 and
$
xw 24 854.49
$ = = 13.136 623 ! 13.14.
w 1892
So the weighted mean of electricity prices is 13.14p per kWh.
165
Unit 2 Prices
Solution to Activity 10
(a) Here, because n = 15, an appropriate picture of the data would be
Figure 9. To find the lower and upper quartiles, Q1 and Q3 , of this
batch, first find 14 (n + 1) = 4 and 34 (n + 1) = 12. Therefore Q1 = 268p
and Q3 = 299p.
(b) For this batch, n = 14 so 14 (n + 1) = 3 34 and 34 (n + 1) = 11 14 .
Q1 = 3.743 + 34 (3.760 − 3.743)
= 3.755 75 ! 3.756
and
Q3 = 3.801 + 14 (3.804 − 3.801)
= 3.801 75 ! 3.802.
So the lower quartile is 3.756 p per kWh and the upper quartile is
3.802p per kWh.
Solution to Activity 11
The range is the distance between the extremes:
range = EU − EL
= 369p − 268p
= 101p.
The interquartile range is the distance between the quartiles:
IQR = Q3 − Q1
= 299p − 268p
= 31p.
Solution to Activity 12
The quartiles, before rounding, are Q1 = 3.755 75 and Q3 = 3.801 75. So
IQR = Q3 − Q1
= 3.801 75 − 3.755 75
= 0.046,
and the interquartile range is 0.046p per kWh.
Solution to Activity 13
(a) All the necessary figures have already been calculated. You found the
median (3.790) in Activity 2 and the quartiles (Q1 = 3.756,
Q3 = 3.802) in Activity 10. The extremes (EL = 3.740, EU = 3.818)
and the batch size (n = 14) are clearly shown in the stemplot.
So the five-figure summary is as follows.
166
Solutions to activities
3.790
n = 14 3.756 3.802
3.740 3.818
(b) Looking at the stemplot, on the whole the lower values are more
spread out, indicating that the data are not symmetric and are
left-skew.
The central box of the boxplot again shows left skewness, with the
left-hand part of the box being clearly longer than the right-hand
part. However, this skewness does not show up in the lengths of the
whiskers in this batch – they are both the same length.
Solution to Activity 14
5
The increase (in £/MWh) is 29 − 24 = 5. This is 24 ! 0.208 as a
5
proportion of the 2007 price. That is, 24 × 100% ! 20.8% of the 2007
price. Or you might have worked this out by finding that the 2008 price is
29
24 × 100% ! 120.8% of the 2007 price, so that again the increase is 20.8%
of the 2007 price.
Solution to Activity 15
The 2008 electricity price is 1.145 × 100% = 114.5% of the 2007 price, so
that the increase is 14.5% of the 2007 price.
The 2008 value of the electricity price index is
(value of the index in 2007, which is 100)
× (electricity price ratio for 2008 relative to 2007)
= 100 × 1.145 = 114.5.
Solution to Activity 16
The expenditure on a particular fuel in a particular year can be calculated
as expenditure = quantity used × price. Therefore, if the expenditure and
price are known, the quantity used can be calculated as
expenditure
quantity used = .
price
In 2007, Gradgrind’s gas cost £24 per MWh, and they spent £9298 on gas,
so the amount of gas they used in MWh was
9298
! 387.4.
24
The other amounts, in MWh, are found in a similar way, and all are shown
in the following table.
2007 2008
Gas 387.4 280.9
Electricity 42.2 34.4
167
Unit 2 Prices
The reason that the expenditures went down is simply that Gradgrind
used less of each fuel in 2008 than in 2007.
Solution to Activity 17
(a) The gas price ratio for 2009 relative to 2008 is
30
! 1.034.
29
The electricity price ratio for 2009 relative to 2008 is
98
! 1.126.
87
(Over this year, electricity prices rose a lot more than gas prices.)
(b) The overall energy price ratio for 2009 relative to 2008 is
(1.034 × 8145) + (1.126 × 2991) 11 789.796
= ! 1.059.
8145 + 2991 11 136
(c) Using the 2009 expenditures for weights instead of the 2008
expenditures, the overall energy price ratio for 2009 relative to 2008 is
(1.034 × 23 733) + (1.126 × 2275) 27 101.572
= ! 1.042.
23 733 + 2275 26 008
This price ratio is considerably less than the one found in part (b).
(Note that if full calculator accuracy is retained throughout the
calculations, the price ratio is 1.043 to three decimal places.)
Solution to Activity 18
The gas price ratio for 2010 relative to 2009 is
28
! 0.933.
30
The electricity price ratio for 2010 relative to 2009 is
88
! 0.898.
98
(Both price ratios are less than 1 because, over this year, Gradgrind’s gas
and electricity prices both fell.)
The overall energy price ratio for 2010 relative to 2009 is
(0.933 × 23 733) + (0.898 × 2275) 24 185.839
= ! 0.930.
23 733 + 2275 26 008
Then the value of the index for 2010 is found by multiplying the 2009
value of the index by this overall price ratio, giving
126.2 × 0.930 ! 117.4.
168
Solutions to activities
Solution to Activity 19
(a) What you need to remember here is that the size of an area represents
the proportion of expenditure on that class of goods or services.
(Also, it is admittedly not very easy to estimate these areas ‘by eye’ !
Your estimates might quite reasonably differ from those given here.)
• The sector for ‘Personal expenditure’ looks as if it is approximately
a tenth of the whole inner circle – so approximately a tenth of total
expenditure is personal expenditure.
• ‘Housing and household expenditure’ looks as if it is somewhere
between a third and a half of the inner circle – perhaps
approximately two fifths – so approximately two fifths of
expenditure is on housing and household expenditure.
• The area for ‘Housing’ takes up about a quarter of the outer ring,
so about a quarter of expenditure is on housing.
(b) The amount spent each week on ‘Personal expenditure’ is
approximately
1
× £540 = £54.
10
The amount spent each week on ‘Housing and household expenditure’
is approximately
2
× £540 = £216 ! £220.
5
The amount spent each week on ‘Housing’ is approximately
1
× £540 = £135 ! £140.
4
Recall, however, that the weights represent average proportions of
expenditure, and the spending patterns of the selected household may
differ from those of the ‘typical’ household.
Solution to Activity 20
Every household will be different, but think about the reasons for any
large differences between your weights and those for the RPI.
Solution to Activity 21
Price ratio for July 2011 2011 weights Price ratio
relative to January 2011 × weight
Group r w rw
Food and catering 1.024 165 168.960
Alcohol and tobacco 1.042 88 91.696
Housing and household
expenditure 1.012 408 412.896
Personal expenditure 1.053 82 86.346
Travel and leisure 1.030 257 264.710
169
Unit 2 Prices
Solution to Activity 22
More detail has been included in these comments than is expected from
you. When you read them, make sure you understand all the points
mentioned.
(a) The RPI is calculated using the price ratio and weight of each item.
Since the weights of items change very little from one year to the
next, the price ratio alone will normally tell you whether a change in
price is likely to lead to an increase or a decrease in the value of the
RPI. If a price rises, then the price ratio is greater than one, so the
RPI is likely to increase as a result. If a price falls, then the price
ratio is less than one, so the RPI is likely to decrease. Therefore, since
the price of leisure goods fell, this is likely to lead to a decrease in the
value of the RPI. For a similar reason, the increase in the price of
canteen meals is likely to lead to an increase in the value of the RPI.
(b) Both changes are likely to be small for two reasons. First, the price
changes are themselves fairly small. Second, leisure goods and canteen
meals form only part of a household’s expenditure: no single group,
subgroup or section will have a large effect on the RPI on its own,
unless there is a very large change in its price.
(c) The weight of ‘Leisure goods’ was 33 in 2012 (see Table 12). Since
‘Canteen meals’ is only one section in the subgroup ‘Catering’, which
had weight 47 in 2012, the weight of ‘Canteen meals’ will be much
smaller than 47. (In fact it was 3.) So the weight of ‘Leisure goods’ is
much larger than the weight of ‘Canteen meals’.
(d) Since the weight of ‘Leisure goods’ is much larger than the weight of
‘Canteen meals’, and the percentage change in the prices are not too
different in size, the change in the price of leisure goods is likely to
have a much larger effect on the value of the RPI as a whole.
170
Solutions to activities
Solution to Activity 23
The ratio of the two RPI values is
value of RPI in February 2012 239.9
= ! 1.037,
value of RPI in February 2011 231.3
or 103.7%. Therefore the annual inflation rate, based on the RPI was
3.7%. (Note that this is slightly higher than the annual inflation rate
measured using the CPI.)
Solution to Activity 24
The weekly amount in November 2011 should be
121.2
£120 × ! £125.81.
115.6
Solution to Activity 25
(a) For May 2010, the ratio of the value of the RPI to its value one year
earlier is
223.6
! 1.051,
212.8
so the annual inflation rate is 5.1%.
The purchasing power of the pound compared to one year previously is
212.8
× 100p ! 95p.
223.6
(b) For October 2011, the ratio of the value of the RPI to its value one
year earlier is
238.0
! 1.054,
225.8
so the annual inflation rate is 5.4%.
The purchasing power of the pound compared to one year previously is
225.8
× 100p ! 95p.
238.0
(c) For March 2011, the ratio of the value of the RPI to its value one year
earlier is
232.5
! 1.053,
220.7
so the annual inflation rate is 5.3%.
The purchasing power of the pound compared to one year previously is
220.7
× 100p ! 95p.
232.5
171
Unit 2 Prices
Solutions to exercises
Solution to Exercise 1
(a) For the arithmetic scores, the position of the median is
1
2 (33 + 1) = 17, so the median is 79%.
(b) For the television prices, the position of the median is 12 (26 + 1) = 13 12 ,
so the median is halfway between x(13) and x(14) . Thus, the median is
1
2 (£269 + £270) = £269.5 ! £270.
Solution to Exercise 2
For the batch of arithmetic scores in part (a) of Exercise 1, the sum of the
33 values is 2326 and
2326
! 70.5.
33
Therefore, the mean is 70.5%. (The original data are given to the nearest
whole number, so the mean is rounded to one decimal place.)
For the batch of television prices in part (b) of Exercise 1, the sum of the
26 values is 7856 and
7856
= 302.1538 ! 302.2.
26
Therefore, the mean is £302.2.
Solution to Exercise 3
For the median, there are now 17 prices left in the batch, so the median is
at position 12 (17 + 1) = 9. It is therefore 150.
The sum of the remaining 17 values is 2480, so the mean is
2480
= 145.8824 ! 146.
17
In this case, removing the three highest prices has not changed the median
at all, but it has reduced the mean considerably. This illustrates that the
median is a more resistant measure than the mean.
Solution to Exercise 4
Mean price of all the cameras is
(80.7 × 10) + (78.5 × 17) 2141.5
= ,
10 + 17 27
which is £79.3 (rounded to the same accuracy as the original means).
172
Solutions to exercises
Solution to Exercise 5
Mean price of all the material is
(10.95 × 8.5) + (12.70 × 6) 169.275
= ,
8.5 + 6 14.5
which is £11.67 (rounded to the nearest penny).
Solution to Exercise 6
(a) For the arithmetic scores, n = 33 so 41 (n + 1) = 8 12 and 34 (n + 1) = 25 12 .
The lower quartile is therefore
Q1 = 12 (55 + 58)% = 56.5% ! 57%.
The upper quartile is
Q3 = 12 (86 + 89)% = 87.5% ! 88%.
The interquartile range is
Q3 − Q1 = 87.5% − 56.5% = 31%.
Solution to Exercise 7
(a) Arithmetic scores:
From the stemplot, n = 33, EL = 7 and EU = 100.
79
n = 33 57 88
7 100
Five-figure summary of arithmetic scores
173
Unit 2 Prices
270
n = 26 230 327
170 699
Solution to Exercise 8
For the boxplot of arithmetic scores, the left part of the box is longer than
the right part, and the left whisker is also considerably longer than the
right. This batch is left-skew (as was also found in Unit 1 (Activity 20,
Subsection 5.2)).
For the boxplot of television prices, the right part of the box is rather
longer than the left part. The right whisker is also rather longer than the
left, and if one also takes into account the fact that two potential outliers
have been marked, the top 25% of the data are clearly much more spread
out than the bottom 25%. This batch is right-skew.
Solution to Exercise 9
The gas price ratio for 2011 relative to 2010 is
30
! 1.071.
28
The electricity price ratio for 2011 relative to 2010 is
86
! 0.977.
88
The overall energy price ratio for 2011 relative to 2010 is
(1.071 × 23 969) + (0.977 × 2920) 28 523.639
= ! 1.061.
23 969 + 2920 26 889
Then the value of the index for 2011 is found by multiplying the 2010
value of the index by this overall price ratio, giving
117.4 × 1.061 ! 124.6.
Solution to Exercise 10
! !
w = 1000, rw = 1007.760,
$
rw 1007.760
all-item price ratio = $ =
w 1000
= 1.007 760,
174
Solutions to exercises
Solution to Exercise 11
(a) For October 2010, the ratio of the value of the RPI to its value one
year earlier is
225.8
! 1.045,
216.0
so the annual inflation rate is 4.5%.
The purchasing power of the pound compared to one year previously is
216.0
× 100p ! 96p.
225.8
(b) For January 2011, the ratio of the value of the RPI to its value one
year earlier is
229.0
! 1.051,
217.9
so the annual inflation rate is 5.1%.
The purchasing power of the pound compared to one year previously is
217.9
× 100p ! 95p.
229.0
Solution to Exercise 12
The RPI for April 2011 was 234.4 and the RPI for April 2010 was 222.8.
So in April 2011, the pension should be
234.4
£800 × ! £842 per month.
222.8
175
Unit 2 Prices
Acknowledgements
Grateful acknowledgement is made to the following sources:
Table 3 Adapted from: https://www.gov.uk/government/statistical-data-
sets/annual-domestic-energy-price-statistics
Table 5 Taken from:
http://en.wikipedia.org/wiki/List of conurbations in the United Kingdom.
This file is licensed under the Creative Commons Attribution Licence
http://creativecommons.org/licenses/by/3.0/
Table 6 Department of Energy and Climate Change
Tables 13–15 Office for National Statistics licensed under the
Open Government Licence v.1.0
Table 16 Adapted from data from the Office for National Statistics licensed
under the Open Government Licence v.1.0
Figure 28 Crown copyright material is reproduced under Class Licence
Number C01W0000065 with the permission of the Controller, Office of
Public Sector Information (OPSI)
Subsection 1.1 figure, ‘Data, data, data!’, Mary Evans Picture Library
Subsection 1.2 figure, ‘An upside down V-shape’, GIDZY /
www.flickr.com. This file is licensed under the Creative Commons
Attribution Licence http://creativecommons.org/licenses/by/3.0/
Subsection 1.2 figure, ‘Not that kind of flat screen’, Joey Gannon /
www.flickr.com. This file is licensed under the Creative Commons
Attribution-Share Alike Licence
http://creativecommons.org/licenses/by-sa/3.0/
Subsection 3.2 figure, ‘More birds, now showing the shape of the ∧∧
diagram’, JUMBERO / www.flickr.com. This file is licensed under the
Creative Commons Attribution-Share Alike Licence
http://creativecommons.org/licenses/by-sa/3.0/
Subsection 3.3 quote from McCullagh, P. (2003): The Royal Statistical
Society
Subsection 3.3 photo of John Tukey: Taken from
http://rchsbowman.wordpress.com/2011/09/03/statistics-notes-
%E2%80%94-biography-%E2%80%94-john-wilder-tukey/
Subsection 3.3 cartoon: www.causeweb.org
Subsection 5.2 quote from BBC News website, 14 March 2012: Taken from
www.bbc.co.uk/news/business-17356286
Every effort has been made to contact copyright holders. If any have been
inadvertently overlooked the publishers will be pleased to make the
necessary arrangements at the first opportunity.
176