Timeseriesanalysisguide 222321
Timeseriesanalysisguide 222321
Some data, such as sales data, may incorporate a seasonal trend. The underlying theory for
approaches to forecasting in these situations is summarised in this document.
Table 5
Time period (t) 1 2 3 4 5 6 7
Sales (x) 577.2 510.2 442.8 559.0 536.5 471.0 585.9
Plotting the sales figures onto a graph shows a gradual increase but the variation within the data
is significant. Indeed, careful observation shows that there is a distinct “seasonality” within the
data. Every three time periods shows a similar peak or a trough. It would not be unreasonable to
propose that to forecast future sales then the seasonality element should be included. A
technique known as time series analysis may be used to forecast sales with seasonality.
700
650
600
550
500
450
400
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Observation of the data shows that there is a 3 season seasonality i.e. the peaks/troughs occur
every 3 time periods
So, considering a sequential 3-season period: (note any number of periods can be used but an
odd number is easier to centre and understand the theory)
For period n-1: y n −1 = x n −1 + s n −1 + rn −1
For period n: y n = x n + s n + rn
For period n+1: y n +1 = x n +1 + s n +1 + r n +1
Averaging these three sequential data points (a bar over the symbol indicates a mean) over the
3-season year and centering the average on period n
_ _ _ _
y n = x n + sn + r n
If seasonal variations are regular over time and the averaging has been done over a full season
cycle then there will be as many data points above the average as there are below the average.
_
Hence s n = 0 , in other words we have smoothed out the seasonality by taking an average. If
the 3 sequential averages are taken at every data point along the data set then we will arrive at
a set of data with the seasonality removed. This procedure is called centred moving averaging.
If residual component is random then an average of random values will cancel each other out so
_
r n = 0.
_
Hence as yn is approximately xn the centred moving average values may be used as a
primary estimate of the x values hence the underlying trend equation.
The amount of seasonality (i.e. greater than or lower than the average amount) may be
estimated by calculating the ratio of the actual values to the centred moving averages:
y x n + s n + rn y s n + rn
_
= So
_
= 1+
xn xn
y y
x n + sn sn
In = So In = 1 +
xn xn
I n = I n +3 =I n + 6 = I n +9 etc
so the average of the estimates for a given season will provide the mean seasonal index for the
season.
So, having obtained estimates for xn and In successive seasonal variations can be estimated
s n = (I n × x n ) − x n So s n = I n − 1 x n( )
The difference between fitted data and actual data may be used to estimate the confidence limits
of the extrapolation
Table 6
Time Sales (y) Centred y/ Average x s r
Period (=x+s+r) moving CMA seasonality estimated (est)
(t) average index from
(=In)
(CMA) In graph
1 577.2 1.09 504 46 27
2 510.2 510.1 1.000 1.05 509 25 -24
3 442.8 504.0 0.879 0.86 513 -71 1
4 559.0 512.8 1.090 1.09 518 47 -7
5 536.5 522.2 1.027 1.05 523 26 -13
6 471.0 531.1 0.887 0.86 528 -73 17
7 585.9 548.2 1.069 1.09 533 48 5
8 587.8 543.6 1.081 1.05 538 27 23
9 457.1 549.1 0.832 0.86 543 -76 -10
10 602.5 541.4 1.113 1.09 548 50 4
11 564.7 555.4 1.017 1.05 553 27 -16
12 498.9 552.4 0.903 0.86 559 -78 18
1. Construct a table of sequential sales data (y). Determine from the seasonality how many time
periods represent a full cycle of seasonality. For this technique to work, the number of periods
used to take centred moving averages must cover the full seasonality (i.e. peak to peak or
trough to trough), no more and no less. This is because when the averages are calculated the
high seasons must cancel out the low seasons. In this example, the periodicity is 3 time periods
so the centred moving average values must be calculated from 3 sequential data points.
2. Calculate the average of 3 sequential y values and put the answer in the centre row in the
CMA column of Table 6. For example, the first three points are 577.2, 510.2 and 442.8, the
average is 510.1 and this value goes into the row t=2.
3. Divide the y values by the CMA values and put the answers into the y/CMA column, of Table
6 which represents the seasonality index. Take the average of all the values representing one
particular period e.g. rows 4, 7, 10, 13 and 16 and put this average value (1.09) into the average
seasonality column for every time period when this season appears i.e. rows 4, 7, 10,13 and 16.
Do the same for the other 2 seasons (their averages are 1.05 and 0.86).
4. In order to determine the trend line (x) plot a graph of CMA against t as shown below.
600
580
560
540
520
500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
This shows the upward trend without the complication of the seasonality. Decide what trend
equation this trend is best represented by (linear, binomial, exponential, logistic) then determine
600
580
560
540
520
500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
The best straight line is represented by the linear equation: x = 496.6 + 5.115t
Now, either calculate the values of x from the equation or read off their values from the straight
line graph and put these values into the (x estimated from the graph) column of table 6.
5. The values of s and r in Table 6 may be determined using the equations stated above.
6. In order to forecast the sales values for periods t = 19, 20 and 21, first determine the
underlying trend value (x) either by calculation using the trend equation using t = 19, 20 and 21
or by extrapolation and reading off the values from the graph.
Multiply the x value by its respective seasonality index (In) to introduce the seasonality element
and this represents the forecasted value +/- the standard deviation determined from the
residuals r.