STATIONARITY
Basic definitions
Time series – A time series ( x1...xe ) is assumed to be a sequence of real values taken at
successive equally spaced points in time, from time t=1 to time t=e.
Lag – A fixed amount of passing time. One set of observations in a time series is plotted
(lagged) against a second, later set of data. For some specific time point r, the observation
xr k (k time points back before a time i) is called the kth lag of xr
For example,
Lag1 (Y2 ) Y1 and Lag5 (Y9 ) Y4
Probability space – A probability space is a triple ( , F, P) where;
- Sample space (non-empty set)
F- -algebra (called sigma-algebra) of subsets of . A -algebra consists of all subsets
of including itself: all unions, compliments, compliments of unions, unions of
compliments etc.
P – Probability measure. Defined for all members of F
Random variable –areal random variable or real stochastic variable on (, F , p) is a
function of x : such that the inverse image of any interval (, a) belongs to F i.e a
measurable function.
Stochastic process- A real stochastic process is a family of real random variables
X {xi ( ) : i T } all defined on the same probability space (, F , p) . The set T is called the
index set of the of the process and If T , then the process is called a continuous
stochastic process.
Definitions of stationarity
Stationarity means that the statistical properties of the process do not change over
time.
A time series is stationary if there is no systematic change in mean (no trend), if
there is no systematic change in variance and if strictly periodic variations have
been removed.
A time series is said to be strictly stationary if the probability structure of the
process does not change with time i.e if the joint distribution of
X t , X t 1 , X t 2 ... X t n is the same as the joint distribution of
X t k , X t k 1 , X t k 2 ... X t k n for all t1...tn and h is the distance between the
observations.
For examples of Time series, See Figures 1 & 2 below:
Figure 1: Sales of a Pharmaceutical Product
Weekly sales of a generic pharmaceutical product shown in Figure 1 appear to be
constant over time, an example of a stationery Time series.
Figure 2: Chemical Process Viscocity Readings
The viscosity readings plotted in Figure 2 exhibit auto correlated behavior, that is,
a value above the long-run average tends to be followed by other values above the
average, while a value below the average tends to be followed by other values
below the average.
Strictly stationary time series is one for which the probabilistic behavior of every collection of
values { X t , X t 1 , X t 2 ... X t n } is identical to that of the time shifted set
{ X t k , X t k 1 , X t k 2 ... X t k n } i.e
p( xt1 c1...xtk ck ) p{xt1h c1...xtk h ck }......
for all k=1,2,…, all time points t1 , t2 ...tk ,all numbers c1 , c2 ...ck and all time shifts h 0, 1, 2,...,
If a time series is strictly stationary , then all the multivariate distribution functions for subsets of
variables must agree with their counterparts in the shifted set for all values of the shift parameter
h
e.g if k=1
Then from equation *
p( xs c) p{xt x)......** for any time points s and t.
This implies for example, that the probability that the value of a time series sampled hourly is
negative at 1 am is the same as at 10am.
Therefore , if the mean function t of the series xt exists, equation ** implies that s t i.e t
is constant.
Stationary implies a type of statistical equilibrium or stability in the data. Consequently, the time
series has a constant mean defined as
y E[ y] yf ( y)dy
And a constant variance defined as
y 2 var( y) ( y y )2 f ( y)dy
The sample mean and sample variance are used to estimate these parameters . If the observations
in the time series are y1 , y2 ... yT , then the sample mean is
T
y ˆ y T1 Yt and the sample variance is
t 1
T 2
s ˆ
2 2
y
1
T (Y y )
t 1
t
If a time series is stationary, this means that the joint probability distribution of any two
observations say Yt and Yt k is the same for any two time periods t and t+k that are separated by
the same interval k. Useful information about this joint distribution,
and hence about the nature of the time series can be obtained by plotting a scatter diagram of all
the data pairs , Yt and Yt k that are separated by the same interval k. The interval k is called lag.
See the plots below:
Figure 3: A Scatter Diagram for sales of a Product “A” at lag k=1
In figure 1, the plotted pairs of adjacent observations Yt , Yt 1 seem to be uncorrelated i.e the value
of Y in the current period does not provide any useful information about the value of Y that will
be observed in the next period.
The covariance between Yt and its value at another period say Yt k is called the auto covariance
at lag k, defined by
k cov(Yt , Yt k ) E[(Yt )(Yt k )]
The collection of the values of k , k=0,1,2,…. is called the auto covariance function.
Remark: The auto covariance at lag k=0 is just the variance of the time series, i.e 0 y2
Figure 4: A Scatter Diagram for sales of a Product “B” at lag k=1
In figure 4, we observe that the pairs of adjacent observation Yt 1 , Yt are positively correlated i.e a
small value of Y tends to be followed in the next time period by a small value of Y and a large
value of Y tends to be followed by another large value of Y in the next period.
The auto correlation coefficient at lag k is given by
E [(Yt )(Yt k )]
k
Cov (Yt ,Yt k )
V ar(Yt ) k0
E [(Yt )2 ] E [(Yt k )2 ]
The collection of the value of k ,k=0,1,2,….., is called the autocorrelation function (ACF)
Remark:
By definition 0 1 . Also, the ACF is independent of the scale of measurement of the time
series, so it is a dimensionless quantity. Furthermore k k i.e the autocorrelation function is
symmetric around zero, so it is only necessary to compute the positive (or negative) half.
If a time series has a finite mean and auto covariance function, it is said to be second order
stationary (or weakly stationary of order 2). If in addition the joint probability distribution of the
observations at all times is multivariate normal, then that would be sufficient to result in a time
series that is strictly stationary.
A weakly stationary time series, X t is a finite variance process whose
i. Mean value function t is constant and does not depend on the time t.
ii. The auto covariance function ( s ,t ) depends on s and t only through their differences (s-t)
A process is called second order stationary in the weak sense if its mean is constant and its auto-
covariance is independent of time but depends only on the distance between the variables, i.e on
h, not on time.
Remarks:
1. Henceforth we will be considering stationarity in the weak sense
2. Most of the probability theory of time series is concerned with stationary time series and
for this reason time series analysis will often require one to change a non-stationary time
series into as stationary one so as to use their theory.
For stochastic process the mean (first moment) is constant and can be assumed to be zero without
loss of generality and the second moment is variance. Therefore the theory of stationary
stochastic process or time series is essentially the theory of its correlation. The theoretical
correlation expresses the dependence of time series observations on each other. This dependence
can also be expressed by a regression model which represents the present observations as the
sum of independent un-correlated or orthogonal paths. i.e one depends on the preceding one and
another one independent observation.
E.g
{xt , xt 1 , xt 2 , xt 3 ,..., x0 }
In a stochastic process, xt depends only on xt 1 and losses memory of history. The relationship
between xt and xt 1 can be represented as
xt xt 1 et .........(1)
Where
xt = observation at time t
xt 1 = observation at time t-1
= constant
e =sequence of uncorrelated variables (error terms)
Since (1) expresses the dependence or regression of xt on its past values it’s called an
autoregressive Model.
It is necessary to estimate the auto covariance and autocorrelation functions from a time series of
finite length , say y1 , y2 ..... yT . The usual estimate of the auto covariance function is
T k
ck ˆk T1 ( yt y )( yt k y ), k 0,1, 2...k
t 1
And the autocorrelation function is estimated by the sample autocorrelation function (or sample
ACF)
rk ˆ k cck0 , k 0,1, 2,...k
Remark:
At least 50 observations are required to give a reliable estimate of the ACF and the individual
sample auto correlation should be calculated up to lag k where k is about T4 .
Often we will need to determine if the autocorrelation coefficient at a particular
lag is zero. This can be done by comparing the sample autocorrelation coefficient at
lag k, rk to its standard error. If we make the assumption that the true value of the
autocorrelation coefficient at lag k is zero ( k 0 ), then the variance of the sample
autocorrelation coefficient is
var(rk ) T1
And the standard error is
Se(rk ) 1
T
Example:
The data below shows chemical process viscosity readings for a manufacturing company.
Calculate the
i) ACF at lag k=1
ii) The Autocovariance Function at lag k=1 and k=2
Solution:
i) Recall that:
rk ˆ k cck0 , k 0,1, 2,...k
ii) Left as an assignment
NON-STATIONARITY.
A time series that exhibits a trend is a non-stationary time series.
Non-stationary data are unpredictable and cannot be modeled or forecast. The results
estimated by using non-stationary time series may be spurious (false ,fake) in that they
indicate a relationship between two variables where none exists.
A non-stationary process has a variable variance and a mean that does not remain near or
return to a long run mean over time.
Types of non-stationary Processes
1. Random walk with or without a drift (slow steady change)
2. Deterministic trends- Trends that are constant , positive or negative and independent of
time for the whole life of the series.
Random walk predicts that the value at time t will be equal to the last period value plus a
stochastic component that is white noise which means et is iid with mean 0 and variance
2.
i.e
Yt Yt 1 et
Random walk variance evolves over time and goes to infinity, therefore the
randomness cannot be predicted.
Random walk with a drift i.e Yt a Yt 1 et , is a random walk that can predict
the value at time t which will be equal to the last period value and a constant term.
A random walk with a drift is always confused with a deterministic trend
Yt a t et because both include a drift and a white noise component but the
value at time t in the case of random walk is regressed on the last period value (
Yt 1 ) while in case of deterministic trend it is regressed on a time trend.
A non-stationary process with a deterministic trend has a mean that grows around
a fixed trend which is a constant and independent of time.
A random walk with or without a drift can be transformed to a stationary process
by
I. The process of differencing
Let
Yt Yt Yt 1 Yt
a bt et (a b(t 1) e(t 1) )
a bt et a bt b et 1
Yt b et et 1
Then
E[Yt ] b which is a constant and
Var (Yt ) var(Yt ) var(et ) var(et 1 )
2 2
2 2
which is a constant also.
Therefore ,
Yt Yt is stationary because the mean and variance are constant.
Remark:
A stochastic process {et : t 0,1, 2...,} is called a white noise process if it is a sequence of
independent and identically distributed (iid) random variables with
E[et ] e
var[et ] e2
Both e and e2 are constant (free of t)
A time series process ( Yt ) generally contains two different types of variation.
1. Systematic variation (that we would like to capture and model e.g trend , seasonal)
2. Random variation (inherent background noise in the process)
Suppose that ( et ) is a zero mean and white noise process with var[et ] e2
Define
Y1 e1
Y2 e1 e2
Y3 e1 e2 e3
.
.
Yn e1 e2 ...en
By this definition, note that we can write, for t 1
Yt Yt 1 et where E[et ] 0 and var[et ] e2 .
The process { Yt } is called a random walk process.
Random walk processes are used to model stock processes, movements of molecules in gases
and liquids, animal locations, etc.
The mean of { Yt } is given by
t E[Yt ] E[e1 e2 ...... et ] E[e1 ] E[e2 ] ...... E[et ] 0
This implies that { Yt } is a zero mean process
The variance of { Yt } is given by
v ar[Yt ] v ar[e1 e2 ...... et ] v ar[e1 ] v ar[e2 ] ...... v ar[et ] t e2
v ar[e1 ] v ar[e2 ] ...... v ar[et ] e2
and
Cov(et es ) 0, t s
Auto covariance function :
For t s, the auto cov ariance of Yt and Ys is given by
t , s cov(Yt Ys ) cov(e1 e2 .... et , e1 e2 .... et et 1 ....es )
cov(e1 e2 .... et ) cov( e1 e2 .... et et 1 ....es )
t
cov(ei ei ) cov(ei e j )
i 1 i s j t
t
var(ei ) e2 e2 ..... e2 t e2
i 1
Since t , s s ,t the auto covariance function for a random walk process is
t ,s t e2 , for i t s
Auto correlation function :
For i t s , the autocorrelation function for a random walk process is given by
t ,s t e2
t ,s corr (YY
t s) t ,t s ,s
t
s
t e2 s e2
Remark:
1. When t is closer to s, the autocorrelation function, t , s is closer to 1 i.e the two observations
Yt and Ys are close together in time and are likely to be close together especially when t and s are
both large.
2. When t is far away from s i.e for the two points Yt and Ys being far apart in time, the
autocorrelation is closer to zero (0)
Example:
1
Suppose that {et } is a zero mean white noise process with var{et } e2 . Define Yt (et et 1 et 2 )
3
i.e Yt is a moving average of the white noise process (averaged over the most recent three time periods).
Find the
i) Mean function
ii) Variance function
iii) Autocovariance function
Solution:
1 1
i) E (et ) E (et 1 ) E (et 2 ) 0
t E (Yt ) E (et et 1 et 2 )
3 3
Since {et } is a zero mean process, Yt is also a zero mean process.
1 1 3 2 2
ii) var(Yt ) var (et et 1 et 2 ) var(et ) var(et 1 ) var(et 2 ) e e
3 9 9 3
Since var{et } e2 for all t and since et , et 1 & et 2 are independent, (All covariance terms are zero)
iii) Case 1: s t
e2
t , s cov(Yt , Ys ) var(Yt )
3
Case 2: s t 1
1 1
t , s t ,t 1 cov(Yt , Yt 1 ) cov (et et 1 et 2 ), (et 1 et et 1 )
3 3
1
cov(et , et ) cov(et 1 , et 1 )
9
1 2 2
var(et ) var(et 1 ) e
9 9
Case 3: s t 2
1 1
t , s t ,t 2 cov(Yt , Yt 2 ) cov (et et 1 et 2 ), (et 2 et 1 et )
3 3
1
cov(et , et )
9
1 2
var(et ) e
9 9
Case 4: s t 2
t , s 0 since Yt and Ys will have no common white noise error terms because t ,s s ,t .
The auto covariance function can be written as
e2
, ts 0
3
2 e2
, t s 1
t ,s 9
2
e , ts 2
9
0, t s 0
The autocorrelation function is given by
t ,s
t , s corr (Yt , Ys )
t ,t , s , s
e2
Since t ,t s , s , the autocorrelation function for this process is
3
1, t s 0
2 , t s 1
t , s 3 Show this
1 , t s 2
3
0, t s 0
Remarks:
Observations Yt and Ys that are one unit apart in time have the same autocorrelation regardless of
the values of t and s.
Observations Yt and Ys that are two units apart in time have the same autocorrelation regardless
of the values of t and s.
Observations Yt and Ys that are more than two units apart in time are uncorrelated.