Introductory Econometrics For Finance Chris Brooks Solutions To Review Questions - Chapter 9
Introductory Econometrics For Finance Chris Brooks Solutions To Review Questions - Chapter 9
Chris Brooks
Solutions to Review Questions - Chapter 9
1. (a). A number of stylised features of financial data have been suggested at the
start of Chapter 9 and in other places throughout the book:
-Frequency: Stock market prices are measured every time there is a trade or
somebody posts a new quote, so often the frequency of the data is very high
-Non-stationarity: Financial data (asset prices) are covariance non-stationary;
but if we assume that we are talking about returns from here on, then we can
validly consider them to be stationary.
-Linear Independence: They typically have little evidence of linear
(autoregressive) dependence, especially at low frequency.
-Non-normality: They are not normally distributed – they are fat-tailed.
-Volatility pooling and asymmetries in volatility: The returns exhibit volatility
clustering and leverage effects.
Of these, we can allow for the non-stationarity within the linear (ARIMA)
framework, and we can use whatever frequency of data we like to form the
models, but we cannot hope to capture the other features using a linear
model with Gaussian disturbances.
(b) GARCH models are designed to capture the volatility clustering effects in
the returns (GARCH(1,1) can model the dependence in the squared returns, or
squared residuals), and they can also capture some of the unconditional
leptokurtosis, so that even if the residuals of a linear model of the form given
by the first part of the equation in part (e), the û t ’s, are leptokurtic, the
standardised residuals from the GARCH estimation are likely to be less
leptokurtic. Standard GARCH models cannot, however, account for leverage
effects.
1
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
GARCH(1,1) goes some way to get around these. The GARCH(1,1) model has
only three parameters in the conditional variance equation, compared to q+1
for the ARCH(q) model, so it is more parsimonious. Since there are less
parameters than a typical qth order ARCH model, it is less likely that the
estimated values of one or more of these 3 parameters would be negative
than all q+1 parameters. Also, the GARCH(1,1) model can usually still capture
all of the significant dependence in the squared returns since it is possible to
write the GARCH(1,1) model as an ARCH(), so lags of the squared residuals
back into the infinite past help to explain the current value of the conditional
variance, ht.
(d) There are a number that you could choose from, and the relevant ones
that were discussed in Chapter 9, including EGARCH, GJR or GARCH-M.
The first two of these are designed to capture leverage effects. These are
asymmetries in the response of volatility to positive or negative returns. The
standard GARCH model cannot capture these, since we are squaring the
lagged error term, and we are therefore losing its sign.
The conditional variance equations for the EGARCH and GJR models are
respectively:
u t 1 u t 1 2
log( t2 ) log( t21 )
t 1 t 1
and
t2 = 0 + 1 ut21 +t-12+ut-12It-1
where It-1 = 1 if ut-1 0
= 0 otherwise
For a leverage effect, we would see > 0 in both models.
The EGARCH model also has the added benefit that the model is expressed in
terms of the log of ht, so that even if the parameters are negative, the
conditional variance will always be positive. We do not therefore have to
artificially impose non-negativity constraints.
2
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
asset influences the return, so that we expect a positive coefficient for . Note
that some authors use t (i.e. a contemporaneous term).
(e). Since yt are returns, we would expect their mean value (which will be
given by ) to be positive and small. We are not told the frequency of the
data, but suppose that we had a year of daily returns data, then would be
the average daily percentage return over the year, which might be, say 0.05
(percent). We would expect the value of 0 again to be small, say 0.0001, or
something of that order. The unconditional variance of the disturbances would
be given by 0/(1-(1 +2)). Typical values for 1 and 2 are 0.8 and 0.15
respectively. The important thing is that all three alphas must be positive, and
the sum of 1 and 2 would be expected to be less than, but close to, unity,
with 2 > 1.
(f) Since the model was estimated using maximum likelihood, it does not
seem natural to test this restriction using the F-test via comparisons of
residual sums of squares (and a t-test cannot be used since it is a test
involving more than one coefficient). Thus we should use one of the
approaches to hypothesis testing based on the principles of maximum
likelihood (Wald, Lagrange Multiplier, Likelihood Ratio). The easiest one to use
would be the likelihood ratio test, which would be computed as follows:
1. Estimate the unrestricted model and obtain the maximised value of the log-
likelihood function.
2. Impose the restriction by rearranging the model, and estimate the
restricted model, again obtaining the value of the likelihood at the new
optimum. Note that this value of the LLF will be likely to be lower than the
unconstrained maximum.
3. Then form the likelihood ratio test statistic given by
3
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
4
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
= 0 + (1+) h1,T
f
= 0 + (1+) h2 ,T
f
And so on. This is the method we could use to forecast the conditional
variance of yt. If yt were, say, daily returns on the FTSE, we could use these
volatility forecasts as an input in the Black Scholes equation to help determine
the appropriate price of FTSE index options.
(h) An s-step ahead forecast for the conditional variance could be written
s 1
hsf,T 0 ( 1 ) i 1 ( 1 ) s 1 h1f,T (x)
i 1
For the new value of , the persistence of shocks to the conditional variance, given
by (1+) is 0.1251+ 0.98 = 1.1051, which is bigger than 1. It is obvious from
equation (x), that any value for (1+) bigger than one will lead the forecasts to
explode. The forecasts will keep on increasing and will tend to infinity as the
forecast horizon increases (i.e. as s increases). This is obviously an undesirable
property of a forecasting model! This is called “non-stationarity in variance”.
For (1+)<1, the forecasts will converge on the unconditional variance as the
forecast horizon increases. For (1+) = 1, known as “integrated GARCH” or
IGARCH, there is a unit root in the conditional variance, and the forecasts will stay
constant as the forecast horizon increases.
2. (a) Maximum likelihood works by finding the most likely values of the parameters
given the actual data. More specifically, a log-likelihood function is formed, usually
based upon a normality assumption for the disturbance terms, and the values of
the parameters that maximise it are sought. Maximum likelihood estimation can
be employed to find parameter values for both linear and non-linear models.
(b) The three hypothesis testing procedures available within the maximum
likelihood approach are lagrange multiplier (LM), likelihood ratio (LR) and Wald
tests. The differences between them are described in Figure 9.4, and are not
defined again here. The Lagrange multiplier test involves estimation only under the
null hypothesis, the likelihood ratio test involves estimation under both the null
and the alternative hypothesis, while the Wald test involves estimation only under
the alternative. Given this, it should be evident that the LM test will in many cases
be the simplest to compute since the restrictions implied by the null hypothesis
5
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
will usually lead to some terms cancelling out to give a simplified model relative to
the unrestricted model.
(c) OLS will give identical parameter estimates for all of the intercept and slope
parameters, but will give a slightly different parameter estimate for the variance of
the disturbances. These are shown in the Appendix to Chapter 9. The difference in
the OLS and maximum likelihood estimators for the variance of the disturbances
can be seen by comparing the divisors of equations (9A.25) and (9A.26).
3. (a) The unconditional variance of a random variable could be thought of, abusing
the terminology somewhat, as the variance without reference to a time index, or
rather the variance of the data taken as a whole, without conditioning on a
particular information set. The conditional variance, on the other hand, is the
variance of a random variable at a particular point in time, conditional upon a
particular information set. The variance of ut, t2 , conditional upon its previous
values, may be written t2 = Var(ut ut-1, ut-2,...) = E[(ut-E(ut))2 ut-1, ut-2,...], while
the unconditional variance would simply be Var(ut) = 2.
(b) Equation (9.120) is an equation showing that the variance of the disturbances is
not fixed over time, but rather varies systematically according to a GARCH process.
This is therefore an example of heteroscedasticity. Thus, the consequences if it
were present but ignored would be those described in Chapter 5. In summary, the
coefficient estimates would still be consistent and unbiased but not efficient. There
6
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
is therefore the possibility that the standard error estimates calculated using the
usual formulae would be incorrect leading to inappropriate inferences.
(c) There are of course a large number of competing methods for measuring and
forecasting volatility, and it is worth stating at the outset that no research has
suggested that one method is universally superior to all others, so that each
method has its merits and may work well in certain circumstances. Historical
measures of volatility are just simple average measures – for example, the
standard deviation of daily returns over a 3-year period. As such, they are the
simplest to calculate, but suffer from a number of shortcomings. First, since the
observations are unweighted, historical volatility can be slow to respond to
changing market circumstances, and would not take advantage of short-term
persistence in volatility that could lead to more accurate short-term forecasts.
Second, if there is an extreme event (e.g. a market crash), this will lead the
measured volatility to be high for a number of observations equal to the
measurement sample length. For example, suppose that volatility is being
measured using a 1-year (250-day) sample of returns, which is being rolled forward
one observation at a time to produce a series of 1-step ahead volatility forecasts. If
a market crash occurs on day t, this will increase the measured level of volatility by
the same amount right until day t+250 (i.e. it will not decay away) and then it will
disappear completely from the sample so that measured volatility will fall abruptly.
Exponential weighting of observations as the EWMA model does, where the
weight attached to each observation in the calculation of volatility declines
exponentially as the observations go further back in time, will resolve both of
these issues. However, if forecasts are produced from an EWMA model, these
forecasts will not converge upon the long-term mean volatility estimate as the
prediction horizon increases, and this may be undesirable (see part (a) of this
question). There is also the issue of how the parameter is calculated (see
equation (9.5), although, of course, it can be estimated using maximum likelihood).
GARCH models overcome this problem with the forecasts as well, since a GARCH
model that is “stationary in variance” will have forecasts that converge upon the
long-term average as the horizon increases (see part (a) of this question). GARCH
models will also overcome the two problems with unweighted averages described
above. However, GARCH models are far more difficult to estimate than the other
two models, and sometimes, when estimation goes wrong, the resulting
parameter estimates can be nonsensical, leading to nonsensical forecasts as well.
Thus it is important to apply a “reality check” to estimated GARCH models to
ensure that the coefficient estimates are intuitively plausible.
7
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
Finally, implied volatility estimates are those derived from the prices of traded
options. The “market-implied” volatility forecasts are obtained by “backing out”
the volatility from the price of an option using an option pricing formula together
with an iterative search procedure. Financial market practitioners would probably
argue that implied forecasts of the future volatility of the underlying asset are
likely to be more accurate than those estimated from statistical models because
the people who work in financial markets know more about what is likely to
happen to those instruments in the future than econometricians do.
Also, an “inaccurate” volatility forecast implied from an option price may imply an
inaccurate option price and therefore the possibility of arbitrage opportunities.
However, the empirical evidence on the accuracy of implied versus statistical
forecasting models is mixed, and some research suggests that implied volatility
systematically over-estimates the true volatility of the underlying asset returns.
This may arise from the use of an incorrect option pricing formula to obtain the
implied volatility – for example, the Black-Scholes model assumes that the
volatility of the underlying asset is fixed (non-stochastic), and also that the returns
to the underlying asset are normally distributed. Both of these assumptions are at
best tenuous.
A further reason for the apparent failure of the implied model may be a
manifestation of the “peso problem”. This occurs when market practitioners
include in the information set that they use to price options the possibility of a very
extreme return that has a low probability of occurrence, but has important
ramifications for the price of the option due to its sheer size. If this event does not
occur in the sample period over which the implied and actual volatilities are
compared, the implied model will appear inaccurate. Yet this does not mean that
the practitioners’ forecasts were wrong, but rather simply that the low-probability,
high-impact event did not happen during that sample period. It is also worth
stating that only one implied volatility can be calculated from each option price for
the “average” volatility of the underlying asset over the remaining lifetime of the
option.
8
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
The coefficients expected would be very small for the conditional mean
coefficients, 1 and 2, since they are average daily returns, and they could be
positive or negative, although a positive average return is probably more likely.
Similarly, the intercept terms in the conditional variance equations would also
be expected to be small since and positive this is daily data. The coefficients
on the lagged squared error and lagged conditional variance in the conditional
variance equations must lie between zero and one, and more specifically, the
following might be expected: 11 and 22 0.1-0.3; 11 and 22 0.5-0.8, with
11 + 11 < 1 and 22 + 22 < 1. The coefficient values for the conditional
covariance equation are more difficult to predict: 11 + 11 < 1 is still required
for the model to be useful for forecasting covariances. The parameters in this
equation could be negative, although given that the returns for two stock
markets are likely to be positively correlated, the parameters would probably
be positive, although the model would still be a valid one if they were not.
(b) One of two procedures could be used. Either the daily returns data would
be transformed into weekly returns data by adding up the returns over all of
the trading days in each week, or the model would be estimated using the
daily data. Daily forecasts would then be produced up to 10 days (2 trading
weeks) ahead.
In both cases, the models would be estimated, and forecasts made of the
conditional variance and conditional covariance. If daily data were used to
estimate the model, the forecasts for the conditional covariance forecasts for
the 5 trading days in a week would be added together to form a covariance
forecast for that week, and similarly for the variance. If the returns had been
aggregated to the weekly frequency, the forecasts used would simply be 1-
step ahead.
Finally, the conditional covariance forecast for the week would be divided by
the product of the square roots of the conditional variance forecasts to obtain
a correlation forecast.
9
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
(d) The simple historical approach is obviously the simplest to calculate, but
has two main drawbacks. First, it does not weight information: so any
observations within the sample will be given equal weight, while those
outside the sample will automatically be given a weight of zero. Second, any
extreme observations in the sample will have an equal effect until they
abruptly drop out of the measurement period. For example, suppose that one
year of daily data is used to estimate volatility. If the sample is rolled through
one day at a time, an observation corresponding to a market crash will appear
in the next 250 samples, with equal effect, but with then disappear altogether.
Finally, implied correlations may at first blush appear to be the best method
for calculating correlation forecasts accurately, for they rely on information
obtained from the market itself. After all, who should know better about
future correlations in the markets than the people who work in those
markets? However, market-based measures of volatility and correlation are
sometimes surprisingly inaccurate, and are also sometimes difficult to obtain.
Most fundamentally, correlation forecasts will only be available where there is
an option traded whose payoffs depend on the prices of two underlying
assets. For all other situations, a market-based correlation forecast will simply
not be available.
5. A news impact curve shows the effect of shocks of different magnitudes on the
next period’s volatility. These curves can be used to examine visually whether there
are any asymmetry effects in volatility for a particular set of data. For the data given
in this question, the way I would approach it is to put values of the lagged error into
column A ranging from –1 to +1 in increments of 0.01. Then simply enter the formulae
for the GARCH and EGARCH models into columns 2 and 3 that refer to those values of
the lagged error put in column A. The graph obtained would be as follows:
10
© Chris Brooks 2014
Introductory Econometrics for Finance by Chris Brooks
0.2
GARCH
0.18 EGARCH
0.16
Value of Conditional Variance
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Value of Lagged Shock
This graph is a bit of an odd one, in the sense that the conditional variance is always
lower for the EGARCH model. This may suggest estimation error in one of the models.
There is some evidence for asymmetries in the case of the EGARCH model since the
value of the conditional variance is 0.1 for a shock of 1 and 0.12 for a shock of –1.
(b) This is a tricky one. The leverage effect is used to rationalise a finding of
asymmetries in equity returns, but such an argument cannot be applied to foreign
exchange returns, since the concept of a Debt/Equity ratio has no meaning in that
context.
On the other hand, there is equally no reason to suppose that there are no
asymmetries in the case of fx data. The data used here were daily US dollar – British
pound returns. It might be the case that, for example, that news relating to one
country has a differential impact to equally good and bad news relating to another. To
offer one illustration, it might be the case that the bad news for the currently weak
euro has a bigger impact on volatility than news about the currently strong dollar. This
would lead to asymmetries in the news impact curve. Finally, it is also worth noting
that the asymmetry term in the EGARCH model, 1, is not statistically significant in
this case.
11
© Chris Brooks 2014