Functional Form and Prediction: OLS Estimation - Assumptions
Functional Form and Prediction: OLS Estimation - Assumptions
Lecture 5
Functional Form and Prediction
1
RS - Econometrics I - Lecture 5
Y 1 2 X 2 3 X3 4 X 4
• Linear in parameters (intrinsic linear), nonlinear in variables:
Y 1 2 X 22 3 X3 4 logX 4
Z 2 X 22 , Z 3 X 3 , Z 4 log X 4
Y 1 2 Z 2 3 Z 3 4 Z 4
Note: We get some nonlinear relation between y and X, but OLS still
can be used.
Y 1 2 X 2 3 X 22
20
0
0 10 20
X2
2
RS - Econometrics I - Lecture 5
Yˆ b1 b2 X 2 b3 X 22 b1 b2 X 2 b3 X 3
20
10
400
0 300
X3= (x2)2
-10 200
100
0
10
20
X2= x2 0
[Matlab demo]
𝑌 𝛽 𝛽𝑋 𝛽 𝑋 𝛽 𝑋 𝛽𝑋 𝜀
Coefficients:
Estimate Std. Error t value Pr(>|t|)
x0 -0.004765 0.002854 -1.670 0.0955 .
xx1 0.906527 0.057281 15.826 <2e-16 ***
xx2 -0.215128 0.084965 -2.532 0.0116 *
xx3 -0.173160 0.085054 -2.036 0.0422 *
xx4 -0.143191 0.617314 -0.232 0.8167 => Not significant
7
3
RS - Econometrics I - Lecture 5
Y 1 2 X 2 3 X 22 3 X 23 ... k 1 X 2k
• Nonlinear in parameters:
Y 1 2 X 2 3 X 3 2 3 X 4
4
RS - Econometrics I - Lecture 5
25
5
RS - Econometrics I - Lecture 5
Note: We can fit both equations into one single equation using a
linear approximation:
E[yi|X] = β00 + β01xi + β10 (xi – t0)+0 + β11 (xi – t0)+1
where (xi – t0)+ is the positive part of (xi – t0) and zero otherwise.
6
RS - Econometrics I - Lecture 5
• Linear model Y 1 2 X
• (Semi-) Log model: log Y 1 2 X
Y 1
• Box–Cox transformation: 1 2 X
Y 1
Y 1 when =1
Y 1
log( Y ) when →0
• Putting = 0 gives the (semi–)logarithmic model (think about the limit
of tends to zero.). We can estimate One would like to test if is
equal to 0 or 1. It is possible that it is neither! 10
7
RS - Econometrics I - Lecture 5
8
RS - Econometrics I - Lecture 5
> summary(fit_ramsey)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.004547 0.002871 -1.584 0.1137
Mkt_RF 0.903783 0.058003 15.582 <2e-16 ***
SMB -0.217268 0.085128 -2.552 0.0110 *
HML -0.173276 0.084875 -2.042 0.0417 *
y_hat2 -0.289197 0.763526 -0.379 0.7050 Not significant!
3
9
RS - Econometrics I - Lecture 5
10
RS - Econometrics I - Lecture 5
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.007195 0.002566 -2.804 0.00522 **
Mkt_RF 0.902968 0.056345 16.026 < 2e-16 ***
SMB -0.240186 0.084013 -2.859 0.00441 **
HML -0.190710 0.084317 -2.262 0.02409 *
Jan_1 0.026993 0.008923 3.025 0.00260 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
11
RS - Econometrics I - Lecture 5
12
RS - Econometrics I - Lecture 5
𝑊 𝑇 β β ′𝑉𝑎𝑟 β β β β
• This test is a bit more flexible, since it is easy to allow for different
formulations for Var[ β β ]. (In econometrics, violations of (A3)
are common, for example, different variances in regimes 1 & 2.)
13
RS - Econometrics I - Lecture 5
14
RS - Econometrics I - Lecture 5
15
RS - Econometrics I - Lecture 5
The event caused structural change in the model. TSB separates the
behaviour of the model in two regimes/categories (“before” & “after”.)
• Under H0 (No structural change), the parameters are the same for all i.
• What events may have this effect on a model? A financial crisis, a big
recession, an oil shock, Covid-19, etc.
• Testing for structural change is the more popular use of Chow tests.
16
RS - Econometrics I - Lecture 5
17
RS - Econometrics I - Lecture 5
18
RS - Econometrics I - Lecture 5
19
RS - Econometrics I - Lecture 5
In the model, the oil shock affects the constant and the slopes.
Constant Slopes:
Before oil shock (D73 = 0): β0 β1
After oil shock (D73 = 1) : β0 + β2 β1 + γ1
20
RS - Econometrics I - Lecture 5
Note: Recall that the Chow test is an F-test, we are testing a joint
hypothesis, all coefficients are subject to structural change.
21
RS - Econometrics I - Lecture 5
- It can deal only with one structural break –i.e., two categories!
The max (supremum) is taken over all potential breaks in (τmin, τmax).
For example, τmin = T*.15; τmax = T*.85).
22
RS - Econometrics I - Lecture 5
where 0< r min< r max<1 and Bk(.) is a “Brownian Bridge” process defined
on [0,1]. Percentiles of this distribution as functions of r max, r min and k
are tabulated in Andrews (1993). (Critical values much larger than χ2.)
• Q: Multiple breaks?
23
RS - Econometrics I - Lecture 5
T1 <- round(T*.15)
T2 <- round(T*.85)
All_F <- matrix(0,T2-T1,1)
t <- T1
while (t <= T2) {
y_1 <- y[1:T1]
x_u1 <- x[1:T1,]
b_1 <- solve(t(x_u1)%*% x_u1)%*% t(x_u1)%*%y_1
e1 <- y_1 - x_u1%*%b_1
RSS1 <- as.numeric(t(e1)%*%e1) # RSS 1
kk = t+1
y_2 <- y[kk:T]
x_u2 <- x[kk:T,
b_2 <- solve(t(x_u2)%*% x_u2)%*% t(x_u2)%*%y_2
e2 <- y_2 - x_u2%*%b_2
RSS2 <- as.numeric(t(e2)%*%e2) # RSS 2
• Objective: Forecast
• Distinction: Ex post vs. Ex ante forecasting
– Ex post: RHS data are observed
– Ex ante (true forecasting): RHS data must be forecasted
24
RS - Econometrics I - Lecture 5
Notation:
- Prediction for T made at T: 𝑌 .
- Forecast for T+l made at T: 𝑌 , 𝑌 | ,𝑌 𝑙 .
where T is the forecast origin and l is the forecast horizon. Then,
𝑌 𝑙 : l-step ahead forecast = Forecasted value YT+l at time T.
25
RS - Econometrics I - Lecture 5
26
RS - Econometrics I - Lecture 5
0.8000
0.6000 Out-of-
0.4000 Sample
0.2000 Forecasts
0.0000 Estimation Period
J a n -9 9
J u l-9 9
J a n -0 0
J u l-0 0
J a n -0 1
J u l-0 1
J a n -0 2
J u l-0 2
J a n -0 3
J u l-0 3
J a n -0 4
J u l-0 4
J a n -0 5
J u l-0 5
J a n -0 6
J u l-0 6
J a n -0 7
J u l-0 7
J a n -0 8
J u l-0 8
J a n -0 9
J u l-0 9
J a n -1 0
J u l-1 0
Steps to measure forecast accuracy:
1) Select a (long) part of the sample (estimation period) to estimate the
parameters of the model. (Get in-sample forecasts, 𝑦.)
2) Keep a (short) part of the sample to check the model’s forecasting
skills. This is the validation step. You can calculate true MSE or MAE
3) If happy with Step 2), proceed to do out-of-sample forecasts.
27
RS - Econometrics I - Lecture 5
28
RS - Econometrics I - Lecture 5
Two cases:
(1) If x0 is given –i.e., constants. Then,
Var[ŷ0 – y0|x0] = x0 Var[b|x0] x0 + 2
Form C.I. as usual.
1
Var 𝑒 𝜎 1 𝑥 𝑥̄ 𝑥 𝑥̄ ZM Z
𝑛
Note: Large 2, small n, and large deviations from the means, decrease
the precision of the forecasting error.
29
RS - Econometrics I - Lecture 5
Prediction Intervals
Example (continuation): We want to calculate the variance of the
forecast error: for thee given x0 = [1.0000 -0.0189 -0.0142 -0.0027]
Recall we got ŷ0 = b’x0 = -0.01877587
Then,
Estimated Var[ŷ0 – y0|x0] = x0 Var[b|x0] x0 + s2 = 0.003429632
> var_ef_0 <- t(x_0)%*% Var_b%*% x_0 + Sigma2
> var_ef_0
[,1]
[1,] 0.003429632
> sqrt(var_ef_0)
[,1]
[1,] 0.05856306
30
RS - Econometrics I - Lecture 5
Prediction Intervals
Example (continuation):
># (1-alpha)% C.I. for prediction (alpha = .05)
> CI_lb <- y_f0 – 1.96 * sqrt(var_ef_0)
> CI_lb
>[1] -0.1335594
> CI_ub <- y_f0 + 1.96 * sqrt(var_ef_0)
>CI_ub
>[1] 0.09600778
31
RS - Econometrics I - Lecture 5
T m T m
1 1
Root Mean Square Error (RMSE) = ( yˆi yi )2
m iT 1
ei 2
m iT 1
T m
1
ei 2
m iT 1
Theil’s U-stat = U
T
1
T y
i 1
i
2
• The lower the above criteria, say MSE, the better the forecasting
ability of our model.
32
RS - Econometrics I - Lecture 5
33
RS - Econometrics I - Lecture 5
• Consider, the pair of RVs et (1)+ et(2) and et(1) – et(2). Now,
• That is, we test H0 by testing that the two RVs are not correlated!
This idea is due to Morgan, Granger and Newbold (MGN, 1977).
34
RS - Econometrics I - Lecture 5
(1) Do a regression: xt = β zt + εt
(2) Test H0: β = 0 a simple t-test.
The MGN test statistic is exactly the same as that for testing the null
hypothesis that β = 0 in this regression (recall: b = (X′X)-1X′y). This is
the approach taken by Harvey, Leybourne and Newbold (1997).
35
RS - Econometrics I - Lecture 5
Using R, we create the forecasting errors for both models and MSE:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.05688 0.03512 1.619 0.117
x_mgn 2.77770 0.58332 4.762 5.32e-05 ***
36
RS - Econometrics I - Lecture 5
( RSS R RSS1 ) / T2
F ~ FT2 ,
RSS1 /(T1 k ) T1 k
37