0% found this document useful (0 votes)

42 views

Lab 4

1. This document summarizes a computer lab on time series forecasting using R. 2. The lab recaps previous labs on reading in and plotting time series data, performing stationarity tests, and fitting ARIMA and regression models. 3. It then demonstrates forecasting using these models, calculating errors for the regression and Holt's smoothing forecasts against actual later values.

Uploaded by

glavchevanasik

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views

Lab 4

Uploaded by

glavchevanasik

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

Computer Lab 4

Mahmood Ul Hassan and Andriy Andreev

Packages

library(forecast)
library(lmtest)
library(tseries)

Summary

In this lab: 1.we will do a recap of the last three computer labs and you will learn: 2. How to use an R
function, written for this course, to convert an Arima forecast of logged prices to a forecast of regular prices.
3. How to plot this forecast. 4. Test for ARCH effects

1. Recap

Let’s start with what we have learned so far. Step I: I have to read my dataset, which I downloaded from
yahoo finance. I downloaded Google’s stock data from January 2015 until January 2022.

GOOG <- read.csv("GOOG_7.csv")

For simplicity reasons, for the Computer lab, I will keep only the column date and Adj.Close and I make
sure that I have only one observation per month.

GOOG <- GOOG[, -c(2,3,4,5,7)]

Step II: I use a plot in order to describe the time series. I use “ts” function and I create the log price and
the log return

GOOG$Adj.Close <- ts(GOOG$Adj.Close, start = c(2016, 1), frequency = 12)

### start = c(year, month)

GOOG$log_Adj.close <- log(GOOG$Adj.Close)

GOOG$logreturn_Adj.close <- c(NA, diff(GOOG$log_Adj.close))

GOOG$logreturn_Adj.close <- ts(GOOG$logreturn_Adj.close, start = c(2016, 1), frequency = 12)

ts.plot(GOOG$logreturn_Adj.close)
abline(h = 0, lty = 2, col = "dark red")

1
GOOG$logreturn_Adj.close

0.10
0.00
−0.10
−0.20

2016 2017 2018 2019 2020 2021 2022 2023

Time

Do I see any clear pattern in my diagram? Can I say that my time series is stationary? Is there a way to
test if the trend of the time series is stationary or not? Step III: I run a Regular Dickey Fuller test.

adf.test(GOOG$logreturn_Adj.close[-(1:2)], k = 0)

## Warning in adf.test(GOOG$logreturn_Adj.close[-(1:2)], k = 0): p-value smaller

## than printed p-value

##
## Augmented Dickey-Fuller Test
##
## data: GOOG$logreturn_Adj.close[-(1:2)]
## Dickey-Fuller = -10.818, Lag order = 0, p-value = 0.01
## alternative hypothesis: stationary

Note about the Dickey-Fuller test: Transform your data until it your time series is stationary.
Step IV: I create a variable called time that I will need later and I split the time series into a training and a
testing set:

GOOG$time <- 1:nrow(GOOG)

training <- GOOG[1:(nrow(GOOG) - 4), ]

testing <- GOOG[(nrow(GOOG) - 3): nrow(GOOG), ]

2
Step V: I estimate a random walk using the training set. Let us create a forecast assuming that the log price
is a random walk with a drift. This means that we will fit a ARIMA(0,1,0) to the log price. Alternatively,
we could fit an ARIMA(0,0,0) to the log return, but this would make forecasting a little harder.

ARIMA010 <- Arima(training$log_Adj.close, order = c(0, 1, 0), include.drift =T)

summary(ARIMA010)

## Series: training$log_Adj.close
## ARIMA(0,1,0) with drift
##
## Coefficients:
## drift
## 0.0119
## s.e. 0.0075
##
## sigma^2 = 0.004538: log likelihood = 102.8
## AIC=-201.6 AICc=-201.45 BIC=-196.84
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 4.448157e-05 0.06652704 0.05162596 -0.005778455 1.224137 0.9506535
## ACF1
## Training set -0.1613332

### print the confidence interval

confint(ARIMA010)

## 2.5 % 97.5 %
## drift -0.002782661 0.02655799

### print the p-values

coeftest(ARIMA010)

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## drift 0.011888 0.007485 1.5882 0.1122

checkresiduals(ARIMA010)

3
Residuals from ARIMA(0,1,0) with drift

0.1

0.0

−0.1

−0.2
0 20 40 60 80

0.2
20

0.1
15
ACF

df$y
0.0
10

−0.1
5

−0.2
0
5 10 15 −0.2 −0.1 0.0 0.1 0.2
Lag residuals

##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,0) with drift
## Q* = 9.2848, df = 10, p-value = 0.5053
##
## Model df: 0. Total lags used: 10

Step VII: We estimate a regression model using time as an explanatory variable

regression <- lm(formula = Adj.Close ~ time, data = training)

summary(regression)

##
## Call:
## lm(formula = Adj.Close ~ time, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -27.107 -11.687 1.010 6.737 39.004
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 20.23703 3.36707 6.01 5.4e-08 ***
## time 1.27185 0.07134 17.83 < 2e-16 ***

4
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## Residual standard error: 15.01 on 79 degrees of freedom
## Multiple R-squared: 0.8009, Adjusted R-squared: 0.7984
## F-statistic: 317.8 on 1 and 79 DF, p-value: < 2.2e-16

### forecast four periods forward

### you need to create a data frame with the time for which you want to predict
forward_data <- data.frame(time = 82:85)
regression_forecast <- predict.lm(regression, newdata = forward_data)

### how to plot the regression model with the training set and
### including the forecast and the testing set
plot((1:nrow(GOOG)), GOOG$Adj.Close, type = "l", xlim = c(0, 85))
lines((1:(nrow(GOOG) - 4)), regression$fitted.values, col = "dark blue", lwd = 2, lty = 2)
lines(82:85, testing$Adj.Close, type = "o", col = "dark red", lwd = 2)
lines(82:85, regression_forecast, type = "o", col = "dark blue", lwd = 2)
100 120 140
GOOG$Adj.Close

80
60
40

0 20 40 60 80

(1:nrow(GOOG))

regression_forecast_error <- testing$Adj.Close - regression_forecast

regression_table <- data.frame(cbind(testing$Adj.Close, regression_forecast, regression_forecast_error))

names(regression_table) <- c("testing", "estimated", "residuals")
regression_table

5
## testing estimated residuals
## 1 94.66 124.5290 -29.86901
## 2 101.45 125.8009 -24.35087
## 3 88.73 127.0727 -38.34272
## 4 99.87 128.3446 -28.47457

Calculate RMSE for the regression model using the above table.

library(Metrics)

##
## Attaching package: ’Metrics’

## The following object is masked from ’package:forecast’:

##
## accuracy

RMSE_regression<-rmsle(regression_table$testing,regression_table$estimated)
RMSE_regression

## [1] 0.277366

Step VIII: Fit a Holt’s smoothing model

holt_model <-holt(training$Adj.Close, h = 4, exponential = FALSE)

summary(holt_model)

##
## Forecast method: Holt’s method
##
## Model Information:
## Holt’s method
##
## Call:
## holt(y = training$Adj.Close, h = 4, exponential = FALSE)
##
## Smoothing parameters:
## alpha = 0.5209
## beta = 0.2279
##
## Initial states:
## l = 36.3075
## b = -0.1851
##
## sigma: 5.7216
##
## AIC AICc BIC
## 644.4166 645.2166 656.3888
##
## Error measures:
## ME RMSE MAE MPE MAPE MASE

6
## Training set -0.2593424 5.57854 3.94528 -0.1856814 5.374785 0.9461912
## ACF1
## Training set 0.04402264
##
## Forecasts:
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 82 95.13739 87.80486 102.46992 83.92325 106.3515
## 83 90.16544 81.00525 99.32562 76.15614 104.1747
## 84 85.19349 73.56632 96.82065 67.41128 102.9757
## 85 80.22153 65.62033 94.82274 57.89092 102.5521

plot(holt_model)
lines((nrow(GOOG) - 3):nrow(GOOG), testing$Adj.Close, col = "dark red", lwd = 2, type = "o")

Forecasts from Holt's method

100 120 140
80
60
40

0 20 40 60 80

### forecast four periods forward

holt_forecast <- holt_model$mean
holt_forecast_error <- testing$Adj.Close - holt_forecast
### summarize the calculations in a table
holt_table <-data.frame(cbind(testing$Adj.Close, holt_forecast, holt_forecast_error))
names(holt_table) <- c("testing", "Holt's forecast",
"Holt's forecast error")

holt_table

## testing Holt’s forecast Holt’s forecast error

7
## 1 94.66 95.13739 -0.4773835
## 2 101.45 90.16544 11.2845606
## 3 88.73 85.19349 3.5365177
## 4 99.87 80.22153 19.6484688

Calculate RMSE for Holt’s model using the above table.

RMSE_HOlt<-rmse(holt_table$testing,holt_table$`Holt's forecast`)
RMSE_HOlt

## [1] 11.46885

Step IX: Acf and Pacf plots

Acf(training$logreturn_Adj.close, main = "ACF", lag.max=12)

ACF
0.3
0.1
ACF

−0.1
−0.3

1 2 3 4 5 6 7 8 9 10 11 12

Lag

Pacf(training$logreturn_Adj.close, main = "PACF",lag.max=12)

8
PACF
0.3
0.1
Partial ACF

−0.1
−0.3

1 2 3 4 5 6 7 8 9 10 11 12

Lag

Step X: Fit ARIMA models

Since there is no reason to believe that the current stock value depends on the stock value from 11 months
ago, we will start with an ARIMA(4,1,4) model and try to find a good model based on P-values and AIC.
During the model selection process, we only need to check the residuals for the good candidate model.

### model p = 4, d = 1, q = 4
ARIMA414 <- Arima(training$log_Adj.close, order = c(4, 1, 4), include.drift =T)
summary(ARIMA414)

## Series: training$log_Adj.close
## ARIMA(4,1,4) with drift
##
## Coefficients:
## ar1 ar2 ar3 ar4 ma1 ma2 ma3 ma4
## 0.5028 -0.0516 0.8321 -0.6155 -0.6951 0.0119 -0.7309 0.9673
## s.e. 0.1569 0.1050 0.1052 0.1563 0.1692 0.1603 0.1176 0.1364
## drift
## 0.0097
## s.e. 0.0103
##
## sigma^2 = 0.003702: log likelihood = 112.17
## AIC=-204.34 AICc=-201.15 BIC=-180.52
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE

9
## Training set 0.0005269856 0.05696635 0.04394368 0.01383662 1.046896 0.80919
## ACF1
## Training set -0.0738096

coeftest(ARIMA414)

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.5027654 0.1568676 3.2050 0.00135 **
## ar2 -0.0516019 0.1050358 -0.4913 0.62323
## ar3 0.8321288 0.1052306 7.9077 2.622e-15 ***
## ar4 -0.6154997 0.1563331 -3.9371 8.247e-05 ***
## ma1 -0.6950654 0.1691727 -4.1086 3.980e-05 ***
## ma2 0.0118910 0.1603453 0.0742 0.94088
## ma3 -0.7308628 0.1176463 -6.2124 5.219e-10 ***
## ma4 0.9672586 0.1364291 7.0898 1.343e-12 ***
## drift 0.0097372 0.0103336 0.9423 0.34605
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

one AR and MA term is not significant in the model. Since MA2 is not significant and has highest P-value
which is 0.94088. So, we will drop this term in the model.

### model p = 4, d = 1, q = 3
ARIMA413 <- Arima(training$log_Adj.close, order = c(4, 1, 3), include.drift = T)
summary(ARIMA413)

## Series: training$log_Adj.close
## ARIMA(4,1,3) with drift
##
## Coefficients:
## ar1 ar2 ar3 ar4 ma1 ma2 ma3 drift
## -0.5202 -0.1688 0.7615 0.2864 0.4014 0.1952 -0.7185 0.0106
## s.e. 0.2075 0.2143 0.2035 0.1147 0.1923 0.2116 0.1891 0.0094
##
## sigma^2 = 0.004093: log likelihood = 108.65
## AIC=-199.29 AICc=-196.72 BIC=-177.86
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.0001364897 0.06032049 0.04816633 0.002002214 1.141745 0.8869469
## ACF1
## Training set 0.003984436

coeftest(ARIMA413)

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)

10
## ar1 -0.520187 0.207484 -2.5071 0.0121720 *
## ar2 -0.168842 0.214318 -0.7878 0.4308073
## ar3 0.761455 0.203473 3.7423 0.0001823 ***
## ar4 0.286442 0.114684 2.4977 0.0125016 *
## ma1 0.401415 0.192256 2.0879 0.0368057 *
## ma2 0.195182 0.211648 0.9222 0.3564249
## ma3 -0.718515 0.189109 -3.7995 0.0001450 ***
## drift 0.010569 0.009396 1.1248 0.2606547
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

One AR and MA terms are not significant. Since AR2 is not significant and has highest P-value which is
0.4308073. So, we will drop this term in the next step from the model.

### model p = 3, d = 1, q = 3
ARIMA313 <- Arima(training$log_Adj.close, order = c(3, 1, 3), include.drift = TRUE)
summary(ARIMA313)

## Series: training$log_Adj.close
## ARIMA(3,1,3) with drift
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 ma3 drift
## 0.6482 0.7182 -0.5771 -0.9484 -0.5408 0.8180 0.0096
## s.e. 0.2572 0.3175 0.1677 0.2735 0.4247 0.2451 0.0102
##
## sigma^2 = 0.003974: log likelihood = 109.74
## AIC=-203.48 AICc=-201.45 BIC=-184.43
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.000445232 0.05984848 0.04597165 0.01122462 1.093784 0.8465336
## ACF1
## Training set -0.009143674

coeftest(ARIMA313)

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.6482315 0.2571628 2.5207 0.0117120 *
## ar2 0.7182306 0.3174787 2.2623 0.0236792 *
## ar3 -0.5771159 0.1677336 -3.4407 0.0005803 ***
## ma1 -0.9484376 0.2734755 -3.4681 0.0005242 ***
## ma2 -0.5408088 0.4247258 -1.2733 0.2029071
## ma3 0.8180417 0.2451190 3.3373 0.0008459 ***
## drift 0.0096475 0.0101886 0.9469 0.3436925
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Since MA2 term is not significant and has highest P-value which is 0.2029071. So, we will drop this term in
the next step from the previous model.

11
### model p = 3, d = 1, q = 2
ARIMA312 <- Arima(training$log_Adj.close, order = c(3, 1, 2), include.drift = TRUE)
summary(ARIMA312)

## Series: training$log_Adj.close
## ARIMA(3,1,2) with drift
##
## Coefficients:
## ar1 ar2 ar3 ma1 ma2 drift
## -0.2865 0.7583 0.1679 0.1287 -0.6889 0.0107
## s.e. 0.3477 0.2348 0.1395 0.3253 0.2554 0.0091
##
## sigma^2 = 0.004644: log likelihood = 104.45
## AIC=-194.9 AICc=-193.34 BIC=-178.22
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set 0.0001101548 0.06513442 0.05039626 -0.0003218164 1.195042
## MASE ACF1
## Training set 0.9280095 -0.02219786

coeftest(ARIMA312)

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 -0.2864539 0.3477100 -0.8238 0.410036
## ar2 0.7582733 0.2347994 3.2295 0.001240 **
## ar3 0.1679470 0.1394598 1.2043 0.228486
## ma1 0.1287263 0.3252842 0.3957 0.692301
## ma2 -0.6888583 0.2553647 -2.6975 0.006985 **
## drift 0.0106558 0.0091212 1.1682 0.242708
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Again some AR and MA terms are not significant. Since MA1 term is not significant and has highest P-value
which is 0.692301. So, we will drop this term in the next step from the previous model.

### model p = 3, d = 1, q = 1
ARIMA311<- Arima(training$log_Adj.close, order = c(3, 1, 1), include.drift = TRUE)
summary(ARIMA311)

## Series: training$log_Adj.close
## ARIMA(3,1,1) with drift
##
## Coefficients:
## ar1 ar2 ar3 ma1 drift
## 0.5143 0.1317 0.1315 -0.6995 0.0099
## s.e. 0.2109 0.1301 0.1184 0.1806 0.0099
##
## sigma^2 = 0.004516: log likelihood = 104.99

12
## AIC=-197.98 AICc=-196.83 BIC=-183.69
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set 0.0003057396 0.06466655 0.05085581 0.006067046 1.209861 0.9364717
## ACF1
## Training set -0.009775339

coeftest(ARIMA311)

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.5143211 0.2109247 2.4384 0.014752 *
## ar2 0.1317039 0.1301457 1.0120 0.311551
## ar3 0.1315252 0.1183873 1.1110 0.266580
## ma1 -0.6994986 0.1805579 -3.8741 0.000107 ***
## drift 0.0099416 0.0099499 0.9992 0.317714
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Two AR terms are not significant. Since AR2 term is not significant and has highest P-value which is
0.311551. So, we will drop this term in the next step from the previous model.

### model p = 2, d = 1, q = 1
ARIMA211 <- Arima(training$log_Adj.close, order = c(2, 1, 1), include.drift = TRUE)
summary(ARIMA211)

## Series: training$log_Adj.close
## ARIMA(2,1,1) with drift
##
## Coefficients:
## ar1 ar2 ma1 drift
## -1.0395 -0.1206 0.8770 0.0123
## s.e. 0.2861 0.1372 0.2547 0.0064
##
## sigma^2 = 0.004572: log likelihood = 104.03
## AIC=-198.05 AICc=-197.24 BIC=-186.14
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.000123686 0.06549716 0.05068155 -0.01062332 1.198091 0.9332629
## ACF1
## Training set -0.00432817

coeftest(ARIMA211)

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)

13
## ar1 -1.0394680 0.2861321 -3.6328 0.0002803 ***
## ar2 -0.1205858 0.1371987 -0.8789 0.3794480
## ma1 0.8770004 0.2547166 3.4430 0.0005752 ***
## drift 0.0122529 0.0064187 1.9090 0.0562681 .
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

Since AR2 term is not significant and has highest P-value which is 0.3794480 . So, we will drop this term in
the model in next step.

### model p = 1, d = 1, q = 1
ARIMA111 <- Arima(training$log_Adj.close, order = c(1, 1, 1), include.drift = TRUE)
summary(ARIMA111)

## Series: training$log_Adj.close
## ARIMA(1,1,1) with drift
##
## Coefficients:
## ar1 ma1 drift
## -0.8314 0.7474 0.0120
## s.e. 0.4017 0.4775 0.0071
##
## sigma^2 = 0.004559: log likelihood = 103.62
## AIC=-199.23 AICc=-198.7 BIC=-189.7
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set -9.764215e-05 0.06583236 0.05094031 -0.009183767 1.206142
## MASE ACF1
## Training set 0.9380277 -0.07078971

coeftest(ARIMA111)

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 -0.831374 0.401676 -2.0698 0.03847 *
## ma1 0.747417 0.477457 1.5654 0.11749
## drift 0.012026 0.007071 1.7008 0.08898 .
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

checkresiduals(ARIMA111)

14
Residuals from ARIMA(1,1,1) with drift

0.1

0.0

−0.1

−0.2
0 20 40 60 80

0.2
20
0.1
15
ACF

df$y
0.0
10

−0.1
5

−0.2
0
5 10 15 −0.2 −0.1 0.0 0.1 0.2
Lag residuals

##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,1,1) with drift
## Q* = 9.2513, df = 8, p-value = 0.3215
##
## Model df: 2. Total lags used: 10

Since MA1 term is not significant and has highest P-value which is 0.11749. So, we will drop this term in
the model in next step.

### model p = 1, d = 1, q = 0
ARIMA110 <- Arima(training$log_Adj.close, order = c(1, 1, 0), include.drift = TRUE)
summary(ARIMA110)

## Series: training$log_Adj.close
## ARIMA(1,1,0) with drift
##
## Coefficients:
## ar1 drift
## -0.1705 0.0123
## s.e. 0.1136 0.0063
##
## sigma^2 = 0.004469: log likelihood = 103.91
## AIC=-201.82 AICc=-201.51 BIC=-194.68

15
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -9.986018e-05 0.06559776 0.05051347 -0.01029073 1.193776 0.9301678
## ACF1
## Training set -0.002278289

coeftest(ARIMA110)

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 -0.1705052 0.1135769 -1.5012 0.1333
## drift 0.0122775 0.0063211 1.9423 0.0521 .
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

checkresiduals(ARIMA110)

Residuals from ARIMA(1,1,0) with drift

0.1

0.0

−0.1

−0.2
0 20 40 60 80

0.2
20
0.1
15
ACF

df$y

0.0
10
−0.1
5

−0.2
0
5 10 15 −0.2 −0.1 0.0 0.1 0.2
Lag residuals

##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,1,0) with drift

16
## Q* = 7.3749, df = 9, p-value = 0.5981
##
## Model df: 1. Total lags used: 10

The model seems to be a good mode as we have small AIC value. But it is always recommended to check
ARIMA (0,1,1) also

### model p =0, d = 1, q = 1

ARIMA011 <- Arima(training$log_Adj.close, order = c(0, 1, 1), include.drift =T)
summary(ARIMA011 )

## Series: training$log_Adj.close
## ARIMA(0,1,1) with drift
##
## Coefficients:
## ma1 drift
## -0.1626 0.0123
## s.e. 0.1074 0.0062
##
## sigma^2 = 0.004473: log likelihood = 103.87
## AIC=-201.75 AICc=-201.43 BIC=-194.6
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.0001226027 0.06563078 0.05059285 -0.01106166 1.195624 0.9316295
## ACF1
## Training set -0.00824293

coeftest(ARIMA011 )

##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ma1 -0.1626465 0.1073594 -1.5150 0.12978
## drift 0.0123500 0.0062058 1.9901 0.04658 *
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1

checkresiduals(ARIMA011 )

17
Residuals from ARIMA(0,1,1) with drift

0.1

0.0

−0.1

−0.2
0 20 40 60 80

0.2
20
0.1
15
ACF

df$y
0.0
10
−0.1
5
−0.2
0
5 10 15 −0.2 −0.1 0.0 0.1 0.2
Lag residuals

##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,1) with drift
## Q* = 7.4427, df = 9, p-value = 0.5911
##
## Model df: 1. Total lags used: 10

This model is also looks a good model. This model has also small AIC.
Step XI: AIC and BIC table for the selected good models

AIC_table <- data.frame(cbind(ARIMA010$aic, ARIMA011$aic, ARIMA110$aic, ARIMA111$aic),

row.names = "AIC")
names(AIC_table) <- c("ARIMA(0, 1, 0)", "ARIMA(0, 1, 1)", "ARIMA(1, 1, 0)",
"ARIMA(1, 1, 1)")
AIC_table

## ARIMA(0, 1, 0) ARIMA(0, 1, 1) ARIMA(1, 1, 0) ARIMA(1, 1, 1)

## AIC -201.6024 -201.7459 -201.8237 -199.2316

Repeat the same for BIC

18
BIC_table <- data.frame(cbind(ARIMA010$bic, ARIMA011$bic, ARIMA110$bic, ARIMA111$bic),
row.names = "BIC")
names(BIC_table) <- c("ARIMA(0, 1, 0)", "ARIMA(0, 1, 1)", "ARIMA(1, 1, 0)",
"ARIMA(1, 1, 1)")
BIC_table

## ARIMA(0, 1, 0) ARIMA(0, 1, 1) ARIMA(1, 1, 0) ARIMA(1, 1, 1)

## BIC -196.8384 -194.5998 -194.6776 -189.7035

2. R function for the transformation

The problem with this forecast is that it is logged. We want to compare the true closing prices of the testing
set with the forecast. First, we want to create a plot where we compare the forecasted values with confidence
intervals to the true test values. To make this easier, we will give you a short cut: a specially made function
that will take a log-price forecast from the forecast package and convert it to a regular price forecast.

log2price <- function(a.forecast){

### saves the non log transform forecasts as a time series
a.forecast$mean <- exp(a.forecast$mean)
### saves the non log transform upper limits for the 80% and 95% confidence interval
a.forecast$upper <- exp(a.forecast$upper)
### saves the non log transform lower limits for the 80% and 95% confidence interval
a.forecast$lower <- exp(a.forecast$lower)
### saves the non log transform original time series
a.forecast$x <- exp(a.forecast$x)
return(a.forecast)
}

First, we will use the function “forecast” that gives us the forecast for four periods forward

### model p = 0, d = 1, q = 0
forecast010 <- forecast(ARIMA010, h = 4)

3. Plotting the True Forecast

Then I will use the “log2price” function to turn the logged forecast into a regular forecast. Then we can
create the plot.

priceforecast010 <- log2price(forecast010)

plot(priceforecast010)
lines(82:85, testing$Adj.Close, type = "o", col = "dark red", lwd = 2)

19
Forecasts from ARIMA(0,1,0) with drift
100 120 140
80
60
40

0 20 40 60 80

I will summarize the forecasts for the last 4 points with the testing values and their difference in a table

### summarize the calculations in a table

ARIMA010_forecast <- exp(forecast010$mean)
ARIMA010_forecast_error <- testing$Adj.Close - ARIMA010_forecast
ARIMA010_table <- data.frame(cbind(testing$Adj.Close, ARIMA010_forecast, ARIMA010_forecast_error))
ARIMA010_table

## testing.Adj.Close ARIMA010_forecast ARIMA010_forecast_error

## 1 94.66 97.29982 -2.6398176
## 2 101.45 98.46339 2.9866055
## 3 88.73 99.64088 -10.9108730
## 4 99.87 100.83244 -0.9624386

Calculate RMSE for ARIMA(0, 1, 0) model using the above table.

RMSE_ARIMA010<-rmse(ARIMA010_table$testing.Adj.Close,ARIMA010_table$ARIMA010_forecast)
RMSE_ARIMA010

## [1] 5.82799

Repeat the same procedure for the other ARIMA models.

Compare RMSE of all the estimated models. The lowest RMSE corresponds to the best model

20
4. Test for ARCH effects

After selecting the best model based on RMSE, AIC and BIC, you can test the model for ARCH effects
among the residuals. To do that you will need to install the package “FinTS”

##install.packages("FinTS")

After installing the package you do not need to install the same package again on your computer and you
do not need to run any library for the command below that will give you the ARCH test

FinTS::ArchTest(ARIMA010$residuals)

##
## ARCH LM-test; Null hypothesis: no ARCH effects
##
## data: ARIMA010$residuals
## Chi-squared = 17.831, df = 12, p-value = 0.1209

Note that there are two “:” between FinTS and ArchTest. MAke sure that you write the command as above.
In case you try to run it with only one “:” you will get an Error

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Six Sigma - A Complete Step-By-Step Guide2
100% (5)
Six Sigma - A Complete Step-By-Step Guide2
829 pages
Computational Laboratory For Economics
0% (1)
Computational Laboratory For Economics
461 pages
STA651 Practical Test 1
No ratings yet
STA651 Practical Test 1
5 pages
RStudio Cheat Sheet 2022
No ratings yet
RStudio Cheat Sheet 2022
1 page
TSA With R
No ratings yet
TSA With R
20 pages
Forecast Time Series-Notes
No ratings yet
Forecast Time Series-Notes
138 pages
Ts Linear
No ratings yet
Ts Linear
4 pages
Mini Project Based On Time Series Forecasting Methods: Data Used
No ratings yet
Mini Project Based On Time Series Forecasting Methods: Data Used
14 pages
ForecastingIndividualassignment MohammadMujtaba 12020063
No ratings yet
ForecastingIndividualassignment MohammadMujtaba 12020063
20 pages
Business Forecast Vishay Sood
No ratings yet
Business Forecast Vishay Sood
8 pages
7 Transformations
No ratings yet
7 Transformations
68 pages
Time Series Plot of Revenue
No ratings yet
Time Series Plot of Revenue
9 pages
Activity 5 (Time Series) - Rudinas
No ratings yet
Activity 5 (Time Series) - Rudinas
7 pages
Real Statistics Using Excel - Time Series Examples Workbook Charles Zaiontz, 27 July 2018
No ratings yet
Real Statistics Using Excel - Time Series Examples Workbook Charles Zaiontz, 27 July 2018
380 pages
lecture_18_build_arima (1)
No ratings yet
lecture_18_build_arima (1)
22 pages
cheatsheet的副本
No ratings yet
cheatsheet的副本
8 pages
Lab 4
No ratings yet
Lab 4
20 pages
UnivariateRegression Summary
No ratings yet
UnivariateRegression Summary
36 pages
Session 2
100% (1)
Session 2
35 pages
Lec20
No ratings yet
Lec20
16 pages
Unit 3 Regression Models
No ratings yet
Unit 3 Regression Models
74 pages
E Monika Sree 10-10-2024
No ratings yet
E Monika Sree 10-10-2024
60 pages
Using R For Linear Regression
No ratings yet
Using R For Linear Regression
9 pages
DataCamp - ForECASTING USING R - Dynamic Regression
No ratings yet
DataCamp - ForECASTING USING R - Dynamic Regression
24 pages
INDR 372 Selected Solutions of Review Exercises For The Midterm Exam
No ratings yet
INDR 372 Selected Solutions of Review Exercises For The Midterm Exam
15 pages
Regression Arima Garch Var Model
No ratings yet
Regression Arima Garch Var Model
19 pages
Predict and Co
No ratings yet
Predict and Co
6 pages
Unit 2c Forecasting - Tools
No ratings yet
Unit 2c Forecasting - Tools
65 pages
MIS410-Chapter7
No ratings yet
MIS410-Chapter7
49 pages
Forecasting: JY Le Boudec
No ratings yet
Forecasting: JY Le Boudec
93 pages
Applied economic forecasting using time series methods Ghysels All Chapters Instant Download
100% (2)
Applied economic forecasting using time series methods Ghysels All Chapters Instant Download
66 pages
Applied economic forecasting using time series methods Ghysels - The latest updated ebook is now available for download
100% (1)
Applied economic forecasting using time series methods Ghysels - The latest updated ebook is now available for download
55 pages
Be A 65 Ads Exp 8
No ratings yet
Be A 65 Ads Exp 8
10 pages
Analysis Course HW3
No ratings yet
Analysis Course HW3
12 pages
Statlearn PDF
No ratings yet
Statlearn PDF
123 pages
Econometrics in MATLAB: ARMAX, Pseudo Ex-Post Forecasting, GARCH and EGARCH, Implied Volatility
No ratings yet
Econometrics in MATLAB: ARMAX, Pseudo Ex-Post Forecasting, GARCH and EGARCH, Implied Volatility
18 pages
Time Series Notes9
No ratings yet
Time Series Notes9
32 pages
Data Analytics & R Siddharth
No ratings yet
Data Analytics & R Siddharth
20 pages
Empirical Finance8
No ratings yet
Empirical Finance8
11 pages
QBUS2820 Mid-Semester 2015s2 (Solution)
No ratings yet
QBUS2820 Mid-Semester 2015s2 (Solution)
7 pages
MIT 302 - Statistical Computing II - Tutorial 03
No ratings yet
MIT 302 - Statistical Computing II - Tutorial 03
16 pages
Basic Statistics
No ratings yet
Basic Statistics
66 pages
unit5_R
No ratings yet
unit5_R
5 pages
Regression Analysis Using R
No ratings yet
Regression Analysis Using R
17 pages
FinAnalyticsSolutions1236 PDF
No ratings yet
FinAnalyticsSolutions1236 PDF
37 pages
Tutorial 9 - Solutions
No ratings yet
Tutorial 9 - Solutions
21 pages
Forecasting Using R: Rob J Hyndman
No ratings yet
Forecasting Using R: Rob J Hyndman
74 pages
Project
No ratings yet
Project
16 pages
Time Series Analysis R
100% (3)
Time Series Analysis R
340 pages
Practical Machine Learning Course Notes
No ratings yet
Practical Machine Learning Course Notes
76 pages
BDA MSC It
No ratings yet
BDA MSC It
35 pages
4027 Assignment Q5
No ratings yet
4027 Assignment Q5
12 pages
5 Forecasterstoolbox
No ratings yet
5 Forecasterstoolbox
49 pages
Forecasting With R Notes
No ratings yet
Forecasting With R Notes
66 pages
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
C Programming
From Everand
C Programming
Netra
No ratings yet
Gd Script
From Everand
Gd Script
Marijo Trkulja
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
Blazor and API Example: Classroom Quiz Application
From Everand
Blazor and API Example: Classroom Quiz Application
Taurius Litvinavicius
No ratings yet
New Time Series Analysis
No ratings yet
New Time Series Analysis
16 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
6 pages
Learning From Incomplete Training Data With Missing Values and Medical Application
No ratings yet
Learning From Incomplete Training Data With Missing Values and Medical Application
4 pages
Keyword Clustering
No ratings yet
Keyword Clustering
15 pages
Research Guide Quantitative
No ratings yet
Research Guide Quantitative
6 pages
B - Data Science in Business - 2024 - 25.298
No ratings yet
B - Data Science in Business - 2024 - 25.298
7 pages
Azatbek A. Classwork
No ratings yet
Azatbek A. Classwork
3 pages
5 Steps to a 5: AP Statistics 2024 1st Edition Jared Derksen - The complete ebook is available for download with one click
100% (2)
5 Steps to a 5: AP Statistics 2024 1st Edition Jared Derksen - The complete ebook is available for download with one click
54 pages
Causal Inference in Statistics: An Overview
100% (2)
Causal Inference in Statistics: An Overview
51 pages
Statistics - Practice 1 - F2020 PDF
No ratings yet
Statistics - Practice 1 - F2020 PDF
6 pages
PSSM
No ratings yet
PSSM
17 pages
A01 Reliability-Green-Belt Flyer EN P-1
No ratings yet
A01 Reliability-Green-Belt Flyer EN P-1
2 pages
How To Write The Methodology Chapter of A Dissertation or Thesis
100% (3)
How To Write The Methodology Chapter of A Dissertation or Thesis
8 pages
CHAPTER 1V RESULT AND DISCCUSION Rini
No ratings yet
CHAPTER 1V RESULT AND DISCCUSION Rini
13 pages
Research Chapter 4
100% (1)
Research Chapter 4
50 pages
Suharno
No ratings yet
Suharno
150 pages
Business Statistics For Contemporary Decision Making 8th Edition Black Solutions Manual 1
100% (73)
Business Statistics For Contemporary Decision Making 8th Edition Black Solutions Manual 1
31 pages
Megastat Getting Started Guide
No ratings yet
Megastat Getting Started Guide
47 pages
Gec104 M3 03
No ratings yet
Gec104 M3 03
57 pages
Be Unit I
No ratings yet
Be Unit I
15 pages
PAD 324 CONTINUOUS ASSESSMENT
No ratings yet
PAD 324 CONTINUOUS ASSESSMENT
4 pages
05-FACTORIAL DESIGN (PROBLEMS WITH SOLUTIONS)
No ratings yet
05-FACTORIAL DESIGN (PROBLEMS WITH SOLUTIONS)
14 pages
Machine Learning 2: Exercise Sheet 6
No ratings yet
Machine Learning 2: Exercise Sheet 6
1 page
The Impacts of Social Isolation in The Well Being of The University of Baguio School of Nursing Students
100% (1)
The Impacts of Social Isolation in The Well Being of The University of Baguio School of Nursing Students
21 pages
Various Screening Designs - Factorial Experiments
No ratings yet
Various Screening Designs - Factorial Experiments
22 pages
Statistics in Hydrology
0% (1)
Statistics in Hydrology
28 pages
Thesis - Chapter 3
No ratings yet
Thesis - Chapter 3
5 pages
Quantitative Techniques For Managerial Decision - 1 (Qtmd1G21-1)
No ratings yet
Quantitative Techniques For Managerial Decision - 1 (Qtmd1G21-1)
33 pages
Information Bulletin No. 1 - Final v4
No ratings yet
Information Bulletin No. 1 - Final v4
2 pages

Lab 4

Uploaded by

Lab 4

Uploaded by

Computer Lab 4

Mahmood Ul Hassan and Andriy Andreev

GOOG <- read.csv("GOOG_7.csv")

GOOG <- GOOG[, -c(2,3,4,5,7)]

GOOG$Adj.Close <- ts(GOOG$Adj.Close, start = c(2016, 1), frequency = 12)

GOOG$log_Adj.close <- log(GOOG$Adj.Close)

GOOG$logreturn_Adj.close <- ts(GOOG$logreturn_Adj.close, start = c(2016, 1), frequency = 12)

2016 2017 2018 2019 2020 2021 2022 2023

## Warning in adf.test(GOOG$logreturn_Adj.close[-(1:2)], k = 0): p-value smaller

GOOG$time <- 1:nrow(GOOG)

training <- GOOG[1:(nrow(GOOG) - 4), ]

ARIMA010 <- Arima(training$log_Adj.close, order = c(0, 1, 0), include.drift =T)

### print the confidence interval

### print the p-values

Step VII: We estimate a regression model using time as an explanatory variable

regression <- lm(formula = Adj.Close ~ time, data = training)

### forecast four periods forward

regression_forecast_error <- testing$Adj.Close - regression_forecast

regression_table <- data.frame(cbind(testing$Adj.Close, regression_forecast, regression_forecast_error))

## The following object is masked from ’package:forecast’:

Step VIII: Fit a Holt’s smoothing model

holt_model <-holt(training$Adj.Close, h = 4, exponential = FALSE)

Forecasts from Holt's method

### forecast four periods forward

## testing Holt’s forecast Holt’s forecast error

Calculate RMSE for Holt’s model using the above table.

Step IX: Acf and Pacf plots

Acf(training$logreturn_Adj.close, main = "ACF", lag.max=12)

Pacf(training$logreturn_Adj.close, main = "PACF",lag.max=12)

Step X: Fit ARIMA models

Residuals from ARIMA(1,1,0) with drift

### model p =0, d = 1, q = 1

AIC_table <- data.frame(cbind(ARIMA010$aic, ARIMA011$aic, ARIMA110$aic, ARIMA111$aic),

## ARIMA(0, 1, 0) ARIMA(0, 1, 1) ARIMA(1, 1, 0) ARIMA(1, 1, 1)

Repeat the same for BIC

## ARIMA(0, 1, 0) ARIMA(0, 1, 1) ARIMA(1, 1, 0) ARIMA(1, 1, 1)

2. R function for the transformation

log2price <- function(a.forecast){

3. Plotting the True Forecast

priceforecast010 <- log2price(forecast010)

### summarize the calculations in a table

## testing.Adj.Close ARIMA010_forecast ARIMA010_forecast_error

Calculate RMSE for ARIMA(0, 1, 0) model using the above table.

Repeat the same procedure for the other ARIMA models.

You might also like