Data Analysis Activity 3
Data Analysis Activity 3
STAT 4534
Dr. Marco A. R. Ferreira
17th Nov, 2022
Data Analysis Activity 3
Task 1
We can observe that the time series is not stationary as it changes along with time.
We can use log-log transformation for this data set.
There is a upward trend in the time series and it does have trends which need to de-trended
using log-log transformation.
Task 2
Plotting the sample ACF and sample PACF of the differenced time series
The possible preliminary orders p and q of an AR model is 1 as PACF cuts of at lag 1 and it seems
the time series is an AR(1) and MA(1)
Task 3
The BIC gives the best model to be (1,1,0) as suspected by us in previous task. This makes us
confident to use ARIMA(1,1,0).
AIC favors the model ARIMA(1,1,1) and also is very close to favoring model ARIMA(1,1,0).
The results for AIC and BIC do not accurately coincide but it does give information about
choosing a model.
Based on these results, I choose ARIMA(1,1,0) to have the bets fit to our model.
The assumption of normality of the residuals reasonable as most values follow the predicted
line, however, some values do not fit the line, causing a S-shape curve which can be reasoned
out as nature of the problem (industrial production)
We observe that the p-values on the Ljung-Box are above the line or just on the line, we don’t
see any point going way below the line and we can say that the residuals are barely
uncorrelated.
Plotting the forecast for industrial production time series 12 quarters ahead
#Jay Kapoor
#Stat 4534 Time Series Analysis
#DAA 3
library(readr)
library(astsa)
library(polynom)
#Task 1
le <- read_csv("D:/STAT 4534 time series/Analysis Activity 3/IPB50001SQ-1980-2019.csv")
industrial.production <- ts(le[,2],start=1980,frequency=4)
tsplot(industrial.production)
#Task2
acf2(diff.log.ip,max.lag=40)
# Task 3
n = length(log.ip)
max.p = 5
max.d = 1
max.q = 5
max.P = 0
max.D = 0
max.Q = 0
BIC.array =array(NA,dim=c(max.p+1,max.d+1,max.q+1,max.P+1,max.D+1,max.Q+1))
AIC.array =array(NA,dim=c(max.p+1,max.d+1,max.q+1,max.P+1,max.D+1,max.Q+1))
best.bic <- 1e8
x.ts = log.ip
for (p in 0:max.p) for(d in 0:max.d) for(q in 0:max.q)
for (P in 0:max.P) for(D in 0:max.D) for(Q in 0:max.Q)
{
# This is a modification of a function originally from the book:
# Cowpertwait, P.S.P., Metcalfe, A.V. (2009), Introductory Time
# Series with R, Springer.
# Modified by M.A.R. Ferreira (2016, 2020).
cat("p=",p,", d=",d,", q=",q,", P=",P,", D=",D,", Q=",Q,"\n")
}
}
best.bic
best.fit
best.model
#Task4
AIC(arima211)
AIC(arima111)
AIC(arima110)
#Task 5
par(mfrow=c(1,1))
ip.pred <- sarima.for(log.ip,12,1,1,0)
ip.pred$pred
ip.pred$se
exp(ip.pred$pred-1.96* ip.pred$se)
exp(ip.pred$pred+1.96* ip.pred$se)