0% found this document useful (0 votes)
53 views

Skripsi-R - Latex: 1 Libraries and Import Data

This R Markdown document discusses time series modeling and forecasting of Indonesia's GDP data from 1970 to 2013. Several ARIMA models are fitted to the Box-Cox transformed GDP data and their parameters are estimated. The best fitting model is identified to be ARIMA(0,1,1) based on the lowest AICc value. Diagnostic checks including the Ljung-Box test indicate that the residuals of this best model are not autocorrelated. Various R packages are used for time series analysis, regression, and output formatting.

Uploaded by

Kurnia Wanto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Skripsi-R - Latex: 1 Libraries and Import Data

This R Markdown document discusses time series modeling and forecasting of Indonesia's GDP data from 1970 to 2013. Several ARIMA models are fitted to the Box-Cox transformed GDP data and their parameters are estimated. The best fitting model is identified to be ARIMA(0,1,1) based on the lowest AICc value. Diagnostic checks including the Ljung-Box test indicate that the residuals of this best model are not autocorrelated. Various R packages are used for time series analysis, regression, and output formatting.

Uploaded by

Kurnia Wanto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Skripsi-R_LaTeX

Kurnia Wanto
08/08/2015
This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF,
and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Libraries and Import Data

# ------------------------------------ Libraries -----------------------------------library(forecast)


##
##
##
##
##
##
##
##
##
##

# --- BoxCox Arima auto.arima function is in forecast package

Loading required package: zoo


Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Loading required package: timeDate
This is forecast 5.9

library(MASS)
library(FitAR)

# --- boxcox function is in MASS package


# --- LjungBoxTest function is in FitAR package

##
##
##
##
##
##
##
##
##
##

package:
package:
package:
package:

Loading
Loading
Loading
Loading

required
required
required
required

lattice
leaps
ltsa
bestglm

Attaching package: 'FitAR'


The following object is masked from 'package:forecast':
BoxCox

library(tsoutliers) # --- tso function is in tsoutliers package


library(lmtest)
# --- coeftest function is in lmtest package
library(stargazer) # --- stargazer function is in stargazer package
##
## Please cite as:
##
## Hlavac, Marek (2015). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2. http://CRAN.R-project.org/package=stargazer
1

library(TSA)
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##
##

# --- arimax function is in TSA package

Loading required package: locfit


locfit 1.5-9.1
2013-03-22
Loading required package: mgcv
Loading required package: nlme
Attaching package: 'nlme'
The following object is masked from 'package:forecast':
getResponse
This is mgcv 1.8-7. For overview type 'help("mgcv-package")'.
Loading required package: tseries
Attaching package: 'TSA'
The following object is masked from 'package:forecast':
fitted.Arima
The following objects are masked from 'package:timeDate':
kurtosis, skewness
The following objects are masked from 'package:stats':
acf, arima
The following object is masked from 'package:utils':
tar

# --------------------------------- 0_Import_Data.R --------------------------------options(width=80)


raw_data <- read.csv("/media/kurnia/Brain/Skripsi/DATA/World\ Bank/inflation.csv")
PDB_ID <- ts(raw_data$GDP, start=1970, end=2013)
#PDB_ID <- PDB_ID/1000000000
PDB_ID
##
##
##
##
##
##
##
##
##
##
##

Time Series:
Start = 1970
End = 2013
Frequency = 1
[1]
9656740014
9849117953 11605084560 17171181163
[6] 32147953008 39328674730 48396143465 54298158340
[11] 78013206038 92473878832 94715163814 85369201879
[16] 87338874330 80060657612 75929617715 88787623310
[21] 114000000000 128000000000 139000000000 158000000000
[26] 202000000000 227000000000 216000000000 95445548017
[31] 165000000000 160000000000 196000000000 235000000000
2

27227710999
55122620334
87612439197
101000000000
177000000000
140000000000
257000000000

## [36] 286000000000 365000000000 432000000000 510000000000 540000000000


## [41] 709000000000 846000000000 877000000000 868000000000
# stat.desc(rwpdb)
plot(PDB_ID, xlab="Waktu", ylab="US Dollar", col="blue", type="l",
main="Plot PDB Indonesia 1970-2013")
points(PDB_ID, cex = .5, col = "red")
abline(v=1998, col=1, lty=2)
text(1998, PDB_ID[41], "1998\n(t=29)", pos=2)

4e+11

1998
(t=29)

0e+00

US Dollar

8e+11

Plot PDB Indonesia 19702013

1970

1980

1990

2000

2010

Waktu

2
2.1

Pemodelan ARIMA
Plot Nt

# ----------------------------------- 1-Plot_Nt.R ----------------------------------# Creating Nt


Nt <- ts(PDB_ID[1:28], start=1970, end=1997)
# Creating Layout 1,1;2,3
m <- rbind(c(1, 1), c(2, 3))
layout(m)
par(mar = c(3, 3, 1, 1))
plot(Nt, xlab="Waktu", ylab="US Dollar", col="blue", type="p",
main="Plot PDB Indonesia 1970-1997 (Nt)")
points(Nt, cex = .5, col = "red")
# Trend Linear
t <- 1970:1997
3

trend_PDB <- glm(Nt~t)


abline(trend_PDB, col="dark blue", lwd=2)
# ACF
acf(Nt, 20, xlim=c(1,20))
text(10,0.8, "ACF")

2.0e+11

Plot PDB Indonesia 19701997 (Nt)

5.0e+10

1975

1980

PACF

0.4

Partial ACF

0.2 0.6

1990
Series Nt1995

Waktu

ACF

0.4

ACF

Series Nt

1985

0.8

1970

0.2

US Dollar

# PACF
pacf(Nt, 20, ylim=c(-.4,1))
text(10,0.8, "PACF")

10

15

20

10

15

20

# Show the ACF & PACF value


acf(Nt, 27, plot=FALSE)
##
## Autocorrelations of series 'Nt', by lag
##
##
1
2
3
4
5
6
7
8
9
10
11
## 0.872 0.704 0.552 0.419 0.303 0.206 0.123 0.058 0.011 -0.021 -0.020
##
12
13
14
15
16
17
18
19
20
21
22
## -0.019 -0.031 -0.058 -0.095 -0.157 -0.233 -0.300 -0.336 -0.369 -0.390 -0.392
##
23
24
25
26
27
## -0.381 -0.350 -0.291 -0.206 -0.099
pacf(Nt, 27, plot=FALSE)
##
## Partial autocorrelations of series 'Nt', by lag
##
##
1
2
3
4
5
6
7
8
9
## 0.872 -0.231 -0.012 -0.034 -0.035 -0.022 -0.036 -0.001 -0.011
4

10
0.006

11
0.082

##
12
13
14
15
16
17
18
## -0.051 -0.055 -0.068 -0.056 -0.151 -0.104 -0.043
##
23
24
25
26
27
## -0.040 0.010 0.062 0.080 0.113

2.2

19
20
21
22
0.019 -0.105 -0.019 -0.023

BoxCox Transformation

# --------------------------------- 2-BoxCox_Trans.R -------------------------------# t1 <- 1:length(Nt)


t1 <- 1:length(PDB_ID)

80

60

95%

100

logLikelihood

40

# --- Search for optimal lambda


par(mar = c(2, 4, 0, 1))
MASS::boxcox(lm(PDB_ID~t1), lambda= seq(-1,1,1/10),
ylab = "log-Likelihood")

1.0

0.5

0.0

0.5

1.0

lambda.model <- forecast::BoxCox.lambda(PDB_ID)


# --- Box-Cox Transformation
Nt_box <- forecast::BoxCox(Nt, lambda = lambda.model)
# --- It is ambigous, either use lambda for Nt (0.5) or PDB_ID (0.2)

2.3

Plot Wt

# ----------------------------------- 3-Plot_Wt.R -----------------------------------

# --- Generating Wt
Wt <- as.numeric(diff(Nt_box))
# --- Creating Layout 1,1;2,3
m <- rbind(c(1, 1), c(2, 3))
layout(m)
par(mar = c(3, 3, 1, 1))
plot(Wt, xlab="Waktu", ylab="US Dollar",
col="blue", type="o",
main="Plot Wt",
#ylim=c(mean(Wt)-sd(Wt), mean(Wt)+sd(Wt))
)
points(Wt, cex = .5, col = "red")
# --- Trend Linear
t <- 1:27
trend_PDB <- glm(Wt~t)
#abline(trend_PDB, col="dark blue", lwd=2)
abline(h=mean(Wt))

# --- ACF
acf(Wt, 20, xlim=c(1,20))
text(10,0.2, "ACF")
# --- PACF
pacf(Wt, 20)
text(10,0.2, "PACF")

20 40
10

10

15
Waktu
0.4

0.4

Series Wt

PACF

0.4

0.4

0.0

ACF
ACF

20
Series
Wt25

0.0

Partial ACF

US Dollar

Plot Wt

10

15

20

# --- Reset the par


par(mfrow=c(1,1))

10

15

20

2.4

Identifikasi Model ARIMA

# ----------------------------------- 4-Id_Model.R ---------------------------------# Save ARIMA model


Nt_arima <- auto.arima(Nt, max.d=2, seasonal=FALSE, lambda=lambda.model,
stepwise=TRUE, trace=TRUE, max.p=5, max.q=5)
##
##
##
##
##
##
##
##
##
##
##

ARIMA(2,1,2)
ARIMA(0,1,0)
ARIMA(1,1,0)
ARIMA(0,1,1)
ARIMA(1,1,1)
ARIMA(0,1,2)
ARIMA(1,1,2)
ARIMA(0,1,1)

with
with
with
with
with
with
with

drift
drift
drift
drift
drift
drift
drift

:
:
:
:
:
:
:
:

Inf
227.5713
225.634
224.9311
227.4983
227.3469
Inf
240.4512

Best model: ARIMA(0,1,1) with drift

#summary(Nt_arima)

2.5

Estimasi Parameter ARIMA

# ----------------------------------- 5-Est_Par.R ----------------------------------# --- Creating All Possible Models


model_010 <- Arima(Nt, order = c(0,1,0), lambda = lambda.model, include.drift=FALSE)
model_110 <- Arima(Nt, order = c(1,1,0), lambda = lambda.model, include.drift=FALSE)
model_011 <- Arima(Nt, order = c(0,1,1), lambda = lambda.model, include.drift=FALSE)
model_111 <- Arima(Nt, order = c(1,1,1), lambda = lambda.model, include.drift=FALSE)
model_010d <- Arima(Nt, order = c(0,1,0), lambda = lambda.model, include.drift=TRUE)
model_110d <- Arima(Nt, order = c(1,1,0), lambda = lambda.model, include.drift=TRUE)
model_011d <- Arima(Nt, order = c(0,1,1), lambda = lambda.model, include.drift=TRUE)
model_111d <- Arima(Nt, order = c(1,1,1), lambda = lambda.model, include.drift=TRUE)

# --- Function for rounded value


# AIC
raic <- function (model) {
round(model$aic,2)
}
# AICc
raicc <- function (model) {
round(model$aicc,2)
}
# BIC
rbic <- function (model) {
round(model$bic,2)
}

# --- Calculate parameters P-Value


PValue <- function (model) {
(1-pnorm(abs(model$coef)/sqrt(diag(model$var.coef))))*2
}

# --- Estimating parameters for all possible models:


#
# Generating Function
EstimasiParamater <- function (model_010) {
cat("Parameter:\n")
print(model_010$coef)
cat("\nP-Values:\n")
print(PValue(model_010))
}
# --- ARIMA(0,1,0)
EstimasiParamater(model_010)
##
##
##
##
##

Parameter:
numeric(0)
P-Values:
numeric(0)

# --- ARIMA(0,1,0) with drift


EstimasiParamater(model_010d)
##
##
##
##
##
##
##

Parameter:
drift
16.28479
P-Values:
drift
2.552839e-06

# --- ARIMA(1,1,0)
EstimasiParamater(model_110)
##
##
##
##
##
##
##

Parameter:
ar1
0.6717558
P-Values:
ar1
4.528871e-07

# --- ARIMA(1,1,0) with drift


EstimasiParamater(model_110d)

## Parameter:
##
ar1
drift
## 0.4123133 15.2953648
##
## P-Values:
##
ar1
drift
## 0.025444345 0.003901791
# --- ARIMA(0,1,1)
EstimasiParamater(model_011)
##
##
##
##
##
##
##

Parameter:
ma1
0.5568536
P-Values:
ma1
1.081214e-06

# --- ARIMA(0,1,1) with drift


EstimasiParamater(model_011d)
## Parameter:
##
ma1
drift
## 0.4436015 15.6496435
##
## P-Values:
##
ma1
drift
## 0.0029888904 0.0004494747
# --- ARIMA(1,1,1)
EstimasiParamater(model_111)
##
##
##
##
##
##
##

Parameter:
ar1
ma1
0.6093421 0.1197083
P-Values:
ar1
ma1
0.01929869 0.76162787

# --- ARIMA(1,1,1) with drift


EstimasiParamater(model_111d)
## Parameter:
##
ar1
ma1
drift
## 0.1523234 0.3491976 15.4399096
##
## P-Values:
##
ar1
ma1
drift
## 0.641274708 0.209684262 0.001625286
9

2.6

Diagnosis Model ARIMA

# ----------------------------------- 6-Diag_Mod.R ---------------------------------# --- Ljung-Box Test for Nt


# --- OR can use stats::Box.test, lag = min(10,n/5) <-- Rob J. Hyndman
Box.test(Nt_arima$residuals, lag = round(length(Nt)/5,0),
type = "Ljung-Box", fitdf = 1)
##
## Box-Ljung test
##
## data: Nt_arima$residuals
## X-squared = 4.6519, df = 5, p-value = 0.4598
# --- Kolmogorov-Smirnov Test
ks.test(Nt_arima$residuals, "pnorm",
mean(Nt_arima$residuals),
sd(Nt_arima$residuals))
##
## One-sample Kolmogorov-Smirnov test
##
## data: Nt_arima$residuals
## D = 0.12779, p-value = 0.7032
## alternative hypothesis: two-sided

Analisis Intervensi

3.1

Identifikasi Orde Intervensi

# ----------------------------------- 7-Id_Intv.R ----------------------------------# --- Box-Cox Transformation on PDB_ID, not sure using lambda for Nt or PDB (?)
PDB_box <- forecast::BoxCox(PDB_ID, lambda.model)
Nt_forecast <- forecast(Nt_arima, h = 16)
# --- h-step forecast for Nt
Nt_forecast
##
##
##
##
##
##
##
##
##

1998
1999
2000
2001
2002
2003
2004
2005

Point Forecast
220497997785
239159144920
259061071731
280264502550
302832102148
326828505812
352320349432
379376299558

Lo 80
197534077911
197466158998
202699960086
210287262821
219460621107
229908261893
241481884597
254107926700

Hi 80
2.455465e+11
2.876040e+11
3.272955e+11
3.677124e+11
4.097694e+11
4.539303e+11
5.004994e+11
5.497112e+11
10

Lo 95
186176054195
177876939931
177111862586
179363315820
183438458883
188837142936
195306739202
202707108543

Hi 95
2.596972e+11
3.162485e+11
3.688236e+11
4.221130e+11
4.774870e+11
5.356283e+11
5.969774e+11
6.618670e+11

##
##
##
##
##
##
##
##

2006
2007
2008
2009
2010
2011
2012
2013

408067083449
438465519120
470646545360
504687251748
540666908659
578666997253
618771239451
661065627910

267752449080
282404821614
298069266141
314760107458
332498938821
351312847171
371233258510
392295163134

6.017649e+11
6.568411e+11
7.151101e+11
7.767370e+11
8.418844e+11
9.107142e+11
9.833889e+11
1.060072e+12

210957219540
220010366901
229841331962
240439173807
251802924955
263938893390
276858903815
290579113715

7.305761e+11
8.033544e+11
8.804357e+11
9.620444e+11
1.048401e+12
1.139723e+12
1.236229e+12
1.338139e+12

1e+11

T=29

1e+11

Residual

# --- Identification intervention order with plot of model residuals


error_idintv <- rep(0,44)
error_idintv[1:28] <- model_011d$residuals
error_idintv[29:44] <- PDB_ID[29:44] - Nt_forecast$mean
plot(error_idintv, type="h", xlab="Waktu (T)", ylab = "Residual", xaxt = "n",
#ylim=c(-15*sd(model_011d$residuals), 15*sd(model_011d$residuals))
)
abline(h=c(-3*sd(model_011d$residuals), 3*sd(model_011d$residuals)),
col="blue", lty=2)
abline(v = 29, col = "red", lty = 3, lwd = 1.5)
text(29, 200, "T=29",cex = .8, pos = 2)
axis(1, at = c(0,10,20,30,40), labels = c("T-29" ,"T-19", "T-9", "T+1", "T+11"))

T29

T19

T9

T+1

T+11

Waktu (T)
error_idintv
##
##
##
##
##
##
##
##
##

[1]
[6]
[11]
[16]
[21]
[26]
[31]
[36]
[41]

4.858446e-01
-7.179901e+00
4.190256e+01
-1.662309e+01
5.067220e+00
8.221203e+00
-9.406107e+10
-9.337630e+10
1.683331e+11

-1.246533e+01
1.422428e+01
-7.418031e+00
-2.195532e+01
1.711697e+00
2.866833e+00
-1.202645e+11
-4.306708e+10
2.673330e+11

6.359059e+00
6.659506e+00
-8.507763e+00
-1.412280e+01
-2.171178e+00
-2.641994e+01
-1.068321e+11
-6.465519e+09
2.582288e+11

2.457999e+01
-2.208022e+00
-2.844919e+01
1.512103e+01
7.913454e+00
-1.250524e+11
-9.182851e+10
3.935345e+10
2.069344e+11

# --- Plot PDB_ID vs forecasting of Nt


plot(forecast(model_011d, h=16), main =NA, ylab="US Dollar" )
points(PDB_ID, cex=.5, col="dark red", pch=19)
11

2.869941e+01
-1.249470e+01
1.075969e+00
-1.582604e+00
1.363048e+00
-9.915914e+10
-9.532035e+10
3.531275e+10

lines(PDB_ID, col="red")
abline(v=1998, lty=2)
text(1998, PDB_ID[41], "1998\n(t=29)", pos=2)
legend("topleft", legend = c("PDB_ID", "Peramalan Nt"), cex=0.75, lty=1,
col=c("dark red", "blue"), pch=c(19,NA))

8.0e+11
0.0e+00

US Dollar

PDB_ID
Peramalan Nt

1998
(t=29)

1970

1980

1990

2000

2010

# --- Detecting Outliers with tsoutliers::tso


PDB_outlier <- tsoutliers::tso(PDB_box, types = c("AO","LS","TC"),
maxit.iloop=10, tsmethod = "auto.arima")
plot(PDB_outlier)

50 0

600

1000

Original and adjusted series

150

Outlier effects

1970
3.2

1980

1990

2000

2010

Estimasi Parameter Intervensi

# ----------------------------------- 8-Est_Intv.R ---------------------------------# --- Pulse: Abrupt Temporary


cobalah2 <- TSA::arimax(PDB_box, order = c(0,1,1), xtransf = data.frame(
T29 = 1*(seq(PDB_ID)==29)), transfer = list(c(1,0)))

12

cobalah2
##
##
##
##
##
##
##
##
##
##
##

Call:
TSA::arimax(x = PDB_box, order = c(0, 1, 1), xtransf = data.frame(T29 = 1 *
(seq(PDB_ID) == 29)), transfer = list(c(1, 0)))
Coefficients:
ma1 T29-AR1
0.4394
0.2055
s.e. 0.1182
0.0804

T29-MA0
-112.4912
14.5695

sigma^2 estimated as 496.7:

log likelihood = -194.6,

aic = 395.19

# --- Test significance of model coeficients


coeftest(cobalah2)
##
##
##
##
##
##
##
##
##

z test of coefficients:
Estimate Std. Error z value
ma1
0.439418
0.118198 3.7176
T29-AR1
0.205494
0.080446 2.5544
T29-MA0 -112.491185
14.569525 -7.7210
--Signif. codes: 0 '***' 0.001 '**' 0.01

Pr(>|z|)
0.0002011 ***
0.0106362 *
1.154e-14 ***
'*' 0.05 '.' 0.1 ' ' 1

# stargazer(coeftest(cobalah2))

3.3

Diagnosis Model Intervensi

# ----------------------------------- 9-Diag_Intv.R --------------------------------# --- Generating f(It), based on cobalah2


pulse29 <- filter(1*(seq(PDB_ID)==29), filter = 0.2055,
method = "rec", sides = 1) * -112.4912
# --- Compute Model for PDB_ID
PDB_arima <- Arima(PDB_ID, lambda = lambda.model, order = c(0,1,1),
include.constant = TRUE, xreg = pulse29)
# --- Ljung-Box Test for Independence of residuals
# , lag = min(10,n/5) <-- Rob J. Hyndman
Box.test(PDB_arima$residuals,
lag=round(length(PDB_ID)/5,0), type = "Ljung-Box", fitdf = 1)
##
## Box-Ljung test
##
## data: PDB_arima$residuals
## X-squared = 14.784, df = 8, p-value = 0.06348
13

# --- Kolmogorov-Smirnov Test for Normality of residuals


ks.test(PDB_arima$residuals, "pnorm",
mean(PDB_arima$residuals),
sd(PDB_arima$residuals))
##
## One-sample Kolmogorov-Smirnov test
##
## data: PDB_arima$residuals
## D = 0.14849, p-value = 0.2597
## alternative hypothesis: two-sided

3.4

Peramalan Model Intervensi

# --- Generating f(It)


pulse29 <- filter(1*(seq(PDB_ID)==29),
filter = 0.2055, method = "rec",
sides = 1)*-112.4912
# --- Compute Model for PDB_ID
PDB_arima <- Arima(PDB_ID, lambda = lambda.model,
order = c(0,1,1),
include.constant = TRUE, xreg = pulse29)
# --- Future Value of xreg
xreg.rob = forecast(auto.arima(pulse29), h=5)$mean
# --- Forecasting
forecast(PDB_arima, xreg = xreg.rob)
##
##
##
##
##
##

2014
2015
2016
2017
2018

Point Forecast
9.045913e+11
9.669871e+11
1.032775e+12
1.102090e+12
1.175072e+12

Lo 80
820870436392
817268518979
832476248750
855854126920
884545319931

Hi 80
9.949980e+11
1.137852e+12
1.269836e+12
1.401906e+12
1.537246e+12

Lo 95
779128419373
745876770536
739794925023
744601046140
755887552783

0.0e+00

1.0e+12

plot(forecast(PDB_arima, xreg = xreg.rob), main=NA)

1970

1980

1990

2000

2010
14

Hi 95
1.045696e+12
1.237594e+12
1.411843e+12
1.585104e+12
1.762173e+12

You might also like