0% found this document useful (0 votes)

47 views

Time Series Analysis Using R

This document discusses time series analysis of London housing prices using R. It begins with an exploratory data analysis of house prices in Westminster, finding prices ranged from £121,387 to £1,117,408 with an average of £521,837. Classical and Bayesian time series models are then fit to the data and compared. For the classical model, tests for stationarity and transformations are applied before fitting ARIMA/SARIMA models. For the Bayesian model, prior distributions are specified and parameters are estimated. The models are compared based on goodness of fit to select the best for forecasting future house prices.

Uploaded by

John Kalar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views

Time Series Analysis Using R

Uploaded by

John Kalar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

Time series Analysis using R 1

Time series Analysis using R

Module Code
Module Name
Course
Your Name

School

Date
Time series Analysis using R 2

Table of Contents

Table of Contents.......................................................................................................................2

Introduction................................................................................................................................3

Overview................................................................................................................................3
Exploratory Data Analysis.........................................................................................................3

Model fitting and Forecasting....................................................................................................5

Time Series.............................................................................................................................6
Time series with Bayesian approach....................................................................................10
Conclusions..............................................................................................................................14

References................................................................................................................................15

Appendix..................................................................................................................................16

R scripts................................................................................................................................16
Time series Analysis using R 3

Introduction
Overview
In economic terms, as a market, the housing market is usually determined by supply and

demand. However, although equilibrium can be reached in the market, this can lead to some

social problems such as housing unaffordability. In the particular case of London, England,

the last few decades have seen prices rise significantly: at the end of last a year lone, for the

first time in history, the average cost per property exceeded the £500,000 margin

(Antonakakis, 2018). This makes London the most expensive region in the UK to live in.

Referring again to the problem of housing unaffordability, it should be noted that the rate of

growth of housing prices has come to exceed the increase in individual incomes. In some

2014 countries, housing costs have even exceeded the average wage by a factor of several

times10, as opposed to when1997, they were only a factor of several times.

As a result of the pandemic caused by the COVID-19 virus, it has been reported that

during this period there has been a concentration of demand for housing in England. This has

several origins: In the case of first-time buyers, because they do not have the complications of

moving house, such as securing a target home, arranging arrangements and payments, and

contemplating repairs and refurbishments, there has been an incentive to buy properties,

which are normally at the lower end of house prices. This, in turn, has led to an increase in

demand and hence higher prices. On the other hand, another factor that has played a role in

the increase in prices is the change in people's housing preferences as a result of the strict

confinement measures decreed by the government at the end of March 2020

Exploratory Data Analysis

A database containing information on dwellings in London was used. The variables it

contains are date, area, average house price (recorded in £GBP), area code, number of houses

sold and an indicator as to whether the area is a London borough or not. The data are updated

every month from January to 1995 January and 2020 are considered as London regions 45,
Time series Analysis using R 4

therefore, each region has data301 per variable. For the purposes of this analysis, the study

analysed the time series of average house prices to make a prediction. In particular, the study

will work with the Westminster region. Then, for the average price variable for the chosen

region are presented in Table 1.

Table 1: Descriptive statistics of House Price (Mean)

Minimum Medium Max

Average price 121,387 521,837 1,117,408

The graphs in Figure 1, contains analysis of normality test the histograms of the

residuals, it is not observed in any anomaly that would lead the study to suspect that the

model is not treated in the same way as the residuals or possible suspect that it is not white

noise.

Figure 1: House price normality test

As expected, the minimum corresponds to the first observation (for instance, January

1995) and the maximum was reached in February 1995. 2018. As it is a time series, the study

will care more about the most recent data. For data from the study would have 2018 a mean

of then 991,349, the study would expect the predictions to be around this value.
Time series Analysis using R 5

Model fitting and Forecasting

In the classical time series approach, broadly speaking, the study has a series of

correlated values, due to their dependence on time (Laptev et al., 2017). Through the different

techniques, the study sought to analyse the information obtained, in order to identify a pattern

that allows us to describe this information over time. Subsequently, this pattern is extended

to a specific period of time in order to carry out a forecast. For time series from the Bayesian

approach, the study find that they have the same mathematical structure as the classical

approach, such as the Box-Jenkins model, but they differ in that in the estimation of the

parameters, these are considered random variables and as such have a probability space

associated with them (Laptev et al., 2017).

First, a Breusch-Pagan test is performed to verify the homoscedasticity of the model

of the observations. Based on the previous result, if heteroskedasticity exists, a Box-Cox

transformation will be applied to the data to correct for heteroskedasticity and again a

Breusch-Pagan test will be performed to check again this condition (Đalić &Terzić, 2021).

Subsequently, if homoscedasticity exists, a differencing will be used to confer stationarity to

the series, since this is necessary to make ARMA or ARIMA. Then, the Dickey-Fuller and

KPSS tests will be performed to check for stationarity in the series (Fedorová, 2016). Finally,

based on the previous results, an ARIMA or SARIMA adjustment is proposed and a

prediction is made for 5 future values.

For the Bayesian model, the following will be sought: Firstly, it should be noted that

the study will work with the already differentiated series that it is used with the classical

model (Xiao et al., 2017). Then, initial values on the parameters for the a priori distributions

will be defined; this will aim at estimating moving averages. The data burn-in will be

performed and the number of Markov chains for the estimation will be established.

According to Xiao et al. (2017), the convergence of the parameters will be checked
Time series Analysis using R 6

graphically and by means of a Gelman test. Finally, a prediction is made for 5future values.

To conclude the above procedure, the classical and Bayesian models are compared according

to the goodness-of-fit criteria, and it is determined which one offers a better model. Finally, a

new Bayesian model will be proposed in order to find a better estimation of the variance,

testing several models and comparing them in terms of their p-values and considering that

there is no correlation between their parameters.

Time Series
A time series is the succession of observations generated by a stochastic process, the

index of which is taken relative to time. In time series it is assumed that there is a correlation

structure between two observations, they are not independent (Woodward et al., 2017). The

study ran a time series for the price column for the Westminster region using R as observed

in Figure 2.

Figure 2: Time series price trend

For the time series, it was essential to have constant variance, so that the study can perform a

h-hosedastic test, under the Breusch- Pagan test, assuming:

H0 = The data are homoscedastic (have constant variance).

H1 = The data are heteroscedastic (the variance is not constant).
Time series Analysis using R 7

The analysis obtained a p-value < 0.05 so the data are not homoscedastic. Thus with

the BoxCox transformation the time series was made to be homoscedastic (constant

variance). The BoxCox command applies a transformation to the data according to a lambda

parameter that the study used to find a lambda value that generates a suitable transformation

to the data, we use the command BoxCox.lambda, then, the study finds lambda with

BoxCox.lambda and transform the data with BoxCox using the lambda parameter (Bauer et

al., 2019). All this under the Warrior method, which is simply the way lambda is going to be

calculated. It was anticipated to used the loglike method by changing "warrior" to "loglik",

however this would change the value of lambda and consequently the transformation, as

"warrior" worked well for us, it is left with this method.

Finally, the study re-ran the test and with a p-value = 0.2557, while the analysis also

accepted H0, for instance, the study data were homoscedastic. An analysis was then

performed stationarity tests, using two tests: the Dickey-Fuller test and the Kwiatkowski-

Phillips-Schmidt-Shin (KPSS) test. As the D-F test did not pass, the study performed a

differencing test and so it passed the stationarity test, as well as the KPSS test as presented in

Figure 3.
Time series Analysis using R 8

Figure 3: Decomposition of the time series.

Thereafter, the analysis used the sample ACF and PACF to give us an idea of the

number of lags so that the study could propose to fit with an ARIMA or SARIMA.

Theoretically an AR(p) will have the first few p lags of the PACF outside the confidence

bands and then the lags will quickly tend to zero, similarly an MA(q) will have the first p lags

of the PACF outside the confidence bands and then the lags will quickly tend to zero, hence it

is concluded that an ARMA(p,q) will fulfil both conditions are shown in Figure 4.

Figure 4: Differentiation of the time series

By means of auto.arima, the analysis obtained the model: ARIMA(2,0,3) WITH non-

zero MEAN with AIC = -1402.07 AICc = -1401. 68and BIC = -1376.14. Finally, the study

used forecast to obtain the prediction graph and the results are shown in Figure 5.
Time series Analysis using R 9

Figure 5: Projection of future values with confidence bands showing the full series

Figure 6: Projection of future values with confidence bands showing the series from the 2016

Time series with Bayesian approach

The study, thereafter attempted to fit the ARIMA(2,0,3) model, proposed in the

classical approach, using JAGS and the forecast package. For this, the study have to define

the equation to work, which requires two autoregressive variables (It will be called ρ 1and

ρ2) and 3latent variables (θ1, θ2 and θ3) for the moving averages, once the study had this, an

auxiliary variable z was defined to carry forward the moving averages. The following results

are shown in Figure 7 were obtained.

Time series Analysis using R 10

Figure 7: Tracing and density of the estimated parameters using strings.3

Time series Analysis using R 11

Figure 8: Tracing and density of the estimated parameters using strings.3

It is observed that the traces converge and the densities converge. Besides, it can be

seen that they converge very close to zero, which is what could also be seen in the classical

model. At the same time, it is evident that the behaviour of the residuals are normal, which is

also evident in Figure 9.

Time series Analysis using R 12

Figure 9: Residuals for the ARIMA (2,0,3) model using the Bayesian approach

Form Figure 9, it can be seen that if white noise is followed, the ACF and PACF plots remain

within the bands. The tests of the assumptions were performed and they did pass the test of

independence (Ljung-Box). From the result, if, it is observed at the following graph showing

the fitted values against the observed ones the study noticed that the variance is

underestimated since the fitted values are below the observed ones.se to the prediction

presented in the classical model and, can be stated that it fits good. Though, in case there

were any issues from the beginning with the variance of the series not being constant. It is

seen for another model to fit with the auto.sarima command, the results obtained an ARIMA

(1,0,2).

The study attempted to fit different models with the ggplot2 package. It was observed

that most of them did not pass the constant variance test, so the study tried to take the ones

with the highest p-value for this test and with no correlation in the parameters. The list of

models that were kept are: ARIMA(1,0,2), ARIMA(2,0,3), ARIMA(4,0,2), and ARIMA
Time series Analysis using R 13

(2,0,1), ARIMA(3,0,3). Finally, the results for the prediction of the model is obtained the

results are shown in Figure 10.

Figure 10: Model prediction (then the best model obtained was the ARIMA(1,0,2).

Comparing these models from the results in Figures above, it can be concluded that

that the one that seemed to have the best fit was the ARIMA(1,0,2), which was in fact the one

suggested by the auto.sarima command.

Conclusions
In the end there was no single best model for the time series fit. The study attempted

to work with the same model obtained in the classical approach for the Bayesian part but for

these methods it was not the best option. It could be seen that the importance of having

different ways of approaching the modelling and how difficult it can be to reach a good fit,

especially in the Bayesian way because it requires more computational work which may not
Time series Analysis using R 14

have been much with the study data but having databases with millions of data can

complicate trying different models. The R scripts used to obtain study results are annexed.

References
Antonakakis, N., 2018. Rethinking London's' ripple effect'on house prices: other UK regions
transmit shocks too. British Politics and Policy at LSE.

Bauer, A., Züfle, M., Herbst, N. and Kounev, S., 2019, June. Best practices for time series
forecasting (tutorial). In 2019 IEEE 4th International Workshops on Foundations and
Applications of Self* Systems (FAS* W) (pp. 255-256). IEEE.

Đalić, I. and Terzić, S., 2021. Violation of the assumption of homoscedasticity and detection
of heteroscedasticity. Decision Making: Applications in Management and
Engineering, 4(1), pp.1-18.

Database obtained from https://www.kaggle.com/justinas/housing-in-london

Fedorová, D., 2016. Selection of unit root test on the basis of length of the time series and
value of ar (1) parameter. Statistika, 96(3), p.3.

Laptev, N., Yosinski, J., Li, L.E. and Smyl, S., 2017, August. Time-series extreme event
forecasting with neural networks at uber. In International conference on machine
learning (Vol. 34, pp. 1-5). Sn

Woodward, W.A., Gray, H.L. and Elliott, A.C., 2017. Applied time series analysis with R.
CRC press.

Xiao, Q., Chaoqin, C. and Li, Z., 2017. Time series prediction using dynamic Bayesian
network. Optik, 135, pp.98-103.
Time series Analysis using R 15

Appendix
R scripts

Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
From Everand
Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
Bob Mather
3/5 (1)
Problem Set 1
100% (2)
Problem Set 1
26 pages
Lütkepohl & Krätzig 2004 Applied Time Series Econometrics
No ratings yet
Lütkepohl & Krätzig 2004 Applied Time Series Econometrics
350 pages
MTH 4130 Final Project
No ratings yet
MTH 4130 Final Project
14 pages
Structural Modelling BKM
100% (7)
Structural Modelling BKM
522 pages
Multivariate Data Analysis Joseph F. Hair Jr. William C. Black Barry J. Babin Rolph E. Anderson Seventh Edition
0% (1)
Multivariate Data Analysis Joseph F. Hair Jr. William C. Black Barry J. Babin Rolph E. Anderson Seventh Edition
7 pages
Time Series
100% (1)
Time Series
61 pages
Time Series Updated
No ratings yet
Time Series Updated
25 pages
The Analysis of Time Series An Introduction by Chris Chatfield 5th Edition 11 18 PDF
No ratings yet
The Analysis of Time Series An Introduction by Chris Chatfield 5th Edition 11 18 PDF
8 pages
Time Series Analysis
No ratings yet
Time Series Analysis
21 pages
Basic Concepts of Time Series Modeling-Notes
No ratings yet
Basic Concepts of Time Series Modeling-Notes
4 pages
Times Series 1
No ratings yet
Times Series 1
88 pages
Time Series Analysis
No ratings yet
Time Series Analysis
10 pages
07 Time_Series_Analysis_with_R_Ranjeet Paul-
No ratings yet
07 Time_Series_Analysis_with_R_Ranjeet Paul-
10 pages
Time Series Analysis
No ratings yet
Time Series Analysis
36 pages
4.2 Empirical Analysis: 4.2.1 Descriptive Statistics
No ratings yet
4.2 Empirical Analysis: 4.2.1 Descriptive Statistics
12 pages
Hannan E.J., Krishnaiah P.R., Rao M.M.-Handbook of Statistics, Vol. 5. Time Series in The Time Domain (1985) PDF
No ratings yet
Hannan E.J., Krishnaiah P.R., Rao M.M.-Handbook of Statistics, Vol. 5. Time Series in The Time Domain (1985) PDF
482 pages
Unit III Time Series Analysis Lesson 6
No ratings yet
Unit III Time Series Analysis Lesson 6
22 pages
Chapter 05 Exploratory
No ratings yet
Chapter 05 Exploratory
26 pages
Time Series Analysis
No ratings yet
Time Series Analysis
2 pages
By Chris Chatfield, Published in 2004 by Chapman & Hall/CRC in The Texts in Statistical Science Series
No ratings yet
By Chris Chatfield, Published in 2004 by Chapman & Hall/CRC in The Texts in Statistical Science Series
19 pages
Time Series Research Papers
100% (1)
Time Series Research Papers
7 pages
Econ 2 - Time Series
No ratings yet
Econ 2 - Time Series
23 pages
Seasonal Modelling of Fourier Series With Linear Trend
No ratings yet
Seasonal Modelling of Fourier Series With Linear Trend
8 pages
Time Series (Autosaved)
No ratings yet
Time Series (Autosaved)
84 pages
BBS en 2010 1 Piscopo
No ratings yet
BBS en 2010 1 Piscopo
8 pages
TIME SERIES ANALYSIS
No ratings yet
TIME SERIES ANALYSIS
9 pages
Arima Garch 11 Modelling and Forecasting For A Ge Stock Price Using R
No ratings yet
Arima Garch 11 Modelling and Forecasting For A Ge Stock Price Using R
20 pages
End Term Project (BA)
No ratings yet
End Term Project (BA)
19 pages
Assignment
No ratings yet
Assignment
4 pages
Dafadsg S
No ratings yet
Dafadsg S
12 pages
Introduction to Time Series
No ratings yet
Introduction to Time Series
6 pages
Topic 4 Analysis of Time Series
No ratings yet
Topic 4 Analysis of Time Series
38 pages
Characteristics of Time Series
No ratings yet
Characteristics of Time Series
2 pages
Chapter 3 Time Series Analysis
No ratings yet
Chapter 3 Time Series Analysis
28 pages
7 Time Series
No ratings yet
7 Time Series
50 pages
TIME SERIES ANALYSIS Chapter 1 and 2
No ratings yet
TIME SERIES ANALYSIS Chapter 1 and 2
24 pages
Students Alfredo de Alba Alvarado Eduardo Melendrez Escobedo Kenya Giselle Martinez Puente Bryton César Arguelles Aguilar
No ratings yet
Students Alfredo de Alba Alvarado Eduardo Melendrez Escobedo Kenya Giselle Martinez Puente Bryton César Arguelles Aguilar
6 pages
Assigment # 1 For Economatrics - 102649
No ratings yet
Assigment # 1 For Economatrics - 102649
10 pages
Chapter 6 Time Series Analysis
No ratings yet
Chapter 6 Time Series Analysis
31 pages
14. Chapter 13
No ratings yet
14. Chapter 13
20 pages
Advance Time Series
No ratings yet
Advance Time Series
12 pages
Ch-Five econometrics normal
No ratings yet
Ch-Five econometrics normal
11 pages
Https Sites - Google.com Site Maeconomicsku Home Trend-Series-1 TMPL /system/app/templates/print/&ShowPrintDialog 1
No ratings yet
Https Sites - Google.com Site Maeconomicsku Home Trend-Series-1 TMPL /system/app/templates/print/&ShowPrintDialog 1
10 pages
Time Series
No ratings yet
Time Series
1 page
Time Series and Forecasting
No ratings yet
Time Series and Forecasting
75 pages
Time Series Notes
No ratings yet
Time Series Notes
26 pages
7 Applied Time Series Econometrics PETER C.B. PHILLIPS
No ratings yet
7 Applied Time Series Econometrics PETER C.B. PHILLIPS
350 pages
The Use of The Variogram in Time Series Analysis
No ratings yet
The Use of The Variogram in Time Series Analysis
50 pages
Times Series Analysis Notes
No ratings yet
Times Series Analysis Notes
5 pages
Unit-2
No ratings yet
Unit-2
23 pages
Models: Autoregressive Moving Average
No ratings yet
Models: Autoregressive Moving Average
13 pages
Arma Models: This Project Is About The Time Analysis Based Model ARMA. Which Is A Forecasting
No ratings yet
Arma Models: This Project Is About The Time Analysis Based Model ARMA. Which Is A Forecasting
23 pages
STA 114 Exam Qs and Sol May 2015
No ratings yet
STA 114 Exam Qs and Sol May 2015
5 pages
Hanke9 Odd-Num Sol 03
100% (1)
Hanke9 Odd-Num Sol 03
10 pages
A Review of Basic Statistical Concepts: Answers To Odd Numbered Problems 1
No ratings yet
A Review of Basic Statistical Concepts: Answers To Odd Numbered Problems 1
32 pages
Bba 104 Assignment
No ratings yet
Bba 104 Assignment
4 pages
ICIAC11 1E LTompson
No ratings yet
ICIAC11 1E LTompson
40 pages
QMT 3001 Business Forecasting Term Project
No ratings yet
QMT 3001 Business Forecasting Term Project
30 pages
Time Series
No ratings yet
Time Series
13 pages
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Substantive Theory and Constructive Measures: A Collection of Chapters and Measurement Commentary on Causal Science
From Everand
Substantive Theory and Constructive Measures: A Collection of Chapters and Measurement Commentary on Causal Science
Mark Everett Stone
No ratings yet
Understanding Proof: Explanation, Examples and Solutions
From Everand
Understanding Proof: Explanation, Examples and Solutions
Tom Bennison
No ratings yet
Lecture Notes in Elementary Real Analysis
From Everand
Lecture Notes in Elementary Real Analysis
Rohan Dalpatadu
No ratings yet
One-Sample Kolmogorov-Smirnov Test
No ratings yet
One-Sample Kolmogorov-Smirnov Test
5 pages
3.exponential Family & Point Estimation - 552
0% (1)
3.exponential Family & Point Estimation - 552
33 pages
Taxation, Responsiveness and Accountability in Sub-Saharan Africa: The Dynamics of Tax Bargaining 1st Edition Wilson Prichard 2024 Scribd Download
100% (1)
Taxation, Responsiveness and Accountability in Sub-Saharan Africa: The Dynamics of Tax Bargaining 1st Edition Wilson Prichard 2024 Scribd Download
55 pages
Chapter 4. Estimation of Parameters
No ratings yet
Chapter 4. Estimation of Parameters
68 pages
Mathematical-Economics Solved MCQs (Set-4)
No ratings yet
Mathematical-Economics Solved MCQs (Set-4)
8 pages
1 Department of Nursing, Tokiwa University, Ibaraki, Japan: ICMJE Statement
No ratings yet
1 Department of Nursing, Tokiwa University, Ibaraki, Japan: ICMJE Statement
16 pages
Pengaruh Corporate Governance, Bonus Plan, Dan Firm Size Terhadap Manajemen Laba
No ratings yet
Pengaruh Corporate Governance, Bonus Plan, Dan Firm Size Terhadap Manajemen Laba
13 pages
asset-v1-IIMBx QM901x 3T2015 Type@asset Block@w02 - C03
No ratings yet
asset-v1-IIMBx QM901x 3T2015 Type@asset Block@w02 - C03
6 pages
Selecting Appropriate Forecast Method On The Basis of Forecast Accuracy
No ratings yet
Selecting Appropriate Forecast Method On The Basis of Forecast Accuracy
10 pages
Quiz 5 Chap 6
No ratings yet
Quiz 5 Chap 6
5 pages
Econometric Analysis For Scenario Based Planning
No ratings yet
Econometric Analysis For Scenario Based Planning
12 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
3 pages
SLR Solved Example
No ratings yet
SLR Solved Example
6 pages
Nurwana 2023 (Tjiptono)
No ratings yet
Nurwana 2023 (Tjiptono)
10 pages
Auto Correlation
100% (2)
Auto Correlation
33 pages
Sobel
No ratings yet
Sobel
4 pages
ch1 The Nature of Regression Analysis
No ratings yet
ch1 The Nature of Regression Analysis
12 pages
What Are We Weighting For - Jeffrey M. Wooldridge
No ratings yet
What Are We Weighting For - Jeffrey M. Wooldridge
16 pages
Generalized Method of Moments Estimation PDF
No ratings yet
Generalized Method of Moments Estimation PDF
29 pages
CIVL-365 Tutorial 6 Question
No ratings yet
CIVL-365 Tutorial 6 Question
2 pages
Bias Varience Trade Off
100% (2)
Bias Varience Trade Off
35 pages
Economic Model Econometric Model
No ratings yet
Economic Model Econometric Model
2 pages
Minimum Mean Square Error Estimation
No ratings yet
Minimum Mean Square Error Estimation
2 pages
14.384. Time Series Analysis: Amikushe@mit - Edu
No ratings yet
14.384. Time Series Analysis: Amikushe@mit - Edu
8 pages
7 Estimation Describing A Single Population
No ratings yet
7 Estimation Describing A Single Population
92 pages
Exercise EC5002 Econometrics All Questions
No ratings yet
Exercise EC5002 Econometrics All Questions
24 pages