0% found this document useful (0 votes)
4 views

Implementation of Time Series Approaches

Uploaded by

Agustin Almada
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Implementation of Time Series Approaches

Uploaded by

Agustin Almada
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 92

Implementation of Time Series Approaches

to Financial Data

Noshin Nawar Sadat

Supervisor: Dr. Mahbub Majumdar

A thesis submitted in partial fulfillment for the


degree of Bachelors of Computer Science & Engineering

in the

Department of Computer Science & Engineering


BRAC University

August 2016
Declaration

This is to certify that this thesis report, titled ”Implementation of Time Series
Approach to Financial Data” is submitted by Noshin Nawar Sadat (ID:12101017)
to the Department of Computer Science and Engineering, School of Engineering and
Computer Science, BRAC University, in partial fulfillment of the requirements for the
degree of Bachelor of Science in Computer Science and Engineering. I, hereby, declare
that this thesis and the work presented in it are my own and it has not been submitted to
any other University or Institute for award of any other degree or diploma. Every work
that has been used as reference for this thesis has been cited properly.

Author Supervisor
Noshin Nawar Sadat Dr. Mahbub Majumdar
ID: 12101017 Professor
Department of CSE
BRAC University, Dhaka

i
Acknowledgements
The completion of this thesis would not have been possible without the support, help and
encouragement of several people. First of all, I would like to express my sincere gratitude
to Dr. Mahbub Alam Majumdar, my supervisor, for his continuous support and guidance.
He has been a constant source of encouragement and enthusiasm throughout the time of
this project.
I am also very grateful to Dr. Md. Haider Ali, Professor and Chairperson, Department
of Computer Science and Engineering, BRAC University. His classes as well as his
support for all his students’ work is a source of inspiration for me and everyone else
in the department.
I would also like to thank all of my teachers, starting from kindergarten to the University.
If it weren’t for their teachings, I might not have been able to accomplish whatever I have
accomplished till now.
My heartfelt gratitude goes to Ipshita Bonhi Upoma, my dear friend, who listened to me
patiently and kept me motivated by warning me about the deadline almost everyday.
I would also like to thank Nuzhat Ashraf Mahsa and Faiza Nuzhat Joyee, for their
understanding and support. I have to thank all my school friends as well. Our regular
chats over phone helped me to get rid of all my stress within minutes.
Finally, my deepest gratitude goes to my parents and my sister, for their unconditional
sacrifices, love and support throughout my life.

ii
ABSTRACT
We study a time series approach to financial data, specifically the ARIMA models, and
build a web based platform for stock market enthusiasts to analyze time series of stock
market returns data and to fit ARIMA models to the series to forecast future returns.
This system also acts as an informative tool by providing helpful instructions to the users
regarding the analysis and model-fitting procedure. It uses R to perform the statistical
computations.

iii
Contents

List of Figures vi

List of Tables viii

1 Introduction 1
1.1 Time Series Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Stock Market Prediction Methods 3


2.1 Efficient Market Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Random Walk Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Typical Stock Prediction Methods . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Summary of Theory 8
3.1 Time Series Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Some Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Mean Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.2 Autocovariance Function . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.3 Autocorrelation Function (ACF) . . . . . . . . . . . . . . . . . . . . 10
3.2.4 Random Walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.5 Simple Moving Average (SMA) . . . . . . . . . . . . . . . . . . . . 11
3.2.6 Stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.7 White Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Trend Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.1 Estimating Constant Mean . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.2 Estimating Non-Constant Mean . . . . . . . . . . . . . . . . . . . . 15
3.3.3 Analyzing Estimated Outputs . . . . . . . . . . . . . . . . . . . . . 18
3.4 Time Series Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.1 General Linear Process . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4.2 Moving Average Process . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4.3 Autoregressive Process . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4.4 ARMA Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4.5 ARIMA Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.6 Backshift Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Box-Jenkins Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5.1 Model Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5.2 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 39

iv
3.5.3 Model Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.6 Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.6.1 Minimum Mean Square Error Forecasting . . . . . . . . . . . . . . 47
3.6.2 Forecasting ARIMA Models . . . . . . . . . . . . . . . . . . . . . . 47
3.6.3 Limits of Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.6.4 Updating ARIMA Forecasts . . . . . . . . . . . . . . . . . . . . . . 50
3.6.5 Forecasting Transformed Series . . . . . . . . . . . . . . . . . . . . 50

4 Stock Returns Forecasting System 51


4.1 Proposed System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1.1 Why use stock returns data instead of stock prices? . . . . . . . . . 51
4.1.2 How the Proposed System Works . . . . . . . . . . . . . . . . . . . 52
4.2 System Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2.1 Outputs of Analysis Section . . . . . . . . . . . . . . . . . . . . . . 55
4.2.2 Outputs of Model-Fitting . . . . . . . . . . . . . . . . . . . . . . . 61
4.3 System Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3.1 The Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.3.2 R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3.3 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.3.4 Integration of PHP and R . . . . . . . . . . . . . . . . . . . . . . . 73
4.4 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5 Further Exploration 75
5.1 Theoretical Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.1.1 ARCH/GARCH Models . . . . . . . . . . . . . . . . . . . . . . . . 79
5.1.2 ARIMAX Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2 Real Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6 Conclusion 81

Bibliography 82

v
List of Figures

3.1 Time Series Plot of Annual Diameter of the Hem of Women’s Skirts . . . . 9
3.2 Time Series Plot of Simple Moving Average of order 12 . . . . . . . . . . . 12
3.3 Plot of First Order Difference of the Diameter of Hem Series . . . . . . . . 29
3.4 Plot of Second Order Difference of the Diameter of Hem Series . . . . . . . 30
3.5 ACF Plot of the Diameter of Hem Series . . . . . . . . . . . . . . . . . . . 33
3.6 PACF Plot of the Diameter of Hem Series . . . . . . . . . . . . . . . . . . 36
3.7 Plot of Residuals of the Fitted Model to the Diameter of Hem Series . . . . 44
3.8 Q-Q plot of Residuals of the Fitted Model to the Diameter of Hem Series . 45
3.9 Histogram plot of Residuals of the Fitted Model to the Diameter of Hem
Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.10 ACF plot of Residuals of the Fitted Model to the Diameter of Hem Series . 46

4.1 An overview of the System . . . . . . . . . . . . . . . . . . . . . . . . . . . 53


4.2 Usecase Diagram of the System . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3 An Activity Diagram of the System . . . . . . . . . . . . . . . . . . . . . . 54
4.4 A Data Flow Diagram of the System . . . . . . . . . . . . . . . . . . . . . 55
4.5 Time Series Plot of Stock Returns of YHOO . . . . . . . . . . . . . . . . . 56
4.6 Summary of Stock Returns Data of YHOO . . . . . . . . . . . . . . . . . . 56
4.7 Sample ACF Plot of Stock Returns Data of YHOO . . . . . . . . . . . . . 57
4.8 Sample PACF Plot of Stock Returns Data of YHOO . . . . . . . . . . . . 58
4.9 EACF of Stock Returns Data of YHOO . . . . . . . . . . . . . . . . . . . . 59
4.10 Q-Q Plot of Stock Returns Data of YHOO . . . . . . . . . . . . . . . . . . 59
4.11 Histogram Plot of Stock Returns Data of YHOO . . . . . . . . . . . . . . . 60
4.12 ADF Test Results of Stock Returns Data of YHOO . . . . . . . . . . . . . 60
4.13 Summary of Fitted Model of Stock Returns Data of YHOO . . . . . . . . . 62
4.14 Time Series Plot of Residuals of Fitted Model on YHOO Stock Returns . . 63
4.15 ACF Plot of Residuals of Fitted Model on YHOO Stock Returns . . . . . . 63
4.16 PACF Plot of Residuals of Fitted Model on YHOO Stock Returns . . . . . 64
4.17 Q-Q Plot of Residuals of Fitted Model on YHOO Stock Returns . . . . . . 65
4.18 Results of Ljung-Box Test on Residuals of Fitted Model on YHOO Stock
Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.19 Plot of Forecast of Fitted Model on YHOO Stock Returns . . . . . . . . . 66
4.20 Summary of Forecast of Fitted Model on YHOO Stock Returns . . . . . . 66
4.21 Errors of Forecast of Fitted Model on YHOO Stock Returns . . . . . . . . 66
4.22 The Homepage of the System . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.23 The Analysis Page Before Input is Submitted . . . . . . . . . . . . . . . . 69
4.24 The Analysis Page After Input is Submitted . . . . . . . . . . . . . . . . . 69
4.25 The Model-Fitting Page Before Input is Submitted . . . . . . . . . . . . . 70

vi
4.26 The Model-Fitting Page After Input is Submitted . . . . . . . . . . . . . . 71
4.27 ER Diagram for the System . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5.1 ACF Plot of AAPL Stock Returns for 2015 . . . . . . . . . . . . . . . . . . 75


5.2 PACF Plot of AAPL Stock Returns for 2015 . . . . . . . . . . . . . . . . . 76
5.3 EACF of AAPL Stock Returns for 2015 . . . . . . . . . . . . . . . . . . . . 76
5.4 Histogram Plot of AAPL Stock Returns for 2015 . . . . . . . . . . . . . . . 77
5.5 Q-Q Plot of AAPL Stock Returns for 2015 . . . . . . . . . . . . . . . . . . 77
5.6 ACF Plot of Absolute Values of AAPL Stock Returns for 2015 . . . . . . . 78
5.7 PACF Plot of Absolute Values of AAPL Stock Returns for 2015 . . . . . . 78

vii
List of Tables

3.1 Behavior of ACF and PACF for Different ARMA Models . . . . . . . . . . 36

4.1 Values of AIC, AICc and BIC after fitting different models . . . . . . . . 61

viii
Chapter 1

Introduction

The analysis of financial data for the prediction of future stock returns has always been
an important field of research. According to [17], stock prices may not only signal future
changes in the economy of a country but also have direct effects on the economic activities
of the country. Although not always effective, stock price movements are considered as
useful indicators of the business cycle and they also affect the household consumptions and
corporate investments in a country to some extent. Hence, being able to make successful
predictions of the future stock prices not only allows investors to make significant amount
of profits by preventing loss but also helps the government to take appropriate measures
to prepare for possible economic downturns in the future. For that reason, numerous
theories and methods are being devised to make accurate predictions of the stock market
by the analysis of financial data. In our project, we focused on creating a platform for the
prediction of stock returns of companies using time series approach. We built a web-based
system where users will be able to analyze stock returns data of forty companies listed in
NASDAQ and NYSE and then use that analysis information to fit ARIMA models to those
data. All the analysis and model-fitting tasks were done by using the R software. This is
just the initiative step towards building a more comprehensive and sophisticated system
where users will be able to build their own models based on any time series approach and
make predictions of future stock returns in order to take appropriate measures.

1.1 Time Series Approach


In the Time Series Approach of stock price analysis and forecasting, it is considered that
the stock price incorporates all important and available information of the stock. By
figuring out the underlying structure and function that produced the past observations of
the price, time series analysis of stock prices aims to forecast the future prices or trends.

1.2 Motivation
Not all investors are capable of analyzing stock market data on their own for making
their trading decisions. They have to depend on the analysis of others instead. Even
if they wish to do it by themselves, they face the different problems of collecting the
data, acquiring analysis tools, leaning to use those tools etc. Moreover, those who are
involved in research in this field, might not be able to access a computer with statistical
tools installed all the time. In such cases, having access to an online statistical system to

1
perform instant analysis on authentic data would be preferable.

We plan to create such a platform for any stock market enthusiast, which will

• Allow them to perform various methods of time series analysis on stock returns data

• Provide authentic data

• Be usable

• Not require any installation

• Be accessible anywhere

• Allow them to save the outputs of their analysis for future references

• Provide helpful information regarding all the tests we allow them to perform and
how to interpret those tests

With that final goal in mind, we have initiated this project. Currently, we have created a
platform, where users can perform analysis on the time series of daily stock returns data
of companies, fit ARIMA models of any order to that data and make forecasts for the
next 10 days. This will eventually be evolved and perfected to achieve our final goal.

1.3 Thesis Outline


The outline of this paper is as follows:

• Chapter 2 gives a basic introduction to the different methods that are used for
stock market prediction and it also discusses some papers that have dealt with
stock market prediction using time series analysis.

• Chapter 3 introduces and explains the concepts of ARIMA modeling which is the
approach of time series that we have focused on in our work.

• Chapter 4 describes our implementation work which is a website for stock market
prediction using time series approach.

• Chapter 5 explains the limitations of our project and also provides directions for
further research.

• Chapter 6 summarizes the whole paper and gives conclusive remarks regarding the
project.

2
Chapter 2

Stock Market Prediction Methods

Despite many negative views regarding stock market prediction, people still try to come
up with different ways to forecast future price movements in the stock market. Thus, we
have now numerous methods of stock market prediction, starting from simple fundamental
analysis to more complicated statistical analysis methods. With the advent of computers
into the stock market prediction scene, analysis and development of prediction methods
have become much easier. This chapter discusses about different such methods of analysis,
which can be used singularly or in combination to predict stock price movements.

2.1 Efficient Market Hypothesis


The Efficient Market Hypothesis or EMH, as developed by [11], suggests that it is
impossible to beat the stock market as it is an efficient market, where the stock prices
reflect all the available and related information. Whenever new information comes up,
it gets immediately spread by the news and the stock price gets adjusted to it in no
time. Therefore, stocks are always traded at their fair value and it is not possible to
make predictions of trends in the market or to identify undervalued stocks. So, the only
possible way of making profit in the share business is to buy risky investments. However,
this theory has one big flaw. It assumes that the stock market is efficient. However, for
the stock market to be efficient, the following criteria have to be met [13]:

• All investors should be able to access high-speed and advanced systems of stock
price analysis.

• There should be a stock price analysis method which is universally accepted as


correct.

• All the investors in the market should be rational decision makers. Their decisions
should not be influenced by their emotions.

• The investors would not wish to make any higher profit than the other investors
because that would not be possible.

None of the above mentioned conditions can be met by the present stock market. Hence,
we cannot call it efficient. The stock market often shows trends in the price movements
and so it is possible to make predictions of future trends based on the past prices, if
modeled correctly.

3
CHAPTER 2. STOCK MARKET PREDICTION METHODS

2.2 Random Walk Hypothesis


[14] stated that stock prices cannot be predicted accurately by using price history. He
termed stock price movements as a statistical process, called the random walk. According
to his theory, the deviation from the mean of each observation is purely random and
therefore, unpredictable.

However, several economists and professors of finance have conducted a number of tests
and studies which reportedly claim that some sort of trend does exist in the stock market
and hence stock market can be predicted to some degree.

2.3 Typical Stock Prediction Methods


The prediction methods of stock market can be divided into two main categories depending
on how the share prices are evaluated. These complementary methods are often used by
themselves or in combination. They are as follows:
1. Fundamental Analysis
Fundamental analysis is based on the assumption that stock prices in the stock
market do not represent the actual real value of the stock. Fundamental analysis
asserts that the correct value of the stock can be found out by analyzing the
fundamentals of the company. The fundamentals include anything that is related to
the economy of the company. It is claimed by the fundamentalists that by buying
the undervalued stocks and holding on to them until the market realizes its mistake
and changes the prices of the stocks to their actual prices, an investor can make
profit.

The different fundamental factors can be divided into two groups and the analysis
methods of these two groups are termed as:
i) Quantitative Analysis
The quantitative factors are all the factors that can be expressed in numerical
terms. This type of analysis involves delving into the financial statements to
learn about the companys revenues, assets, liabilities, expenses and all other
financial aspects.
ii) Qualitative Analysis
Qualitative factors are the intangible, non-measurable aspects of the company,
such as, the companys business model, competitive advantage, management,
competition, customer, market share, government regulations etc. In other
words, it involves the analysis of the company itself, the industry in which the
company specializes in as well as the economic condition of the country in which
the company operates.
The intrinsic value of the stock is determined by performing these two analysis
methods. If the measured intrinsic value turns out to be greater than the actual
market value, then the stock is bought. If it is the same as the market price, then
it is held. And if it is lower than the market price, then it is sold.
2. Technical Analysis Technical analysis is based on the following three principles:

4
CHAPTER 2. STOCK MARKET PREDICTION METHODS

• Market Discounts Everything


The price of a stock reflects all the relevant and available information in the
market. Technical analysts believe that all the factors that could affect the
company, such as, the company’s fundamental factors, the economic factors
as well as the market psychology, are all accounted for in the price of that
company’s stock and hence, there is no need to analyze them separately.
• Price Moves in Trends
The price movements follow a trend and when such trend has been established,
the likelihood of the future stock prices to be in that same direction increases.
• History Tends to Repeat Itself
The pattern of price movements in the past tends to repeat itself in the present
as market participants have a tendency to react in a consistent manner to
similar stimuli all the time.

Technical analysis (or charting) only focuses on the price movements in the market.
It involves identifying patterns of price and volume movement in the market by
using charts and other tools and using those patterns to predict future activities in
the market. It does not care about the undervalued stocks and it does not try to
find the intrinsic values of any stock.

There are two types of tools for technical analysis:

a) Charts
Charts are just graphical representations of stock prices over a set time frame.
They could vary in time scale or price scale. Depending on the information to be
retrieved, the charts that are mostly used are line chart, bar chart, candlestick
chart and points and figures chart.
b) Indicators & Oscillators
Indicators use price and volume information to measure flow of money,
momentum and trends in the stock market. Indicators are used to either form
buy or sell signals, or to confirm price movement. They are also of two types
leading and lagging. Leading indicators help predict future price by preceding
price movements. Lagging indicators follow price movements and work as a tool
of confirmation. Some indicators are constructed in such a way that they fall
within a bound range. These are called oscillators. Crossovers and divergence
in the indicators are used to form buy or sell decisions. Some popular indicators
are Accumulation/Distribution Line, Average Directional Index (ADX), Aroon,
Aroon Oscillator, Moving Average Convergence Divergence (MACD), Relative
Strength Index (RSI), On Balance Volume (OBV) and Stochastic Oscillator.

Time Series Analysis

Time Series Analysis may be considered as a mathematical version of technical analysis


which uses statistical tools to extract meaningful information from a given stock market
price series and makes predictions on the basis of those information. However, getting
absolutely accurate forecasts of stock prices using time series analysis only is not an easy
route.

5
CHAPTER 2. STOCK MARKET PREDICTION METHODS

Machine Learning

Various machine learning algorithms are being applied for stock market prediction.
Notable among them are Support Vector Machines, Linear regression, Online Learning,
Expert Weighting and Prediction using Decision Stumps. These machine learning
algorithms are applied basing on the same assumption as that of technical analysis. It
is assumed that all the prices of the stocks have all the relevant information embedded
in them [18]. Using machine learning techniques alone, however, will not be able to give
accurate results and so a hybrid of several algorithms or a hybrid of algorithms and some
other analysis technique can be used. Moreover, the same technique or algorithm might
not work for every company’s stock price prediction.

2.4 Literature Review


In order to complete this project, a good understanding of Time Series Analysis was
required. To understand the basic concepts of time series analysis, the books of [9] and
[8] were very helpful. The different topics of time series analysis were explained in great
detail with easy to understand examples in these books. Moreover, the lecture note by
[20] and the online learning materials provided by PennState Eberly College of Science
on [2] were also very useful in our endeavor to understand time series. In order to learn
R programming, the online resources of [4] and [5] were of great help.

We went through several papers that dealt with stock market prediction using time series
models so that we could understand the method better. One of the papers was written
by [7]. They used the closing price data of Nokia Stock Index and Zenith Bank Stock
Index to build separate ARIMA models for the two companies. The ARIMA models with
smaller BIC and S.E. of regression, higher adjusted R2 were chosen as the best models.
Another criterion for choice was that the residuals of the models would be white noise.
They found that their built models were satisfactorily providing short-term predictions.

[12] used Box-Jenkins method to fit ARIMA models to the stock closing prices of AAPL,
MSFT, COKE, KR, WINN, ASML, AATI and PEP. They chose to use the closing prices
of the past ten years of the companies or since the year the companies went public. The
data was collected from Yahoo Finance. After analysis, almost all their models were
AR(1) models either in differenced or undifferenced form. They had used the stock price
data of eight companies specializing in four different industries to find if there were any
similarities between the industries. It was found that the stocks of the same industry did
not behave in a similar manner.

A study on the effectiveness of ARIMA models to predict future prices of fifty-six stocks
from seven sectors of India was conducted by [15]. For their work, they used the past
twenty-three months data. To choose the best model, they used the value of AICc. All
their built models were able to predict stock prices with above 85% accuracy.

Another paper attempted to combine traditional time series analysis techniques with
information from the Google trend website and the Yahoo finance website to predict
weekly changes in stock prices [21]. They collected important news related to a particular
stock over a five year span and they used the Google trend index values of this stock to
measure the magnitude of these events. They found significant correlation between the

6
CHAPTER 2. STOCK MARKET PREDICTION METHODS

values of the important news/events and the weekly stock prices. They collected weekly
stock prices of AAPL from Yahoo Finance website and extracted important news related
to AAPL stock by using the Key Developments feature under the Events tab in the Yahoo
Finance website. Each piece of news was then analyzed to give a positive or negative value
depending on their influence. The weekly search index of the term AAPL were extracted
from Google trend. The starting point of the news data and the Google trend data were
set to one month before the stock price data in order to find the relation between the
news of one time to the prices of stock at a later time. To analyze the historical stock
prices, they performed ARIMA time series analysis after first degree differencing of the
square root of the raw data. It was found by plotting the autocorrelation function and
partial autocorrelation function that the transformed stock prices essentially followed an
ARIMA(0,1,0) process.

[16] proposed a hybrid Support Vector Machine and ARIMA model for stock price
forecasting. They used daily closing prices of the fifty days (from October 21, 2002
to December 31, 2002) of ten stocks for their research. They used the closing prices in the
month of January 2003 as the validation set. The closing prices of February 2003 were
used as testing dataset. They tried making one step ahead forecasts of the hybrid model
as well as SVM and ARIMA models separately. It turned out that the hybrid model had
outperformed the other two models.

[19] collected 50 randomly selected stocks from Yahoo Finance website and applied time
series decomposition (TSD), Holts exponential smoothing (HES), Winters exponential
smoothing (WES), Box-Jenkins (B/J) methodology, and neural networks (NN) on the
dataset to analyze them and predict future prices. To use in the NN model, they divided
the dataset into three groups training set, validation set and testing set. Instead of using
the data directly, they used normalized data, which reduced the errors. They used back
propagation algorithm for training their system. Their models had fit the data with R2
almost equal to 0.995.

[10] created a system to forecast movements in the stock market in a given day by using
time series analysis on the S&P 500 values. They also performed market sentiment
analysis on data collected from Twitter to find out whether the addition of it increased
the accuracy of the prediction. They collected the S&P 500 values from Yahoo Finance
and the Twitter data from Twitter Census: Stock Tweets dataset from Infochimps. The
latter included around 2.3 million stock tweets. This dataset was then modified for the
purpose of their work. They had three labels for the S&P index movement up, down, and
same. To predict the S&P movements, they used five different attributes. To analyze the
sentiments in the tweet dataset, they used Naive Bayes Classifier. The sentiments were
labeled as up, down or same. After incorporating the sentiment analysis results with the
time series analysis results, they found that the accuracy had improved.

From the above discussion, it is clear that although time series analysis alone is a good
candidate for stock price forecasting, to get a more accurate result, other factors and
techniques should also be incorporated.

7
Chapter 3

Summary of Theory

In this chapter, we discuss about the time series analysis approach that we have used in
our project. We try to summarize what we have learned from the books, lecture notes and
other online materials mentioned in Section 2.4 of this paper. With a view to explaining
the topics, we have used the dataset on the annual diameter of the hem of womens skirts
from 1866 to 1911 provided by [1]. We used the R software as our tool to perform the
different analysis on the dataset. We shall discuss about R in detail in Chapter 4.

3.1 Time Series Analysis


Time series is a collection of observations made sequentially over a time interval. Time
series data are being generated every day in different fields of application, such as,

• In Finance: daily closing prices of stock, daily exchange rates, etc.

• In Economics: monthly total exports, monthly data on unemployment, etc.

• In Physical Sciences: daily rainfall, monthly average air temperature, etc.

• In Marketing: annual or monthly average sales figures, etc.

• In Demographic Studies: monthly or annual population of a city, etc.

• In Medicine: Brain wave activity during ECG, etc.

• In Process Control: Color property of batches of product, etc.

A time series can either be continuous or discrete. It is said to be continuous if it consists


of observations taken continuously in time. If the observations are taken in specific,
equal time intervals, then it is said to be discrete. In this paper, we will be working with
discrete time series.

In the time series, the distance between any two consecutive time points must be the
same and each time point must have at most one observation. That is, if the series is an
observation of monthly data, then it must have the observations of every month; and for
each month, only one observation should be taken.

The following is an example of a time series plot of the annual diameter of the hem of

8
CHAPTER 3. SUMMARY OF THEORY

womens skirts:

Figure 3.1: Time Series Plot of Annual Diameter of the Hem of Women’s Skirts

A stochastic process is a sequence of random variables, represented as follows:

{Yt : t = 0, ±1, ±2, }

Stochastic processes are used to model observed time series.

Time Series Analysis consists of various techniques of analyzing the time series data
with the aim of extracting significant statistics and other important features of data,
usually in order to make forecasts of future values based on the past observations.
Therefore, during time series analysis, the order of the observations must be maintained.
Otherwise, the very meaning of the data would change.

3.2 Some Basic Concepts


3.2.1 Mean Function
If {Yt : t = 0, ±1, ±2, ...........} is a time series, then its mean function is the expected
value of the observation at time t,

µt = E(Yt ) for t = 0, ±1, ±2, ±3, ....... (3.1)

9
CHAPTER 3. SUMMARY OF THEORY

3.2.2 Autocovariance Function


If t and s are two time points in the time series, then the covariance of the observations
of those two points is given by,
Cov(Yt , Ys ) = E[(Yt − µt )(Ys − µs )] (3.2)
= E[Yt Ys ] − µt µs (3.3)
Then the autocovariance function, γt,s is the sequence,
γt,s = Cov(Yt , Ys ) for t, s = 0, ±1, ±2, ±3, ....... (3.4)

3.2.3 Autocorrelation Function (ACF)


The correlation between two observations at two time points t and s is given by,
Cov(Yt , Ys )
Corr(Yt , Ys ) = p (3.5)
V ar(Yt )V ar(Ys )
γt,s
= √ (3.6)
γt,t γs,s
Then the autocorrelation function ρt,s is,
ρt,s = Corr(Yt , Ys ) for t, s = 0, ±1, ±2, ±3, ....... (3.7)
The value of ρt,s is always between −1 and +1. When its value is close to ±1, it means
that the two observations have strong linear dependence, whereas a value closer to 0
indicates a weak linear dependence. When ρt,s = 0, the two observations are said to be
uncorrelated.

3.2.4 Random Walk


If e1 ,e2 ,.......et is a sequence of independent and identically distributed random variables
with mean 0 and variance σe2 , then the observation at time t = 1 will be,

Y1 = e1
At time, t=2,
Y2 = e1 + e2
This will continue on, until t = t, where,
Yt = e1 + e2 + ........... + et
Thus the observation at time t = t can be expressed as follows:
Yt = Yt−1 + et (3.8)
Now, the mean of Yt is given by,
µt = E(Yt )
= E(e1 ) + E(e2 ) + ........ + E(et )
= 0 + 0 + ......... + 0
= 0 (3.9)

10
CHAPTER 3. SUMMARY OF THEORY

And the variance is given by,

V ar(Yt ) = V ar(e1 ) + V ar(e2 ) + ........ + V ar(et )


= σe2 + σe2 + ......... + σe2
= tσe2 (3.10)

The covariance between two time point t and s, where s − t = k is,

γt,s = Cov(Yt , Ys )
= Cov(e1 + e2 + ...... + et , e1 + e2 + ....... + et + et+1 + ..... + es )
= Cov(e1 , e1 ) + Cov(e1 , e2 ) + ........ + Cov(et , e1 ) + ...... + Cov(et , et ) + ....... + Cov(et , es )
= V ar(e1 ) + 0 + ........ + 0 + ......... + V ar(et ) + ......... + 0
= σe2 + 0 + ......... + 0 + ...... + σe2 + ...... + 0
= tσe2 (3.11)

Hence, the autocovariance function for the process is,

γt,s = tσe2 for 16t6s

The autocorrelation function of the process can then be expressed as,


γt,s
ρt,s = √
γt,t γs,s
r
t
= for 16t6s (3.12)
s
From (3.12), it is clear that the more the lag k = s − t increases, the more the value of
ρt,s decreases, which indicates that with the increase in lag, the correlation between two
observations reduces.

3.2.5 Simple Moving Average (SMA)


The arithmetic moving average which is calculated by adding the observations of a time
series over a certain number of time points and then dividing the added result by that
number of time points is called simple moving average or SMA.The total number of time
points that are taken into account can be varied if needed.

Say, Y−(n−1) , .......Y−3 , Y−2 , Y−1 , Y0 is a time series. If we wish to calculate its simple moving
average for n time points at t=0, then,
Y−(n−1) + ...... + Y−1 + Y0
SM A = (3.13)
n
Simple moving average helps to smooth out the series by filtering out the noise and thus
helps to indicate trends in the series. Often, a weighted moving average is used instead of
simple moving average. The weights are given according to the requirement of the analysis.

If a simple moving average of order 12 is done on the time series of annual diameter of
hem of women’s skirts, we will get the output as shown in Figure 3.2.

11
CHAPTER 3. SUMMARY OF THEORY

Figure 3.2: Time Series Plot of Simple Moving Average of order 12

It would not be an easy task to see trends visually in most time series data, such as the
data of daily stock prices, as there would be a lot more fluctuations in them. Hence,
smoothing out the time series data, using the simple moving average method, as we did
in Figure 3.2, will help us to see the trends clearly.

3.2.6 Stationarity
Say, {Yt } is a time series. If the joint probability distribution of Yt1 , Yt2 ....., Ytn is same
as the joint probability distribution of Yt1 −k , Yt2 −k ....., Ytn −k for all sets of time points
t1 , t2 .....tn and lag k, then {Yt } is said to be a strictly stationary process. Shifting the
time origin by k does not affect the joint probability distribution. This implies that the
joint distribution is dependent on the intervals between t1 , t2 .....tn .

If n = 1, the univariate distribution of Yt will be the same for all t, and both the mean
function and the variance will remain constant,

µt = µ
V ar(Yt ) = σ 2

If n = 2, the bivariate distribution of Yt1 and Yt2 will be the same as that of Yt1 −k and
Yt2 −k . Therefore,

Cov(Yt1 , Yt2 ) = Cov(Yt1 −k , Yt2 −k )


γt1 ,t2 = Cov(Yt1 −k , Yt2 −k )

12
CHAPTER 3. SUMMARY OF THEORY

Replacing k by t1 and k by t2 ,

γt1 ,t2 = Cov(Y0 , Yt2 −t1 )


= Cov(Yt1 −t2 , Y0 )
= Cov(Y0 , Y|t1 −t2 | )
= γ0,|t1 −t2 |

Hence, the covariance between Yt1 and Yt2 depends on the lag between them and not on
the actual time points t1 and t2 . Therefore, for stationary processes, we can write the
autocovariance functions and autocorrelation functions as follows:

γk = Cov(Yt , Yt−k ) (3.14)


ρk = Corr(Yt , Yt−k ) (3.15)

Moreover,
γk
ρk = (3.16)
γ0

{Yt } will be termed as a weakly stationary process or second order stationary


process if its mean function is constant over time and its covariance is independent of
actual time t.

In this paper, we will only discuss about univariate, weakly statitionary time series.

3.2.7 White Noise


White noise is a sequence of independent, identically distributed random variables {et },
having mean zero and variance σe2 . It is a strictly stationary series, where,
(
V ar(et ), for k = 0
γk = (3.17)
0, for k 6= 0

And,
(
1, for k = 0
ρk = (3.18)
0, for k 6= 0

3.3 Trend Estimation


In stationary time series, we assume that the mean function is constant. However, in
practicality, that is never the case and so we often need to consider the mean functions
to be simple functions of time or trends. These trends can either be stochastic or
deterministic. Stochastic trends are impossible to model because they tend to show
completely different characteristics with every simulation, e.g. the random walk model.
On the other hand, a deterministic trend can be modeled using deterministic functions to

13
CHAPTER 3. SUMMARY OF THEORY

represent them. For example, a possible model of a time series with deterministic trend
could be,
Yt = µt + Xt
Here,
µt = a deterministic function
Xt = unobserved deviations from µt , having zero mean
We might consider µt to be periodic. We could also assume it to be a linear or a quadratic
function of time. However, it must be kept in mind that whenever we are stating that
E(Xt ) = 0, we are assuming that the trend µt will last forever.

3.3.1 Estimating Constant Mean


Say,
Yt = µt + Xt (3.19)
The most common estimate of the constant mean for (3.19) is the sample mean, which is
calculated from the observed time series Y1 , Y2 ......Yn ,
n
1X
Y = Yt
n t=1

Therefore, E(Y ) = µ
To determine the accuracy of Y as an estimate of µ, we will have to make some guesses
regarding Xt and test it. Say, {Yt }, or equivalently {Xt }, is a stationary time series with
ρ = ρk . Then,
n
1 X
V ar(Y ) = V ar[ Yt ]
n2 t=1
n n
1 X X
= Cov[ Yt Ys ]
n2 t=1 s=1
n n
1 XX
= γt−s
n2 t=1 s=1

Now, putting t − s = k and t = j, we get,

{1 ≤ t ≤ n, 1 ≤ s ≤ n} =⇒ {1 ≤ j ≤ n, 1 ≤ j − k ≤ n}
=⇒ {k + 1 ≤ j ≤ n + k, 1 ≤ j ≤ n}
=⇒ {k > 0, k + 1 ≤ j ≤ n} ∪ {k ≤ 0, 1 ≤ j ≤ n + k}

14
CHAPTER 3. SUMMARY OF THEORY

Therefore,
n−1 n 0 n+k
1 X X X X
V ar(Y ) = [ γk + γk ]
n2 k=1 j=k+1 k=−n+1 j=1
n−1 0
1 X X
= [ (n − k)γk + (n + k)γk ]
n2 k=1 k=−n+1
n−1
1 X |k|
= (1 − )γk
n k=−n+1 n
n−1
γ0 X |k|
= (1 − )ρk (3.20)
n k=−n+1 n

(3.20) is used to evaluate the estimation of µ. By assuming {Xt } as different models, the
value of ρk is placed and the variance of the estimate is approximated. If the variance of
the estimate varies with the sample size n, then the estimate is rejected.

3.3.2 Estimating Non-Constant Mean


Regression analysis can be used to estimate the parameters of non-constant mean trend
models. For example, if the trend is a linear function of time, then the mean is represented
as,
µt = β0 + β1 t (3.21)
Here,
β0 = intercept
β1 = slope
Both the slope and the intercept need to be estimated in this case, which is done by
choosing those value of β0 and β1 that will minimize the following:
n
X
Q(β0 , β1 ) = [Yt − (β0 + β1 t)]2
t=1

One of the solutions of the above equation is shown below:


Pn
(Y − Y )(t − t)
ˆ
β1 = t=1 Pn t 2
t=1 (t − t) (3.22)
βˆ0 = Y − βˆ1 t

Here, t = n+1
2
The least square estimate of the slope can be expressed as follows:
Pn
(t − t)Yt
βˆ1 = Pt=1
n 2
(3.23)
t=1 (t − t)

Say, c1 , c2 .....cm are constants and t1 , t2 ......tm are time points. Then,

Xn n
X n X
X i−1
2
V ar[ ci Ytj ] = ci V ar(Yti ) + 2 ci cj Cov(Yti , Ytj ) (3.24)
i=1 i=1 i=2 j=1

15
CHAPTER 3. SUMMARY OF THEORY

Using (3.23) and (3.24), we find the variance,


Pn
ˆ (t − t)Yt
V ar(β1 ) = V ar( Pt=1 n 2
)
t=1 (t − t)
n
1 2
X
= { Pn 2
} V ar( (t − t)Yt )
t=1 (t − t) t=1
n n Xs−1
144 X
2
X
= [ (t − t) V ar(Yt ) + 2 (t − t)(s − t)Cov(Yt , Ys )]
n2 (n2 − 1)2 t=1 s=2 t=1
n Xs−1
144 n(n2 − 1) X
= [ γ0 + 2 (t − t)(s − t)γs−t ]
n2 (n2 − 1)2 12 s=2 t=1
n Xs−1
12γ0 288γ0 X
= + (t − t)(s − t)ρs−t ]
n(n2 − 1) n2 (n2 − 1)2 s=2 t=1
n s−1
12γ0 24 XX
= [1 + (t − t)(s − t)ρs−t ] (3.25)
n(n2 − 1) n(n2 − 1) s=2 t=1
2
Here, we replaced nt=1 (t − t)2 by n(n12−1) .
P
Using (3.25), the precision of the estimates of the linear trend model can be evaluated in
the same way as that of the constant mean model.

If the data is monthly seasonal, then it is assumed that there are twelve constants
β1 , β2 .....β12 , each of which are equal to the average monthly observations of the year.
So,


 β1 , for t = 1, 13, ....




 β2 , for t = 2, 14, ....

.
µt = (3.26)

 .

.





β12 , for t = 12, 24, ....

(3.26) is also known as the seasonal means model. The estimate of any parameter of
the seasonal model, say βˆj is given by,
N −1
1 X
Yj+12i
N i=0

Here, N = number of years of monthly data


Since βˆj is like Y and it only considers every 12th values, we can transform (3.20) as
follows:
N −1
ˆ γ0 X k
V ar(βj ) = [1 + 2 (1 − )ρ12k ] for j = 1, 2, ....12 (3.27)
N k=1
N
Using (3.27), the precision of the estimates of the seasonal model can be evaluated in the
same way as that of the constant mean model.

16
CHAPTER 3. SUMMARY OF THEORY

Some seasonal trends can be modeled using cosine curves. This helps to maintain
smoothness in the curve during transitions from one time period to another. Such a
model is as follows:
µt = βcos(2πf t + Φ) (3.28)
Here,
β = amplitude of the curve
f = frequency of the curve
Φ = phase of the curve
As the value of t changes, the curve fluctuates between the highest value of β and the
lowest value of −β.
A more convenient form of (3.28) is,
βcos(2πf t + Φ) = β1 cos(2πf t) + β2 sin(2πf t) (3.29)
Here,
β2
q
β= β12 + β22 , Φ = atan(− ) (3.30)
β1
Similarly,
β1 = βcos(Φ), β2 = βsin(Φ) (3.31)
Thus, the simplest model for the mean would be
µt = β0 + β1 cos(2πf t) + β2 sin(2πf t) (3.32)
m
Say, frequency, f = n
. Here, m is an integer and 1 ≤ m < n2 . Then,
n
2X 2πmt
βˆ1 = [cos( )Yt ]
n t=1 n
n (3.33)
2X 2πmt
βˆ2 = [sin( )Yt ]
n t=1 n

Using (3.33) and (3.24), we get the variance of βˆ1 ,


n
2 X 2πmt
V ar(βˆ1 ) = V ar( [cos( )Yt ])
n t=1
n
n n Xs−1
4 X 2πmt 2 X 2πmt 2πms
= 2
[ {cos( )} V ar(Yt ) + 2 cos( )cos( )Cov(Yt , Ys )]
n t=1 n s=2 t=1
n n
n s−1
4 n 8 XX 2πmt 2πms
= 2
γ0 + 2 cos( )cos( )γs−t
n 2 n s=2 t=1 n n
n s−1
2γ0 8γ0 X X 2πmt 2πms
= + 2 cos( )cos( )ρs−t
n n s=2 t=1 n n
n s−1
2γ0 4 XX 2πmt 2πms
= [1 + cos( )cos( )ρs−t ] (3.34)
n n s=2 t=1 n n

Here, we replaced nt=1 {cos( 2πmt )}2 by n2 .


P
n
Similarly, the variance of βˆ2 can be calculated by replacing the cosines with sines in the
above derivation.

17
CHAPTER 3. SUMMARY OF THEORY

3.3.3 Analyzing Estimated Outputs


After estimation of µt , we can estimate the stochastic component Xt for each t, using the
following equation:
Xt = Yt − µt
The standard deviation of {Xt } can be calculated, if its variance is constant, by the
residual standard deviation, which is given by,
v
u n
u 1 X
s= t (Yt − µ̂t )2 (3.35)
n − p t=1

Here,
p = number of parameters estimated for µt
n-p = degrees of freedom for s
s is an absolute measure of the estimated trend’s goodness of fit. The less is its value, the
better is the fit.
Another measure of the estimated trend’s goodness of fit is the value of R2 , which is also
known as the co-efficient of determination or multiple R-squared. It is unitless
and is defined as the square of the sample correlation co-efficient between the series of
observation and the approximated trend.
The adjusted R2 approximates unbiased estimates depending on the number of
parameter estimated in the trend. Its value is basically just a small adjustment to the
value of R2 . if there are several models with different number of parameters, then the
adjusted R2 helps to compare them.
The standard deviations, also known as, Std. Error of the estimated co-efficients should
not be taken into consideration unless the stochastic component is found to be white noise.
When we divide each estimated regression coefficient by their respective standard errors,
we get t-values or the t-ratios. These values do not come to any use if the stochastic
component is not white noise.

3.4 Time Series Models


Integrated Autoregressive Moving Average models also known as the ARIMA models
are one of the several approaches to time series forecasting. They aim to utilize the
autocorrelations in the time series data to make future predictions. ARIMA models are
a broad class of parametric models that have gained a lot of popularity in time series
forecasting. The fundamental concepts of ARIMA models are discussed in this section.

3.4.1 General Linear Process


A general linear process is a process which is a weighted linear combination of the
present and past white noise terms. If {Yt } is an observed time series and {et } is an
unobserved white noise series where e1 , e2 .....et are independent, identically distributed
random variables, then {Yt } can be expressed as a general linear process in the following
manner:
Yt = Ψ0 et + Ψ1 et−1 + Ψ2 et−2 + .......... (3.36)

18
CHAPTER 3. SUMMARY OF THEORY

In order to make the infinite series of the right hand side of (3.36) more meaningful, we
assume that,
X∞
Ψ2i < ∞ (3.37)
i=1

Since {et } is not observable, we can presume that Ψ0 = 1. Therefore,

Yt = et + Ψ1 et−1 + Ψ2 et−2 + .......... (3.38)

The Ψs are often considered to form an exponentially decaying sequence, such that,

Ψj = φj

Here, the value of φ is strictly between -1 and +1.


Then, (3.38) becomes
Yt = et + φet−1 + φ2 et−2 + .......... (3.39)
The mean function of Yt ,

E(Yt ) = E(et + φet−1 + φ2 et−2 + ..........)


= E(et ) + φE(et−1 ) + φ2 E(et−2 ) + ..........
= 0 + 0 + 0 + ..............
= 0

The variance of Yt ,

V ar(Yt ) = V ar(et + φet−1 + φ2 et−2 + ..........)


= V ar(et ) + φ2 V ar(et−1 ) + φ4 V ar(et−2 ) + ..........
= σe2 + φ2 σe2 + φ4 σe2 + ..............
= σe2 (1 + φ2 + φ4 + .....)
σe2
=
1 − φ2
The covariance between two consecutive observations becomes,

Cov(Yt , Yt−1 ) = Cov(et + φet−1 + φ2 et−2 + .........., et−1 + φet−2 + φ2 et−3 + ..........)
= Cov(et , et−1 ) + Cov(et , φet−2 ) + .... + Cov(φet−1 , et−1 ) + Cov(φet−1 , φet−2 ) +
..... + Cov(φ2 et−2 , et−1 ) + Cov(φ2 et−2 , φet−2 ) + .........
= 0 + 0 + ....... + φσe2 + 0 + ...... + 0 + φ3 σe2
= φσe2 (1 + φ2 + φ4 + ......)
φσe2
=
1 − φ2
Therefore, covariance between two observations that are k lags apart,

φk σe2
Cov(Yt , Yt−k ) =
1 − φ2
It is clear from the previous discussion that the autocovariance function is dependent
only on the lag k and not on the actual time. So, we can conclude that Yt is a stationary

19
CHAPTER 3. SUMMARY OF THEORY

process.
Now, the correlation between two consecutive observations is,
Cov(Yt , Yt−1 )
Corr(Yt , Yt−1 ) = p
V ar(Yt )V ar(Yt−1 )
φσe2
1−φ2
= q
σe2 σe2
1−φ2 1−φ2
φσe2
1−φ2
= σe2
1−φ2
= φ

Therefore, correlation between two observations that are k lags apart,

Corr(Yt , Yt−k ) = φk (3.40)

Hence, for a general linear process with Ψ0 = 1, we have,



X
E(Yt ) = 0, γk = σe2 Ψi Ψi+k f or k>0 (3.41)
i=0

If the general linear process has non-zero mean, then it can be expressed as follows:

Yt = µ + et + Ψ1 et−1 + Ψ2 et−2 + ..........

3.4.2 Moving Average Process


A moving average (MA) process is a general linear process having finite number of Ψ
weights. It is represented as follows:

Yt = et − θ1 et−1 − θ2 et−2 − ....... − θq et−q (3.42)

Such a process is known as a moving average process of order q, MA(q).


It is called a moving average process because the weights 1, −θ1 , −θ2 , ...., −θq are applied
to variables et , et−1 ......et−q to get Yt . Then the weights are shifted once to the right to
get Yt+1 by the application on the next set of q variables et+1 , et ......et−q+1 and so on.

MA(1) Process

An MA(1) process is represented as follows:

Yt = et − θet−1

Here,

E(Yt ) = E(et − θet−1 )


= E(et ) − θE(et−1 )
= 0−0
= 0

20
CHAPTER 3. SUMMARY OF THEORY

V ar(Yt ) = V ar(et − θet−1 )


= V ar(et ) + θ2 V ar(et−1 )
= σe2 + θ2 σe2
= σe2 (1 + θ2 )
= γ0

γ1 = Cov(Yt , Yt−1 )
= Cov(et − θet−1 , et−1 − θet−2 )
= Cov(et , et−1 ) + Cov(et , −θet−2 ) + Cov(−θet−1 , et−1 ) + Cov(−θet−1 , −θet−2 )
= 0 + 0 − θσe2 + 0
= −θσe2

γ2 = Cov(Yt , Yt−2 )
= Cov(et − θet−1 , et−2 − θet−3 )
= Cov(et , et−2 ) + Cov(et , −θet−3 ) + Cov(−θet−1 , et−2 ) + Cov(−θet−1 , −θet−3 )
= 0+0+0+0
= 0
∴ γk = Cov(Yt , Yt−k ) = 0 whenever k > 2.
γ1
ρ1 =
γ0
−θσe2
=
σe2 (1 + θ2 )
−θ
=
1 + θ2
And, ρk = 0 for k > 2.

MA(2) Process

An MA(2) process is represented by,


Yt = et − θ1 et−1 − θ2 et−2
Here,
E(Yt ) = E(et − θ1 et−1 − θ2 et−2 )
= E(et ) − θ1 E(et−1 ) − θ2 E(et−2 )
= 0−0−0
= 0

V ar(Yt ) = V ar(et − θ1 et−1 − θ2 et−2 )


= V ar(et ) + θ12 V ar(et−1 ) + θ22 V ar(et−2 )
= σe2 + θ12 σe2 + θ22 σe2
= σe2 (1 + θ12 + θ22 )
= γ0

21
CHAPTER 3. SUMMARY OF THEORY

γ1 = Cov(Yt , Yt−1 )
= Cov(et − θ1 et−1 − θ2 et−2 , et−1 − θ1 et−2 − θ2 et−3 )
= Cov(−θ1 et−1 , et−1 ) + Cov(−θ2 et−2 , −θ1 et−2 )
= −θ1 σe2 + θ1 θ2 σe2
= (−θ1 + θ1 θ2 )σe2

γ2 = Cov(Yt , Yt−2 )
= Cov(et − θ1 et−1 − θ2 et−2 , et−2 − θ1 et−3 − θ2 et−4 )
= Cov(−θ2 et−2 , et−2 )
= −θ2 σe2

γk = Cov(Yt , Yt−k ) = 0 whenever k > 3.


γ1
ρ1 =
γ0
(−θ1 + θ1 θ2 )σe2
=
σe2 (1 + θ12 + θ22 )
−θ1 + θ1 θ2
=
1 + θ12 + θ22

γ2
ρ2 =
γ0
−θ2 σe2
=
σe2 (1 + θ12 + θ22 )
−θ2
=
1 + θ12 + θ22

And, ρk = 0 for k > 3.

MA(q) Process

For the MA(q) process in (3.42), the mean function is given by,

E(Yt ) = E(et − θ1 et−1 − θ2 et−2 − .......... − θq et−q )


= E(et ) − θ1 E(et−1 ) − θ2 E(et−2 ) − .......... − θq E(et−q )
= 0 − 0 − 0 − .............. − 0
= 0

The variance of Yt ,

V ar(Yt ) = V ar(et − θ1 et−1 − θ2 et−2 − .......... − θq et−q )


= V ar(et ) + θ12 V ar(et−1 ) + θ22 V ar(et−2 ) + .......... + θq2 V ar(et−q )
= σe2 + θ12 σe2 + θ22 σe2 + .......... + θq2 σe2
= σe2 (1 + θ12 + θ22 + ..... + θq2 ) (3.43)

22
CHAPTER 3. SUMMARY OF THEORY
(
σe2 (−θk + θ1 θk+1 + θ2 θk+2 + .... + θq−k θq ), for k 6 q
γk = (3.44)
0, for k > q

 −θk + θ1 θk+1 + θ2 θk+2 + .... + θq−k θq , for k 6 q


ρk = 1 + θ12 + θ22 + ......... + θq2 (3.45)



0, for k > q

Invertibility
In case of an MA(1) process, both θ and 1θ give the same ACF. However, this is
not acceptable as it would lead to wrong estimations of the parameters during model
specification. So, it has to be made sure that no two values of the same parameter leads
to the same ACF for an MA function. Invertible MA series are such processes with unique
ACF. Let us consider the following MA(1) process,

Yt = et − θet−1
et = Yt + θet−1

Replacing t by t-1, we get,

et−1 = Yt−1 + θet−2

Therefore,

Yt = et − θet−1
et = Yt + θ(Yt−1 + θet−2 )
= Yt + θYt−1 + θ2 et−2

The substitution may continue infinitely into the past if |θ| < 1. Thus an MA(1) model
will be inverted into an infinite ordered AR model. And so, MA(1) is said to be invertible
if |θ| < 1.
The MA(q) characteristic equation is,

θ(x) = 1 − θ1 x − θ2 x2 − ...... − θq xq

The MA characteristics equation is,

1 − θ1 x − θ2 x2 − ...... − θq xq = 0

To show that the MA(q) model is invertible, we can show that such a coefficient πj exists
that,
Yt = π1 Yt−1 + π2 Yt−2 + ...... + et
This is only possible if the roots of the MA characteristics equation exceed 1.

23
CHAPTER 3. SUMMARY OF THEORY

3.4.3 Autoregressive Process


In autoregressive processes, the linear combination of the most recent p past values
plus an error term et at time t gives the current value Yt . All the things that are not
explained by the past values of the series are incorporated into et . If Yt is an autoregressive
process of order p, that is, AR(p), then it can be expressed as follows:
Yt = φ1 Yt−1 + φ2 Yt−2 + ..... + φp Yt−p + et (3.46)
Here, et is independent of all past values Yt−1 , Yt−2,.... .

AR(1) Process
An AR(1) process can be written as,
Yt = φYt−1 + et (3.47)
Say, the process mean has been deducted. So, the mean function of the series is,
E(Yt ) = 0
The variance is,
γ0 = V ar(Yt )
= V ar(φYt−1 + et )
= φ2 V ar(Yt−1 ) + V ar(et )
= φ2 γ0 + σe2
σe2
∴ γ0 = (3.48)
1 − φ2
Here, φ2 < 1.
If we multiply both sides of (3.47) by Yt−k and take expectation, we get,
E(Yt Yt−k ) = φE(Yt−1 Yt−k ) + E(et Yt−k )
∴ γk = φγk−1 + E(et Yt−k )

Since et is independent of Yt−k ,


E(et Yt−k ) = E(et )E(Yt−k )
= 0
∴ γk = φγk−1 for k > 1 (3.49)
When k = 1,
γ1 = φγ0
σe2
= φ
1 − φ2
When k = 2,
γ2 = φγ1
σe2
= φ(φ )
1 − φ2
σe2
= φ2
1 − φ2

24
CHAPTER 3. SUMMARY OF THEORY

Therefore,
σe2
γk = φ k (3.50)
1 − φ2
And,
γk
ρk =
γ0
σe 2
φk 1−φ 2
= σe2
1−φ2
k
= φ for k > 1 (3.51)

Since φ2 < 1, as the number of lags k increases, ρk decreases exponentially.


Now, suppose,
Yt−1 = φYt−2 + et−1
Then, (3.47) can be written as,

Yt = φ(φYt−2 + et−1 ) + et
= φ2 Yt−2 + φet−1 + et
∴ Yt = φk Yt−k + φk−1 Yt−k+1 + ..... + φ2 Yt−2 + φYt−1 + et (3.52)

If the series on the right hand side of (3.52) is infinite instead, then we can write it as
follows:
Yt = et + φet−1 + φ2 et−2 + .......... (3.53)
This is similar to (3.38), if we replace the Ψj s there by φj .
The following is an AR characteristic model for AR(1) process:

φ(x) = 1 − φx

It is used to explain the stationarity of AR(1) process. The corresponding AR


characteristic equation is,
1 − φx = 0
To get the stationary condition of AR(1) model,the root of the characteristic equation
is used. When the root, taken in its absolute form, exceeds 1, we get the stationarity
condition. Thus, x = φ1 has to exceed 1 in absolute form, which happens only when |φ| < 1.

AR(2) Process
An AR(2) process is written as,

Yt = φ1 Yt−1 + φ2 Yt−2 + et (3.54)

The following is an AR characteristic model for AR(2) process:

φ(x) = 1 − φ1 x − φ2 x2

It is used to explain the stationarity of AR(2) process. The corresponding AR


characteristic equation is,
1 − φ1 x − φ2 x 2 = 0

25
CHAPTER 3. SUMMARY OF THEORY

To get the stationary condition of AR(2) model,the roots of the characteristic equation
are used. When the roots, taken in their absolute form, exceed 1, we get the stationarity
condition.Now, the roots of the characteristics equation are given by,
p
φ1 ± φ21 + 4φ2
x= (3.55)
−2φ2

Here, |x| > 1 iff,


φ1 + φ2 < 1, φ2 − φ1 < 1, |φ2 | < 1 (3.56)
Multiplying both sides of (3.54) by Yt−k and take expectation, we get,

E(Yt Yt−k ) = φ1 E(Yt−1 Yt−k ) + φ2 E(Yt−2 Yt−k ) + E(et Yt−k )


γk = φ1 γk−1 + φ2 γk−2 + 0
∴ γk = φ1 γk−1 + φ2 γk−2 for k > 1 (3.57)
γk γk−1 γk−2
= φ1 + φ2
γ0 γ0 γ0
∴ ρk = φ1 ρk−1 + φ2 ρk−2 for k > 1 (3.58)

(3.57) and (3.58) are known as the Yule Walker Equations.


Variance of AR(2) model,

γ0 = V ar(Yt )
= V ar(φ1 Yt−1 + φ2 Yt−2 + et )
= V ar(φ1 Yt−1 + φ2 Yt−2 ) + V ar(et )
= V ar(φ1 Yt−1 ) + V ar(φ2 Yt−2 ) + 2Cov(φ1 Yt−1 , φ2 Yt−2 ) + σe2
= φ21 γ0 + φ22 γ0 + 2φ1 φ2 γ1 + σe2
= (φ21 + φ22 )γ0 + 2φ1 φ2 γ1 + σe2 (3.59)

If k = 1, (3.57) gives,

γ1 = φ 1 γ0 + φ 2 γ1
φ 1 γ0
∴ γ1 =
1 − φ2

Therefore, from (3.59),

φ 1 γ0
γ0 = φ21 γ0 + φ22 γ0 + 2φ1 φ2 + σe2
1 − φ2
φ21 (1 − φ2 )γ0 + φ22 (1 − φ2 )γ0 + 2φ1 φ2 γ0 + (1 − φ2 )σe2
=
(1 − φ2 )
∴ (1 − φ2 )γ0 = γ0 [φ1 (1 − φ2 ) + φ2 (1 − φ2 ) + 2φ1 φ2 ] + (1 − φ2 )σe2
2 2

(1 − φ2 )σe2
∴ γ0 =
1 − φ2 − φ21 (1 − φ2 ) − φ22 (1 − φ2 ) − 2φ1 φ2
(1 − φ2 )σe2
γ0 = (3.60)
(1 − φ2 )(1 − φ21 − φ22 ) − 2φ1 φ2

26
CHAPTER 3. SUMMARY OF THEORY

AR(p) Process
The AR characteristic polynomial for an AR(p) model is,

φ(x) = 1 − φ1 x − φ2 x2 − ..... − φp xp (3.61)

The corresponding AR characteristic equation is then represented by,

1 − φ1 x − φ2 x2 − ..... − φp xp = 0 (3.62)

To get the stationary condition of AR(p) model,the roots of the characteristic equation
is used. When the roots, taken in their absolute form, exceed 1, we get the stationarity
conditions:
φ1 + φ2 + ... + φp < 1 and |φp | < 1 (3.63)
Multiplying both sides of (3.46) by Yt−k and take expectation, we get,

E(Yt Yt−k ) = φ1 E(Yt−1 Yt−k ) + φ2 E(Yt−2 Yt−k ) + ...... + φp E(Yt−p Yt−k ) + E(et Yt−k )
∴ γk = φ1 γk−1 + φ2 γk−2 + ..... + φp γk−p + 0
γk = φ1 γk−1 + φ2 γk−2 + ..... + φp γk−p (3.64)
∴ ρk = φ1 ρk−1 + φ2 ρk−2 + ..... + φp ρk−p fork > 1 (3.65)

We get the following Yule Walker equations if we set k = 1, 2, .....p, ρ0 = 1 and ρ−k = ρk
in (3.65): 
ρ1 = φ1 + φ2 ρ1 + φ3 ρ2 + ..... + φp ρp−1 


ρ2 = φ1 ρ1 + φ2 + φ3 ρ1 + ..... + φp ρp−2 




. 
(3.66)
. 


.





ρp = φ1 ρp−1 + φ2 ρp−2 + φ3 ρp−3 + ..... + φp

Multiplying both sides of (3.46) by Yt and take expectation, we get,

E(Yt Yt ) = φ1 E(Yt−1 Yt ) + φ2 E(Yt−2 Yt ) + .... + φp E(Yt−p Yt ) + E(et Yt )


∴ γ0 = φ1 γ1 + φ2 γ2 + .... + φp γp + σe2
σ2
=⇒ 1 = φ1 ρ1 + φ2 ρ2 + .... + φp ρp + e
γ0
σe2
=⇒ = 1 − φ1 ρ1 − φ2 ρ2 − .... − φp ρp
γ0
σe2
∴ γ0 = (3.67)
1 − φ1 ρ1 − φ2 ρ2 − .... − φp ρp

3.4.4 ARMA Models


Autoregressive moving average models are models of those time series which are
partly autoregressive and partly moving average. If a series Yt consists of an autoregressive
process of order p and a moving average process of order q, then it is known as a
ARMA(p,q) process. It is expressed as follows:

Yt = φ1 Yt−1 + φ2 Yt−2 + ..... + φp Yt−p + et − θ1 et−1 − θ2 et−2 − .... − θq et−q (3.68)

27
CHAPTER 3. SUMMARY OF THEORY

ARMA(1,1) Model
An ARMA(1,1) model is shown below,
Yt = φYt−1 + et − θet−1 (3.69)
Here,
E(et Yt ) = E[et (φYt−1 + et − θet−1 )]
= 0 + E(et et ) + 0
= V ar(et )
= σe2

E(et−1 Yt ) = E[et−1 (φYt−1 + et − θet−1 )]


= φE(et−1 et−1 ) + 0 − θE(et−1 et−1 )
= φV ar(et ) − θV ar(et )
= φσe2 − θσe2
= σe2 (φ − θ)
Multiplying both sides of (3.69) by Yt−k and take expectation, we get,
E(Yt Yt−k ) = E(φYt−1 Yt−k + et Yt−k − θet−1 Yt−k )
∴ γk = φE(Yt−1 Yt−k ) + E(et Yt−k ) − θE(et−1 Yt−k )
∴ γk = φγk−1 + 0 − 0
∴ γk = φγk−1
If k=0,
γ0 = φE(Yt−1 Yt ) + E(et Yt ) − θE(et−1 Yt )
= φγ1 + σe2 − σe2 (φ − θ)θ
= φγ1 + σe2 [1 − (φ − θ)θ]
If k=1,
γ1 = φE(Yt−1 Yt−1 ) + E(et Yt−1 ) − θE(et−1 Yt−1 )
= φγ0 + 0 − θσe2
= φγ0 − θσe2
Therefore, 
γ0 = φγ1 + σe2 [1 − (φ − θ)θ]

γ1 = φγ0 − θσe2 (3.70)

γk = φγk−1 for k > 2 
From the first two equations of (3.70), we get,
γ0 = φ(φγ0 − θσe2 ) + σe2 [1 − (φ − θ)θ]
= φ2 γ0 − φθσe2 + σe2 − φθσe2 + θ2 σe2
= φ2 γ0 − 2φθσe2 + σe2 + θ2 σe2
σ 2 − 2φθσe2 + θ2 σe2
= e
1 − φ2
1 − 2φθ + θ2 2
= σe (3.71)
1 − φ2

28
CHAPTER 3. SUMMARY OF THEORY

If we solve the recursion, we get,


(1 − θφ)(φ − θ) k−1
ρk = φ for k > 1 (3.72)
1 − 2θφ + θ2
To get the stationarity conditions of the ARMA(1,1) process, we have to ensure that the
absolute value of the root of the AR characteristic equation 1 − φx = 0 exceeds 1. This
happens when,
|φ| < 1
This is the stationarity condition of ARMA(1,1) model.

3.4.5 ARIMA Models


If a process {Yt } is non-stationary, then it means that it has a non-constant mean over
time. If we difference consecutive observations of the time series, then the mean gets
stabilized to some extent as the changes in the level get removed. If required, differencing
can be done more than once on the time series data to achieve stationarity.
If differencing is done once, it is called the first order differencing,
5Yt = Yt − Yt−1
If it is done twice, then it is called the second order differencing and so on. A second
order differencing looks like the following:
52 Yt = 5(5Yt )
= (Yt − Yt−1 ) − (Yt−1 − Yt−2 )
= Yt − 2Yt−1 − Yt−2
If we perform first order difference of the seriesin Figure 3.1, we get the output as shown
in Figure 3.3.

Figure 3.3: Plot of First Order Difference of the Diameter of Hem Series

29
CHAPTER 3. SUMMARY OF THEORY

It can be seen clearly in Figure 3.3 that the trend has been removed substantially from
the series. If we perform second order difference on the series, then we will get,

Figure 3.4: Plot of Second Order Difference of the Diameter of Hem Series

{Yt } will be an integrated autoregressive moving average process, if its dth difference,
denoted by Wt = 5d Yt , is a stationary ARMA process. Therefore, {Wt } follows an
ARMA(p,q) model and {Yt } follows an ARIMA(p,d,q) model. Typically, the value of d
is at least 0 (meaning no difference) or at most 2.

An ARIMA(p,1,q) series is represented as follows in terms of its observations:

Yt − Yt−1 = φ1 (Yt−1 − Yt−2 ) + φ2 (Yt−2 − Yt−3 ) + ..... + φp (Yt−p − Yt−p−1 )


+et − θ1 et−1 − θ2 et−2 − ...... − θq et−q
=⇒ Yt = (1 + φ1 )Yt−1 + (φ2 − φ1 )Yt−2 + (φ3 − φ2 )Yt−3 + .... + (φp − φp−1 )Yt−p − φp Yt−p−1
+et − θ1 et−1 − θ2 et−2 − ...... − θq et−q (3.73)

Or, it can be expressed as follows:

Wt = φ1 Wt−1 + φ2 Wt−2 + ..... + φp Wt−p + et − θ1 et−1 − θ2 et−2 − ...... − θq et−q (3.74)

(3.73), which looks like an ARMA(p+1,q) process, is also known as the difference equation
form of the ARIMA model.

The characteristic polynomial equation of ARIMA (p,d,q) model is as follows:

1−(1+φ1 )x−(φ2 −φ1 )x2 −.....−(φp −φp−1 )xp +φp xp+1 = (1−x)(1−φ1 x−φ2 x2 −....−φp xp )

From the above equation, we can see that one of the roots is x = 1, which implies
non-stationarity. The other roots are the roots of the characteristic polynomial equation

30
CHAPTER 3. SUMMARY OF THEORY

of the stationary time series 5Yt .

We consider stationary time series to have zero mean. However, if we wish to accommodate
a non-zero constant mean µ in the ARMA process {Wt }, we can suppose that,

Wt −µ = φ1 (Wt−1 −µ)+φ2 (Wt−2 −µ)+.....+φp (Wt−p −µ)+et −θ1 et−1 −θ2 et−2 −......−θq et−q

We can also add a constant term θ0 to the model instead:

Wt = θ0 + φ1 Wt−1 + φ2 Wt−2 + ..... + φp Wt−p + et − θ1 et−1 − θ2 et−2 − ...... − θq et−q

If we take expectations on both sides of the above equation, we get,

E(Wt ) = θ0 + φ1 E(Wt−1 ) + φ2 E(Wt−2 ) + ..... + φp E(Wt−p ) + E(et )


−θ1 E(et−1 ) − θ2 E(et−2 ) − ...... − θq E(et−q )
∴ µ = θ0 + φ1 µ + φ2 µ + ..... + φp µ + 0 − 0 − 0 − ..... − 0
=⇒ µ = θ0 + φ1 µ + φ2 µ + ..... + φp µ
=⇒ µ = θ0 + (φ1 + φ2 + ..... + φp )µ
θ0
∴µ = (3.75)
1 − φ1 − φ2 − ...... − φp
∴ θ0 = µ(1 − φ1 − φ2 − ...... − φp ) (3.76)

Often, it is observed in time series that the higher its level is, the more variations it shows
around that level and vice versa. That is, its variance increases as its level increases. In
such cases, transforming the data-set to its log form will result in a series with constant
variance over time. If the level of the original time series varies exponentially with time,
then the new log transformed time series will show a linear time trend, which can be
removed by differencing. Therefore, the new series can be expressed as,

5 [log(Yt )] = Xt (3.77)

In stock price predictions, often, the returns are considered to perform analysis. These
returns are the differences of the logarithms of the stock prices.
Data can also be transformed by using power functions. Such transformation is known as
power transformations. If λ is given, then the power transformation is defined by,
 λ
x − 1
, for λ 6= 0
g(x) = λ (3.78)
log(x), for k = 0

The value of λ is estimated and used to transform the non-stationary time series. Power
transformation only works if the values of the data are positive. Otherwise, they are
transformed after adding the absolute form of the lowest value to all the data values and
making the data values positive.

3.4.6 Backshift Operator


The backshift operator of a time series operates on the time index of the observations
of the series to produce the previous observations, e.g. BYt = Yt−1 .

31
CHAPTER 3. SUMMARY OF THEORY

Applying the backshift operator on the general MA(q) model, we get,

Yt = et − θ1 et−1 − θ2 et−2 − ....... − θq et−q


= et − θ1 Bet − θ2 B 2 et − ....... − θq B q et
= et (1 − θ1 B − θ2 B 2 − ....... − θq B q )
= θ(B)et

Applying the backshift operator on the general AR(p) model, we get,

Yt = φ1 Yt−1 + φ2 Yt−2 + ..... + φp Yt−p + et


=⇒ et = Yt − φ1 Yt−1 − φ2 Yt−2 − ..... − φp Yt−p
= Yt − φ1 BYt − φ2 B 2 Yt − ..... − φp B p Yt
= (1 − φ1 B − φ2 B 2 − ..... − φp B p )Yt
= φ(B)Yt

Applying the backshift operator on the general ARMA(p,q) model, we get,

φ(B)Yt = θ(B)et

Applying the backshift operator on the differencing equations, we get,

5Yt = Yt − Yt−1
= Yt − BYt
= Yt (1 − B)

52 Yt = Yt − 2Yt−1 + Yt−2
= Yt − 2BYt + B 2 Yt
= Yt (1 − B)2

Applying the backshift operator on the general ARIMA(p,d,q) model, we get,

φ(B)(1 − B)d Yt = θ(B)et

3.5 Box-Jenkins Procedure


The ARIMA models discussed in Section 3.4 are fitted to time series data for further
analysis and forecasting. George Box and Gwilym Jenkins had established a method for
finding the best fit of ARIMA models to past values of a time series. This method is
known as the Box-Jenkins procedure, which is widely used in time series analysis and
forecasting.
The Box-Jenkins procedure consists of three steps:

1. Model Specification

2. Parameter Estimation

3. Model Verification

32
CHAPTER 3. SUMMARY OF THEORY

3.5.1 Model Specification


Model specification involves determining reasonable yet tentative values for p, d and
q of the ARIMA(p,d,q) model to fit to the time series data. The tools that are used for
this purpose are discussed here.

Sample Autocorrelation Function

If Y1 , Y2 , ......Yn is the observed time series, then,


Pn
t=1 Yt
mean, Y =
n
autocovariance, ck = γˆk
Pn
t=k+1 (Yt − Y )(Yt−k − Y )
=
T
autocorrelation, rk = ρˆk
γˆk
=
γˆ0
Pn
t=k+1 (Yt − Y )(Yt−k − Y )
= Pn 2
(3.79)
t=1 (Yt − Y )
rk is the sample autocorrelation function and it is used to identify MA(q) process. The
plot of rk against lag k is called a correlogram. We know from (3.45) that for k > q,
the autocorrelation function ρk becomes zero. So, if the value of rk drops to zero at a
particular lag in the correlogram, then we can say that it is a MA process and we can
also determine the value of q from it.

Figure 3.5: ACF Plot of the Diameter of Hem Series

33
CHAPTER 3. SUMMARY OF THEORY

We plot the ACF of our example time series on the diameter of hem of skirts in Figure 3.5.
From the plot, we can see that the ACF exceeds the significance bound at lag 1. So, we
can guess that the series could be an MA(1) process.

Sample Partial Autocorrelation Function

rk works as a good indicator of the order q of a moving average process. However, in


case of autoregressive processes, the autocorrelation function never becomes zero after a
certain amount of lags. It simply dies off. Hence, we have to define a function for the
correlation between two observations Yt and Yt−k . We can define the function in such a
way that the effect of the observations Yt−1 , Yt−2 , ......, Yt−k+1 are removed. This function
is known as the partial autocorrelation function or PACF, and is denoted by φkk .
If {Yt } is a time series with normal distribution, then, φkk is defined by,

φkk = Corr(Yt Yt−k |Yt−1 , Yt−2 , ......Yt−k+1 ) (3.80)

In case we wish to define φkk for both normally distributed and non-normally distributed
series, then, we can assume that the prediction of Yt is based on a linear combination of
its intervening variables:

β1 Yt−1 + β2 Yt−2 + ............. + βk−1 Yt−k+1

Here, βs are selected in such a way that the mean square error of the prediction gets
minimized. Since it is a stationary series, the prediction of Yt−k will also be based on a
linear combination of its intervening variables:

β1 Yt−k+1 + β2 Yt−k+2 + ............. + βk−1 Yt−1

Then the PACF at lag k will be,

φkk = Corr(Yt − β1 Yt−1 − β2 Yt−2 − ............. − βk−1 Yt−k+1 , Yt−k − β1 Yt−k+1 − β2 Yt−k+2 −
............. − βk−1 Yt−1 ) (3.81)

We always consider φ11 to be equal to 1. It can be shown that based on Yt−1 alone, the
best linear prediction of Yt is ρ1 Yt−1 . Therefore,

Cov(Yt − ρ1 Yt−1 , Yt−2 − ρ1 Yt−1 ) = Cov(Yt , Yt−2 ) − ρ1 Cov(Yt , Yt−1 ) − ρ1 Cov(Yt−1 , Yt−2 ) +
ρ21 Cov(Yt−1 , Yt−1 )
= γ2 − ρ1 γ1 − ρ1 γ1 + ρ21 γ0
= ρ2 γ0 − ρ21 γ0 − ρ21 γ0 + ρ21 γ0
= (ρ2 − ρ21 − ρ21 + ρ21 )γ0
= (ρ2 − ρ21 )γ0

V ar(Yt − ρ1 Yt−1 ) = V ar(Yt−2 − ρ1 Yt−1 )


= γ0 − ρ21 γ0
= γ0 (1 − ρ21 )

34
CHAPTER 3. SUMMARY OF THEORY

Therefore,
(ρ2 − ρ21 )γ0 ρ2 − ρ21
φ22 = = (3.82)
γ0 (1 − ρ21 ) 1 − ρ21
For an AR(1) model, where ρk = φk ,

ρ2 − ρ21 φ2 − φ2
φ22 = = =0
1 − ρ21 1 − φ2

Therefore, for AR(1) process, φkk = 0 for all k > 1. So, we can say that the PACF for an
AR(p) process would cut off when the lag becomes greater than its order. That is,

φkk = 0 for all k > p (3.83)

In case of an MA(1) process, from (3.82), we get,


−θ 2
0 − ( 1+θ 2)
φ22 = −θ 2
1 − ( 1+θ 2)

−θ2
=
(1 + θ2 )2 − θ2
−θ2
=
1 + 2θ2 + θ4 − θ2
−θ2
= (3.84)
1 + θ2 + θ4
So, for an MA(q) model, φkk never becomes equal to zero. It only dies off. So, it can be
used as a tool to exclusively identify an AR process.

The value of φkk can be found by using the following Yule Walker equations:

ρj = φk1 ρj−1 + φk2 ρj−2 + ..... + φkk ρj−k for j = 1, 2, ....k (3.85)

Here, we assume that the values of ρ1 , ρ2 ...., ρk are given. By estimating the values of ρs
using the sample autocorrelation functions, that is, by replacing ρk s by rk s, we can solve
(3.85) to get the values of sample autocorrelation functions (φˆkk ). There is a method
called the Levinson-Durbin method, by which we can show that (3.85) can be solved to
find an equation for φkk :
ρk − k−1
P
j=1 φk−1,j ρk−j
φkk = Pk−1 (3.86)
1 − j=1 φk−1,j ρj
Here, φk,j = φk−1,j − φkk φk−1,k−j for j = 1, 2, ....k − 1. We plot the PACF of our example
time series on the diameter of hem of skirts in Figure 3.6. From the plot, we can see that
the PACF exceeds the significance bound at lag 1. So, we can guess that the series could
be an AR(1) process.

35
CHAPTER 3. SUMMARY OF THEORY

Figure 3.6: PACF Plot of the Diameter of Hem Series

Extended Autocorrelation Function

Both sample ACF and sample PACF are very effective in identifying the MA(q) and
AR(p) models respectively. However, in case of mixed ARMA(p,q) models, both the
ACF and the PACF tend to tail off instead of becoming zero within a finite number of
lags (See Table 3.1).

AR(p) MA(q) ARMA(p,q)


ACF Dies off Cuts off after q lags Dies off
PACF Cuts off after p lags Dies off Dies off

Table 3.1: Behavior of ACF and PACF for Different ARMA Models

Various tools are used in such cases, where the series seems to follow an ARMA model,
such as, the extended autocorrelation function (EACF), the corner method, smallest
canonical correlation method, etc. In our work, we have only considered about the EACF
method.

In the EACF method, it is assumed that if we can determine the autoregressive part of
a mixed ARMA model, by ”filtering” it out from that model, we can get a pure moving
average process. We can then use sample ACF to determine the order of the moving
average part.

The coefficients of the autoregressive part are estimated by using a finite sequence of
regressions. If {Yt } is a true ARMA(1,1) model, then,

Yt = φYt−1 + et − θet−1

36
CHAPTER 3. SUMMARY OF THEORY

Performing a linear regression of Yt on Yt−1 gives an estimator of φ which is inconsistent.


By performing another regression of Yt on Yt−1 and on the lag one residuals of the first
v
regression will result in a consistent estimator φ. Then, the autoregressive part of the
v
series will be φYt−1 . By deducting it from the series, we will get an approximately pure
moving average series:
v
Wt = Yt − φYt−1
Similarly, for an ARMA(p,q) process, we can estimate the autoregressive coefficients by
a sequence of q regressions. If we consider the AR order to be k and MA order to be j,
then the remaining pure MA series will be:
v v
Wt,k,j = Yt − φt Yt−1 − ......... − φk Yt−k (3.87)
The sample ACF of Wt,k,j is known as the extended ACF.
It was suggested by Tsay and Tiao that the sample EACF information should be
summarized by a table, where the elements of the (k,j)-th slot will be ’X’ if the sample
ACF of at lag j+1 of Wt,k,j is significantly different from zero, and ’O’ otherwise.

Finding ’d’

If the sample ACF of a time series fails to die off quickly as the number of lags increases,
it can be assumed that the series is non-stationary. In such cases, by using the different
transformation methods, including differencing methods, we can turn it into a stationary
series. If any order of differencing is done on the series, then the value of d of the
ARIMA(p,d,q) model becomes that order.

Over-differencing
If we difference any stationary series, we get another stationary series. Over-differencing
a series can lead to various complications in the modeling process and so, care should be
taken while choosing a differencing order.

Dickey Fuller Unit Root Test


The Dickey Fuller unit root test is a method of hypothesis testing to find if a series is
stationary or not. In this test, the null hypothesis is that the series is non-stationary. To
reject the null hypothesis, the probability value has to be less than 0.1. Let us assume
that in the following model, {Xt } is a stationary AR(k) process:
Yt = αYt−1 + Xt
Here, {Yt } will be stationary if |α| < 1, and non-stationary if α = 1. Therefore, under the
null hypothesis that {Yt } is non-stationary, let a = α − 1 and Xt = Yt − Yt−1 . Then,we
get,
Yt − Yt−1 = αYt−1 − Yt−1 + Xt
= (α − 1)Yt−1 + Xt
= aYt−1 + φ1 Xt−1 + φ2 Xt−2 + ....... + φk Xt−k + et
= aYt−1 + φ1 (Yt−1 − Yt−2 ) + φ2 (Yt−2 − Yt−3 ) + ..... + φk (Yt−k − Y + t − k − 1)
+et (3.88)

37
CHAPTER 3. SUMMARY OF THEORY

Here, {Yt } will be difference non-stationary if α = 1, that is, if a = 0. If −1 < α < 1,


then {Yt } will follow an AR(k+1) model, having AR characteristic equation,

1 − αx − φ1 x − φ2 x2 − ......... − φk xk = 0
(1 − φ1 x − φ2 x2 − ....... − φk xk )(1 − αx) = 0

Therefore, if x = 1, that is, the equation has unit root, then the process will be
considered non-stationary which is the null hypothesis. Otherwise, it will be considered
as stationary.Therefore, to find if a series needs differencing or not, all that is required is
to find if the characteristic equation has unit root.
In case, there is possibility that the series has a non-zero mean, then an intercept is
augmented to (3.88). The test is then known as the Augmented Dickey Fuller Test.

Akaike’s Information Criterion (AIC)


In this method, only that model is chosen, which minimizes the AIC. The AIC is given
by,
AIC = −2log(maximum likelihood) + 2k (3.89)
If the model contains an intercept or constant term, then k=p+q+1. Otherwise k=p+q.
2k here is the penalty function. By adding the number of parameters, it is ensured
that models with too many unnecessary parameters do not get chosen. AIC estimates
the mean Kullback-Leibler divergence of the estimated model from the actual model. If
Y1 , Y2 ....Yn is a model with true probability distribution function p(y1 , y2 , ....., yn ) and if
its estimated probability distribution function is qθ (y1 , y2 , ....., yn ), having parameter θ,
then the Kullback - Leibler divergence of p from qθ is given by,
Z ∞Z ∞ Z ∞
p(y1 , y2 , ....., yn )
D(p, qθ ) = ....... p(y1 , y2 , ....., yn )log[ ]dy1 dy2 .....dyn
−∞ −∞ −∞ qθ (y1 , y2 , ....., yn )

AIC is an estimator of E[D(p, qθ̂ )], where θ̂ is the maximum likelihood estimator of the
vector parameter θ.

Corrected Akaike’s Information Criterion (AICc )


AIC’s estimations are biased. And so, Hurvich and Tsai came up with a new equation for
AIC, which is called the corrected AIC or AICc . They added a non-stochastic penalty
term to the existing equation of AIC (3.89) to eliminate the bias:

2(k + 1)(k + 2)
AICc = AIC + (3.90)
n−k−2
Here,
k = total number of parameters minus the noise variance
n = the effective sample size
If k/n is greater than 10%, AICc outperforms both AIC and BIC.

Bayesian Information Criterion (BIC)


In this method, only that model is chosen, which minimizes the BIC. The BIC is given
by,
AIC = −2log(maximum likelihood) + klog(n) (3.91)

38
CHAPTER 3. SUMMARY OF THEORY

The orders specified by BIC for the model of a true ARMA(p,q)process are always
consistent as the sample size increases. However, if the true process is not an ARMA(p,q)
process, then the AIC leads to more optimal selection of orders than BIC.

3.5.2 Parameter Estimation


Once a model is specified for a stationary time series (which could possibly have been
a non-stationary time series that had been transformed into a staionary series), the
parameters of the model are estimated. The different methods of parameter estimation
are discussed here.

Method of Moments Estimators


In this method, the sample moments are equated to the theoretical moments and after
solving the resulting equations, the estimates of the unknown parameters are acquired.
In case of AR(p) models, by equating ρi s to ri s where i = 1, 2, 3....p, we get,

r1 = φ1 + φ2 r1 + φ3 r2 + ..... + φp rp−1 


r2 = φ1 r1 + φ2 + φ3 r1 + ..... + φp rp−2 




. 
(3.92)
. 


.





rp = φ1 rp−1 + φ2 rp−2 + φ3 rp−3 + ..... + φp

These Yule Walker equations are solved to get, φˆ1 , φˆ2 .....φˆp .
In case of MA(q) models, method of moments is not a good estimator. We know that for
MA(1) process,
−θ
ρ1 =
1 + θ2
When we equate ρ1 to r1 :

• if r1 = ±0.5, the solutions are not invertible

• if |r1 | > 0.5, no solution exists

• if |r1 | < 0.5, only one of the solutions is invertible

In case of ARMA(1,1) models, first we find φ̂ using the following formula:


r2
φ̂ = (3.93)
r1
Then, by equating ρs to r’s in (3.72), we get,

(1 − θφ̂)(φ̂ − θ)
r1 = (3.94)
1 − 2θφ̂ + θ2

We solve (3.94) to get θ̂. Since the model has MA process in it, care has to be taken to
make sure that only invertible solutions are taken.

39
CHAPTER 3. SUMMARY OF THEORY

In order to estimate the noise variance, σe2 , at first the sample variance of the process is
estimated using the following formula:
n
1 X
s2 = (Yt − Y )2 (3.95)
n − 1 t=1

Then the relationship among variance, noise variance, θs and φs are used to estimate noise
variance. For AR(p) models, from (3.67), we get,

σˆe 2 = (1 − φˆ1 r1 − φˆ2 r2 − .......... − φˆp rp )s2 (3.96)

For MA(q) models, from (3.44), we get,

s2
σˆe 2 = (3.97)
1 + θ12 + θ22 + ......... + θq2

For ARMA(1,1) models, from (3.71), we get,

2 1 − φ̂2
σˆe = s2 (3.98)
1 − 2φ̂θ̂ + θ̂2

Least Squares Estimation (LSE)


In this method, we take a non-zero mean, µ, into consideration and include it in our
model. Then we estimate it along with other parameters using least squares.
In case of AR(1) models, after including the non-zero mean, we get,

Yt − µ = φ(Yt−1 − µ) + et (3.99)

In LSE method, estimates are made by minimizing the sum of squares of the differences,

(Yt − µ) − φ(Yt−1 − µ)

The conditional sum of squares function of an AR(1) model is given by,


n
X
Sc (φ, µ) = [(Yt − µ) − φ(Yt−1 − µ)]2 (3.100)
t=2

φs and µs are estimated by the values that minimize Sc (φ, µ) when Y1 , Y2 , .....Yn are given.
Say, ∂S
∂µ
c
= 0, then,
n
X
2[(Yt − µ) − φ(Yt−1 − µ)](−1 + φ) = 0
t=2
n n
1 X X
∴µ= [ −φ Yt−1 ] (3.101)
(n − 1)(1 − φ) t=2 t=2

For large n, we get, Pn Pn


Yt
t=2 Yt−1
t=2
≈ ≈Y
n−1 n−1

40
CHAPTER 3. SUMMARY OF THEORY

Therefore,from (3.101), we get,


(Y − φY )
µ̂ ≈ =Y (3.102)
1−φ
Again,
∂Sc (φ, Y )
= 0
∂φ
n
X
2[(Yt − Y ) − φ(Yt−1 − Y )](Yt−1 − Y ) = 0
t=2
Pn
t=2 (Yt − Y )(Yt−1 − Y)
φ̂ = Pn 2
t=2 (Yt−1 − Y )

This is almost similar to r1 and so for large n, both method of moments and least squares
estimators are identical.

For greater orders of AR processes, it can be showed that,


µ̂ = Y (3.103)
In case of AR(2) models, the conditional sum of squares function is given by,
n
X
Sc (φ1 , φ2 , Y ) = [(Yt − Y ) − φ1 (Yt−1 − Y ) − φ2 (Yt−2 − Y )]2 (3.104)
t=3

To estimate the values of φˆ1 and φˆ2 , ∂φ


∂Sc
1
∂Sc
= 0 and ∂φ 2
= 0 are set. Then the resulting
Pn 2
equations are divided by t=3 (Yt − Y ) . These, when rearranged, turn into Yule Walker
equations like (3.92). These equations are then solved for φˆ1 and φˆ2 . The same principle
is followed to find the estimates of parameters of higher order AR processes.

In case of estimation of θs in MA(1) model, the MA(1) model is expressed as its invertible
form:
Yt = −θYt−1 − θ2 Yt−2 − ..... + et
Thus, by using LSE, the value of θ can be estimated by minimizing,
X X
Sc (θ) = (et )2 = [Yt + θYt−1 + θ2 Yt−2 + .....]2 (3.105)

Here, et is the function of the unknown parameter(θ) and the observed series.If we know
the value of e0 , which is commonly assumed as zero, to calculate e1 , e2 , .....en , we use the
following equation:
et = Yt + θet−1 (3.106)
which is a rearranged version of the MA(1)model. Thus, we get,

e1 = Y1 


e2 = Y2 + θe1  



. 
(3.107)
. 


.





en = Yn + θen−1

41
CHAPTER 3. SUMMARY OF THEORY

Here, Y1 , Y2 , .....Yn are the observed values. Now, we can calculate, Sc (θ) = (et )2 to
P
get the value of θ. For higher order MA(q) models, the same principle applies. et =
et (θ1 , θ2 , .....θq ) is calculated recursively using the following equation:

et = Yt + θ1 et−1 + θ2 et−2 + ..... + θq et−q (3.108)

Here, it is assumed that e0 = e−1 = e−2 ..... = e−q = 0. With the help of multivariate
numerical method, the sum of squares is minimized.

In case of general ARMA(p,q) models, we use the same technique as the pure MA model.
The following equation is used to compute the values of e1 , e2 , .....en :

et = Yt − φ1 Yt−1 − φ2 Yt−2 − ..... − φp Yt−p + θ1 et−1 + θ2 et−2 + .... + θq et−q (3.109)

Here, it is assumed that ep = ep−1 = .....ep+1−q = 0. In order to obtain the least squares
estimate of all the parameters, Sc (φ1 , φ2 , φ3 , ....φp , θ1 , θ2 , .....θq ) is minimized.

Maximum Likelihood Estimator (MLE)


In this method only those values are chosen for the parameters which maximize the
likelihood function. The joint probability density of obtaining the actual observed data
is called the likelihood function L.

In case of AR(1) model, the probability density function of each white noise term et is,

1 e2
p exp(− t 2 ) f or − α < et < α
2πσe2 2σe

Then, since the white noise terms are independently and identically distributed, the joint
probability density function for e2 , e3 , ....en is given by,
Pn 2
1 e
p exp(− t=22 t ) (3.110)
(2πσe2 )( n − 1) 2σe

Say, Y1 = y1 is given and,



Y2 − µ = φ(Y1 − µ) + et  

Y3 − µ = φ(Y2 − µ) + e3  



. 
(3.111)
. 


.





Yn − µ = φ(Yn−1 − µ) + en

Then the joint probability density of Y2 , Y3 , ....Yn is given by the following equation,
Pn
1 [(Yt − µ) − φ(Yt−1 − µ)]2
f (y2 , y3 , ....yn |y1 ) = p exp(− t=2 ) (3.112)
(2πσe2 )n−1 2σe2

Since this is an AR(1) process, the marginal probability distribution of Y1 will be a


σe2
normal distribution, having mean µ and variance 1−φ 2 . So, the joint probability density

of Y1 , Y2 , Y3 , ....Yn will be equal to the joint probability density of Y2 , Y3 , ....Yn multiplied

42
CHAPTER 3. SUMMARY OF THEORY

by the marginal probability density of Y1 .


The likelihood function of AR(1) model is,
1 p 1
L(φ, µ, σe2 ) = p 1 − φ2 exp[− 2 S(φ, µ)] (3.113)
2
(2πσe )n 2σe

Here, S(φ, µ) is known as the unconditional sum of squares function and is given by,
n
X
S(φ, µ) = [Yt − µ) − φ(Yt−1 − µ)]2 + (1 − φ2 )(Y1 − µ) (3.114)
t=2

Usually, instead of the likelihood function itself, its log is used. The log likelihood function
of AR(1) model is given by,
n n 1 1
l(φ, µ, σe2 ) = log(2π) − log(σe2 ) + log(1 − φ2 ) − 2 S(φ, µ) (3.115)
2 2 2 2σe

3.5.3 Model Diagnostics


After the models are specified and their parameters are estimated, the next step is to
diagnose the fitted models and check if the models fit the data well. There are two
approaches for accomplishing this task. They can either be used singularly or together.

Residual Analysis
Residuals are the differences between the actual terms and the predicted terms of a model.
For a general ARMA(p,q) model where the existing MA process is inverted to form an
infinite autoregressive process, the residual is given by,

eˆt = Yt − πˆ1 Yt−1 − πˆ2 Yt−2 − ...... (3.116)

The residuals will resemble white noise if the models are a good fit, that is, the residuals
will have zero mean and a constant standard deviation. Residuals are analyzed using
different methods:

1. Plots of Residuals

• The residuals are plotted over time and observed. If the residuals resemble
white noise, then there will be no trend visible and the plot would scatter
around a horizontal level forming a somewhat rectangular shape. There will
be no increase or decrease of variation of the plot around the horizontal line.
After fitting an ARIMA(1,2,1) model to our original time series of the diameter
of the hem of women’s skirts, the residuals we get, is plotted in Figure 3.7.

43
CHAPTER 3. SUMMARY OF THEORY

Figure 3.7: Plot of Residuals of the Fitted Model to the Diameter of Hem Series

2. Normality of the Residuals

• Q-Q Plot or Quantile-Quantile plots shows the quantiles of the residuals


versus the theoretical quantiles of a normal distribution for the residuals. If
the points follow the straight line which passes through the first and third
probability quantiles of the series, very closely, then we can say that the
residuals are normally distributed. The Q-Q plot of our example series is
shown in Figure 3.8. It seems to follow the said straight line pretty closely.
• Histogram plot of the residuals can also help assess normality. If the histogram
is somewhat symmetrical and the tails off at the two ends, then we can say that
it resembles normal distribution. The histogram plot of our example series is
shown in Figure 3.9. The plot seems to indicate a normal distribution to some
extent.

3. Sample Autocorrelation Function (ACF)

• The sample autocorrelation function is plotted to look for correlations among


the residuals at different lags. A horizontal line is drawn on either side of zero
at 2 approximate standard error of the sample ACF, that is, ± √2n . This is
also known as the significance bound. If the values are within the significance
bound, then we can say that the residuals are uncorrelated. Sample partial
autocorrelation function is also plotted to ensure that there is no correlation
among the residuals. The ACF plot of our example is shown in Figure 3.10.
The plot shows significant correlation at lag 5, but not in the smaller lags. We
can say that the model was not a perfect fit as there still seems to remain some
information left in the residuals.

44
CHAPTER 3. SUMMARY OF THEORY

Figure 3.8: Q-Q plot of Residuals of the Fitted Model to the Diameter of Hem Series

Figure 3.9: Histogram plot of Residuals of the Fitted Model to the Diameter of Hem
Series

45
CHAPTER 3. SUMMARY OF THEORY

Figure 3.10: ACF plot of Residuals of the Fitted Model to the Diameter of Hem Series

4. Ljung Box test

• The Ljung Box test is based on the following statistic:


m
X rˆk 2
Q∗ = n(n + 2) (3.117)
k=1
n−k

Here, n is the sample size, rˆk is the k-th sample autocorrelation function of
the residuals. If the fitted model is correct, then the Ljung Box test Q∗ will
have a χ2 distribution with k-p-q degrees of freedom. If the probability value
is greater than 0.05, then the residuals are uncorrelated.

Overfitting and Parameter Redundancy


In this method, once a model is fitted to a series, another model which contains our model
as a special case, should be fitted to the series. The orders of the latter model should
not be too big compared to those of the former. That is, if the original fitted model
was an AR(1) model, then the ”bigger” model should be either ARMA(1,1) or AR(2).
This will help to analyze more accurately. After fitting the latter model, if the estimated
values of the new parameters are significantly different from 0 and/or if the estimates of
the parameters which are common to both the models are significantly different, then the
former model might have not been a good fit. One way of choosing the bigger model is
to see what the residual analysis indicates. For example, if the ACF of the residuals after
fitting an MA(1) model show significant correlation at lag 2, then MA(2) model should
be chosen instead of ARMA(1,1).

46
CHAPTER 3. SUMMARY OF THEORY

3.6 Forecasting
3.6.1 Minimum Mean Square Error Forecasting
If the time series Y1 , Y2 , ....Yt is given, then the minimum mean square error forecast of
Yt+l which is l time units after t, is,
Ŷt (l) = E(Yt+l |Y1 , Y2 .....Yt ) (3.118)

3.6.2 Forecasting ARIMA Models


AR(1) Models
In case of an AR(1) process with non-zero mean, to forecast l time units into the future,
we have from (3.99) and (3.118),
Yt+l − µ = φ(Yt+l−1 − µ) + et+l
E(Yt+l |Y1 , Y2 ....Yt ) − µ = φ[E(Yt+l−1 |Y1 , Y2 ....Yt ) − µ] + E(et+l |Y1 , Y2 ....Yt )
Ŷt (l) − µ = φ[Ŷt (l − 1) − µ] + 0
Ŷt (l) = µ + φ[Ŷt (l − 1) − µ] for l > 1 (3.119)
From (3.119), we can see that we can make forecasts upto any lead time by recursively
forecasting for smaller lead times. This equation is also known as the difference equation
form of the forecasts. To get a more explicit expression for Ŷt (l),
Ŷt (l) = φ[Ŷt (l − 1) − µ] + µ
= φ{φ[Ŷt (l − 2) − µ]} + µ
.
.
.
= φl−1 [Ŷt (1) − µ] + µ
= φl (Yt − µ) + µ (3.120)
As |φ| < 1,
Ŷt (l) = µ for large l (3.121)
A one-step ahead forecast error will be given by,
et (1) = Yt+1 − Ŷt (1)
= [φ(Yt − µ) + µ + et+1 ] − [φ(Yt − µ) + µ]
∴ et (1) = et+1 (3.122)
∴ V ar(et (1)) = σe2 (3.123)
In general linear process form, an AR(1) model can be written as follows:
Yt = et + φet−1 + φ2 et−2 + ...... (3.124)
Now, the l step ahead forecast error would be:
et (l) = Yt+l − Ŷt (l)
= Yt+l − µ − φl (Yt − µ)
= et+l + φet+l−1 + φ2 et+l−2 + ...... + φl−1 et+1 + φl et + ...... − φl (et + φet−1 + φ2 et−2 + ......)
∴ et (l) = et+l + φet+l−1 + .... + φl−1 et+1 (3.125)

47
CHAPTER 3. SUMMARY OF THEORY

(3.125) can also be written as,

et (l) = et+l + Ψ1 et+l−1 + .... + Ψl−1 et+1 (3.126)

Here, E[et (l)] = 0 and,

V ar(et (l)) = σe2 (1 + Ψ21 + Ψ22 + ...... + Ψ2l−1 ) (3.127)

Therefore, as the lead increases, the forecast error also increases.

MA(1) Models
In case of an MA(1) model with non-zero mean, to forecast 1 time unit into the future,
we have,

Yt+1 = µ + et+1 − θet


E[Yt+1 |Y1 , Y2 , ...Yt ] = µ + 0 − θE[et |Y1 , Y2 , ...Yt ]
Ŷt (1) = µ − θet (3.128)

Then, the one step ahead forecast error is,

et (1) = Yt+1 − Ŷt (1)


= (µ + et+1 − θet ) − (µ − θet )
= et+1

To forecast l time unit into the future, we have,

Yt+l = µ + et+l − θet+l−1


E[Yt+l |Y1 , Y2 , ...Yt ] = µ + 0 − 0
Ŷt (1) = µ for l > 1 (3.129)

ARMA(p,q) Models
The difference equation form for forecasting of general ARMA(p,q) model is given by,

Ŷt (l) = φ1 Ŷt (l − 1) + φ2 Ŷt (l − 2) + .... + φp Ŷt (l − p) + θ0 − θ1 E(et+l−1 |Y1 , Y2 , ....Yt )


−θ2 E(et+l−2 |Y1 , Y2 , ....Yt ) − ... − θq E(et+l−q |Y1 , Y2 , ....Yt ) (3.130)

Here,
(
0, for j > 0
E(et+j |Y1 , Y2 , ....Yt ) = (3.131)
et+j , for j 6 0

In case the models are invertible, et can be written as a linear combination of the infinite
sequence Yt , Yt−1 , Yt−2 , ......, using π-weights. However, as j increases, the π-weights die
out exponentially fast. In fact, for j > t − q, πj is assumed to be negligible.
For leads=1,2,....q, in (3.130), the noise terms et−(q−1) , ....et−1 , et already exist. However,
for l > q, the autoregressive portion and the mean θ0 determine the general nature of the
forecast.

Ŷt (l) = φ1 Ŷt (l − 1) + φ2 Ŷt (l − 2) + .... + φp Ŷt (l − p) + θ0 for l > q (3.132)

48
CHAPTER 3. SUMMARY OF THEORY

Moreover, θ0 = µ(1 − φ1 − φ2 − ..... − φp ). Therefore, (3.132) can be written as:

Ŷt (l)−µ = φ1 [Ŷt (l−1)−µ]+φ2 [Ŷt (l−2)−µ]+....+φp [Ŷt (l−p)−µ] for l > q (3.133)

As l increases, Ŷt (l) − µ decays to zero for any stationary ARMA model and the long term
forecast then gives the process mean, µ.
An ARIMA model can be written as follows:

Yt+l = Ct (l) + It (l) for l > 1 (3.134)

Here,
Ct (l) = a certain function of Yt , Yt−1 , Yt−2 , ...... and,

It (l) = et+l + Ψ1 et+l−1 + Ψ2 et+l−2 + .... + Ψl−1 et+1 for l > 1 (3.135)

Then,

Ŷt (l) = E(Ct (l)|Y1 , Y2 , ...Yt ) + E(It (l)|Y1 , Y2 , ...Yt )


= Ct (l)

And,

et (l) = Yt+l − Ytˆ(l)


= Ct (l) + It (l) − Ct (l)
= It (l)
= et+l + Ψ1 et+l−1 + Ψ2 et+l−2 + .... + Ψl−1 et+1

So, for general ARIMA process, we can write,

E(et (l)) = 0 for l > 1 (3.136)

And,
l−1
X
V ar(et (l)) = σe2 Ψ2j for l > 1 (3.137)
j=0

Non-stationary Models
Using an ARMA(p+1,q) model, we can express an ARIMA(p,1,q) model as follows:

Yt = ϕ1 Yt−1 +ϕ2 Yt−2 +.....+ϕp Yt−p +ϕp+1 Yt−p−1 +et −θ1 et−1 −θ2 et−2 −.....−θq et−q (3.138)

Here, )
ϕ1 = 1 + φ1 , ϕj = φj − φj−1 for j = 1, 2, ....p
(3.139)
ϕp+1 = −φp
If the order differencing of the ARIMA model is d, then there will be p + d such ϕ
coefficients. Using (3.130) and (3.131), and replacing the ps with (p + d)s and φj s by ϕj s,
we can do the forecasting.

49
CHAPTER 3. SUMMARY OF THEORY

3.6.3 Limits of Prediction


In a general ARIMA series, if the white noise terms {et } are independently and identically
distributed, then the forecast error, et (l), will also be normally distributed. That is,
et (l) = Yt+l − Ŷt (l)
will be normally distributed. Therefore, using the standard normal percentile, Z1− α2 , it
can be stated that,
Yt+l − Ŷt (l)
P [−Z1− α2 < p < Z1− α2 ] = 1 − α
V ar(et (l))
Here, (1 − α) is the given level of confidence. The above equation can also be written as
follows:
p p
P [Ŷt (l) − Z1− α2 V ar(et (l)) < Yt+l < Ŷt (l) + Z1− α2 V ar(et (l))] = 1 − α
Hence, it can be said with (1 − α)100% confidence, that the forecast of Yt+l will be within
the limits: p
Ŷt (l) ± Z1− α2 V ar(et (l)) (3.140)

3.6.4 Updating ARIMA Forecasts


Say, we have made a forecast, l + 1 steps into the future and so we have Ŷt (l + 1). Now,
once we get the observation of the next time unit, that is, t = t + 1, we would like to
ˆ (l). Therefore, using (3.134)
update our forecast with the origin at t = t + 1, that is, Yt+1
and (3.135), we get,
Yt+l+1 = Ct (l + 1) + et+l+1 + Ψ1 et+l + Ψ2 et+l−1 + ... + Ψl et+1
ˆ (l) = Ct + Ψl et+l
Yt+1
ˆ (l) = Ŷt (l + 1) + Ψl [Yt+1 − Ŷt (1)]
Yt+1 (3.141)

3.6.5 Forecasting Transformed Series


Transformation by Differencing
In case the transformations were done by differencing, then two approaches could be used
to forecast them:
1. Forecasting the original non-stationary series using the difference equation form and
replacing the φs with ϕs.
2. Forecasting the stationary differenced series first and then summing the series to
undo the differencing.
Log Transformations
If log transformations were done on the original time series Yt to get Zt = log(Yt ), then
the minimum mean square error forecast on the original series is expressed as,
1
exp{Ẑt (l) + V ar[et (l)]} (3.142)
2
This only works properly if the variables are normally distributed. However, if Zt itself
has a normal distibution, then a different method would be preferred.

50
Chapter 4

Stock Returns Forecasting System

In this chapter, we discuss about our proposed web-based system for stock returns
forecasting. We also explain how we implemented the system and the challenges we
faced while doing so.

4.1 Proposed System


The development of the stock returns forecasting system is aimed towards aiding any
stock market enthusiast who wishes to use their own model for stock price forecasting. At
present, the system provides facilities to fit ARIMA models of any order to the time series
of stock returns of a company. We intend to include other models into the picture in the
future. This system not only helps the users to do their own analysis and model-fitting,
but it also provides them with useful descriptions of the different concepts to help them
understand better. Therefore, even if someone has no background in time series analysis,
they can use this website and get a rough idea of what is going on.
A good advantage of such a system is that it does not require any kind of installation
or coding knowledge to perform the analysis. Users can access it anytime and anywhere
as long as there is an internet connection. They can also download the outputs of their
analysis for future use.

4.1.1 Why use stock returns data instead of stock prices?


In order to fit ARIMA models to any time series, the first and foremost condition is that
the time series has to be stationary. Unfortunately, the time series of stock market prices
almost always rejects the Augmented Dickey Fuller Unit Root Test (See Section 3.5.1 for
details on page 37). That is, they are almost always non-stationary in nature. Taking
the first difference of the logarithm of the time series of stock market prices leads to a
stationary time series in most cases. Even if it is not stationary, it can be turned into one
by differencing it further. Therefore, for our convenience, instead of using time series of
stock prices, we use the stock returns time series which is calculated using the following
formula:
rt = 100(log(pt ) − log(pt−1 ))
We multiply by 100 because otherwise, it would have resulted in round-off errors as the
returns values are too small.

51
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

4.1.2 How the Proposed System Works


• Users choose whether they wish to analyze a particular stock returns data or to fit
ARIMA models to them.

• If they choose to analyze the data

– They are asked to select a particular stock symbol from the dropdown list and
a past time interval. If the time interval is less than ten trading days, we reject
their input and ask them to choose a bigger time interval. We also ask them
if they would like to perform differencing on their selected series. We set the
limit of differencing to 3 as more differencing of the series is not recommended.
On the right of the input form, they can see what outputs they will get after
they click the Analyze button along with their description.
– When they give the required inputs and click the Analyze button, the inputs
are sent to the server. The R software in the server extracts the dataset from
the database and executes several functions on them to generate the outputs.
The outputs are then saved in a directory of the server by R, which are then
fetched and shown to the users on the website. Users also get the choice of
downloading the generated outputs.

• If they choose to fit models to the data

– They are asked to select a particular stock symbol from the dropdown list, a
past time interval, and the orders of the ARIMA model they wish to fit. If
the time interval is less than ten trading days or less than the selected orders
of the model, we reject their input and ask them to give proper inputs. To
fit the models, we also give them the choice to select a method for parameter
estimation and to select whether they wish to fit the model considering the
series to have a non-zero mean. By default, these are set to ”CSS-ML” and
”No mean” respectively. Here, CSS stands for Conditional Sum of Squares
and ML stands for Maximum Likelihood. The mean is not included by default
because when the series is being used to fit an ARIMA model, we expect that
it would be a stationary series with no mean. On the right of the input form,
they can see what outputs they will get after they click the Fit Model button
along with their description.
– When they click the Fit Model button, similar steps are executed as the
Analysis section. Only the functions executed by R and the generated outputs
are different.

52
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.1: An overview of the System

Figure 4.2: Usecase Diagram of the System

53
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.3: An Activity Diagram of the System

54
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.4: A Data Flow Diagram of the System

4.2 System Outputs


In this section, we use the time series of YHOO stock closing prices for the first half
of 2016 and explain the different outputs provided by the system. We also mention the
different functions of R that we used to get those results.

4.2.1 Outputs of Analysis Section


In this section, the stock returns are calculated using the closing prices of the YHOO
stock for the first half of the year 2016 with the help of the formula mentioned in Section
4.1.1 and then they are used as a time series to give the following outputs:

Plot of Stock Returns


The stock returns are plotted (Figure 4.5) as time series using the plot(ts()) function
of R. Plotting the time series helps us to have a rough idea about its nature. From the
figure, it seems as if the series has constant mean and variance.

55
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.5: Time Series Plot of Stock Returns of YHOO

Summary of Stock Returns

Figure 4.6: Summary of Stock Returns Data of YHOO

A summary of the time series data (Figure 4.6) is provided using the summary() function
of R, where,

• Min minimum value in the dataset

• 1st Qu. First Quartile 25% of the values are below the given quantity

• Median The median value of the dataset

• 3rd Qu. Third Quartile 75% of the values are below the given quantity

• Max maximum value in the dataset

56
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Plot of ACF

Figure 4.7: Sample ACF Plot of Stock Returns Data of YHOO

The Autocorrelation Function (ACF) of the stock returns time series is plotted using the
acf() function of R. The plot of ACF helps us to identify the order of the pure MA(q)
model of a time series. Starting from 0, the lag after which the ACF stops crossing
the significance bound (blue dashed line), is the order, q, of the MA(q) model. The
significance bound is set at ± √2n , where n is the length of the series. If the ACF does
not cross the significance bound in the first lag, but does so in case of later lags, then we
assume that q=0. From the plot Figure 4.7, we can see that the ACF at lag 1 has crossed
the significance bound. So, we can assume that the series has an MA(1) process in it.

Plot of PACF

The Partial Autocorrelation Function (PACF) of the stock returns time series is plotted
using the pacf() function of R. The plot of PACF helps us to identify the order of the
pure AR(p) model of a time series. Starting from 0, the lag after which the PACF stops
crossing the significance bound (blue dashed line), is the order, p, of the AR(p) model.
The significance bound is set at ± √2n , where n is the length of the series. If the PACF
does not cross the significance bound in the first lag, but does so in case of later lags,
then we assume that q=0. From the plot in Figure 4.8, we can see that the PACF at lag
1 crosses the significance bound and so, we can assume that it has an AR(1) process in
it.

57
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.8: Sample PACF Plot of Stock Returns Data of YHOO

EACF

We get the Extended Autocorrelation Function (EACF) by using the eacf() function in
R. For a mixed ARMA(p,q) model, Extended Autocorrelation Function (EACF) helps
us to identify the possible values of p and q of an ARMA model of a time series. Let,
the AR order be k and the MA order be j. Then, in the $symbol table of the output, the
element in the k-th row and j-th column is set to x if for AR order k, the lag j+1 ACF is
significantly different from zero. Otherwise, it is set to o. The trick to understand what it
means is to look for a triangle of zeroes in the $symbol table. The upper left-hand vertex
of the triangle will indicate the order of the ARMA(p,q) model. In our case (Figure 4.9),
the EACF table does not look too clear. As an exact triangle has not been formed
anywhere. So, we can try fitting ARMA(0,5),ARMA(1,1),ARMA(2,1), ARMA(3,3) and
ARMA(4,3) on the series and choose the one which best fits the model based on the
AIC, AICc or BIC values.

Q-Q Plot

The Quantile-Quantile plot or the Q-Q plot is plotted using the qqnorm() and qqline()
functions of R. This plot helps us to find whether a time series is normally distributed or
not. If the plot of the values looks like a straight line, then we can say that the series is
normally distributed. From the Figure 4.10, we can see that the most of the values seem
to align with the straight line in the middle and then they move away at the two ends.
We can say that it is somewhat normally distributed.

58
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.9: EACF of Stock Returns Data of YHOO

Figure 4.10: Q-Q Plot of Stock Returns Data of YHOO

59
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Histogram Plot

Figure 4.11: Histogram Plot of Stock Returns Data of YHOO

The Histogram of probability densities of the returns series is plotted using the hist()
function of R. Along with it, we also drew a curve which represents the theoretical
normal distribution of the series. the This plot helps us to find whether a time series is
normally distributed or not. The plot will be somewhat symmetric and tail off at both
the high and low ends as a normal distribution would. From the Figure 4.11, we can say
that the series roughly follows normal distribution.

ADF Test

Figure 4.12: ADF Test Results of Stock Returns Data of YHOO

The Augmented Dickey Fuller Test (Figure 4.12) is run on the returns series using the
adf.test() function of R. This test helps us to find whether a time series is stationary or

60
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

not. If the probability value i.e. p.value becomes greater than 0.1, then we cannot reject
the null hypothesis which states that the series is non-stationary. From the output of
our example (Figure 4.12), we can say that the series is stationary as the p.value is not
greater than 0.1.

4.2.2 Outputs of Model-Fitting


In this section, we take the same inputs as the analysis section and we also take inputs
for the orders p,d and q of the ARIMA(p,d,q) model. We try and fit the several
ARIMA models to the series that we found we could use during analysis in Section
4.2.1. We chose to consider zero mean while fitting the model and the combination
of conditional sum of squares and maximum likelihood methods for parameter estimation.

ARIMA AIC AICc BIC


(1,0,1) 540.37 540.57 548.83
(0,0,5) 539.49 540.21 556.42
(2,0,1) 542.32 542.66 553.6
(3,0,3) 537.23 538.19 556.97
(4,0,3) 541.73 542.98 564.29

Table 4.1: Values of AIC, AICc and BIC after fitting different models

After we tried fitting the different models, the ARIMA(3,0,3) model showed the
least value for AICc (See Table 4.1), so we chose to show only the outputs of fitting
ARIMA(3,0,3) model to our returns series. The typical outputs that are shown in our
system are discussed.

Plot of Stock Returns and Summary of Stock Returns

The plot of time series of stock returns and its summary are shown as output in this
section as well for reference.

Summary of Fitted Model

After the ARIMA(3,0,3) model is fitted to the series, as shown in Figure 4.13, the
estimates of the coefficients and their respective standard errors are given. Also, the
following are shown:

Sigma-square
The variance of the series as assumed by the fitted model.
Log Likelihood
It quantifies the relative abilities of the estimates to explain the observed data.
AIC
This is the Akaikes Information Criterion. Models with the least value of AIC should be
chosen.

61
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.13: Summary of Fitted Model of Stock Returns Data of YHOO

AICc
This is the corrected form of AIC and is said to outperform both AIC and BIC in model
selection. Models with the least value of AICc should be chosen.
BIC
Bayesian Information Criterion. Models with the least value of BIC should be chosen.
ME
It is the mean error of the fitted values.
RMSE
It is the root mean square error of the fitted values.
MAE
It is the mean absolute error of the fitted values.
MPE
It is the mean percentage error of the fitted values. Since it gives error in percentage, it
can be used to compare models with different datasets.
MAPE
It is the mean absolute percentage error of the fitted values. Since it gives error in
percentage, it can be used to compare models with different datasets.
MASE
It is the mean absolute scaled error of the fitted values. It is also used to compare models
with different datasets.
ACF1
It is the first order autocorrelation coefficient. It is the correlation coefficient of the first
N-1 observations and the next N-1 observations.

Plot of Residuals of Fitted Model

Plotting the residuals (Figure 4.14) helps us to have a rough idea about its nature. From
the figure, it seems like the residual series has constant mean and variance.

62
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.14: Time Series Plot of Residuals of Fitted Model on YHOO Stock Returns

ACF Plot of Residuals

Figure 4.15: ACF Plot of Residuals of Fitted Model on YHOO Stock Returns

63
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

The ACF plot (Figure 4.15) in the figure shows that the residual series has no MA(q)
process in it as none of the ACFs crossed the significance bound.

PACF Plot of Residuals

Figure 4.16: PACF Plot of Residuals of Fitted Model on YHOO Stock Returns

The PACF plot (Figure 4.16) in the figure shows that the residual series has no AR(p)
process in it as none of the PACFs crossed the significance bound.

Q-Q Plot of Residuals

The Q-Q plot (Figure 4.17) of the residuals shows that the residuals are normally
distributed to a great extent.

Ljung-Box Test

Ljung-Box Test was run on the residuals using the Box.test() (Figure 4.18) function of
R with the default lag which is 1. This test allows us to find whether error terms are
correlated or not. If p.value > 0.05, then we cannot reject the null hypothesis that the
error terms are uncorrelated. Since the p.value here is 0.6613, we cannot reject the null
hypothesis that the adjacent error terms are uncorrelated.

64
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.17: Q-Q Plot of Residuals of Fitted Model on YHOO Stock Returns

Figure 4.18: Results of Ljung-Box Test on Residuals of Fitted Model on YHOO Stock
Returns

Plot of Forecast of Fitted Model

We plot the forecast of next 10 trading days along with a 80% prediction interval
for the forecast and a 95% prediction interval for the forecast by using the plot.forecast()
function in Figure 4.19. If the values of the next ten days are also available, then those
are also plotted to show the outputs.

Forecast Values

Using the forecast.Arima() function, we get the values of forecast of next 10 trading days
along with a 80% prediction interval for the forecast and a 95% prediction interval for the
forecast (Figure 4.20).

65
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.19: Plot of Forecast of Fitted Model on YHOO Stock Returns

Figure 4.20: Summary of Forecast of Fitted Model on YHOO Stock Returns

Forecast Errors

Figure 4.21: Errors of Forecast of Fitted Model on YHOO Stock Returns

Using the accuracy() function, we get the errors of forecast of next 10 trading days

66
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

(Figure 4.21). The training set errors are the errors faced during model fitting and the
test set errors are the forecast errors. We look at MPE or the mean percentage error to
determine the accuracy for the forecast. Unfortunately, in this case, the error is huge. We
can try checking the other models that we had ignored previously to see if the error gets
reduced. If they lead to huge errors as well, the chances of which are fairly high, then
other factors or models would need to be incorporated to make a better model. More
about this has been discussed in Section 5.1

4.3 System Implementation


Our system is a responsive website which was built using PHP (ver. 5.6.23), HTML 5,
CSS, Javascript and MYSQL languages. We used the Xampp software to create a local
server in our laptop. The database server we used was MariaDB 10.1.13. As mentioned
before, R was used to perform the different analysis and to generate the results.

4.3.1 The Website


The website consists of three pages the Home page, the Analysis page and the
Model-Fitting Page. All these pages were developed with the help of the templates
provided by [3]. We inserted a stock ticker watch list widget in our website, which was
collected from the [6] website. The watch list monitors stock quotes of 100 S&P500
companies.

The Homepage
The homepage gives a brief explanation on what facilities our site provides.

The Analysis Page


In the Analysis page, the user gets to see a form on the left where they are asked to
choose a ticker name, a date interval and a differencing order. On the right, we give
them an idea about what outputs they would get after the analysis and we also explain
how they should interpret those results. Once they give the inputs and submit the form,
the outputs are shown on the right with the descriptions at the bottom. We also give
them the option to download all the results, which are .png files and .txt file.

The Model-Fitting Page


The functionality of this page is the same as the analysis page except for the fact that
they insert additional inputs to fit the ARIMA models. The outputs shown here are also
different.

67
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.22: The Homepage of the System

68
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.23: The Analysis Page Before Input is Submitted

Figure 4.24: The Analysis Page After Input is Submitted

69
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.25: The Model-Fitting Page Before Input is Submitted

70
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Figure 4.26: The Model-Fitting Page After Input is Submitted

4.3.2 R
R is a programming language and software environment which is widely used for
statistical computations. It is not only used by statisticians to perform data analysis, but
it also helps in developing statistical software. R scripts (with .R extensions) are simple
text files, where we write all the commands that we want R to execute. We made three
separate R scripts for our project update.R, model.R and analysis.R. We run update.R
manually for now to update the database every day. The model.R and analysis.R scripts
are run, depending on which page’s form the user submits.
To learn the R language as well as to create the Rscript files for our project, we used the
RStudio software which is an IDE for using R.

71
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

Data Frames
Data frames in R are used to store data tables. A data frame is actually a list of vectors
of equal length. In our project, we used R to collect data from Yahoo Finance, to store
the data in the database and to extract required data from database. All these were done
by the use of data frames. The data from Yahoo Finance were collected as data frames
and then the data frames were modified according to our need. We then appended the
data from the data frame to the existing table in the database using R.
The data frames imported from Yahoo Finance had the dates column set as the
row names attribute. The row names attribute is a character vector having length equal
to the number of rows of the data frame. To append the data from the data frame, we
used the WriteTable() function. This function creates a table in the database in case
the table does not exist. Otherwise, it simple appends the data. So, the database table
has the same attribute names as the data frame. We tried to change the name of the
row names attribute to Date but we could not do it. As it did not hamper our work in
any way, we decided to leave it as it was.

Packages and Functions Used


R has a lot of useful functions to perform statistical analysis. However, sometimes we
require some specialized functions. Rs active user community has built numerous useful
specialized and these are available at the CRAN website. We can also install them
directly from the RStudio.
The packages that we needed for our project are as follows:

• DBI : This package helps to build the communication between R and the database
management system.
i) dbConnect() – Used to build connection with the database.
ii) dbSendQuery) - Used to execute a query on the connected database.
iii) fetch() – Used to fetch records from the previously executed query.
iv) dbWriteTable() - Used to copy dataframes into the database table.
• forecast : This package helps to analyze and display univariate time series forecasts.
It also requires the installation of the zoo package.
i) Arima() - Used to fit an ARIMA model to a time series.
ii) forecast.Arima() - Used to make forecast up to specified number of steps.
iii) plot.forecast() - Used to plot the forecasted series.
iv) accuracy() - Used to calculate the accuracy of the fitted model and its forecasts.
• TSA : This package was created by [9]. It includes various functions for time
series analysis. It also requires the installation of the leaps, locfit, mgcv and tseries
packages.
i) eacf() - Used to compute the sample EACF of the data.
• tseries : This package includes various functions for time series analysis as well as
computational finance.

72
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

i) adf.test() - Used to perform the Augmented Dickey Fuller test on the data.
• quantmod : This package contains tools for downloading financial data, plotting
common charts and doing technical analysis. It also requires the TTR and xts
packages.
i) getSymbols() – Used to collect the historical prices from Yahoo Finance.
• RMySQL : This package is used to implements DBI Interface to MySQL and
MariaDB databases.
i) MySQL() - Used to authenticate and connect to one or more MySQL databases.
A lot more built in functions were used in our project , such as, acf(), pacf(), png(),
capture.output(), hist(), Box.test() etc.

4.3.3 Database
In our database, we currently have the historical daily price data of forty S&P500
companies that are listed in NASDAQ and NYSE since January’1980. Some companies’
data didn’t begin from 1980 however as they were not available. The data was collected
as a data frame and then were copied to our database table by R.

Figure 4.27: ER Diagram for the System

Our database consists of only one table at present, which is the hp table. It stores
the historical price data of the different companies. Our table has a composite primary
key, consisting of the date and the stock symbol name, i.e. the row names and Symbol
attributes. Although we are working only with the Close prices of the stocks, we chose to
keep the other retrieved information as well, so that we could use them in the future.

4.3.4 Integration of PHP and R


When users submit the inputs, PHP executes the Rscript and passes the input variables
concatenated as a string to the latter through its exec() command. If the user submits
the form of the Analysis page, then the analysis.R Rscript is executed as follows:

73
CHAPTER 4. STOCK RETURNS FORECASTING SYSTEM

exec("Rscript --vanilla E:\analysis.R ".\$param)

In case they use the Model-Fitting page, the following code is executed:

exec("Rscript --vanilla E:\model.R ".\$param)

At the beginning of both the Rscripts, there is the following function:

commandArgs(trailingOnly=TRUE)

This function captures all the arguments supplied by the command line when the R session
gets invoked. Setting trailingOnly=TRUE ensures that the only the arguments after -
-args are captured. Based on these arguments, the Rscript runs the different functions
to generate the results. The results are then saved in a directory by R, which are then
fetched and shown on the website using PHP and HTML. If the user presses the download
button, a download.php file gets called. Then by using the content disposition header,
they force the browser to show the dialogue box for saving the ’txt or .png file.

4.4 Challenges
The biggest challenge while working with this project was that there was no initial
knowledge regarding Time Series Analysis. There were a lot of things to learn in a
very short amount of time. The bulk of the time was spent on understanding the
different concepts of time series analysis and forecasting. Still, there are more topics
to be covered and incorporated into this project. Moreover, the initial intention was to
store the historical prices of all companies listed in NASDAQ, NYSE and AMEX starting
from the year 1980. However, importing only the data of NASDAQ caused the database
performance to deteriorate. The server would occasionally freeze or would take several
minutes when sent a query. Therefore, the decision was taken to use a smaller database
for the system, consisting of 40 companies’ historical prices.

74
Chapter 5

Further Exploration

5.1 Theoretical Approach


In our project, we have considered fitting ARIMA models to the stock returns time series
which almost all the time demonstrates stationarity. Often, while trying to fit ARIMA
models to a time series of stock returns it is seen that the ACF, PACF, EACF plots as
well as the Q-Q plots and histograms plots seem to indicate the series is white noise,
which would mean that it cannot be used to predict future values. However, if further
in-depth analysis of the series is done, we can find more hidden information in it.
Let us consider an example of the stock returns of AAPL for the year 2015. The ACF
and PACF plots show no significant correlations among the returns values, as shown in
Figure 5.1 and Figure 5.2.

Figure 5.1: ACF Plot of AAPL Stock Returns for 2015

75
CHAPTER 5. FURTHER EXPLORATION

Figure 5.2: PACF Plot of AAPL Stock Returns for 2015

The EACF table (Figure 5.3) hints at an ARMA(0,0) model for the series which is the
random walk model.

Figure 5.3: EACF of AAPL Stock Returns for 2015

The histogram plot (Figure 5.4) and Q-Q plot(Figure 5.5) indicate slight normality and
the ADF test results in a p.value less than 0.1, meaning that the series is stationary.

76
CHAPTER 5. FURTHER EXPLORATION

Figure 5.4: Histogram Plot of AAPL Stock Returns for 2015

Figure 5.5: Q-Q Plot of AAPL Stock Returns for 2015

All these outputs indicate that the series is white noise, that is, it is independently and
identically distributed with constant mean and variance.
If a particular time series is independently and identically distributed, then the absolute

77
CHAPTER 5. FURTHER EXPLORATION

value or square or the logarithm of that time series will also be independently and
identically distributed. If we plot the ACF and PACF of the absolute values of the AAPL
dataset, then we see significant autocorrelations in the series, as shown in Figure 5.6 and
Figure 5.7 respectively.

Figure 5.6: ACF Plot of Absolute Values of AAPL Stock Returns for 2015

Figure 5.7: PACF Plot of Absolute Values of AAPL Stock Returns for 2015

Thus we can conclude that there are still information present within the returns time
series.

78
CHAPTER 5. FURTHER EXPLORATION

5.1.1 ARCH/GARCH Models


There could be higher order dependence such as heavy-tailed distribution or volatility
clustering in the returns time series which could not be detected by the simple ARIMA
models. The conditional variance of the series might vary over time. This type of
characteristic is very common in most financial time series and they are usually addressed
with the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) models.

A GARCH(p,q) model is expressed as follows:


2 2 2 2 2 2
σt|t−1 = ω + β1 σt−1|t−2 + .... + βp σt−p|t−p−1 + α1 rt−1 + α2 rt−2 + .... + αq rt−q (5.1)

Here,
p = lags of conditional variance
q = order of ARCH
An ARCH(q) model is given by,
2 2 2 2
σt|t−1 = ω + α1 rt−1 + α2 rt−2 + ..... + αq rt−q (5.2)

By combining the ARIMA models with the ARCH/GARCH models, we could extract
more information from the series of AAPL stock returns.

5.1.2 ARIMAX Models


Time series analysis assumes that all observations are taken at equal intervals of time,
which is not the case, as the stock market remains closed during the non-trading days
and within those days, a lot of information may have been circulated that could cause
dramatic changes in the prices on the first trading day. In cases such as these, it would
be difficult to fit time series ARIMA models to the data set. In order to make the time
series approach of model-fitting and prediction more effective, we have to bring other
factors into account. [12] suggested that we have to incorporate secondary variables
such as, competition of the company, political events in the company’s country, natural
disasters, speculations about the company made in the market, etc.

ARIMAX models are simply ARIMA models with additional explanatory variables
provided by economic theory. It can be expressed as follows:

Yt = βXt + φ1 Yt−1 + .... + φp Yt−p − θ1 et−1 − ...... − θq et−q + et (5.3)

Here,
Xt = a covariate at time t
β = coefficient of the covariate
By including external variables into our ARIMA model, we could extract more information
from the series of AAPL stock returns.

5.2 Real Time Analysis


In this project, we worked with only the daily closing prices of stock market data, whereas
stock quotes get updated every minute. The closing prices get updated at the end of the
day when the market is closed. Within this time frame, a lot can happen, none of which

79
CHAPTER 5. FURTHER EXPLORATION

gets included in our analysis. If we had minutely data in our hands, we could have
been able to fit models more efficiently to the data and thus we could have gotten better
predictions. In the future, we intend to collect live data and process them in real time to
give outputs to the users.

5.3 System
We could incorporate the following features in the website:

• Build an Android/iOS application

• Forum for discussion

• Video tutorials regarding how to perform the analysis and model fitting

• More interactive during analysis and model-fitting

80
Chapter 6

Conclusion

This project was undertaken with a view to contributing to the financial technology sector
by allowing investors or any interested party to learn to analyze stock market data and
also to use that knowledge to build their own models. In order to bring this project to
fruition, we started by studying the various methods of stock market prediction. We then
chose to work with the time series approach of analysis and forecasting of financial data.
We studied the different ARIMA models and built the website, basing on them. While
working with ARIMA models, we realized that using only pure ARIMA models cannot
lead to an accurate prediction. More factors need to be taken in to account. The next
evolution of our project is to incorporate such models with the basic ARIMA models and
thus make the system more efficient in terms of analysis and forecasting.

81
Bibliography

[1] Rob j. hyndman. URL http://robjhyndman.com/tsdldata/roberts/skirts.dat.

[2] Applied time series analysis. URL https://onlinecourses.science.psu.edu/stat510/


node/41.

[3] w3schools. URL http://www.w3schools.com/w3css/.

[4] Econometrics academy. URL https://sites.google.com/site/econometricsacademy/


econometrics-models/time-series-arima-models.

[5] R-bloggers. URL https://www.r-bloggers.com/.

[6] Tc2000. URL https://widgets.tc2000.com/.

[7] A. O. Adebiyi, A. A.and Adewumi and C. K. Ayo. Stock price prediction using the arima
model. AMSS 16th International Conference on Computer Modeling and Simulation, pages
105–111, 2014.

[8] C. Chatfield. The Analysis of Time Series : Theory and Practice. Springer, 1975.

[9] J. D. Cryer and K. Chan. Time Series Analysis With Applications in R. Springer.

[10] T. Ding, V. Fang, and D. Zuo. Stock market prediction based on time series data
and market sentiment. URL http://murphy.wot.eecs.northwestern.edu/~pzu918/
EECS349/final_dZuo_tDing_vFang.pdf.

[11] E. F. Fama. The behavior of stock-market prices. Journal of Finance, 38:34–105, January
1965.

[12] S. Green. Time series analysis of stock prices using the box-jenkins approach. Electronic
Theses & Dissertations, 2011.

[13] Investopedia. Efficient market hypothesis: Is the stock market efficient? URL http:
//www.investopedia.com/articles/basics/04/022004.asp.

[14] B. Malkiel. A Random Walk Down Wall Street. W. W. Norton & Company, 1973.

[15] P. Mondal, L. Shit, and S. Goswami. Study of effectiveness of time series modeling (arima)
in forecasting stock prices. International Journal of Computer Science, Engineering and
Applications (IJCSEA), 4:13–29, April 2014.

[16] P. Pai and C. Lin. A hybrid arima and support vector machines model in stock price
forecasting. The International Journal of Management Science, 33:497–505, 2004.

[17] D. K. Pearce. Stock prices and the economy. Federal Reserve Bank of Kansas City Economic
Review, pages 7–22, 1983.

82
BIBLIOGRAPHY

[18] V. H. Shah. Machine learning techniques for stock prediction. URL http://www.vatsals.
com/.

[19] K. Tseng, O. Kwon, and L. C. Tjung. Time series and neural network forecast of daily
stock prices. Investment Management and Financial Innovations, 9:32–54, 2012.

[20] R. Weber. Time series, 1999. URL http://www.statslab.cam.ac.uk/~rrw1/


timeseries/index.html.

[21] S. Y. Xu. Stock price foreasting using information from yahoo finance and google
trend. URL https://www.econ.berkeley.edu/sites/default/files/Selene%20Yue%
20Xu.pdf.

83

You might also like