Forecasting Gold Price.
Forecasting Gold Price.
Abstract
This paper propose different techniques used in forecasting on time series analysis. Various techniques had been used in the field
of time series forecasting from the traditional Box-Jenkins approach to the most popular neural network technique nowadays.
However, there is no specifically the best method to deal with time series forecasting as the application of different time series
forecasting methods has their own requirements and restrictions. In determining the movement of gold price, there are a lot of
different methods being implemented by various authors to propose their models. Various time series forecasting method have
been discussed in this paper which consists of several journal articles that related to gold price and some of the data mining
techniques in time series forecasting retrieved from Google Scholar in this review.
1. Introduction analysis. Big data could often be heard from various industries
A precious metal is a rare naturally occurring, metallic due to the readily available data that could be access by
chemical element of high economic value; precious metals anyone to perform their business analysis which could reduce
were important as currency in the past but are now regarded the cost and resources on data collection. Hence, data mining
mainly as investment and industrial commodities. Historically, technique is getting more popular as it used to extract the
gold remains resilient and performed better than the S&P 500 needed information from a big pile of data for specific
index during the 1973-1974 market crash, the oil crisis of the analysis.
mid-1980s,the crisis of the new millennium combined with the Traditional way of doing time series forecasting is popular by
tax bubble in the 1990s and the global financial crisis in 2008 using the Box-Jenkins method. However it can only deal with
(The Edge Malaysia, 2017). Hence, gold is known as a safe- univariate time series analysis which includes only the
haven asset as it has an inverse correlation with the US dollar. dependent variable itself. Data mining allow the addition of
Investors tend to preserve the value of their asset when the various independent variables to help to forecast the
economic is uncertain by investing in gold since it does not movement of dependent variable and allow us to understand
have heavy liability or unpredictability (Shafiee and Topal the indicators behind it. Hence, it is value added to any
2010) [2, 12]. Gold act as insurance for investors as it has low businesses.
correlation with most assets thus can be used to reduce Most research only concern about the movement of gold price
portfolio volatility and minimize losses during extreme market with macroeconomic variables such as GDP growth rate,
conditions. inflation rate, and consumer price index. There is no specific
The movement of gold price is concerned by investors, method in data mining used to forecast the gold price with
financial specialist, and government and to who gain their other precious metals. In the study of volatilities of four
profit through gold trading. The impact to the movement of precious metals which are gold, silver, platinum, and
gold prices is influenced by a lot of factors which include palladium prices, the authors claimed that precious metals are
macroeconomic variables and business cycles. However, too distinct to be considered a single asset class, or
related studies also suggested that there is high correlation represented by a single index (Batten, Ciner et al. 2010) [5].
between gold price and crude oil price (Chen and Fang 2013)
[3, 18]. Besides, other author’s result shows that there is a
2. Literature Review
significant relationship between the precious metals and oil, The advancement of technology had boosted the collection
which include gold, platinum and silver. Additional research and storing of data to a whole new level. Nowadays, data can
also indicated that oil impacts gold more than gold impacts oil be easily collected by various gadgets which connect to a
(Gabralla, Jammazi et al. 2013) [4]. computer system. Since data can be easily collected as the
The prices of precious metals varies from day to day, a lot of data storage is getting cheaper, data can also be retrieved and
research had been done by researchers using different obtained with fingertips through internet by using
techniques. The traditional way of doing statistical analysis computational software. In this situation, it brings up a whole
could be more complex and difficult as of today, the data new challenge in the field of data analysis as traditional
storage is getting cheaper and lots of data can be collected for statistical analysis could not be able to perform efficiently as
44
International Journal of Multidisciplinary Research and Development
the data sets get larger because it will take a whole lot more of research and a SARIMA model of order (0, 1, 1) (0, 1, 1) 12 is
time and energy for a human being to do so. Hence, it is the model chosen by the authors. The authors suggested that
required to have both computational skills and analytical skills the Bayesian information criterion (BIC) is preferred to
in order to improve the performance of data analysis. Akaike’s information criterion (AIC) for comparing different
Big data is being more and more often to be heard nowadays. models as BIC will penalize the addition of extra parameters
Big data usually includes data sets that are so large and more severely than AIC does.
complex which the sizes are beyond the ability of standard A comparison has been made by (Shafiee and Topal 2010) [2,
12] to compare the accuracy of forecasting model of gold price
statistical software tools to capture, curate, manage, and
process data within a tolerable elapsed time (Snijders, Matzat by using reverting jump and dip diffusion model, and ARIMA
et al. 2012) [6]. Big data requires a set of techniques and model. The study focus on the gold price trends over the past
technologies with new forms of integration to reveal insights 40 years from January 1968 to December 1988 and analysis
from datasets that are diverse, complex, and of a massive scale was conducted by using monthly data to forecast the gold
(Hashem, Yaqoob et al. 2015) [7]. price for the next 10 years. The unit root test Augmented
Data mining is a process of extracting information from a Dickey Fuller (ADF) test was conducted by the authors and
huge data set and transform it into understandable structure for the gold price was found non-stationary and the first
future use. Data mining is the interdepartmental field of study differential transformation was applied. The reverting jump
that are able to discover patterns and models from a large and dip diffusion model was claimed to perform better than
amount of information stored in data warehouse (Gaber, ARIMA model through comparison of RMSE and MAE of
Zaslavsky et al. 2005) [8]. Data mining is also refers to the both model. The new model proposed by the author is as
extracting or mining knowledge from large data stores or sets follows which consider the drift, diffusion components, and
(Al-Radaideh, Assaf et al. 2013) [9]. However, data mining is jump and dip period of the gold price.
not something new as various author had been using the
technique in their research for past decade. X t 1 2 t 2 (1 3 D 1 4 D 2 u t (2)
45
International Journal of Multidisciplinary Research and Development
of ARIMA and ARIMAX model on time series forecasting. affecting the gold price include Bureau future (CRB) index,
The authors implement wavelet analysis (WA) to decompose USD/Euro Foreign Exchange Rate EUROUSD, inflation rate
the trend, seasonality, process variations, and noises of time and money supply (M1).
series. The decomposed data is then used as exogenous
variables on constructing an ARIMA model. The results show 2.4 Support vector regression
that WA are good regressors in ARIMAX which are able to Support Vector Regression (SVR) is used as a learning
capture nonlinear patterns well. ARIMAX-WD performs algorithm to predict the gold price by understanding the
better than ARIMA in long term and nonlinear time-series pattern of the historical gold price (Navin 2013) [20]. An
forecasting application. assumption that made by the authors is that the historical data
incorporate all the external effects behaviour. The data is
2.3 Regression transformed into generic dataset followed by cross validation
A linear regression approach is proposed by (Gharehchopogh, process to feed them as inputs into support vector regression
Bonab et al. 2013) [16] in the case study of predicting the S&P model. The type of kernel and special parameters are selected
500 index. The authors selected the volume as the dependent and applied to the model, the process was repeated until the
variables and average daily price as independent variables. accuracy of the model is good enough. The model is then used
The results shows that the accuracy is satisfied as there is to test with the out of sample dataset for gold price prediction.
61.35% of similarities observed when compared to the out
samples. However, the model has a low R-squared value, 2.5 Neural network
which indicates that the model does not contribute much for Neural network and genetic algorithm were implemented by
the prediction of volume. The model could be improved by (Mirmirani and Li 2004) [21] to analyse the movement of gold
considering other factors that could affect the daily volume price. Application of Genetic Algorithm along with neural
trade via variable selection through data mining. networks is said to improve the learning power and robustness
Multiple regression method is proposed by (Baker and Van of the system. The analysis was done by using daily cash
Tassel 1985) [17] based on fundamentalist approach to prices of gold from 12/31/1974 to 12/31/1998 which consist of
determine the variables that affecting the gold price and model 6008 data points. The results show that the historical prices of
building for long term gold price prediction. The changes in gold strongly affect the gold price in future. The authors also
commodity price index, changes in the value of US dollars, claimed that there is short-term time dependence of gold price
and future inflation rate is found to be affecting the monthly movements with the time lag of 36 days.
change of gold price. The results suggested that the price of Neural network has also be applied by (Grudnitski and Osburn
gold can be expected to rise if there is a general increase in 1993) [22] for forecasting S&P and gold futures price. Other
commodity prices. than focusing on the price trend itself, the authors also
Another multiple regression approach is conducted by (Chen consider general economic conditions and traders’
and Fang 2013) [3, 18]. The return on investment in gold has expectations about the market in future. The authors conclude
higher correlation with other commodities (aluminium, that neural network can be applied to forecast price changes of
petroleum, zinc, and etc.) than the business cycle and the main the markets with four factors considered which are:
macroeconomic variables. The authors aimed to explore the 1. Derivation of the parameters of the networks relies
long term determinants of gold price by considering the US heavily on published commitments.
money supply M2, CRB index, US dollar index, the Dow 2. Selection of futures whose price are relatively insulated
Jones Industrial Average, and the SPDR holdings as the from natural phenomena.
exogenous variables in building the regression model. The 3. Appropriate length for the training period.
study concludes that M2 of US money supply is the most 4. Trading selectively rather than in every period.
affecting factor with gold price which they are highly positive
correlated. CRB index of inflation levels also shows high In constructing the neural network model, the researchers may
positive correlation with the gold price. However, the model face a hard time on estimating the weights to be used in the
showed that gold price has low negative correlation with the model. It is not an easy task as the number of weights may be
Dow Jones Industrial Average, low positive correlation with large and the objective function may have local minima
the US dollar index and SPDR gold trust positions. (Faraway and Chatfield 1998) [11]. The application of neural
According to experts, there are a lot of economic factors network model is still a doubt for researchers as some study
which could affect the movement of gold price. Multiple had proposed that the neural network model is no better than
linear regression method has been used to predict the gold other methods. Neural networks has several drawbacks which
price based on several economic factors such as inflation, include excessive training times, difficulty in obtaining and
currency price movements and others (Ismail, Yahya et al. replicating a stable solution, overfitting of model, and the
2009) [19]. Stepwise regression is used to remove the black box nature of the solutions. Although, neural network
correlation between variables. At the same time, the authors modelling is non parametric analysis which the process can be
proposed that the effects of significant lag in the cause-and- completely automated on a computer, Black boxes can
effect process in order to improve the performance of the sometimes give silly results.
model. Durbin-Watson statistics is used to check whether
autocorrelation exist in the error terms. Error mean square 2.6 Other Methods in Time Series Forecasting
MSE and mean squared prediction error MSPR are used to Ensemble machine learning was proposed by (Gabralla,
evaluate the model accuracy. From the study, the factors Jammazi et al. 2013) [4] to study the forecasting performance
46
International Journal of Multidisciplinary Research and Development
of daily WTI crude oil price with consideration of a number of the market index will go up, go down or stay in the next 1, 6,
influential features as inputs. Best-first search and genetic and 21 days. In the mean times, the authors claimed that
algorithm were used by the authors in selection of variables although NN models have high accuracy, many of them does
that influences the crude oil price. According to the authors, it not put into practice due to inability of neural networks to
is possible to use commercial material instead of gold price explain its reasoning. For the results, D-Tree with 3 classes is
for crude oil price prediction. Their results show that the considered as the best model in the study. The authors applied
ensemble method performed better than SMOReg and IBL chart patterns discovery to drill down into the model to find
using only three attributes. valuable patterns which could be easily used in decision
Another approach of using data mining classification in stock making.
market prediction is proposed by (Paliyawan 2006) [23]. The
author performs the analysis by using four different 3. Discussion
approaches which are neural network, decision tree, naïve This section highlights the key point that expressed by the
Bayes, and k-nearest neighbours. The authors predict whether authors of previous study.
47
International Journal of Multidisciplinary Research and Development
naïve Bayes, and discovery can be applied to drill down into the model to find valuable patterns.
k-nearest Although NN models have high accuracy, many of them does not put into practice due to
neighbours inability of neural networks to explain its reasoning.
To propose a hybrid method of ARIMA-GARCH modelling to forecast the gold price by using a
Yaziz, Azizan et al.
ARIMA-GARCH total of 40 daily gold price. ARIMA and GARCH are able to complement each other to achieve
(2013) [14]
a better forecasting model than traditional ARIMA model.
Previous studies showed that several approaches have been machine learning. Computing, Electrical and Electronics
implemented in the time series forecasting in economic and Engineering (ICCEEE), International Conference on,
financial time series analysis. However, there is no IEEE, 2013.
specifically the best method to deal with time series 5. Batten JA, et al. The macroeconomic determinants of
forecasting as the application of different time series volatility in precious metals markets. Resources policy.
forecasting methods has their own requirements and 2010; 35(2):65-71.
restrictions. Therefore, further improvement of traditional 6. Snijders C, et al. Big Data: big gaps of knowledge in the
Box-Jenkins analysis in time series analysis could be field of internet science. International Journal of Internet
improved. Table 2 describes the contributions that could be Science. 2012; 7(1):1-5.
recognized in the future research. 7. Hashem IAT, et al. The rise of big data on cloud
computing: Review and open research issues. Information
Table 2: Contributions of the research Systems. 2015; 47:98-115.
To forecast the movement of gold price with the consideration of 8. Gaber MM, et al. Mining data streams: a review. ACM
the price of other precious metals. Sigmod Record. 2005; 34(2):18-26.
To identify the relationship of gold to other precious metals. 9. Al-Radaideh QA, et al. Predicting stock prices using data
To compare the accuracy of forecasting model using data mining mining techniques. The International Arab Conference on
forecasting technique and traditional Box-Jenkins approach. Information Technology (ACIT’2013), 2013.
10. Guha B, Bandyopadhyay G. Gold price forecasting using
4. Conclusion ARIMA model. Journal of Advanced Management
The paper presents a review of literature concerned with the Science, 2016, 4(2).
variety of methods that had been applied in the time series 11. Faraway J, Chatfield C. Time series forecasting with
forecasting in the field of economics and finance. The neural networks: a comparative study using the air line
growing of recorded data is faster than ever before nowadays data. Journal of the Royal Statistical Society: Series C
which leads to huge data sets. Traditional statistical analysis (Applied Statistics). 1998; 47(2):231-250.
methods may need to modify in order to improve the 12. Shafiee S, Topal E. An overview of global gold market
performances and speed of conducting analysis. In short, and gold price forecasting. Resources policy. 2010;
statistical knowledge and computational skills have to be 35(3):178-189.
combined in order boost the performance of statistical 13. Khan MMA. Forecasting of gold prices (Box Jenkins
analysis. Further study could be done by considering the other approach). International Journal of Emerging Technology
industrial commodities as exogenous variables that that could and Advanced Engineering. 2013; 3(3):662-670.
have driven the gold price. Data mining techniques that could 14. Yaziz S, et al. The performance of hybrid ARIMA-
complement with the traditional statistical analysis should be GARCH modeling in forecasting gold price. 20th
considered in time series analysis and forecasting in future International Congress on Modelling and Simulation,
work. Adelaide, 2013.
15. Wongdhamma W. Upgrade from ARIMA to ARIMAX to
5. Acknowledgement Improve Forecasting Accuracy of Nonlinear Time-Series:
We would like to say thank you to Universiti Tun Hussein Create Your Own Exogenous Variables Using Wavelet
Onn Malaysia (UTHM) and Office for Research, Innovation, Analysis, 2016.
Commercialization and Consultancy Management (ORICC), 16. Gharehchopogh FS, et al. A linear regression approach to
UTHM for kindly providing us with the internal funding TIER prediction of stock market trading volume: a case study.
1for proofreading (Vot U903). International Journal of Managing Value and Supply
Chains. 2013; 4(3):25.
6. References 17. Baker SA, Van Tassel RC. Forecasting the price of gold:
1. The Edge Markets. Cover Story: Backed By Solid Gold. A fundamentalist approach. Atlantic Economic Journal.
Retrieved on 2017-2018, from http://www.theedgemar 1985; 13(4):43-51.
kets.com/article/cover-story-backed-solid-gold 18. Chen X, Fang Y. Enterprise systems in financial sector–
2. Shafiee S, Topal E. An overview of global gold market an application in precious metal trading forecasting.
and gold price forecasting. Resources policy. 2010; Enterprise Information Systems. 2013; 7(4):558-568.
35(3):178-189. 19. Ismail Z, et al. Forecasting gold prices using multiple
3. Chen X, Fang Y. Enterprise systems in financial sector-an linear regression method. American Journal of Applied
application in precious metal trading forecasting. Sciences. 2009; 6(8):1509.
Enterprise Information Systems. 2013; 7(4):558-568. 20. Navin DG. Big Data Analytics for Gold Price Forecasting
4. Gabralla LA, et al. Oil price prediction using ensemble Based on Decision Tree Algorithm and Support Vector
48
International Journal of Multidisciplinary Research and Development
49