0% found this document useful (0 votes)

10 views

Statistical Modeling of High Frequency Datasets Using The ARIMA-ANN Hybrid2023

Uploaded by

lai shanyan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Statistical Modeling of High Frequency Datasets Using The ARIMA-ANN Hybrid2023

Uploaded by

lai shanyan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

mathematics

Article
Statistical Modeling of High Frequency Datasets Using the
ARIMA-ANN Hybrid
Etaf Alshawarbeh 1 , Alanazi Talal Abdulrahman 1 and Eslam Hussam 2,3, *

1 Department of Mathematics, College of Science, University of Ha’il, Ha’il P.O. Box 55476, Saudi Arabia;
[email protected] (E.A.); [email protected] (A.T.A.)
2 Department of Accounting, College of Business Administration in Hawtat bani Tamim,
Prince Sattam bin Abdulaziz University, Hawtat bani Tamim, Saudi Arabia
3 Department of Mathematics, Faculty of Science, Helwan University, Cairo 12613, Egypt
* Correspondence: [email protected]

Abstract: The core objective of this work is to predict stock market indices’ using autoregressive
integrated moving average (ARIMA), artificial neural network (ANN) and their combination in
the form of ARIMA-ANN. Financial data are, in fact, trendy, noisy and highly volatile. To tackle
their chaotic nature and forecast the three considered stock markets, namely Nasdaq stock exchange,
United States, Nikkei stock exchange, Japan, and France stock exchange data (CAC 40 index), we use
novel approaches. The data are taken from the Yahoo Finance website for the period from 4 January
2010 to 20 August 2021. To assess the relative predictive effectiveness of the selected tools, the dataset
was divided into two distinct subsets: 75% of the data was allocated for training purposes, while the
remaining 25% was reserved for testing. The empirical results suggest that ARIMA-ANN produces
more accurate forecasts than the separate components of all stock markets. In light of this, it may be
inferred that the combining tool is more effective in analyzing financial data and provides a more
accurate comparative prediction.

Keywords: stock markets; machine learning; hybridization; forecasting

MSC: 60E05
Citation: Alshawarbeh, E.;
Abdulrahman, A.T.; Hussam, E.
Statistical Modeling of High
1. Introduction
Frequency Datasets Using the
ARIMA-ANN Hybrid. Mathematics The stock market, or equity market, consists of numerous stock exchanges across the
2023, 11, 4594. https://doi.org/ globe. The general public and investors sell and purchase shares, whose prices fluctuate
10.3390/math11224594 constantly by dint of the law of demand and supply. A stock or share represents partial
possession of a company or corporation. Buyers attempt to purchase a share at the lowest
Academic Editor: Antonella Basso
feasible price, while sellers attempt to sell it at the highest price [1]. One of the most
Received: 9 October 2023 significant venues for raising capital is the stock market, alongside debt markets, which
Revised: 29 October 2023 are more intimidating but not publicly traded. Due to the high liquidity of the stock
Accepted: 2 November 2023 market, investors can quickly and easily buy and sell securities. A rising stock market and
Published: 9 November 2023 widespread participation in this are the two main indicators of an improving economy.
Stock market fluctuations can have a considerable influence on individuals as well
as the whole economy. A dramatic drop in stock prices can be extremely destabilizing for
economic activities. For example, the 1929 stock market collapse was the primary cause
Copyright: © 2023 by the authors.
Licensee MDPI, Basel, Switzerland.
of the Great Depression in the 1930s [2]. When stock prices are high, a large number of
This article is an open access article
companies are likely to launch an initial public offering (IPO) in order to enhance their
distributed under the terms and capital by transferring ownership of their businesses. During a bull market, mergers and
conditions of the Creative Commons acquisitions are also influential. Due to the greater investment, economic development is
Attribution (CC BY) license (https:// accelerated [1].
creativecommons.org/licenses/by/ What if investors could predict when the price of a stock would increase or decrease?
4.0/). They would invest all their funds in that company in order to maximize their profits.

Mathematics 2023, 11, 4594. https://doi.org/10.3390/math11224594 https://www.mdpi.com/journal/mathematics

Mathematics 2023, 11, 4594 2 of 17

However, it is feasible to estimate the unknown parameters and achieve a forecast for the
future based on historical and current data regarding specific shares. This type of analysis
refers to technical analysis or machine learning (ML). ML models have shown effectiveness
in a variety of financial processes, including portfolio management [3] and bankruptcy
forecasting [4].
ML is an AI subfield concerned with developing and testing algorithms with the aid of
data. Automation is taking over a lot of industries; using mathematical models, computers
make quick decisions about online trade [5]. This generates markets in which the long-
term outlook is replaced by short-term fluctuations and sell-offs. The algorithms that are
most often used for predicting and analyzing the stock market and future movements are
SVM and ANN. Using tick data, these systems achieve up to 99.9% accuracy. Financial
forecasting is characterized by data-intensive, non-stationary, noisy, unstructured, and
hidden relationships [6].
Ref. [7] utilized neural networks to predict US stock prices and demonstrated that
neural networks outperform conventional models such as generalized linear models, main
component regressions, and regression trees. Long short-term memory (LSTM) networks
were utilized by [8] in order to accurately predict stock trends that attract investor sentiment
and report big profits. Ref. [9] utilize neural networks to predict bond excess returns
and report large economic gains. The neural network model has also been applied to
cryptocurrencies in some of the literature; these studies demonstrate that the approach is
more accurate at predicting future price changes [10,11]. Fathali et al. [12] used various
neural network techniques, including recurrent neural networks (RNNs), LSTM, and
convolutional neural networks (CNNs), for anticipating stock market price movements.
They discovered that LSTM is the best model after running numerous experiments with
different inputs and epochs. Ref. [13] used random forests to examine how investor
confidence affects US monthly aggregate realized stock-market volatility, in addition to
a large number of financial and macroeconomic variables. They found that investor
confidence, specifically investor confidence uncertainty, predicts overall realized volatility
and its “good” and “bad” variants out-of-sample. Ref. [14] introduced an investor attention
index that relies on proxies found in the existing literature. Their findings indicate that
this index effectively forecasts the stock market risk premium, demonstrating its predictive
accuracy in both the sample and post-sample periods. Notably, the individual proxies
exhibit a limited predictive ability when considered independently. Ref. [15] carried out
the study and showed that the Markov-switching multifractal (MSM) is superior to the
dynamic conditional correlation-generalized autoregressive conditional heteroscedasticity
(DCC-GARCH) model in terms of predictive accuracy. Ref. [16] predicted three stock
market indexes of SAARC countries using the ARIMA model and novel machine-learning
techniques including multilayer perceptron and recurrent neural networks. They showed
that hybrid models are a viable choice for forecasting financial time-series data. The study
carried out by [17] demonstrated that the integration of ARIMA and ANN models yields
a superior predictive performance compared to the individual use of either ARIMA or
ANN models. To predict stock market movement, Ref. [18] evaluated a variety of ML
algorithms for the standard time series model, and it was determined that LSTM accurately
predicts stock market data. To address the challenge of predicting stock closing prices,
Ref. [19] proposed the Deep Convolutional Generative Adversarial Network (DCGAN)
architecture and demonstrated that it outperforms current tools in both single-step and
multi-step forecasting, demonstrating that deep learning (and GANs in particular) is a
promising tool for financial time series forecasting.
Ref. [20] compared the forecast performance of volatilities using two different hybrid
ANN models and GARCH-type models. The results demonstrate notable leverage effects in
the Chinese energy market and that the EGARCH-ANN model outperforms other models
in predicting the volatilities of log-returns series.
According to [21], the goal of this study is to develop a novel parallel hybrid model in
order to provide a comprehensive hybrid framework that can accurately simulate all pure
Mathematics 2023, 11, 4594 3 of 17

and mixed linear and/or nonlinear patterns found in real-world time series. The suggested
hybrid model performs better than the individual models of ARIMA, MLPNN, RBFNN,
and LSTM, as well as the hybrid models of the ARIMA-MLPNN and MLPNN-ARIMA
series, and the hybridization of ARIMA and MLP models in parallel.
Numerous time series forecasting techniques that employ linear and nonlinear
models, alone or in combination, have been studied by [22]. The research indicates that
integrating linear and nonlinear models can enhance forecasting accuracy. Nevertheless,
in some circumstances, the performance of those current methods may be limited by
specific assumptions that they make. We offer a novel hybrid technique that operates
within a broader framework: ARIMA-ANN. We demonstrate that combining our hybrid
approach with EMD with any of the other approaches that we employed independently
can be a useful strategy to increase the forecasting accuracy attained by conventional
hybrid methods.
In the fields of economics and finance, there is a pressing need to enhance the precision
of forecasts to the utmost degree. In order to effectively implement strong macroeconomic
policies, it is important to engage in empirical analyses and strategic planning that relies on
projections pertaining to significant macroeconomic indicators. Consequently, a range of
univariate and multivariate methodologies have been devised to effectively manage data
noise and enhance the precision of forecasting. However, it is important to acknowledge
that real-world phenomena do not strictly adhere to either linear or nonlinear patterns. Con-
sequently, both linear and nonlinear models frequently fall short of accurately representing
the underlying trend within the data. This study integrates linear and nonlinear models to
develop a hybrid model, specifically ARIMA-ANN, which effectively incorporates both
linear and nonlinear components of a series. Consequently, this hybrid model enhances
predictive accuracy in comparison to the use of individual linear (ARIMA) or nonlinear
(ANN) models alone.
Our research aims to bridge a significant gap in the existing literature by investigating
the use of stock market indices within the context of G7 countries. These nations, including
the United States, Canada, Japan, Germany, France, the United Kingdom, and Italy, collec-
tively represent some of the world’s largest and most influential economies. Despite their
critical role in the global financial landscape, there has been a notable scarcity of studies
that explore the application of stock market indices in hybrid models within this specific
group of countries.
The central objective of our research is to enhance prediction accuracy by integrat-
ing both linear and non-linear modeling approaches, specifically by combining the linear
(ARIMA) model with a nonlinear (ANN). Thus, our study focuses on analyzing the histori-
cal closing prices of key stock indices, namely the Nasdaq stock exchange in the United
States, the Nikkei stock exchange in Japan, and the CAC 40 index in France. These indices
represent a sample from the G7 countries, and our aim is to evaluate and compare the
predictive capabilities of standalone linear and non-linear models against a hybrid model,
known as ARIMA-ANN.
In the specific context of G7 countries, numerous prior research endeavours have
employed various forecasting techniques, such as AR, ARIMA, ANN, and VAR, among
others. However, a notable gap exists in the utilisation of hybrid models for this purpose.
As previously discussed, hybrid models are deemed more appropriate for forecasting due
to their ability to capture both linear and nonlinear trends in the data. This characteristic
ultimately leads to more precise and accurate forecasts. The primary objective of our
research is to investigate the efficacy of the hybrid ARIMA-ANN model in comparison
to the individual ARIMA and ANN models. This analysis is conducted using a dataset
comprising stock market indices.
The remaining sections of the paper are organized as follows. Section 2 discusses the
data and the procedures. Section 3 presents the research’s empirical findings. The paper
arrives at a conclusion in Section 4.
Mathematics 2023, 11, 4594 4 of 17

2. Data and Methods

Within this section, we shall provide a comprehensive examination of the stock markets
that are the focal point of our inquiry. In this study, we explore the complexities associated
with data acquisition and preprocessing methodologies, elucidating the process by which
we gathered and prepared the data for subsequent analysis. In addition, we expand on
the procedures utilised in the current study, offering a comprehensive description of the
strategies and approaches adopted to conduct our research.

2.1. Data
This research uses daily data on the closing prices of three Stock market indexes
including Nasdaq stock exchange in the United States, Nikkei Stock exchange in Japan
and CAC 40 index (a benchmark France stock market index). The data were taken from
the Yahoo Finance website for the period from 4 January 2010 to 20 August 2021. In order
to assess the prediction capabilities of the hybrid model in comparison to the individual
ARIMA and ANN models, the dataset was divided into two distinct subsets: a training set
including 75 percent of the data, and a testing set comprising the remaining 25 percent. The
training data were utilized to calibrate the models, whereas the testing data were employed
to assess the predictive capability of the underlying tools.

2.2. Methodology
The Linear and Non-Linear Models
The field of time series prediction is experiencing rapid growth and holds significant
potential for future improvement. A commonly employed strategy for updating the
accuracy of predictions involves the integration of multiple methods. This approach relies
on the inherent abilities of various models or methodologies with the aim of constructing a
prediction framework that is both more resilient and precise. Extensive research has been
conducted in this particular domain, resulting in the proposal of various combinations of
approaches, as documented in the existing literature [23–25].
ARIMA: In recent decades, ARIMA has become a popular statistical methodology
for forecasting stationary and non-stationary time series data. This model frequently
incorporates autoregressive (AR) and moving average (MA) models, as well as a data
transformation term called differentiation. Nevertheless, the ARIMA model has certain
limitations, such as the assumption of linearity, a condition that is challenging to satisfy in
practical scenarios, or relying solely on historical data as input variables. The ARIMA model
can be transformed into an AutoRegressive Moving Average (ARMA) model by eliminating
the differencing component. In general, the ARMA model can be considered a specific
instance of the more comprehensive ARIMA model, and its formulation is represented by
Equation (1).
p q
Xt = b + ∑i=1 γi Xt−i + µt − ∑j=1 θj µt−j (1)

The ARMA model is utilized to predict the value of a time series variable (Xt ) one step
ahead. This prediction is based on the historical values of the time series (Xt−1 , Xt−2 , . . ., Xt−p )
and the previous errors (µt−1 , µt−2 , . . ., µt−q ). The parameters γi and θj are of an unknown
nature, whereas b represents an intercept term. The stochastic error term µt is indepen-
dently and identically distributed, with a mean of zero and a variance of δ2 . The model
incorporates prior values up to orders p and q.
In order to make the preceding formula easier to understand, the backward shift
operator (A), which is illustrated as Ai Xt = Xt−i , is substituted to represent the ordinary
algebraic symbols in Equation (1). As a result, the ARMA model can be mathematically
represented in the following manner:
p q
Xt = b + ∑i=1 γi Xt Ai + µt − ∑j=1 θj µt Aj (2)
Mathematics 2023, 11, 4594 5 of 17

Then, after adjusting the terms associated with Xt in Equation (2), we can obtain the
following ARMA model:

p q
1 − ∑i=1 γi Ai Xt = b + 1 − ∑j=1 θj Aj µt (3)

For a simplified version of the expression:

γp (A)Xt = b + θq (A)µt (4)

where
p
γp (A) = 1 − ∑i=1 γi Ai , (5)
and
q
θq (A) = 1 − ∑ θj Aj
j=1

representing, respectively, the AR operator and MA operator.

Despite the ARMA model’s inability to incorporate the unit root impact in time series
data, it is necessary to use difference transformation to obtain stationarity and attain
accurate findings. The integration term is then adjusted in this manner.

γp (A)(1 − A)s Xt = b + θn (A)µt (6)

The ANNs approach: The relaxation of the linear constraint in the model form leads
to a vast range of alternative non-linear structures that can be utilized for the purpose of
explaining and predicting a time series. A well-established nonlinear model should be
globally adequate to deal with the specific nonlinear structure of the data. For further detail,
we refer to [26]. ANNs are specifically designed to effectively approximate nonlinearities
present in datasets.
A variety of nonlinear issues can be simulated by ANNs, which are flexible computer
frameworks. One primary advantage of ANN models in comparison to other non-linear
models is in their capacity to effectively estimate a diverse array of functions [27]. Its
strength comes from the simultaneous processing of data. No prior assumptions regarding
the model shape are required during the construction process. Instead, the ANN models
are primarily specified by the data attributes.
The utilization of a single hidden-layer feed-forward network is a commonly em-
ployed functional framework for the purpose of time series prediction [28]. A matrix
of three layers of fundamental processing units is defined by cyclical connections. The
relationship between the output (Qm ) and the inputs (Qm−1 , Qm−2 , . . ., Qm−n ) is depicted
mathematically below.

k n
Qm = β0 +∑l=1 βl g α0l + ∑i=1 αil Qm−i + em (7)

βl (l = 0, 1, 2, . . ., k) and αil (i = 0, 1, 2, . . ., n; l = 0, 1, 2, . . ., k) are the model parameters,

also known as connection weights. n is the number of input nodes, while k is the number
of hidden nodes. The logistic function is widely utilized as a hidden layer transfer function
and can be written as follows.
The model parameters, denoted as βl (l = 0, 1, 2, . . ., k) and αil (i = 0, 1, 2, . . ., n;
l = 0, 1, 2, . . ., k), are commonly referred to as connection weights in the academic literature.
The variable n represents the quantity of input nodes, whereas k denotes the quantity of
hidden nodes. The logistic function is frequently employed as a transfer function in hidden
layers and can be expressed as follows:

1
g(x) = (8)
1 + exp(−x)
variable n represents the quantity of input nodes, whereas k denotes the quantity of hid-
den nodes. The logistic function is frequently employed as a transfer function in hidden
Mathematics 2023, 11, 4594
layers and can be expressed as follows: 6 of 17
g(x) = (8)
( )

Consequently,
Consequently, the theANN
ANNmodel
modeldescribed
describedininEquation
Equation(8)(8) exhibits
exhibits thethe ability
ability to exe-
to execute
cute a non-linear functional mapping. This mapping is achieved by utilizing
a non-linear functional mapping. This mapping is achieved by utilizing prior observations prior obser-
vations
(Q (Q
;Q ; Q
, . . ., Q , …, Q ) to predict the future value
) to predict the future value Q . Q .
m−1 m−2 m−n m
Q = f (Q , Q , …, Q , v) + e (9)
Qm = f (Qm−1 , Qm−2 , . . ., Qm−n , v) + em (9)
Here, v stands for a vector containing all parameters, and f is a function based on the
Here,
network v standsand
structure forconnection
a vector containing
weight. Asalla parameters, and f networks
result, the neural is a function(NNs)based on
corre-
the network
spond structure
to a nonlinear ARand connection
model. weight.
One output nodeAsinatheresult,
outputthelayer
neural networks
is used (NNs)
in Equation
correspond
(9) to produce to aaone-step
nonlinear AR model.
ahead One output node in the output layer is used in
prediction.
Equation (9) toofproduce
In terms a one-step
prediction, simpleahead
ANN prediction.
algorithms are extremely eﬀective. Time series
In terms of prediction, simple
data are frequently better forecasted by NNs ANN algorithms
with oneare or extremely
two hiddeneffective.
nodes [28].Time series
data Hybrid
are frequently better forecasted by NNs with one or two hidden nodes
model: In a nutshell, the process of developing a hybrid model involves two [28].
Hybrid
distinct In theIninitial
model:
stages. a nutshell,
stage,the
theprocess
ARIMAofmodeldeveloping a hybrid
is employed tomodel
examineinvolves two
the linear
distinct stages. In the initial stage, the ARIMA model is employed to examine
aspect of the data. In the second stage, the residuals recovered from the estimated ARIMA the linear
aspect of
model arethe data.
used toIn the second
build a neural stage, the residuals
network. recovered
The residuals from
of the the estimated
ARIMA model ARIMA
include
model are used to build a neural network. The residuals of the
significant information pertaining to nonlinearities, as the ARIMA model is unableARIMA model include
to ef-
significant
fectively information
represent pertaining
the nonlinear to nonlinearities,
pattern present in theas the The
data. ARIMAANNs’model is unable
algorithm can beto
effectively represent the nonlinear pattern present in the data. The ANNs’
used to forecast the residuals of an ARIMA model. The hybrid model uses the distinct algorithm can be
used to forecast the residuals of an ARIMA model. The hybrid model uses the distinct traits
traits and strengths of the ANN and ARIMA models to identify alternative structures.
and strengths of the ANN and ARIMA models to identify alternative structures. Linear
Linear and non-linear patterns can be adequately described using multiple models, and
and non-linear patterns can be adequately described using multiple models, and their
their predictions can be combined to improve overall modelling and predictability [28].
predictions can be combined to improve overall modelling and predictability [28]. Figure 1
Figure 1 shows the steps followed in this study.
shows the steps followed in this study.

Figure 1.
Figure Flowchart of
1. Flowchart of hybrid
hybrid model. In the
Noted: In
model. Noted: the initial
initial stage,
stage, the
the ARIMA
ARIMA model
model is is employed
employed to
to
The residuals
examine the linear aspect of the data. The residuals recovered from an estimated ARIMA model are
used to build
build aa neural
neural network
networkininthe
thesecond
secondstage.
stage.Finally,
Finally,totomake
makea hybrid,
a hybrid,
thethe forecasted
forecasted values
values of
of
ANNANNandand ARIMA
ARIMA areare added.
added.

3. Empirical
3. Empirical Results
Results
This section
This section provides
provides aa thorough analysis and
thorough analysis and graphical
graphical representation
representation of
of the
the three
three
stock markets.
stock markets.
3.1. Nasdaq USA Stock Market
In Figure 2a, the original series is shown to increase over time, which shows that
the underlying series is non-stationary. More specifically, the statistical characteristics
exhibit temporal variability. To achieve smoothness and eliminate fluctuations from the
data, we initially transform the series by taking the natural logarithm and then perform
3.1. Nasdaq USA Stock Market
In Figure 2a, the original series is shown to increase over time, which shows that the
underlying series is non-stationary. More specifically, the statistical characteristics exhibit
Mathematics 2023, 11, 4594 temporal variability. To achieve smoothness and eliminate fluctuations from the 7data, of 17 we
initially transform the series by taking the natural logarithm and then perform the first
difference to achieve stationarity. Figure 2b portrays the graph of the transformed time
the firstwhich
series, difference to achieve
manifests that stationarity.
the series isFigure 2b portrays
difference the graph
stationary. of the3a,
In Figure transformed
the ACF plot
istime series,declining.
steadily which manifests
This is that the series
another is difference
indication stationary.
of a unit root. AsInFigure
Figure3c3a, the ACF
shows, as we
plot is steadily declining. This is another indication of a unit root. As Figure
performed the transformation, the ACF plot is very quickly declines, which suggests 3c shows, as a
we performed the transformation, the ACF plot is very quickly declines, which
differenced stationary series. Thus, we can proceed with the stationary series. Certain pat- suggests
a differenced
terns in the ACFstationary series.
and PACF Thus,
plots we can proceed
correspond with
to specific the stationary
orders of q and series.
p. Certain
patterns in the ACF and PACF plots correspond to specific orders of q and p.

(a) (b)
Figure
Figure 2.
2. Level
Level and first difference
and first differenceofofUSAUSAstock
stockmarket.
market. Figure
Figure (a)(a) shows
shows thatthat
thethe series
series increasing
increasing
over time, but as we take the first difference, indicated by figure (b), then the series is mean station-
over time, but as we take the first difference, indicated by figure (b), then the series is mean stationary.
ary. Noted: Level and first difference of USA stock market, where the series at level shows an in-
Noted: Level and first difference of USA stock market, where the series at level shows an increasing
creasing trend and achieves smoothness after difference transformation.
trend and achieves smoothness after difference transformation.

There are a few ways in which we can observe the residuals’ randomness in the
estimated model. We adopt a graphical approach, as well as a statistical approach, in
Figure 4. The residuals’ ACF reveals no serious autocorrelations. The last plot on the
bottom provides p-values for the Ljung–Box statistic for each lag up to 10. These tests
consider the accumulated residual autocorrelation from lag 1. The dashed blue line indicates
a 5 percent level of significance, and it can be observed that all p-values (denoted by circles)
are above this. Thus, we can conclude that residuals are purely random. Hence, this model
is suitable for prediction.
Post ARIMA modeling, we utilize another approach for forecasting, known as ANN.
ANN is considered the most well-known machine learning technique for forecasting.
Therefore, this study adopts this technique to capture the complex behavior of the Nasdaq
US stock market and resultantly achieve a better forecast. The process of configuring
the ANN is comprehensively elucidated in Section 2.2. In the ANN model fitting, we
employ an iterative approach, utilizing a trial-and-error method to determine the optimal
number of hidden layers. To elucidate this, we commence with a single hidden layer and
individually increment the layer count until we achieve the most
(a) (b) precise outcome. During
this progression, it was observed that the minimum test error was attained when employing
three hidden layers and five input layers.
(a) (b)
Figure 2. Level and first difference of USA stock market. Figure (a) shows that the series increasing
over time, but as we take the first difference, indicated by figure (b), then the series is mean station-
Mathematics 2023, 11, 4594 8 of 17
ary. Noted: Level and first difference of USA stock market, where the series at level shows an in-
creasing trend and achieves smoothness after difference transformation.

Mathematics 2023, 11, x FOR PEER REVIEW 8 of 17

(a) (b)

(c) (d)
Figure3.3. ACF
Figure ACF and
and PACF
PACF plots.
plots. Noted:
Noted: ACF
ACF and
and PACF
PACF for
for level
level (a,b),
(a,b), where
where the
the ACF
ACF isis steadily
steadily
declining, which ensures the unit root problem, and diﬀerenced data (c,d), where the ACF plot is
declining, which ensures the unit root problem, and differenced data (c,d), where the ACF plot is
declining very fast, which is evidence of stationarity.
declining very fast, which is evidence of stationarity.

There
The are methodological
same a few ways in which we can
approach wasobserve the residuals’
replicated randomness
in the construction in the
of the esti-
hybrid
mated model. We adopt a graphical approach, as well as a statistical approach,
model. Here, the task was to identify the ideal configuration of the hybrid model. The in Figure
4. The residuals’
iterative process ACF
led toreveals no serious
the selection autocorrelations.
of two hidden layers The
andlastfour
plotinput
on thelayers
bottom as pro-
the
vides p-values for the Ljung–Box statistic for
configuration that yielded the most favorable results. each lag up to 10. These tests consider the
accumulated residual autocorrelation from lag 1. The dashed blue line indicates
Figure 5 shows our comparison of different time series and machine learning models. a 5 per-
cent level
This shows of how
significance,
well theand it can be observed
predictions worked that all p-values
visually, with the(denoted
heightby ofcircles)
each bar are
above this.
showing Thus,
the weto
extent can conclude
which that residuals
the predicted are purely
values differedrandom.
from the Hence,
actualthis modelA
values. is
suitable for prediction.
lower bar height is indicative of a smaller margin of error, reflecting a higher level of
accuracy in the prediction.
Upon a detailed examination of Figure 5, several key observations and insights come
to the fore. First and foremost, it is evident that the ANN model exhibits a commendable
ability to capture the directional movements of the Nasdaq US stock market. This implies
that, when using the ANN model in isolation, it can offer a relatively accurate forecast.
This is a testament to the power of neural networks to uncover complex patterns and
relationships within financial time series data.
4. The residuals’ ACF reveals no serious autocorrelations. The last plot on the bottom pro-
vides p-values for the Ljung–Box statistic for each lag up to 10. These tests consider the
accumulated residual autocorrelation from lag 1. The dashed blue line indicates a 5 per-
cent level of significance, and it can be observed that all p-values (denoted by circles) are
Mathematics 2023, 11, 4594 above this. Thus, we can conclude that residuals are purely random. Hence, this model is
9 of 17
suitable for prediction.

Mathematics 2023, 11, x FOR PEER REVIEW 9 of 17

Post ARIMA modeling, we utilize another approach for forecasting, known as ANN.
ANN is considered the most well-known machine learning technique for forecasting.
Therefore, this study adopts this technique to capture the complex behavior of the Nasdaq
US stock market and resultantly achieve a better forecast. The process of configuring the
ANN is comprehensively elucidated in Section 2.2. In the ANN model fitting, we employ
an iterative approach, utilizing a trial-and-error method to determine the optimal number
of hidden layers. To elucidate this, we commence with a single hidden layer and individ-
ually increment the layer count until we achieve the most precise outcome. During this
progression, it was observed that the minimum test error was attained when employing
three hidden layers and five input layers.
The same methodological approach was replicated in the construction of the hybrid
model. Here, the task was to identify the ideal configuration of the hybrid model. The
iterative process led to the selection of two hidden layers and four input layers as the
configuration that yielded the most favorable results.
Figure 5 shows our comparison of diﬀerent time series and machine learning models.
This shows how well the predictions worked visually, with the height of each bar showing
the extentDiagnostic
Figure to which
Figure4.4.Diagnostic
the predicted
check. Noted: The
check. Noted:
values
The ACF
ACF of
diﬀered
of the
from
the residuals
the actual
residuals shows
shows no
values.
no significant
A lower bar
significant autocorrelations.
autocorrelations.
height
The is
Thedashed indicative
dashedblue
blueline of a smaller
lineindicates margin
indicatesaa55percent of error,
percentsignificancereflecting
significancelevel,
level,and a higher
andititcan
canbe level
beobservedof accuracy
observedthat in the
allp-values
thatall p-values
prediction.
(denoted
(denotedbybycircles)
circles)are
areabove
abovethis,
this,which
whichensures
ensuresthe
therandomness
randomnessofofresiduals.
residuals.

Figure 5. Forecast comparison across several models. Noted: This presents a comparison of time
Figure 5. Forecast comparison across several models. Noted: This presents a comparison of time
series and machine learning models. The smaller height of a bar is evidence of an accurate prediction.
series and machine learning models. The smaller height of a bar is evidence of an accurate predic-
Herein, the hybrid model outperforms the rival models.
tion. Herein, the hybrid model outperforms the rival models.
However, the most intriguing findings emerge when we turn our attention to the
Upon a detailed examination of Figure 5, several key observations and insights come
hybrid model, specifically the ARIMA-ANN combination. When compared to both the
to the fore. First and foremost, it is evident that the ANN model exhibits a commendable
standalone ARIMA and ANN models in this situation, it is clear that the forecast errors
ability to capture the directional movements of the Nasdaq US stock market. This implies
produced by the ARIMA-ANN hybrid model are significantly lower. This reduction
that, when using
in forecast errorsthe ANN model
signifies in isolation,
a higher it can oﬀer accuracy
level of predictive a relatively accurate
when forecast.
utilizing the
This is a testament
hybrid approach. to the power of neural networks to uncover complex patterns and re-
lationships within financial time series data.
However, the most intriguing findings emerge when we turn our attention to the
hybrid model, specifically the ARIMA-ANN combination. When compared to both the
standalone ARIMA and ANN models in this situation, it is clear that the forecast errors
produced by the ARIMA-ANN hybrid model are significantly lower. This reduction in
Mathematics 2023, 11, x FOR PEER REVIEW 10 of 17

Mathematics 2023, 11, 4594 10 of 17

The observed improvement in forecast accuracy achieved with the ARIMA-ANN hy-
brid model can be attributed to its unique ability to combine the strengths of two distinct
The observed
forecasting improvement
methodologies. in forecast
The ARIMA accuracyexcels
component achieved with the linear
in modeling ARIMA-ANN
trends and
hybrid model can be attributed to its unique ability to combine the strengths of two distinct
capturing seasonality, while the ANN component is adept at handling complex, nonlinear
forecasting methodologies. The ARIMA component excels in modeling linear trends and
relationships in the data. By integrating these two approaches, the hybrid model leverages
capturing seasonality, while the ANN component is adept at handling complex, nonlinear
their complementary strengths, resulting in a more precise forecast.
relationships in the data. By integrating these two approaches, the hybrid model leverages
their complementary strengths, resulting in a more precise forecast.
3.2. Nikkei Japan Stock Market
3.2. Figure
Nikkei Japan Stock Market a clear upward trend in the series at a certain level, indicating
6a demonstrates
Figure
that the 6a demonstrates
underlying a clear upward trend
series is non-stationary. in the
In order series atflatness
to achieve a certainand
level, indicat-
remove fluc-
ing thatfrom
tuations the underlying
an underlyingseriesseries,
is non-stationary.
researchers In order to achieve
commonly employflatness and remove
a logarithm transfor-
fluctuations
mation, from by
followed an underlying series,ofresearchers
the application commonlytoemploy
the first diﬀerence establisha logarithm transfor-
stationarity. Figure
mation, followed by the application of the first difference to establish stationarity.
6b displays a plot of the converted series, which exhibits a diﬀerence stationarity. Figure Figure 6b
displays a plot of the converted series, which exhibits a difference stationarity.
7a demonstrates a consistent decrease in the autocorrelation function (ACF) plot, which Figure 7a
demonstrates a consistent decrease in the autocorrelation function (ACF) plot, which serves
serves as additional evidence of the presence of a unit root. Figure 7c exhibits a distinct
as additional evidence of the presence of a unit root. Figure 7c exhibits a distinct decline in
decline in the autocorrelation function (ACF) plot after undergoing transformation, indi-
the autocorrelation function (ACF) plot after undergoing transformation, indicating the
cating the achievement of stationarity. The arrangement of q and p in a certain sequence
achievement of stationarity. The arrangement of q and p in a certain sequence correlates
correlates with pattern
with a distinct a distinct patterninobserved
observed the plots in the Autocorrelation
of the plots of the Autocorrelation
Function (ACF) Function
and
(ACF) and Partial Autocorrelation Function
Partial Autocorrelation Function (PACF), respectively.(PACF), respectively.

(a) (b)
Figure
Figure6.6.Level
Leveland
and first differenceofofJapanese
first difference Japanese stock
stock market.
market. Figure
Figure (a) demonstrates
(a) demonstrates increasing
increasing trend,
trend, while figure (b) mean stationary. Noted: Level and first difference of Japanese stock
while figure (b) mean stationary. Noted: Level and first difference of Japanese stock market, where market,
where the series at level shows an upward trend and achieves smoothness after difference trans-
the series at level shows an upward trend and achieves smoothness after difference transformation.
formation.
In Figure 8, the ACF or autocorrelation coefficient of the residuals of fitted ARIMA for
lag 1–30 is within the limits. Moreover, the Ljung–Box test also supports this result. Thus,
we can conclude that residuals are purely random. Hence, this approach can be applied to
forecasting. Post ARIMA prediction, we utilized the ANN algorithm and then a hybrid of
both. We used an iterative process to fit the ANN model, determining the ideal number of
hidden layers through trial and error. We started with one hidden layer and progressively
added layers individually until we reached the most accurate result. It was discovered that
using two hidden layers and three input layers resulted in the lowest test error.

(a) (b)
(a) (b)
Figure 6. Level and first difference of Japanese stock market. Figure (a) demonstrates increasing
Mathematics 2023, 11, 4594 trend, while figure (b) mean stationary. Noted: Level and first difference of Japanese stock market,
11 of 17
where the series at level shows an upward trend and achieves smoothness after difference trans-
formation.

Mathematics 2023, 11, x FOR PEER REVIEW 11 of 17

(a) (b)

(c) (d)
Figure7.7.ACF
Figure ACFand
andPACF
PACFplots.
plots. Noted:
Noted: ACF
ACF and
and PACF
PACF for
forlevel
level(a,b),
(a,b),where
wherethe
theACF
ACFisissteadily
steadily
declining, which ensures the unit root problem, and diﬀerenced data (c,d), where the ACF plot is
declining, which ensures the unit root problem, and differenced data (c,d), where the ACF plot is
declining very fast, which is evidence of a stationary series.
declining very fast, which is evidence of a stationary series.

In Figure
The 8, the
insights ACFfrom
drawn or autocorrelation coefficient illuminating,
Figure 9 are particularly of the residuals of fittedlight
shedding ARIMA for
on the
lag 1–30 is within the limits. Moreover, the Ljung–Box test also supports
performance of various forecasting models in the context of the Nikkei Japan stock market. this result. Thus,
we can
This conclude
visual that residuals
representation allowsarefor
purely
us torandom. Hence,
discern and this approach
interpret can be
the relative appliedof
accuracy to
forecasting. Post ARIMA prediction, we utilized the ANN algorithm
these models by observing the heights of the bars, where lower heights signify smallerand then a hybrid of
both. We
forecast usedand,
errors an iterative processa to
consequently, fit thedegree
higher ANN of model, determining
predictive the ideal number of
precision.
hidden
Upon a closer examination of Figure 9, it becomes evident that theand
layers through trial and error. We started with one hidden layer ANN progressively
algorithm
added layers individually until we reached the most accurate result.
displays a commendable capacity to capture the overarching trend of the Nikkei JapanIt was discovered that
usingmarket.
stock two hidden
This layers andthat,
indicates threewhen
inpututilized
layers resulted in the lowest
as a standalone model,testthe
error.
ANN is adept
at providing forecasts that align well with the actual market movements. This observation
underscores the ability of neural networks to uncover and incorporate intricate patterns
and nuances within the time series data of the Nikkei index, contributing to its strong
forecasting performance.
However, the most striking findings emerge when we shift our focus to the hybrid
model, specifically the ARIMA-ANN combination. In this context, it becomes readily
apparent that the forecast errors generated by the ARIMA-ANN hybrid model are notably
reduced when compared to the separate ARIMA and ANN models. This reduction in
forecast errors is a clear manifestation of the heightened predictive accuracy that the hybrid
approach offers.
lag 1–30 is within the limits. Moreover, the Ljung–Box test also supports this result. Thus,
we can conclude that residuals are purely random. Hence, this approach can be applied to
forecasting. Post ARIMA prediction, we utilized the ANN algorithm and then a hybrid of
both. We used an iterative process to fit the ANN model, determining the ideal number of
hidden layers through trial and error. We started with one hidden layer and progressively
Mathematics 2023, 11, 4594 12 of 17
added layers individually until we reached the most accurate result. It was discovered that
using two hidden layers and three input layers resulted in the lowest test error.

Mathematics 2023, 11, x FOR PEER REVIEW 12 of 17

The insights drawn from Figure 9 are particularly illuminating, shedding light on the
performance of various forecasting models in the context of the Nikkei Japan stock mar-
ket. This visual representation allows for us to discern and interpret the relative accuracy
of these models by observing the heights of the bars, where lower heights signify smaller
forecast errors and, consequently, a higher degree of predictive precision.
Upon a closer examination of Figure 9, it becomes evident that the ANN algorithm
displays a commendable capacity to capture the overarching trend of the Nikkei Japan
stock market. This indicates that, when utilized as a standalone model, the ANN is adept
at providing forecasts that align well with the actual market movements. This observation
Figure 8. Diagnostic check. Noted: The ACF of the residuals does not exhibit any statistically
underscores the ability
Figure 8. Diagnostic of Noted:
check. neural The
networks
ACF to
theuncover
ofline residualsand
doesincorporate intricate patterns
significant autocorrelations. The dashed blue represents the not exhibitsignificance
5 percent any statistically signif-
level. It is
and nuances
icant within
autocorrelations.the
Thetime series
dashed bluedata
line of the Nikkei
represents the 5 index,
percent contributing
significance to
level. its
It strong
is evident
evident that all p-values, indicated by circles, are higher than this threshold, indicating that the
forecasting performance.
that all p-values, indicated by circles, are higher than this threshold, indicating that the residuals
residuals exhibit randomness.
exhibit randomness.

Forecastcomparison
Figure9.9.Forecast
Figure comparison across
across several
severalmodels. Noted:AAcomparison
models.Noted: comparison between
betweenMLMLmodels and
models
time series is presented. An accurate prediction is demonstrated by a shorter bar. The hybrid
and time series is presented. An accurate prediction is demonstrated by a shorter bar. The hybridmodel
outperformed
model the other
outperformed models
the other in thein
models present case. case.
the present

The unique
However, theability of the ARIMA-ANN
most striking findings emerge hybrid
whenmodel to combine
we shift thetobest
our focus the features
hybrid
of two different modelling approaches is what makes it better at making
model, specifically the ARIMA-ANN combination. In this context, it becomes readily ap-predictions. The
ARIMA
parent component
that the forecastexcels
errors in generated
capturing by linear
the trends, and it effectively
ARIMA-ANN addresses
hybrid model issues
are notably
reduced when compared to the separate ARIMA and ANN models. This reduction in fore-in
related to seasonality. Meanwhile, the ANN component demonstrates its prowess
dealing
cast errorswith
is a the complexity
clear manifestationof non-linear relationships
of the heightened withinaccuracy
predictive the data.that
By the
integrating
hybrid
these two approaches,
approach offers. the hybrid model capitalizes on their complementary strengths,
culminating
The unique in aability
more of
precise and reliable forecast.
the ARIMA-ANN hybrid model to combine the best features of
two different modelling approaches is what makes it better at making predictions. The
ARIMA component excels in capturing linear trends, and it effectively addresses issues
related to seasonality. Meanwhile, the ANN component demonstrates its prowess in deal-
ing with the complexity of non-linear relationships within the data. By integrating these
two approaches, the hybrid model capitalizes on their complementary strengths, culmi-
Mathematics 2023, 11, 4594 13 of 17

3.3. France Stock Market (CAC 40 Index)

We can see in Figure 10a that the original stock market time series is increasing over
time, which shows that the series is suffering from a unit root problem. To achieve flatness
and remove fluctuations in the data, the logarithm transformation is implemented,
Mathematics 2023, 11, x FOR PEER REVIEW
and
13 of 17
difference transformation is performed to obtain a stationary series. Figure 11 represents
the transformed series, which ensures stationarity. In Figure 11a, a gradual decrease in the
ACF plot is further evidence of a unit root. Following transformation, in Figure 11c, we
can notice
can notice aa sharp
sharp fall
fall in
in the
the ACF
ACF plot.
plot. This
This confirms
confirms that
that the
the series
series is
is aa first
first difference
diﬀerence
stationary series. Certain orders of q and p are connected to a specific pattern
stationary series. Certain orders of q and p are connected to a specific pattern in the in the ACF
ACF
and PACF plots, respectively.
and PACF plots, respectively.

(a) (b)
Figure 10.
Figure 10. Level
Level and
and first
first difference
difference in
in French
French stock
stock market.
market. Figure
Figure (a)
(a) is
is showing
showing increasing
increasing trend,
trend,
while figure (b) mean stationary. Noted: Level and first difference in French stock market, where
while figure (b) mean stationary. Noted: Level and first difference in French stock market, where the
the series at level shows an upward trend and achieves smoothness after difference transformation.
series at level shows an upward trend and achieves smoothness after difference transformation.

Looking at the residuals correlogram and the Ljung–Box test shown in Figure 12, it
is clear that there is no noticeable spike, and the p-values from the Box–Ljung test are
higher than the 5% significance level. The results of this study offer support for the null
hypothesis, indicating that the residuals have a random pattern. Therefore, it can be
inferred that residuals exhibit characteristics of white noise. Therefore, this model has the
potential to be utilised for t making predictions. After ARIMA prediction, the subsequent
step employs the ANN technique. Subsequently, a combination of both ARIMA and ANN
strategies is utilised. We fit the ANN model iteratively, exploring until we found the
optimal number of hidden layers. We began with a single hidden layer and worked our
way up to the most accurate outcome, layer by layer. Along the way, it was found that the
lowest test error was achieved with three input levels and three hidden layers.
The insights derived from Figure 13 offer a compelling perspective of the performance
of various forecasting models within the intricate landscape of the French stock market.
This visual representation provides a clear means of gauging the relative accuracy of these
(a) (b)
models, with lower bar heights indicating smaller forecast errors and, by extension, a
higher level of predictive accuracy.
Upon a detailed examination of Figure 13, a notable observation comes to the forefront:
the ANN algorithm demonstrates a strong ability to capture the underlying trends of the
French stock market. This implies that, when employed as a standalone model, the ANN
excels at providing forecasts that closely align with actual market behavior. This finding
underscores the capacity of neural networks to uncover and incorporate the subtleties and
intricacies within the time series data of the French stock market, contributing to its robust
forecasting performance.
(a) (b)

Mathematics 2023, 11, 4594 Figure 10. Level and first difference in French stock market. Figure (a) is showing increasing14
trend,
of 17
while figure (b) mean stationary. Noted: Level and first difference in French stock market, where
the series at level shows an upward trend and achieves smoothness after difference transformation.

(a) (b)

Figure 11. ACF and PACF plots. Noted: ACF and PACF for level (a,b), where the ACF is steadily
declining, which ensures the unit root problem, and differenced data (c,d), where the ACF plot is
declining very fast, which confirms stationarity.

However, the most remarkable findings are unveiled as we shift our focus towards
the hybrid model, specifically the fusion of ARIMA and ANN. When compared to the
individual ARIMA and ANN models, it is clear that the ARIMA-ANN hybrid model
significantly reduces the forecast errors. This substantial reduction in forecast errors reflects
a higher degree of predictive accuracy, affirming the superior forecasting capability of the
hybrid approach.
The improvement in forecasting precision obtained with the ARIMA-ANN hybrid
model is a direct consequence of its unique ability to harness the strengths of two distinct
modeling methodologies. The ARIMA component effectively captures linear trends and
addresses seasonality in the data, while the ANN component excels at managing the
complexities of non-linear relationships. By seamlessly integrating these two approaches,
the hybrid model optimally leverages their complementary strengths, culminating in a
forecast that is both accurate and robust.
ferred that residuals exhibit characteristics of white noise. Therefore, this model has the
potential to be utilised for t making predictions. After ARIMA prediction, the subsequent
step employs the ANN technique. Subsequently, a combination of both ARIMA and ANN
strategies is utilised. We fit the ANN model iteratively, exploring until we found the opti-
mal number of hidden layers. We began with a single hidden layer and worked our way
Mathematics 2023, 11, 4594 15 of 17
up to the most accurate outcome, layer by layer. Along the way, it was found that the
lowest test error was achieved with three input levels and three hidden layers.

Mathematics 2023, 11, x FOR PEER REVIEW 15 of 17

finding underscores the capacity of neural networks to uncover and incorporate the sub-
tleties and intricacies within the time series data of the French stock market, contributing
to its robust forecasting performance.
However, the most remarkable findings are unveiled as we shift our focus towards
the hybrid model, specifically the fusion of ARIMA and ANN. When compared to the
individual ARIMA and ANN models, it is clear that the ARIMA-ANN hybrid model sig-
nificantly reduces the forecast errors. This substantial reduction in forecast errors reflects
a higher degree of predictive accuracy, aﬃrming the superior forecasting capability of the
hybrid approach.
The improvement in forecasting precision obtained with the ARIMA-ANN hybrid
model is a direct consequence of its unique ability to harness the strengths of two distinct
modeling methodologies. The ARIMA component eﬀectively captures linear trends and
addresses seasonality in the data, while the ANN component excels at managing the com-
plexities of non-linear relationships. By seamlessly integrating these two approaches, the
Figure 12.Diagnostic
Figure12. Diagnosticcheck. Noted: The
check.Noted: The ACF
ACF of
of the
the residuals
residuals shows
shows no no significant
significant autocorrelations.
autocorrelations.
hybrid model optimally leverages their complementary strengths, culminating in a fore-
The
Thedashed
dashedblue
blueline
lineindicates
indicates55percent
percentsignificance
significancelevel,
level,and
andititcan
canbe
beobserved
observedthat allp-values
thatall p-values
cast that is both accurate and robust.
(denoted by circles) are above this, which ensures the randomness of residuals.
(denoted by circles) are above this, which ensures the randomness of residuals.

The insights derived from Figure 13 oﬀer a compelling perspective of the perfor-
mance of various forecasting models within the intricate landscape of the French stock
market. This visual representation provides a clear means of gauging the relative accuracy
of these models, with lower bar heights indicating smaller forecast errors and, by exten-
sion, a higher level of predictive accuracy.
Upon a detailed examination of Figure 13, a notable observation comes to the fore-
front: the ANN algorithm demonstrates a strong ability to capture the underlying trends
of the French stock market. This implies that, when employed as a standalone model, the
ANN excels at providing forecasts that closely align with actual market behavior. This

Figure 13. Forecast

Figure13. comparison across
Forecast comparison acrossseveral
severalmodels.
models.Noted:
Noted:AA comparison
comparison of time
of time series
series andand
ML
ML models
models is produced.
is produced. The shorter
The shorter bar denotes
bar denotes accurate
accurate prediction.
prediction. Herein,Herein, the hybrid
the hybrid model
model outper-
forms the rival
outperforms themodels.
rival models.

3.4. Difference among the Three Datasets Results

3.4. Diﬀerence among the Three Datasets Results
This study presents a novel approach that combines the ARIMA and ANN models
This study presents a novel approach that combines the ARIMA and ANN models
and is then applied to three financial markets within the G7. The findings of this study
and is then applied to three financial markets within the G7. The findings of this study
demonstrate that the hybridization of these models yields highly beneficial results
demonstrate that the hybridization of these models yields highly beneficial results in
in terms of predicting. It is worth noting that, in the realm of financial markets, the
terms of predicting. It is worth noting that, in the realm of financial markets, the hybrid
hybrid approach exhibits a notably low level of forecast inaccuracy when applied to the
approach exhibits a notably low level of forecast inaccuracy when applied to the Nasdaq
Nasdaq USA stock market as compared to other financial markets under consideration.
USA stock market as compared to other financial markets under consideration. Specifi-
cally, in the case of the Nikkei Japan stock market, there is a particularly significant degree
of forecasting error.

4. Conclusions
Mathematics 2023, 11, 4594 16 of 17

Specifically, in the case of the Nikkei Japan stock market, there is a particularly significant
degree of forecasting error.

4. Conclusions
Almost all financial decision-makers, such as investors, money managers, hedge
funds, and investment banks, needed to forecast financial asset prices such as exchange
rates, options, bonds, interest rates, and stocks, among other things, with the aim of
making productive decisions. Therefore, to date, the modification and development
of new models have not stopped in research on the management of financial markets.
According to previous research, prediction plays a key role in financial markets; however,
this is a difficult task. Thus, financial stakeholders face many difficulties in achieving
accurate forecasts. In the forecasting literature, merging multiple models is one of the
most popular ways to gain additional accuracy in comparison with individual models.
The literature has put forth a number of methods for dealing with the limitations of the
separate approaches and generating more trustworthy results. A combining approach
that decomposes a time series into two parts, linear and non-linear, is the most popular
approach, and has been theoretically as well as empirically accepted to be more successful
than an individual model. These models have advantages in terms of linearity and
nonlinearity in the time series nexus.
The current study compares the predictive power of a hybrid of linear/nonlinear
(i.e., ARIMA/ANN), such as ARIMA-ANN, with their components using the data of
three stock market indices from G7 countries. Empirical research based on three popular
real datasets of stock prices from the three stock market indexes, namely the Nasdaq
stock exchange, United States, Nikkei Stock exchange, Japan, and France stock exchange,
demonstrates that using a hybrid model yields a more accurate forecast than using separate
components. It is generally believed that a hybrid model can deliver results that are, to
some extent, better than those obtained by individual models. Based on an analysis of real
data, the findings revealed that the hybrid ARIMA-ANN is overall superior to individual
ANN and ARIMA models. For all the considered stock exchange indexes, the RMSE and
MAE values observed in the hybrid model exhibited a significant reduction in comparison
to the individual models.
The scope of this study primarily centres on univariate analysis, wherein forecasting
models are built solely on historical data related to the stock market indices under consid-
eration. Numerous external variables, including economic indicators, political events, and
global trends, can have a profound impact on market movements. Incorporating external
economic and financial indicators, such as geopolitical events or macroeconomic data, into
the forecasting models can enhance their predictive power. Future studies could explore
the impact of exogenous variables on model accuracy. A combination of LSTM and ANN
can be utilised for the prediction of complex stock market data.

Author Contributions: Software, A.T.A.; Validation, E.A.; Investigation, E.H. All authors have read
and agreed to the published version of the manuscript.
Funding: This research has been funded by Deputy for Research & Innovation, Ministry of Education
through Initiative of Institutional Funding at University of Ha’il—Saudi Arabia through project
number IFP-22 055.
Data Availability Statement: All data available in the paper with related references.
Acknowledgments: This research has been funded by Deputy for Research & Innovation, Ministry
of Education through Initiative of Institutional Funding at University of Ha’il—Saudi Arabia through
project number IFP-22 055.
Conflicts of Interest: There is no conflict of interest regarding publishing this paper.
Mathematics 2023, 11, 4594 17 of 17

References
1. Chhajer, P.; Shah, M.; Kshirsagar, A. The applications of artificial neural networks, support vector machines, and long–short term
memory for stock market prediction. Decis. Anal. J. 2022, 2, 100015. [CrossRef]
2. Pettinger, T. UK Wage Growth. Economics Help. 2019. Available online: https://www.economicshelp.org/blog/6994
/economics/uk-wage-growth/ (accessed on 8 October 2019).
3. Yun, H.; Lee, M.; Kang, Y.S.; Seok, J. Portfolio management via two-stage deep learning with a joint cost. Expert Syst. Appl. 2020,
143, 113041. [CrossRef]
4. Kou, G.; Xu, Y.; Peng, Y.; Shen, F.; Chen, Y.; Chang, K.; Kou, S. Bankruptcy prediction for SMEs using transactional data and
two-stage multiobjective feature selection. Decis. Support Syst. 2021, 140, 113429. [CrossRef]
5. Kshirsagar, A.; Shah, M. Anatomized study of security solutions for multimedia: Deep learning-enabled authentication, cryptog-
raphy and information hiding. In Advanced Security Solutions for Multimedia; IOP Publishing: Bristol, UK, 2021.
6. Solanki, P.; Baldaniya, D.; Jogani, D.; Chaudhary, B.; Shah, M.; Kshirsagar, A. Artificial intelligence: New age of transformation in
petroleum upstream. Pet. Res. 2022, 7, 106–114. [CrossRef]
7. Gu, S.; Kelly, B.; Xiu, D. Empirical asset pricing via machine learning. Rev. Financ. Stud. 2020, 33, 2223–2273. [CrossRef]
8. Zhang, Y.; Chu, G.; Shen, D. The role of investor attention in predicting stock prices: The long short-term memory networks
perspective. Financ. Res. Lett. 2021, 38, 101484. [CrossRef]
9. Bianchi, D.; Büchner, M.; Hoogteijling, T.; Tamoni, A. Corrigendum: Bond risk premiums with machine learning. Rev. Financ.
Stud. 2021, 34, 1090–1103. [CrossRef]
10. Anghel, D.G. A reality check on trading rule performance in the cryptocurrency market: Machine learning vs. technical analysis.
Financ. Res. Lett. 2021, 39, 101655. [CrossRef]
11. Liu, M.; Li, G.; Li, J.; Zhu, X.; Yao, Y. Forecasting the price of Bitcoin using deep learning. Financ. Res. Lett. 2021, 40, 101755.
[CrossRef]
12. Fathali, Z.; Kodia, Z.; Ben Said, L. Stock market prediction of Nifty 50 index applying machine learning techniques. Appl. Artif.
Intell. 2022, 36, 2111134. [CrossRef]
13. Gupta, R.; Nel, J.; Pierdzioch, C. Investor confidence and forecastability of US stock market realized volatility: Evidence from
machine learning. J. Behav. Financ. 2023, 24, 111–122. [CrossRef]
14. Chen, X.; Wu, C. Retail investor attention and information asymmetry: Evidence from China. Pac.-Basin Financ. J. 2022, 75, 101847.
[CrossRef]
15. Liu, R.; Gupta, R. Investors’ uncertainty and forecasting stock market volatility. J. Behav. Financ. 2022, 23, 327–337. [CrossRef]
16. Peng, Z.; Khan, F.U.; Khan, F.; Shaikh, P.A.; Yonghong, D.; Ullah, I.; Ullah, F. An Application of Hybrid Models for Weekly Stock
Market Index Prediction: Empirical Evidence from SAARC Countries. Complexity 2021, 2021, 5663302. [CrossRef]
17. Khan, F.; Urooj, A.; Muhammadullah, S. An ARIMA-ANN hybrid model for monthly gold price forecasting: Empirical evidence
from Pakistan. Pak. Econ. Rev. 2021, 4, 61–75.
18. Majumder, A.; Rahman, M.M.; Biswas, A.A.; Zulfiker, M.S.; Basak, S. Stock Market Prediction: A Time Series Analysis. In Smart
Systems: Innovations in Computing: Proceedings of SSIC 2021; Springer: Singapore, 2022; pp. 389–401.
19. Staffini, A. Stock price forecasting by a deep convolutional generative adversarial network. Front. Artif. Intell. 2022, 5, 837596.
[CrossRef]
20. Lu, X.; Que, D.; Cao, G. Volatility forecast based on the hybrid artificial neural network and GARCH-type models. Procedia
Comput. Sci. 2016, 91, 1044–1049. [CrossRef]
21. Hajirahimi, Z.; Khashei, M. A novel parallel hybrid model based on series hybrid models of ARIMA and ANN models. Neural
Process. Lett. 2022, 54, 2319–2337. [CrossRef]
22. Rudin, C.; Ertekin, Ş. Learning customized and optimized lists of rules with mathematical programming. Math. Program. Comput.
2018, 10, 659–702. [CrossRef]
23. Armstrong, J.S. Combining forecasts. In Principles of Forecasting; Armstrong, J.S., Ed.; Kluwer Academic Publishers: Norwell, MA,
USA, 2001.
24. Zhang, G.P. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 2003, 50, 159–175.
[CrossRef]
25. Armstrong, J.S. Findings from evidence-based forecasting: Methods for reducing forecast error. Int. J. Forecast. 2006, 22, 583–598.
[CrossRef]
26. De Gooijer, J.G.; Kumar, K. Some recent developments in non-linear time series modelling, testing, and forecasting. Int. J. Forecast.
1992, 8, 135–156. [CrossRef]
27. Khashei, M.; Bijari, M. A novel hybridization of artificial neural networks and ARIMA models for time series forecasting. Appl.
Soft Comput. 2011, 11, 2664–2675. [CrossRef]
28. Zhang, G.; Patuwo, B.E.; Hu, M.Y. Forecasting with artificial neural networks: The state of the art. Int. J. Forecast. 1998, 14, 35–62.
[CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

Production Planning and Control D.R. Kiran 2024 Scribd Download
100% (3)
Production Planning and Control D.R. Kiran 2024 Scribd Download
41 pages
613698478
No ratings yet
613698478
8 pages
Paper 06
No ratings yet
Paper 06
27 pages
Electronics 11 03149 v3
No ratings yet
Electronics 11 03149 v3
19 pages
1-s2.0-S187705092500050X-main
No ratings yet
1-s2.0-S187705092500050X-main
12 pages
Analysis On Stock Market Prediction Using Machine Learning Techniques
No ratings yet
Analysis On Stock Market Prediction Using Machine Learning Techniques
5 pages
Research Article Impact of Technical Indicators and Leading Indicators On Stock Trends On The Internet of Things
No ratings yet
Research Article Impact of Technical Indicators and Leading Indicators On Stock Trends On The Internet of Things
15 pages
s00521-023-09179-4
No ratings yet
s00521-023-09179-4
25 pages
applsci-13-08356-v2
No ratings yet
applsci-13-08356-v2
18 pages
588567578
No ratings yet
588567578
10 pages
journal.pone.0286362
No ratings yet
journal.pone.0286362
19 pages
05.stock Market Prediction Using
No ratings yet
05.stock Market Prediction Using
4 pages
A Novel Integrated Approach For Stock Prediction Based On Modal Decomposition Technology and Machine Learning
No ratings yet
A Novel Integrated Approach For Stock Prediction Based On Modal Decomposition Technology and Machine Learning
14 pages
Automatic Extraction and Identification of Chart Patterns Towards Financial Forecast
No ratings yet
Automatic Extraction and Identification of Chart Patterns Towards Financial Forecast
12 pages
Research On HMM-Based Efficient Stock Price Prediction
No ratings yet
Research On HMM-Based Efficient Stock Price Prediction
8 pages
Financial Market Forecasting using RNN, LSTM, BiLSTM, GRU and Transformer-Based Deep Learning Algorithms
No ratings yet
Financial Market Forecasting using RNN, LSTM, BiLSTM, GRU and Transformer-Based Deep Learning Algorithms
11 pages
3IEEE-Access-Navigating the Stock Market Dynamics- A Comprehensive Survey of Prediction Approaches
No ratings yet
3IEEE-Access-Navigating the Stock Market Dynamics- A Comprehensive Survey of Prediction Approaches
8 pages
Predicting_Market_Performance_Using_Machine_and_Deep_Learning_Techniques
No ratings yet
Predicting_Market_Performance_Using_Machine_and_Deep_Learning_Techniques
8 pages
Iraqi Stock Market Prediction Using Artificial Neu
No ratings yet
Iraqi Stock Market Prediction Using Artificial Neu
7 pages
Stock Market Analysis
No ratings yet
Stock Market Analysis
8 pages
1-s2.0-S1568494624001030-main
No ratings yet
1-s2.0-S1568494624001030-main
12 pages
Ref 1
No ratings yet
Ref 1
21 pages
ojapps20241411_72312565
No ratings yet
ojapps20241411_72312565
8 pages
(IJETA-V11I3P50) :vikash Kumar, Khushbu Jain, Arin Joshi, Ayush Mishra, Manvi Sharma
No ratings yet
(IJETA-V11I3P50) :vikash Kumar, Khushbu Jain, Arin Joshi, Ayush Mishra, Manvi Sharma
5 pages
Emerging Stock Market Prediction Using GRU Algorithm Incorporating Endogenous and Exogenous Variables
No ratings yet
Emerging Stock Market Prediction Using GRU Algorithm Incorporating Endogenous and Exogenous Variables
8 pages
A Securities Exchange Prospect Utilizing AI
No ratings yet
A Securities Exchange Prospect Utilizing AI
4 pages
1-s2.0-S1566253524003944-main
No ratings yet
1-s2.0-S1566253524003944-main
20 pages
Reserach Paper
No ratings yet
Reserach Paper
6 pages
An Improved Convolutional Recurrent Neural Network For Stock Price Forecasting
No ratings yet
An Improved Convolutional Recurrent Neural Network For Stock Price Forecasting
14 pages
618248282
No ratings yet
618248282
11 pages
Stock Prediction System Using ML
No ratings yet
Stock Prediction System Using ML
5 pages
Comparison_of_Machine_Learning_Models_for_Stock_Pr
No ratings yet
Comparison_of_Machine_Learning_Models_for_Stock_Pr
9 pages
Parmar2018 PDF
No ratings yet
Parmar2018 PDF
3 pages
Predictive Modeling of Stock Prices Using Transformer Model
No ratings yet
Predictive Modeling of Stock Prices Using Transformer Model
8 pages
SNCS D 23 00531
No ratings yet
SNCS D 23 00531
15 pages
Seminar Report Format 4th Sem (1) Biswa
No ratings yet
Seminar Report Format 4th Sem (1) Biswa
20 pages
Forecasting Indian Trade Trends Through LSTM - Based Predictive Modeling
No ratings yet
Forecasting Indian Trade Trends Through LSTM - Based Predictive Modeling
10 pages
Paper_98-A_Comparative_Study_of_Deep_Learning_Algorithms
No ratings yet
Paper_98-A_Comparative_Study_of_Deep_Learning_Algorithms
10 pages
shsconf_edma2024_02006 (1)
No ratings yet
shsconf_edma2024_02006 (1)
6 pages
Exploring Machine Learning for Stock Price Prediction and Decision Making
No ratings yet
Exploring Machine Learning for Stock Price Prediction and Decision Making
4 pages
Algo Trading
No ratings yet
Algo Trading
10 pages
Enhancing Option Pricing Accuracy in the Indian
No ratings yet
Enhancing Option Pricing Accuracy in the Indian
25 pages
Forecasting Significant Stock Market Price Changes
No ratings yet
Forecasting Significant Stock Market Price Changes
23 pages
Machine_learning-based_approaches_for_financial_ma (1)
No ratings yet
Machine_learning-based_approaches_for_financial_ma (1)
19 pages
Stock Market Research Paper
No ratings yet
Stock Market Research Paper
9 pages
IEEE Predicting Stock Closing Price After COVID 19 Based On Sentiment Analysis and LSTM
No ratings yet
IEEE Predicting Stock Closing Price After COVID 19 Based On Sentiment Analysis and LSTM
6 pages
Machine Learning in Futures Markets: Waldow, Fabian Schnaubelt, Matthias Krauss, Christopher Fischer, Thomas G
No ratings yet
Machine Learning in Futures Markets: Waldow, Fabian Schnaubelt, Matthias Krauss, Christopher Fischer, Thomas G
15 pages
STOCK MARKET REVIEW (1)
No ratings yet
STOCK MARKET REVIEW (1)
5 pages
4
No ratings yet
4
5 pages
Using Neural Networks To Forecast Stock Market Prices: January 1998
No ratings yet
Using Neural Networks To Forecast Stock Market Prices: January 1998
22 pages
Cross-Domain_Disentanglement_A_Novel_Approach_to_Financial_Market_Prediction
No ratings yet
Cross-Domain_Disentanglement_A_Novel_Approach_to_Financial_Market_Prediction
11 pages
2023-26-Multi-Source Aggregated Classification For Stock Price Movement Prediction
No ratings yet
2023-26-Multi-Source Aggregated Classification For Stock Price Movement Prediction
14 pages
Application of Data Mining in Equity Market: Abstract
No ratings yet
Application of Data Mining in Equity Market: Abstract
8 pages
Stock Market Prediction Employing Ensemble Methods: The Nifty50 Index
No ratings yet
Stock Market Prediction Employing Ensemble Methods: The Nifty50 Index
11 pages
A Hybrid Model To Forecast Stock Trend U
No ratings yet
A Hybrid Model To Forecast Stock Trend U
8 pages
JETIR2501512
No ratings yet
JETIR2501512
6 pages
Stock Market Prediction Using Time Series Analysis: N Viswam and G Satyanarayana Reddy
No ratings yet
Stock Market Prediction Using Time Series Analysis: N Viswam and G Satyanarayana Reddy
5 pages
A Systematic Survey of AI Models in Financial Mark
No ratings yet
A Systematic Survey of AI Models in Financial Mark
23 pages
Pang2020 Article AnInnovativeNeuralNetworkAppro
No ratings yet
Pang2020 Article AnInnovativeNeuralNetworkAppro
21 pages
Irjet V5i3634
No ratings yet
Irjet V5i3634
4 pages
Duplex Models of Complex Systems
From Everand
Duplex Models of Complex Systems
Steven H. Kim
No ratings yet
Eviews 7.0 Manual
No ratings yet
Eviews 7.0 Manual
108 pages
Mckenzie 2003
No ratings yet
Mckenzie 2003
34 pages
A Guide to Modern Econometrics 5th Edition Marno Verbeek - The ebook in PDF and DOCX formats is ready for download
100% (1)
A Guide to Modern Econometrics 5th Edition Marno Verbeek - The ebook in PDF and DOCX formats is ready for download
57 pages
Data-driven modeling for unsteady aerodynamics and aeroelasticity
No ratings yet
Data-driven modeling for unsteady aerodynamics and aeroelasticity
35 pages
Instant Ebooks Textbook Structural Dynamics in Engineering Design 1st Edition Nuno M. M. Maia Download All Chapters
100% (2)
Instant Ebooks Textbook Structural Dynamics in Engineering Design 1st Edition Nuno M. M. Maia Download All Chapters
64 pages
(Ebooks PDF) Download Advanced Linear Modeling Statistical Learning and Dependent Data 3rd Edition Christensen R Full Chapters
100% (3)
(Ebooks PDF) Download Advanced Linear Modeling Statistical Learning and Dependent Data 3rd Edition Christensen R Full Chapters
62 pages
(Ebook) Analysis of Financial Time Series by Ruey S. Tsay ISBN 9780470414354, 0470414359 - Download the ebook today and experience the full content
100% (2)
(Ebook) Analysis of Financial Time Series by Ruey S. Tsay ISBN 9780470414354, 0470414359 - Download the ebook today and experience the full content
56 pages
Time Series-ch08
No ratings yet
Time Series-ch08
26 pages
An Intelligent Factory Automation System With Multivariate Time Series Algorithm For Chip Probing Process
No ratings yet
An Intelligent Factory Automation System With Multivariate Time Series Algorithm For Chip Probing Process
8 pages
Gold-Bitcoin Trading Strategy Based On Time Series
No ratings yet
Gold-Bitcoin Trading Strategy Based On Time Series
6 pages
Encyclopedia of Statistical Sciences 2nd Edition Campbell B. Read - The ebook in PDF format is ready for download
No ratings yet
Encyclopedia of Statistical Sciences 2nd Edition Campbell B. Read - The ebook in PDF format is ready for download
75 pages
(Ebook) Building Statistical Models in Python: Develop useful models for regression, classification, time series, and survival analysis by anonymous - The ebook in PDF and DOCX formats is ready for download
100% (1)
(Ebook) Building Statistical Models in Python: Develop useful models for regression, classification, time series, and survival analysis by anonymous - The ebook in PDF and DOCX formats is ready for download
85 pages
Download Complete (Ebook) Introduction to Time Series Forecasting with Python: How to Prepare Data and Develop Models to Predict the Future by Jason Brownlee PDF for All Chapters
100% (10)
Download Complete (Ebook) Introduction to Time Series Forecasting with Python: How to Prepare Data and Develop Models to Predict the Future by Jason Brownlee PDF for All Chapters
81 pages
Forcasting
No ratings yet
Forcasting
20 pages
A Systematic Review of The Bubble Dynamics of Cryptocurrency Prices
No ratings yet
A Systematic Review of The Bubble Dynamics of Cryptocurrency Prices
26 pages
KENENISA VS RESEARCH PROPOSAL
No ratings yet
KENENISA VS RESEARCH PROPOSAL
31 pages
Applied economic forecasting using time series methods Ghysels All Chapters Instant Download
100% (2)
Applied economic forecasting using time series methods Ghysels All Chapters Instant Download
66 pages
Cipra T. (2020) - Time Series in Economics and Finance. Springer
100% (1)
Cipra T. (2020) - Time Series in Economics and Finance. Springer
409 pages
Full Download Python For Water and Environment 2024th Edition Anil Kumar PDF
100% (3)
Full Download Python For Water and Environment 2024th Edition Anil Kumar PDF
36 pages
A Study On RAINFALL DATA OF LOWER ASSAM
No ratings yet
A Study On RAINFALL DATA OF LOWER ASSAM
25 pages
Analysis and Control of Linear Systems Philippe de Larminat
No ratings yet
Analysis and Control of Linear Systems Philippe de Larminat
70 pages
Arma
No ratings yet
Arma
32 pages
Literature Review On Arima Models
100% (2)
Literature Review On Arima Models
4 pages
Gold Price Forecasting Using ARIMA Model
No ratings yet
Gold Price Forecasting Using ARIMA Model
6 pages
Download Full Applied Time Series Analysis with R 2nd Edition Wayne A. Woodward PDF All Chapters
No ratings yet
Download Full Applied Time Series Analysis with R 2nd Edition Wayne A. Woodward PDF All Chapters
55 pages
incorporating inflation rate in construction projects model
No ratings yet
incorporating inflation rate in construction projects model
18 pages
PDF System Modeling and Simulation An Introduction 1st Edition Frank L. Severance Download
100% (4)
PDF System Modeling and Simulation An Introduction 1st Edition Frank L. Severance Download
84 pages
Theory and Applications of Time Series Analysis
No ratings yet
Theory and Applications of Time Series Analysis
236 pages
Forecasting Electricity Consumption Using ARIMA Model
No ratings yet
Forecasting Electricity Consumption Using ARIMA Model
6 pages

Statistical Modeling of High Frequency Datasets Using The ARIMA-ANN Hybrid2023

Uploaded by

Statistical Modeling of High Frequency Datasets Using The ARIMA-ANN Hybrid2023

Uploaded by

mathematics

Keywords: stock markets; machine learning; hybridization; forecasting

Mathematics 2023, 11, 4594. https://doi.org/10.3390/math11224594 https://www.mdpi.com/journal/mathematics

2. Data and Methods

For a simplified version of the expression:

γp (A)Xt = b + θq (A)µt (4)

representing, respectively, the AR operator and MA operator.

γp (A)(1 − A)s Xt = b + θn (A)µt (6)

βl (l = 0, 1, 2, . . ., k) and αil (i = 0, 1, 2, . . ., n; l = 0, 1, 2, . . ., k) are the model parameters,

Mathematics 2023, 11, x FOR PEER REVIEW 8 of 17

Mathematics 2023, 11, x FOR PEER REVIEW 9 of 17

Mathematics 2023, 11, 4594 10 of 17

Mathematics 2023, 11, x FOR PEER REVIEW 11 of 17

Mathematics 2023, 11, x FOR PEER REVIEW 12 of 17

3.3. France Stock Market (CAC 40 Index)

Mathematics 2023, 11, x FOR PEER REVIEW 15 of 17

Figure 13. Forecast

3.4. Difference among the Three Datasets Results

You might also like