Paper 06
Paper 06
Article
Predicting Economic Trends and Stock Market Prices with Deep
Learning and Advanced Machine Learning Techniques
Victor Chang 1, * , Qianwen Ariel Xu 1 , Anyamele Chidozie 2 and Hai Wang 3
1 Department of Operations and Information Management, Aston Business School, Aston University,
Birmingham B4 7ET, UK; [email protected]
2 School of Computing, Engineering and Digital Technologies, Teesside University,
Middlesbrough TS1 3BX, UK
3 School of Computer Science and Digital Technologies, Aston University, Birmingham B4 7ET, UK;
[email protected]
* Correspondence: [email protected] or [email protected]
Abstract: The volatile and non-linear nature of stock market data, particularly in the post-pandemic
era, poses significant challenges for accurate financial forecasting. To address these challenges, this
research develops advanced deep learning and machine learning algorithms to predict financial
trends, quantify risks, and forecast stock prices, focusing on the technology sector. Our study seeks to
answer the following question: “Which deep learning and supervised machine learning algorithms
are the most accurate and efficient in predicting economic trends and stock market prices, and
under what conditions do they perform best?” We focus on two advanced recurrent neural network
(RNN) models, long short-term memory (LSTM) and Gated Recurrent Unit (GRU), to evaluate
their efficiency in predicting technology industry stock prices. Additionally, we integrate statistical
methods such as autoregressive integrated moving average (ARIMA) and Facebook Prophet and
machine learning algorithms like Extreme Gradient Boosting (XGBoost) to enhance the robustness
of our predictions. Unlike classical statistical algorithms, LSTM and GRU models can identify and
retain important data sequences, enabling more accurate predictions. Our experimental results show
that the GRU model outperforms the LSTM model in terms of prediction accuracy and training
time across multiple metrics such as RMSE and MAE. This study offers crucial insights into the
Citation: Chang, V.; Xu, Q.A.;
predictive capabilities of deep learning models and advanced machine learning techniques for
Chidozie, A.; Wang, H. Predicting
Economic Trends and Stock Market
financial forecasting, highlighting the potential of GRU and XGBoost for more accurate and efficient
Prices with Deep Learning and stock price prediction in the technology sector.
Advanced Machine Learning
Techniques. Electronics 2024, 13, 3396. Keywords: stock prices; deep learning; artificial neural networks; recurrent neural networks; long
https://doi.org/10.3390/ short-term memory (LSTM); gated recurrent unit (GRU)
electronics13173396
such as company performance, brand value, market activity, inflation, trends, and investor
sentiment. While some aspects, like sales and purchases, can be estimated, the complexities
add a layer of difficulty to developing accurate models that capture trends and forecast
future prices. Predicting future trends can be the difference between investment success
and failure for investors. Traditional methods, such as technical and fundamental analyses,
have been used to study patterns and predict future stock prices, but they often fall short
when dealing with the dynamic, non-stationary nature of stock markets influenced by
factors like announcement headlines, social media tweets, corporate news, and other mood
indicators [1,2].
Over the years, numerous statistical methods like regressions and time series models
(ARIMA, SARIMA [3], GARCH) have been employed to predict future stock prices. While
beneficial in some respects, these methods struggle with handling stock price data. For
example, the autoregressive integrated moving average (ARIMA) model has been applied
to predict the stock market using historical financial data; however, these statistical models
often fall short due to the non-linear structure of time series data [4].
To overcome the inefficiencies of statistical methods, various Artificial Intelligence (AI)
models have been developed and integrated into statistical analysis to predict future stock
market trends. These include classical machine learning (ML) algorithms such as Support
Vector Machines (SVMs) [5] and Random Forest (RF) [6], as well as deep learning (DL)
algorithms such as recurrent neural networks (RNNs) [7], Convolutional Neural Networks
(CNNs) [8], and other deep learning methods for multivariate time series data analysis. Rao
and Reimherr introduce a novel class of non-linear function-on-function regression models
specifically designed for functional data using neural networks. The authors propose
two model-fitting strategies: Function-on-Function Direct Neural Networks (FFDNNs)
and Function-on-Function Basis Neural Networks (FFBNNs). These strategies are tailored
to leverage the inherent structure of functional data and capture complex relationships
between functional predictors and responses [9]. These AI models, with their capacity to
learn from extensive datasets and continuously improve, offer promising potential for
automated and more accurate future stock price predictions.
Deep learning methods have been extensively used in the existing literature to predict
future stock prices, significantly contributing to improved model accuracy [10]. White
was a pioneer in implementing an artificial neural network (ANN) for financial market
forecasting, using the daily prices of IBM as a database [11]. Although this initial study did
not achieve the expected results, it highlighted several difficulties, such as the overfitting
problem and the low complexity of the neural network, which used only a few entries and
one hidden layer. This study highlighted possible future improvements, including adding
more features to the ANN, working with different forecasting horizons, and evaluating
model profitability. Over the years, deep learning capabilities have greatly improved, and
various parameter tuning methods have been developed to address the issues mentioned
by White [11]. A family of recurrent neural network (RNN) architectures, including vari-
ations of gated recursive units (GRUs) and long and short-term memory (LSTM), have
become popular methods for predicting stock market patterns. Recent studies highlight
the effectiveness of combining sentiment analysis with deep learning models. For instance,
Sonkiya et al. used BERT for sentiment analysis and GANs for stock price prediction, show-
ing improved performance over traditional methods like ARIMA and neural networks
such as LSTM and GRU [12]. Similarly, Maqsood et al. demonstrated that incorporating
sentiment from local and global events into deep learning models enhances prediction
accuracy, as evidenced by improved RMSE and MAE metrics [13]. Another innovative
approach by Patil et al. utilized graph theory to model the stock market as a complex
network. Their hybrid models, which combined graph-based structural information with
deep learning and traditional machine learning techniques, outperformed standard models
by leveraging the spatio-temporal relationships between stocks [14].
Despite these advancements, there is a notable gap in current research. Compara-
tive analyses of LSTM and GRU for predicting stock prices of technology companies are
Electronics 2024, 13, 3396 3 of 27
insufficient. Existing studies often lack the necessary industry specificity, resulting in un-
satisfactory predictions when models trained on general stock market data are applied
to specific industries or companies. This study aims to address this gap by focusing on
the technology industry and applying LSTM and GRU models to enhance the precision
of technology stock forecasts. By comparing these models, we aim to determine the more
effective method for predicting technology sector stock prices, ultimately aiding investors
in making data-driven decisions.
This study uniquely applies LSTM and GRU deep learning models, along with various
machine learning algorithms, to predict stock prices in the technology sector. We aim to
identify the more effective model among them, offering a crucial contribution to financial
forecasting. The objective is to better understand the patterns, trends, and volatility of
the tech stock market and develop an efficient model to bolster the accuracy of tech stock
forecasts, enabling data-driven decision-making for investors.
The remainder of this paper is structured as follows: Section 2 provides a brief intro-
duction to the various computational methods and data analysis techniques utilized in our
study. Section 3 covers the preliminaries, our approach, and illustrative examples. Section 4
presents the numerical results, while Section 5 details additional experiments and valida-
tions. Finally, Section 6 concludes the paper and highlights directions for future research.
2. Theoretical Background
This section outlines the various computational methods and data analysis techniques
employed in our study to predict stock price movements. The theoretical foundation of our
approach relies on both deep learning and traditional machine learning frameworks. Deep
learning is particularly adept at processing and learning from large datasets, making it ideal
for the complex patterns observed in stock market data. Machine learning algorithms like
XGBoost complement deep learning by providing efficient, scalable methods for regression
and classification.
network
network to
to remember
remember the
the context
context during
during training.
training. In
In the
the next
next section,
section, we
we summarize
summarize
Electronics 2024, 13, 3396
LSTM and
network to GRU networks.
remember the
LSTM and GRU networks. context during training. In the next section, we summarize
4 of 27
LSTM and GRU networks.
Figure
Figure 1.
1. Simple
Simple RNN
RNN with
with an
an input
input circle
circle and
and its
its equivalent
equivalent unrolled
unrolled presentation.
presentation.
Figure 1. Simple RNN with an input circle and its equivalent unrolled presentation.
Figure 1. Simple RNN with an input circle and its equivalent unrolled presentation.
LSTM
LSTMand and GRU
andGRU Networks
GRUNetworks
Networks
LSTM
LSTMHochreiter
and GRU Networks
and Schmidhuber developed an exceptional type of RNN that can learn
Hochreiter and
Hochreiter and Schmidhuber
Schmidhuber developed
developed an an exceptional
exceptional type
type of
of RNN
RNN thatthat can
canlearn
learn
over
over long
long distances.
Hochreiter
distances. Various
and Schmidhuber
Various other
other researchers
developed
researchers later
anlater improved
exceptional
improved this
type leading
ofleading
this RNN that effort [17–19].
can[17–19].
effort learn
over
LSTM long
and distances.
GRUs Various
were other researchers
developed to solve thelater improved
protracted this leading effort [17–19].
dependency
over
LSTM longanddistances.
GRUs Various
were other researchers
developed to solve later
the improved
protracted this leadingproblem.
dependency problem. Sutton
effort [17–19].
Sutton
LSTM and GRUs were developed to solve the protracted dependency problem. Sutton and
and
LSTM
and Barto
and
Barto discussed
GRUs were
discussed the
the evolution
developed
evolution and
to
and refinement
solve the of LSTM
protracted
refinement of LSTM and GRUs
dependency
and GRUs from
problem.
from RNNs
RNNs [20].
Sutton
[20].
Barto discussed the evolution and refinement of LSTM and GRUs from RNNs [20]. RNNs
RNNs
and
RNNsBartoare made up
up of
discussed
areupmade theaa evolution
of series ofof repeating neural
and refinement network
of LSTMmodules.
and GRUs In aa standard
Infrom RNNsRNN, [20].
are made of a series of series
repeating repeating neural modules.
neural network network modules.
In a standard standard
RNN, RNN,
repeating
repeating
RNNs modules
are made
repeating up of
modules contain
a series
contain a simple computational
of repeating
a simple node,
neural network
computational represented
modules. Inby
node, represented by a single
a standard tanh
RNN, ac-
modules contain a simple computational node, represented by a single atanhsingle tanh
activationac-
tivation
repeating
tivation function,
modules
function, as
as shown
contain
shown a in
in Figure
simple
Figure 2.
computational
2. node, represented by a single tanh ac-
function, as shown in Figure 2.
tivation function, as shown in Figure 2.
Figure 2.
Figure2. The
2.The repeating
Therepeating module
repeatingmodule in
inaaastandard
modulein standard RNN
standardRNN contains
containsaaasingle
RNNcontains single layer.
singlelayer.
layer.
Figure
Figure 2. The repeating module in a standard RNN contains a single layer.
LSTM cells
LSTM cells
LSTM can
cells can track
cantrack information
trackinformation
information overover multiple
overmultiple time
multiple time steps.
timesteps. Information
steps. Information
Information is is added
is added
added or or
or
eliminated
eliminated
eliminated through
LSTM through
cells
through structures
canstructures
track
structures called
information gates.
over
calledgates.
called gates. Gates
Gates
Gates naturally
multiple
naturally
naturally allow
time allow
steps. information
Information
allowinformation
information through
isthrough
throughadded via
ora
viavia
aa sigmoid
eliminated
sigmoid
sigmoid neural
neural net net
through
neural layer
net layer
and and
structures
layer and aa pointwise
a called gates.
pointwise multiplication.
Gates naturally
multiplication.
pointwise multiplication. The
Theallow repeating
Theinformation
repeating module
repeating module
through
in an LSTM
module in an
invia
an
LSTM
aisLSTM is shown in Figure 3. LSTM models process the information by first forgetting ir-
sigmoid
shown is shown
inneural
Figure in Figure
net
3. layer
LSTM 3.
andLSTM
a
models models
pointwise
process process the information
multiplication.
the information The
by by
repeating
first first forgetting
module
forgetting in
irrelevant an
ir-
relevant
LSTM
parts of
relevant parts
is the
shown
parts of
ofinthe
previous
the previous
Figure 3. then
state,
previous state,
LSTM then
models
storing
state, storing
thenthe mostthe
process
storing the most
the relevant
parts ofparts
information
relevant
most relevant by of
of the
thefirst
parts new new
new infor-
forgetting
information
the ir-
infor-
mation
relevant
to to
the state
mation to the
parts
the state
of of
the the
state of
cell,
of the
previouscell,
thirdly
the thirdly
cell,state,
updating
thirdly updating
then storing
their
updating their
internal
their internal
the moststatus,
internal status,
relevant
and
status, and
parts
then
and ofthen
the new
finally
then finally pro-
infor-
producing
finally pro-
ducing
mation
the the
to
output.
ducing the output.
the state of the cell, thirdly updating their internal status, and then finally pro-
output.
ducing the output.
Figure
Figure 3. The
3.The
Figure3. repeating
Therepeating module
repeatingmodule in
modulein an
inan LSTM.
anLSTM.
LSTM.
Figure 3. The repeating module in an LSTM.
The forget gate in an LSTM unit determines which cell state information to exclude
from the model. The memory cell takes the previous instant ht−1 and the current input
information xt and transforms them into a long vector (ht−1 , xt ) to become
f t = σ W f ·[ht−1 , xt ] + b f (1)
The forget gate in an LSTM unit determines which cell state information to exclude
from the model. The memory cell takes the previous instant ℎ and the current input
information 𝑥 and transforms them into a long vector (ℎ , 𝑥 ) to become
o𝑜t =
= σ𝜎((𝑊
Wo ·[. [ℎ
ht−1 ,, x𝑥t ]]+
+ b𝑏o ) ) (5)
(5)
where 𝑊 is the weight matrix for the output gate and 𝑏 is the bias term.
where Wo is the weight matrix for the output gate and bo is the bias term.
The final output value of the cell is defined as
The final output value of the cell is defined as
ℎ = 𝑜 ∗ tanh(𝐶 ) (6)
ht = ot ∗ tanh(Ct ) (6)
Cho created the Gated Recurrent Unit (GRU), a kind of RNN, in 2014 with the pur-
pose of fixing
Cho the vanishing
created the Gated gradient
Recurrentissue
Unitof(GRU),
RNNsa[21].
kindThe GRU in
of RNN, s key benefit
2014 over
with the other
purpose
structures is that
of fixing the it requires
vanishing fewerissue
gradient parameters,
of RNNstrains
[21]. quicker,
The GRU’sandkey
requires less
benefit data
over to
other
generalize.
structures The structure
is that of thefewer
it requires GRUparameters,
model is shown
trainsin quicker,
Figure 4.and requires less data to
generalize. The structure of the GRU model is shown in Figure 4.
Figure
Figure4.4.The
Theinternal
internalstructure
structureofofthe
theGRU
GRUmodel.
model.
Theupdate
The update and reset
reset gates
gatesproduce
produceintermediate
intermediate values
values 𝑧 and
zt and 𝑟 , respectively,
rt , respectively, while
the final
while the memory of the of
final memory general-purpose unit stores
the general-purpose unit the result
stores t [21]. h
thehresult The[21].
update
The specifies
update
the amount
specifies of prior of
the amount priorxtinput
input 𝑥 andhoutput
and output t−1 thatℎshould thatbe conveyed
should to the next
be conveyed cell,
to the
governed by the weight W
next cell, governed by the weightt . The 𝑊 . The reset gate determines how much data should
reset gate determines how much data should be erased
from
be memory.
erased from memory.
Thefollowing
The followingarearethe
the
most most essential
essential equations
equations thatthat characterize
characterize the operation
the operation of theof
the GRU:
GRU:
zt = σ(Wz ·[ht−1 , xt ]) (7)
rt = σ(Wr ·[ht−1 , xt ]) (8)
ht = tanh(Wh ·[rt ∗ ht−1 , xt ])
e (9)
ht = (1 − zt )∗ ht−1 + zt ∗ e
ht (10)
Electronics 2024, 13, 3396 6 of 27
where Wz , Wr , and Wh are the weight matrices for the update gate, reset gate, and candidate
activation, respectively. The operator · denotes matrix multiplication, while ∗ denotes
element-wise multiplication. The functions σ and tanh are the sigmoid and hyperbolic
tangent activation functions, respectively.
In this paper, we will use deep learning (DL) models to analyze selected technological
stock patterns as one-dimensional time series and attempt to forecast future stock prices by
examining past historical prices and the most critical technical indicators. This research
will compare the performance of the LSTM and the GRU ensemble models on selected
technology stock data to investigate stock price patterns.
et = tanh(Wa { x1 , x2 , . . . , x T } + b) (11)
These scores are then normalized using the softmax function to produce the attention
weights (at ):
exp(et )
at = T (12)
∑k=1 exp(ek )
The attention mechanism generally involves two steps: the first phase involves cal-
culating the attention distribution, and the second step involves computing the weighted
average of the incoming information using the attention distribution as a guide. The process
is initiated with the attention scoring function S, passing the result to the softmax layer,
generating the attention weights (1, 2, . . . , n)· Following that, the softmax layer is handed
Electronics 2024, 13, 3396 7 of 27
the attention weights 1, 2, . . . Finally, the attention weight vector is weighted and averaged
against the input data to arrive at the final result. The attention process is shown in Figure 5.
Figure5.
Figure 5. The
The basic
basic structure
structureof
ofthe
theattention
attentionmodel.
model.
where ŷt represents the forecasted value, µ is the mean term, ϕ1 , . . ., ϕp are the autoregres-
sive coefficients, yt−1 ,. . .,yt−p are the lagged values of the series, θ 1 , . . ., θ q are the moving
average coefficients, and et−1 , . . ., et−q are the lagged forecast errors.
employing a sequential LSTM approach. The findings show that our improved LSTM
model outperforms more conventional approaches in terms of prediction, especially when
it comes to identifying the complex patterns of stock price fluctuations. The model’s reliance
on substantial computational resources and extensive, high-quality datasets is one of the
study’s acknowledged potential shortcomings.
A dedicated recurrent neural network (RNN) architecture designed for time series
data is introduced by Lu and Xu [29] with a focus on stock price prediction. The temporal
relationships and non-linear patterns of financial data, among other difficulties, are well
handled by the TRNN model. The TRNN outperforms conventional RNN architectures
in terms of predicted accuracy and processing overhead by incorporating particular ap-
proaches. This paper makes a strong argument for the use of sophisticated RNN models
in stock price prediction by highlighting the significance of customized neural network
designs in financial forecasting.
and options worldwide. In addition, it offers extensive historical data, some going back
many decades. This is especially useful for long-term financial analysis and historical
research. We utilized stock data from Apple (AAPL), Amazon (AMZN), Google (GOOG),
and Microsoft (MSFT) from the past ten years, sourced from the Yahoo Finance database.
These stocks were chosen to leverage the findings of this study in building effective price
forecasting algorithms to aid investment decisions. Exploratory data analysis (EDA) will
be employed to gain a better understanding of the basic characteristics and nature of the
collected dataset, including data visualization.
x − min( x )
x′ = (14)
max ( x ) − min( x )
where x is the original value, and min(x) and max(x) are the minimum and maximum
values in the dataset, respectively.
(ii) Mean normalization: this method adjusts the data based on the mean and can be
computed as the following:
x − average ( x )
x′ = (15)
max( x ) − min( x )
where yi and ŷi are the actual and forecasted values, respectively, and N is the total number
of observations.
1 N
N ∑ J =1 j
MAE = | x − x̂ j | (18)
2
1
∑N
i=1 (yi − ŷi )
R2 = N
2
(19)
1
N ∑N
i=1 (yi − y i )
1 N
∑ j=1 1sign
MAD = x j − x j −1 (20)
N
where N is the total number of observations (trading days), and x j and x j−1 are the actual
and forecast values, respectively.
The methodology framework is developed as follows (Figure 6). In summary, we start
with data collection, gathering historical stock price data and financial indicators from Ya-
hoo Finance for companies such as Apple, Amazon, Google, and Microsoft. This is followed
by data exploration, where we perform exploratory data analysis (EDA) to understand the
dataset’s characteristics and trends. Next, we prepare the data by splitting it into training
(80%) and testing (20%) sets, employing k-fold cross-validation to ensure robustness. Pre-
processing and normalization are then applied, using techniques like Min-Max, Mean, and
Z-score normalization to make the data suitable for model training. For model construction,
we develop long short-term memory (LSTM) and Gated Recurrent Unit (GRU) models, as
well as XGBoost and Facebook Prophet, for machine learning approaches to predict future
stock prices. The models’ performance is evaluated using metrics such as Mean Absolute
Error (MAE), Root Mean Square Error (RMSE), Mean Directional Accuracy (MDA), and the
coefficient of determination (R2 ) to assess accuracy and effectiveness. Finally, we conduct
a risk–return tradeoff analysis to examine the predicted stock prices in terms of risk and
return, aiding investment decisions. This comprehensive and systematic approach ensures
the development and evaluation of effective stock price prediction models, enhancing the
accuracy of financial forecasts and supporting informed investment choices.
We now proceed to evaluate its performance through comprehensive numerical results
and analyses in the next sections.
Thearchitecture
Figure7.7.The
Figure architecturediagram
diagramfor
forprocessing
processingand
andanalyzing
analyzingdata.
data.
Input Layer: Time series data, such as stock prices over a given look-back period
Input Layer: Time series data, such as stock prices over a given look-back period (e.g.,
(e.g., 60 time steps), make up the input data. A minimum of 4 GB memory is expected.
60 time steps), make up the input data. A minimum of 4 GB memory is expected.
Layer of LSTM (128 Units): With 128 units, the LSTM layer is the first hidden layer.
Layer
Figure
of architecture
7. The
LSTM (128 diagram
Units): With
for
128 units,
processing
the
and
LSTM layer
analyzing
is the first hidden layer.
data.time steps, which allows it
The LSTM layer keeps a memory of prior inputs over numerous
The LSTM layer keeps a memory of prior inputs over numerous time steps, which allows
to identify long-term dependencies in the time series data. This aids in seeing patterns that
it to identify long-term dependencies in the time prices
series data.aThis aids in seeing patterns
mightInput
not beLayer:
obviousTime at series data,
first but aresuch as stock
essential for preciseover given
forecasting. look-back period (e.g.,
that might
60 time not
steps), be obvious at first but are essential for precise forecasting.
GRU Layermake up the After
(128 Units): inputthat,
data.theA minimum
output of the of 4LSTM
GB memory is expected.
layer is sent to a GRU layer,
GRU Layer (128 Units): AfterWiththat,128 theunits,
output of the LSTM layerislayerthe is sent to a GRU
whichLayer
has 128of LSTM
units as(128
well.Units):
Similar to the LSTM, the
the LSTM
GRU layer ismore first hidden
computationallylayer.
layer,
The which
LSTM has
layer 128 units
keeps a as well.
memory Similar
of prior to the
inputs LSTM,
over the GRU
numerous
efficient overall yet may still capture dependencies across time. By combining the benefits layer
time is more
steps, computa-
which allows
tionally
it both
of efficient
to identify
recurrent overall
long-term yet
unit types, may still
dependencies
LSTM and capture dependencies
in thelayers
GRU time improve
series data.across
the This time.
model’s Byincombining
aidscapacity
seeing the
patterns
to identify
benefits of
that might
intricate both recurrent
not be patterns
temporal unit
obvious at types,
in first LSTM and GRU layers improve
but are essential for precise forecasting.
the data. the model s capacity
to identify
GRUintricate
Dense Layer
Layer (128temporal
(64 Units):patterns
Units): After
The in the
that,
following thedata.
output
layer of theofLSTM
consists 64 unitslayerandis is
sent to a(com-
dense GRU
layer, which
pletely has 128
connected). In units
order asto well.
createSimilar
a moreto the LSTM,
condensed the GRU layer
representation is more
that will becomputa-
utilized
tionally
to produce efficient overall
the final yet may
prediction, still
this capture
layer dependencies
processes the output across
fromtime. By combining
the GRU layer. the
Dense Layer (32 Units): The features retrieved by the earlier
benefits of both recurrent unit types, LSTM and GRU layers improve the model s capacity layers are further refined
by
to aidentify
secondintricate
dense layer with 32
temporal units, which
patterns reduces the amount of data that will go into
in the data.
the final forecast.
Layer of Output: The prediction, usually the stock price for the following time step or
day, is provided by the output layer, which is the last layer. This layer has a linear activation
function, which is common for regression tasks like stock price prediction and includes a
single unit for predicting a single value.
4. Numerical Results
4.1. Data Preprocessing and Exploratory Analysis
This study collected daily historical stock datasets for Apple, Google, Microsoft, and
Amazon stocks using the API of Yahoo Finance. The selected stocks are from international
public companies traded at both NASDAQ and the NYSE. The time series data range from
Electronics 2024, 13, 3396 13 of 27
1 January 2013 to 30 March 2022, encompassing 3775 trading days. The daily time series data
was downloaded automatically using Python’s connection to the Yahoo Finance API. Daily
open price, daily highest price, daily lowest price, daily close price, daily adjusted closing
price, and daily trading volume are all included in the dataset. Table 1 below presents the
description of the features provided in the datasets downloaded from Yahoo Finance.
FEATURE DESCRIPTION
OPEN PRICE The price at which a stock was initially traded at the start of a trading day.
CLOSE PRICE The last price of a stock in the last transaction on a given trading day.
HIGH PRICE The highest price at which a stock traded on a specified trading day.
LOW PRICE The lowest price at which a stock traded on a specified trading day.
ADJUSTED CLOSE PRICE Adjusted close price based on the reflection of dividends and splits.
TRADING VOLUME A total number of shares/contracts traded on a given trading day.
The closing data were used to compute the daily returns for each technological stock
used to train the models. The most straightforward and obvious way to understand the
stock trend is through the characteristics of the stock price. Compared to the absolute value
of stock prices, price trend returns are more effective in stock forecasting. Different stocks
have different base prices, leading to large variations in absolute stock price values. Using
Electronics 2024, 13, 3396 14 of 27
daily returns reduces the prediction’s sensitivity to the price base.
For the training and testing of the model, the data were split into training and test
sets, with 80% of the total data used for training the model and the remaining 20% used
standard deviations of the features, especially for the adjusted closing price and trading
for testing.
volume, indicate
Figure thatshows
8 below the stock prices and
the closing trading
price volumes
line chart for theofselected
these companies werestocks,
technological more
volatile during
providing theoverview
a quick reportingofperiod.
the collected data.
Figure 8. The
Figure 8. The closing
closing price
price of
of the
the technological stock prices
technological stock prices selected.
selected.
PARAMETER VALUES
NODES WITHIN INPUT LAYER look-back period × input features
STEPS 2026 with early stopping criteria of the patience of 1 epoch
BATCH SIZE 1
HIDDEN LAYER 1 LSTM/GRU layer with 128 units
DROP OUT LAYER 0.2 dropout rate
LOOK-BACK PERIOD 60
OUTPUT LAYER 1
TRAINING TIME
MODELS RMSE MAE R2 MAD STOCK
(SECS)
LSTM 9.1463 7.8058 0.7609 6.0689 54 Apple
GRU 3.4273 6.5298 0.8229 6.7607 51 Apple
LSTM 103.5552 61.7088 0.4679 107.5323 57 Google
GRU 67.4582 35.1966 0.7742 115.7959 52 Google
LSTM 32.3734 31.045 0.6998 13.8247 63 Microsoft
GRU 8.0805 5.2005 0.8319 6.6803 61 Microsoft
LSTM 116.948 5.8673 0.7479 191.7697 64 Amazon
GRU 82.9599 3.8673 0.8731 182.9617 61 Amazon
Table 4. Comparison of Model Performance Against State-of-the-Art Methods Reported in the Literature.
Figure 9. Apple stock: actual and predicted close price, LSTM and GRU models.
Figure 9. Apple stock: actual and predicted close price, LSTM and GRU models.
4.3.2. Google Stock Prediction
Using the daily historical stock datasets for Google Inc. (Mountain View, CA, USA),
along with technical indicators, the performance of the LSTM and GRU models was
assessed using MAE, RMSE, MAD, and R2 .
From Table 3, it can be observed that the GRU model makes more accurate stock
price predictions for Google, with lower RMSE and MAE values (67.4582 and 35.1966,
respectively), compared to the LSTM model (103.5552 and 61.7088, respectively). The R2
value is also higher for the GRU model (0.7742) than for the LSTM model (0.4679). The
GRU model also has a shorter training time. Figure 10 below depicts the pattern of the
actual closing prices and predicted closing prices for the LSTM and GRU models.
Figure10.
Figure Googlestock:
10.Google stock: actual
actual andand predicted
predicted close,
close, LSTMLSTM and GRU
and GRU models.
models.
Figure 11.
Figure Microsoft stock:
11. Microsoft stock: actual
actualand
andpredicted
predictedclose
closeprice,
price,LSTM
LSTMand GRU
and models.
GRU models.
Figure
Figure12.12.
Amazon stock:
Amazon actual
stock: andand
actual predicted closeclose
predicted price,price,
LSTMLSTM
and GRU
andmodels.
GRU models.
Amazon stock. This could be due to the high price of Amazon’s stock, which might result
in more price volatility. Apple stock’s counter-intuitive scenario might result from market
Figure 12. Amazon stock: actual and predicted close price, LSTM and GRU models.
sentiment, Apple’s strong financial performance, or the company’s potential for future
growth.
4.4. It suggests
Predicted that Apple
Risk–Return could provide an attractive risk–return tradeoff for investors.
Tradeoff
Microsoft stock is also associated with lower risk but considerably higher expected returns
whenAcompared
risk–returnwith
tradeoff plotstocks.
Google was created to linkreflect
This might the predicted
investorstock prices from
confidence the GRU
in Microsoft’s
model with effective decision-making. This plot visually represents the
business model, its diverse range of offerings, and its solid financial performance. model s perfor-
The
mance by connecting
risk–return the risks
tradeoff chart from
shows predicted
that the riskreturns among
and return the stock
profiles prices. Itstocks
of different visualizes
vary
these tradeoffs
even within thefor the four
same technology
industry. stocks considered
The analysis in this study: Apple,
provides decision-makers with Google, Mi-
an effective
crosoft, and Amazon. The risk–return tradeoff plot is presented in Figure 13 below.
tool to align their investment decisions with their risk appetite and return expectations.
Figure 13.
Figure Predicted risk–return
13. Predicted risk–return tradeoff
tradeoff plot.
5. Additional
As observed Experiments and Validations
from the predicted risk–return tradeoff plot presented above, there is a
Ourrelationship
positive objective is to identify
between theand
risk most accuratereturns
expected model for
for each
each of
of the
the four
four stock prices.
technology
The primary goal of this analysis is to construct reliable and precise forecasting models
specifically tailored for short- to medium-term predictions. To ensure the consistency and
reliability of these models, we conducted a validation exercise over a 30-day time horizon,
starting from 1 January 2023. This timeframe simulates the intended use of these models in
real-world scenarios, allowing us to assess their practical effectiveness and suitability for
forecasting stock prices in a reasonable timeframe. The results are summarized in Table 5.
Table 5. Cont.
XGBoost demonstrated a relatively strong performance, with low RMSE scores, suggesting
their effectiveness in capturing Microsoft’s stock price trends. ARIMA, on the other hand,
exhibited a higher RMSE, indicating some difficulty in accurately predicting the stock price
movements. With the highest RMSE, Facebook Prophet has faced substantial challenges in
providing accurate forecasts for Microsoft stock.
field that has been a key driver of economic growth. The potential for improved tech-
nology stock price forecasting to enhance market efficiency and capital allocation is a
compelling prospect.
6. Discussions
6.1. Contributions
In this paper, we clearly demonstrate that we have made three major contributions to
this research.
First, we developed machine learning (ML) frameworks for social, economic, and
demographic prediction, as we have developed ML models to perform accurate analysis
and predictions for selected stock prices and the risks associated with them. Modeling
stock prices provides crucial insights into the dynamics of financial markets, with profound
implications for the economy and society at large. Stock markets essentially represent public
expectations of corporate growth and economic health. Advanced forecasting fuels data-
driven decision-making, risk assessment, and policy actions that shape social outcomes [33].
A second contribution is our use of big data and data sources for digital and computa-
tional analysis, since we have used a big data approach to analyze stock market prices and
predictions and investigate their relations to the US market and its economy. This research
implemented machine learning on an extensive dataset of 3775 daily observations across
four major technology stocks over ten years. The data-intensive modeling approaches
demonstrate the power of modern computational statistics to uncover complex patterns in
economic time series data [34].
Third, we used deep learning for stock prediction, as the primary goal of this study
was to employ deep learning, AI, and machine learning methods like the recurrent neural
network (RNN) to accurately anticipate the pattern of future stock prices in the technology
sector. We used daily technology stock data and basic technical indicators and compared
LSTM and GRU models, which belong to the RNN family, to ascertain which of them is
more efficient in predicting stock prices of technology industries. To achieve this aim, this
study collected daily historical stock datasets for Apple, Google, Microsoft, and Amazon
stocks from the API of Yahoo Finance through the Python ‘pandas_datareader.data’ and
Yahoo Finance library. The stocks selected are for international public companies traded
at both NASDAQ and the NYSE. The time series data range from 1 January 2013 to
30 March 2022. Together, the series contains 3775 trading days. The daily time series data
were automatically downloaded because Python is connected to the Yahoo Finance API.
The dataset includes daily prices: open, highest, lowest, close, adjusted close, and daily
trading volume.
The study applied deep learning models to analyze selected technological stock pat-
terns as a one-dimensional time series and forecast future stock prices by examining past
historical prices and the most critical technical indicators. The analysis built a compari-
son system to examine the performance of the LSTM and the GRU ensemble models on
the selected technology stock data to identify a parsimonious model for the real-world
representation of the technology stock markets.
The performances of the LSTM and GRU models were assessed using the Mean
Absolute Error (MAE), the Root Mean Square Error (RMSE), the Mean Absolute Deviation
(MAD), and the coefficient of determination (R2 ). From the results, it is observed that the
GRU model makes it simpler to predict how close the predicted value of the Apple, Google,
Amazon, and Microsoft stocks are to the actual value, as the RMSE and MAE values are
considerably lower for the GRU model for all of the technology stocks than for the LSTM
model. Moreover, the model’s fit (R2 ) is observed to be better for the GRU model than
for the LSTM model. It was also observed from our analysis that the GRU model has a
shorter training time than the LSTM model. Therefore, the GRU model produced a better
forecasting system for predicting daily technology stock data and fundamental technical
indicators. It can be used to efficiently estimate the pattern of future stock prices within the
technology industry.
Electronics 2024, 13, 3396 23 of 27
Lastly, this study linked the predicted stock from the GRU model with effective
decision-making. The risk–return tradeoff plot was computed as a visual depiction of the
model performance to connect risks from predicted returns among the technological stock
prices, and it can be observed that there is a positive relationship between risk and expected
returns for each of the four technology stocks considered in this study. Investing in Google
stock is associated with the lowest risk and lowest expected returns. However, the risk
associated with Amazon stock is higher than that of Apple stock. However, Apple stock
predicted more expected returns despite the lower risk compared with Amazon stock.
Microsoft stock is also associated with lower risk, but considerably higher expected returns
compared with Google stocks.
The present study has several contributions. Firstly, our study focuses on the technol-
ogy sector, comparing LSTM and GRU models specifically for technology stocks like Apple,
Google, Microsoft, and Amazon. This sector-specific analysis reveals unique patterns that
are not seen in broader market studies, providing more useful insights for technology
investors. Additionally, we evaluate not only the accuracy but also the training efficiency
of the models, offering practical insights into their computational performance. Our study
also includes a risk–return analysis based on predicted stock prices, giving practical insights
into investment strategies. These points highlight the unique aspects of our approach and
the significant contributions of our work.
While our study focuses on predicting stock prices in the technology sector, the ad-
vanced deep learning (DL) and machine learning (ML) models we employ have broad
applicability across various industries. For example, these models can predict patient out-
comes and optimize treatment plans in the healthcare sector. The energy sector can utilize
our solution to forecast consumption patterns and optimize grid operations. In retail, our
models can forecast sales and manage inventory. Financial institutions can leverage these
techniques for credit scoring, fraud detection, and risk management. By demonstrating the
versatility of our solution, we highlight its potential to address diverse challenges across
different sectors, underscoring the broad impact and utility of our approach.
7. Conclusions
This study has made important contributions to financial market analysis by utilizing
cutting-edge machine learning and deep learning techniques. We successfully designed ML
models that predict stock prices with precision and assess associated risks, offering critical
insights into the factors that shape financial markets and, by extension, the broader economy.
By applying big data approaches to analyze extensive historical stock data, we showcased
the effectiveness of modern computational methods in revealing intricate patterns within
economic datasets. Our comparison of LSTM and GRU models demonstrated that the GRU
model excels in both prediction accuracy and computational efficiency, particularly within
the technology sector. Additionally, the study’s analysis of the risk–return relationship
provided actionable insights for investors, highlighting the distinct behaviors of major
technology stocks. These findings not only enhance our understanding of stock market
dynamics but also provide a strong foundation for future research in financial forecasting
and investment strategy development.
Author Contributions: Conceptualization, V.C. and A.C.; methodology, V.C.; software, A.C.; valida-
tion, V.C., Q.A.X. and H.W.; formal analysis, V.C., A.C. and H.W.; investigation, V.C. and Q.A.X.; re-
sources, V.C.; data curation, A.C.; writing—original draft preparation, V.C. and A.C.; writing—review
and editing, V.C., Q.A.X. and H.W.; visualization, V.C. and A.C.; supervision, V.C.; project administra-
tion, V.C.; funding acquisition, V.C. All authors have read and agreed to the published version of
the manuscript.
Funding: This work is partly supported by VC Research (VCR 000221) and Leverhulme Trust
(VP1-2023-025).
Data Availability Statement: Data is available upon requests. Readers can also download data from
Yahoo Finance or Google Finance.
Acknowledgments: We thank Akash Prasad and Akram Dehnokhalaji for spending some time to
help improve part of this research project.
Conflicts of Interest: The authors declare no conflicts of interest.
Appendix A
References
1. Ariyo, A.A.; Adewumi, A.O.; Ayo, C.K. Stock Price Prediction Using the ARIMA Model. In Proceedings of the 2014 UKSim-AMSS
16th International Conference on Computer Modelling and Simulation, Cambridge, UK, 26–28 March 2014; pp. 106–112.
2. Nicholas Refenes, A.; Zapranis, A.; Francis, G. Stock performance modeling using neural networks: A comparative study with
regression models. Neural Netw. 1994, 7, 375–388. [CrossRef]
3. Malki, A.; Atlam, E.-S.; Hassanien, A.E.; Ewis, A.; Dagnew, G.; Gad, I. SARIMA model-based forecasting required number of
COVID-19 vaccines globally and empirical analysis of peoples’ view towards the vaccines. Alex. Eng. J. 2022, 61, 12091–12110.
[CrossRef]
4. Paliari, I.; Karanikola, A.; Kotsiantis, S. A comparison of the optimized LSTM, XGBOOST and ARIMA in Time Series forecasting.
In Proceedings of the 2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA), Chania,
Crete, Greece, 12–14 July 2021; pp. 1–7.
Electronics 2024, 13, 3396 26 of 27
5. Cheng, Y.; Yi, J.; Yang, X.; Lai, K.K.; Seco, L. A CEEMD-ARIMA-SVM model with structural breaks to forecast the crude oil prices
linked with extreme events. Soft Comput. 2022, 26, 8537–8551. [CrossRef]
6. Chatterjee, A.; Bhowmick, H.; Sen, J. Stock Price Prediction Using Time Series, Econometric, Machine Learning, and Deep
Learning Models. In Proceedings of the 2021 IEEE Mysore Sub Section International Conference (MysuruCon), Hassan, India,
24–25 October 2021; pp. 289–296.
7. Escudero, P.; Alcocer, W.; Paredes, J. Recurrent Neural Networks and ARIMA Models for Euro/Dollar Exchange Rate Forecasting.
Appl. Sci. 2021, 11, 5658. [CrossRef]
8. Liang, F.; Liang, F.; Zhang, H.; Zhang, H.; Fang, Y.; Fang, Y. The Analysis of Global RMB Exchange Rate Forecasting and Risk
Early Warning Using ARIMA and CNN Model. J. Organ. End User Comput. (JOEUC) 2022, 34, 1–25. [CrossRef]
9. Rao, A.R.; Reimherr, M. Modern non-linear function-on-function regression. Stat. Comput. 2023, 33, 130. [CrossRef]
10. Thakkar, A.; Chaudhari, K. A comprehensive survey on deep neural networks for stock market: The need, challenges, and future
directions. Expert Syst. Appl. 2021, 177, 114800. [CrossRef]
11. White, H. Economic prediction using neural networks: The case of IBM daily stock returns. In Proceedings of the IEEE 1988
International Conference on Neural Networks, San Diego, CA, USA, 24–27 July 1988; Volume 452, pp. 451–458.
12. Sonkiya, P.; Bajpai, V.; Bansal, A. Stock price prediction using BERT and GAN. arXiv 2021, arXiv:2107.09055.
13. Maqsood, H.; Mehmood, I.; Maqsood, M.; Yasir, M.; Afzal, S.; Aadil, F.; Selim, M.M.; Muhammad, K. A local and global event
sentiment based efficient stock exchange forecasting using deep learning. Int. J. Inf. Manag. 2020, 50, 432–451. [CrossRef]
14. Patil, P.; Wu, C.-S.M.; Potika, K.; Orang, M. Stock Market Prediction Using Ensemble of Graph Theory, Machine Learning
and Deep Learning Models. In Proceedings of the 3rd International Conference on Software Engineering and Information
Management, Sydney, NSW, Australia, 12–15 January 2020; pp. 85–92.
15. Hochreiter, S. Untersuchungen zu Dynamischen Neuronalen Netzen. Bachelor’s Thesis, Technische Universität München,
München, Germany, 1991.
16. Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw.
1994, 5, 157–166. [CrossRef]
17. Hochreiter, S. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions. Int. J. Uncertain.
Fuzziness Knowl. Based Syst. 1998, 6, 107–116. [CrossRef]
18. Schmidhuber, J.; Wierstra, D.; Gagliolo, M.; Gomez, F. Training recurrent networks by Evolino. Neural Comput. 2007, 19, 757–779.
[CrossRef] [PubMed]
19. Chen, J.; Chaudhari, N.S. Segmented-Memory Recurrent Neural Networks. IEEE Trans. Neural Netw. 2009, 20, 1267–1280.
[CrossRef]
20. Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; A Bradford Book; The MIT Press: Cambridge, MA, USA; London,
UK, 2018.
21. Cho, K.; van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations
using RNN Encoder–Decoder for Statistical Machine Translation. In Proceedings of the 2014 Conference on Empirical Methods in
Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1724–1734.
22. Jiang, L.; Subramanian, P. Forecasting of Stock Price Using Autoregressive Integrated Moving Average Model. J. Comput. Theor.
Nanosci. 2019, 16, 3519–3524. [CrossRef]
23. Kavzoglu, T.; Teke, A. Predictive Performances of Ensemble Machine Learning Algorithms in Landslide Susceptibility Mapping
Using Random Forest, Extreme Gradient Boosting (XGBoost) and Natural Gradient Boosting (NGBoost). Arab. J. Sci. Eng. 2022,
47, 7367–7385. [CrossRef]
24. Lilly Sheeba, S.; Neha, G.; Anirudh Ragavender, R.M.; Divya, D. Time Series Model for Stock Market Prediction Utilising Prophet.
Turk. J. Comput. Math. Educ. (TURCOMAT) 2021, 12, 4529–4534.
25. Yun, K.K.; Yoon, S.W.; Won, D. Interpretable stock price forecasting model using genetic algorithm-machine learning regressions
and best feature subset selection. Expert Syst. Appl. 2023, 213, 118803. [CrossRef]
26. Han, C.; Fu, X. Challenge and Opportunity: Deep Learning-Based Stock Price Prediction by Using Bi-Directional LSTM Model.
Front. Bus. Econ. Manag. 2023, 8, 51–54. [CrossRef]
27. Zhao, Y.; Yang, G. Deep Learning-based Integrated Framework for stock price movement prediction. Appl. Soft Comput. 2023,
133, 109921. [CrossRef]
28. Quadir, A.; Kapoor, S.; Junni, A.V.C.; Sivaraman, A.K.; Tee, K.F.; Sabireen, H.; Janakiraman, N. Novel optimization approach for
stock price forecasting using multi-layered sequential LSTM. Appl. Soft Comput. 2023, 134, 109830. [CrossRef]
29. Lu, M.; Xu, X. TRNN: An efficient time-series recurrent neural network for stock price prediction. Inf. Sci. 2024, 657, 119951.
[CrossRef]
30. Salah, S.; Alsamamra, H.R.; Shoqeir, J.H. Exploring Wind Speed for Energy Considerations in Eastern Jerusalem-Palestine Using
Machine-Learning Algorithms. Energies 2022, 15, 2602. [CrossRef]
31. Li, Z.; Yu, H.; Xu, J.; Liu, J.; Mo, Y. Stock Market Analysis and Prediction Using LSTM: A Case Study on Technology Stocks. Innov.
Appl. Eng. Technol. 2023, 2, 1–6. [CrossRef]
Electronics 2024, 13, 3396 27 of 27
32. Liu, Y. Analysis and forecast of stock price based on LSTM algorithm. In Proceedings of the 2021 IEEE International Conference on
Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI), Fuzhou, China, 24–26 September
2021; pp. 76–79.
33. Jabeur, S.B.; Mefteh-Wali, S.; Viviani, J.-L. Forecasting gold price with the XGBoost algorithm and SHAP interaction values. Ann.
Oper. Res. 2024, 334, 679–699. [CrossRef]
34. Kumar, G.; Jain, S.; Singh, U.P. Stock Market Forecasting Using Computational Intelligence: A Survey. Arch. Comput. Methods Eng.
2020, 28, 1069–1101. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.