Machine Learning Models in Finance
Machine Learning Models in Finance
2023
DECLARATION
This research is our original work and has not been presented for a degree award in any other
university.
……………...
The research proposal has been submitted for examination with my approval as a university
supervisor.
Signature……………………………… Date…………………………………….
ii
ACKNOWLEDGEMENT
We primarily thank the Almighty God for His grace throughout our research work and enabling
us to reach this stage in our career.
It would not have been possible to do this project proposal without the support of our families.
Their care, understanding, prayers and continued support that has enabled us to reach this far and
we are forever grateful.
It has also been a great honor and privilege to undergo training in this prestigious university.
We also are highly indebted to Dr. Susan Mwelu for her guidance and constant supervision
throughout the research project proposal process.
iii
LIST OF ABBREVIATIONS
A/D Accumulation/ Distribution
AI Artificial Intelligence
AR AutoRegressive
iv
MAE Mean Absolute Error
v
CONTENTS
DECLARATION ............................................................................................................................ ii
ACKNOWLEDGEMENT ............................................................................................................. iii
LIST OF ABBREVIATIONS ........................................................................................................ iv
CHAPTER ONE ............................................................................................................................. 1
INTRODUCTION .......................................................................................................................... 1
1.1 BACKGROUND OF THE STUDY ..................................................................................... 1
1.2 PROBLEM STATEMENT. .................................................................................................. 4
1.3 JUSTIFICATION OF THE STUDY..................................................................................... 4
1.4 OBJECTIVES ....................................................................................................................... 5
1.4.1 GENERAL OBJECTIVE ............................................................................................... 5
1.4.2 SPECIFIC OBJECTIVES............................................................................................... 5
1.5 SCOPE OF THE STUDY ..................................................................................................... 5
CHAPTER TWO ............................................................................................................................ 7
LITERATURE REVIEW ............................................................................................................... 7
2.1 Introduction ........................................................................................................................... 7
2.2 Theoretical review ................................................................................................................. 7
2.3 Critiques of the existing literature relevant to the study ..................................................... 10
2.4 Summary ............................................................................................................................. 11
2.5 Research gaps ...................................................................................................................... 12
CHAPTER THREE ...................................................................................................................... 14
METHODOLOGY ....................................................................................................................... 14
3.1 Introduction ......................................................................................................................... 14
3.2 Research Design .................................................................................................................. 14
3.3 The target population .......................................................................................................... 15
3.4 Sampling techniques and illustration. ................................................................................. 15
3.5 The Instruments ................................................................................................................... 17
MODELS .................................................................................................................................. 17
3.5.1 Long Short Term Memory ............................................................................................ 17
3.6 Data collection..................................................................................................................... 20
3.6.1 Variables description .................................................................................................... 21
3.7 Process and evaluation ........................................................................................................ 24
vi
3.7.1 Process .......................................................................................................................... 24
3.7.2 Evaluation ..................................................................................................................... 25
REFERENCES ............................................................................................................................. 28
WORK PLAN ............................................................................................................................... 31
Gantt chart ..................................................................................................................................... 31
vii
CHAPTER ONE
INTRODUCTION
Fundamentally, a stock market is associated with buyers and sellers dealing with stocks or shares.
The stock market is known to be volatile, random and unpredictable due to continuously changing
stream of data. The stock data is a sequence of the price of a given stock equally distributed by
time intervals, i.e., time-series data. The stock traders mainly focus on accurate prediction of the
shares to maximize profit. According to Cao, Y., Tay, F. E. H., & Yao, L. (2019),deep learning
for stock prediction refers to the application of deep neural networks, a subset of machine learning
algorithms, for forecasting stock prices or making investment decisions in the financial markets.
Globally stock markets are increasingly adopting deep learning techniques for stock prediction.
With digital technologies and vast financial data, researchers and practitioners are exploring deep
learning's potential to uncover patterns and improve prediction accuracy. From Wall Street to
financial hubs in Europe and Asia, deep learning is enhancing stock market forecasting, enabling
informed investment decisions and risk management strategies.
Africa is a diverse continent with different countries and economies, each with its unique
perspective on stock markets. Predicting stock markets in Africa, like in any region, presents
challenges and uncertainties. Key factors to consider when discussing the outlook of African stock
markets include economic growth, commodity prices, the political and regulatory environment,
investor sentiment, capital market development, and regional integration.In recent years, Africa's
1
financial landscape has experienced rapid evolution, with stock exchanges emerging as crucial
drivers of economic growth and investment opportunities. Deep learning for stock prediction is
gradually gaining attention within the African context. As African stock exchanges embrace
digitization and technological advancements, researchers and analysts are exploring the
application of deep learning algorithms to predict stock prices, identify market trends, and enhance
investment strategies.
The stock market in Kenya refers to the Nairobi Securities Exchange (NSE), which is the primary
securities exchange in the country. The NSE plays a crucial role in facilitating the buying and
selling of securities, including stocks, bonds and other financial instruments. Shareholders
however do not directly execute the trade, nor is there any meeting between buyers and sellers for
sale negotiations. They trade by giving instructions to their Stockbrokers, who in turn execute
orders by automatically matching mutually agreeable prices through the NSE trading software.
Stockbrokers are the only authorized agencies that can act on behalf of investors in the stock trade
business. Stockbrokers do not just execute client orders, they are also charged with the
responsibility of advising clients on NSE trades (Government of Kenya, 2009). In the stock trade
business, investors, shareholders, traders and clients usually mean the same thing. These are the
entities that invest in the stock market by buying and selling stocks through Stockbrokers. In their
advisory role, some Stockbrokers base their advice on their observation of short term price
movements (trend). Other Stockbrokers do basic research into the fundamentals of the various
stocks or undertake technical analysis before they advise their clients on investment decisions.
However, none of these methods have any assurance of profitability and they usually state a caveat
as such. While talking about the stock market, Graham (2003) explains technical approaches as
those that generally urge investors to buy stocks because they are appreciating in value and sell
when their value declines. He says this is the ‘popular’ method. This approach is based on the
immediate or short term focus. He further says that many investors acquire common stocks for the
mere ‘excitement and temptation’ of the market. This state of investing for the excitement of the
market is described as ‘temperament’. Despite the popularity of technical analysis, he states that
this approach is unreliable and is akin to ‘simple tossing of a coin’ (Graham, 2003) Since none of
the current methods of investment used by NSE Stockbrokers guarantee trade at a profit, investors
can be said to be leaving important investment decisions to mere temperament of Stockbrokers.
Despite this lack of investment advice that assures investors of profits, this sector controls about
2
half the wealth compared to bank deposits. It is therefore important to have carefully considered
investment advice. The likelihood of financial losses, due to inadequate or incorrect advice, is
detrimental to investors, especially the low-income citizens who may be investing their livelihoods
in the stock exchange, hoping for some profit. Without assurance of some profit, potential investors
have been reluctant to consider the NSE as a serious investment sector. Lack of interest in this
sector is therefore likely to stagnate the growth of the NSE and hence fail to help the country
achieve the desired economic growth.Stockbrokers need to be empowered to enable them have
some capability to provide the best advice to their clients. Such empowerment should not only
improve their reputation amongst clients, but is also for their own benefit, since they are paid a
commission of each trade. If Stockbrokers do not have tools that can generate good advice, then
they are bound to fail in their duty, leading to apathy from the investors. The current popular
methods of technical analysis (using trends for guessing future price movements) and fundamental
analysis (buy-and-hold principle for any company that is considered good) have no pointers on the
exact price for a future trade. A system that can guide on the most likely next day price is therefore
missing and necessary. Such a predictive tool is therefore desirable. It is for this reason that there
is need to formulate a deep learning model that can be developed into a tool that can be used by
Stockbrokers to advise investors on exactly which stock to invest in, for guaranteed profitability.
Such a tool should not only show the trend but also the most probable stock prices for next day
trade. Additionally, with such a system having generated a shortlist of good stocks, it is possible
to further pick the most profitable of these stocks at any given time. Such a system should therefore
be able to predict the next day events. Prediction is the foretelling of a future event or outcome,
before it occurs. In contrast, trends (technical analysis) would show many stocks moving in a
particular direction, without pinpointing the most profitable amongst them. The deep learning
system can be able to provide a longer term investment plan so that the investor knows which
stocks to buy and which ones to sell at any particular time in the future (prediction). Deep learning
techniques such as LSTM and GRU are computer algorithms formulated using specific AI rules
to learn from data and then be used for tasks such as prediction. The emergence of deep learning,
a subset of machine learning revolutionized stock prediction. The main goal is to study and apply
deep learning techniques to the stock market in order to predict stock behavior and thus act on
those predictions to avoid investment risk and generate profit. Deep learning models, particularly
3
LSTM and GRU have shown exceptional capabilities in capturing complex patterns and non-linear
relationships.
4
platform, offering influential advice and executing trades. By developing deep learning tools to
assist Stockbrokers in providing predictive advice to clients, investors are more likely to make
informed trading decisions. Assured returns on investment will attract more investors to participate
in the Nairobi Securities Exchange (NSE), contributing to its growth and the overall economic
development of Kenya.
The NSE plays a significant role in the country's economic growth by facilitating the movement
of funds across various sectors. Investment in listed firms supports their development and
improves the savings culture among citizens. Stocks provide an opportunity for citizens to profit,
enhance their purchasing power, and actively participate in economic activities. This project also
bridges the gap between theoretical considerations in Information and Communications
Technology (ICT) systems and their practical implementation in solving real world financial
problems. It encourages the academic community to apply ICT to tackle financial problems.
1.4 OBJECTIVES
5
different market conditions. The study's outcomes could provide valuable insights for investors
and financial analysts, contributing to the growing field of deep learning applications in finance.
6
CHAPTER TWO
LITERATURE REVIEW
2.1 Introduction
This literature review explores the theoretical and empirical framework of using deep learning
techniques for stock market prediction. It examines various studies, methodologies, and models
employed in the field of deep learning for predicting stock market trends. The review focuses on
the theoretical and empirical underpinnings of deep learning algorithms, their application in stock
market prediction, and the challenges associated with this approach. The findings contribute to a
comprehensive understanding of the theoretical and empirical foundations of deep learning
techniques for stock market prediction and identifying research gaps from past studies.
The data used for stock price forecasting is typically time series data, LeCun, Bottou, Bengio, &
Haffner (1998), which is a chronological sequence of observations for a specific variable, in this
case, stock prices. Time series analysis helps identify patterns, trends, and cycles within the data.
Early knowledge of bullish or bearish market trends can be advantageous for making wise
investment choices, and recognizing patterns can aid in identifying the best-performing companies
for specific periods.
According to Hochreiter and Schmidhuber(1997), Several methods are used for stock price
forecasting, including fundamental analysis, technical analysis, and time series forecasting.
Fundamental analysis estimates a company's share value by analyzing its financial indicators,
making it suitable for long-term predictions. Technical analysis, on the other hand, relies on
historical price data to identify current trends and is more suited for short-term forecasts.
7
Time series forecasting involves two main classes of algorithms: linear models and non-linear
models. Linear models, such as AR, ARMA, and ARIMA, fit mathematical models to univariate
time series data. However, these models have limitations, as they do not account for latent
dynamics in the data and fail to identify interdependencies among various stocks.
According to Rosenblatt (1958), Efficient Market Hypothesis (EMH) and Random Walk Theory
(RWT) are two significant theories used to understand stock market pricing movements. EMH
suggests that all publicly available information is already reflected in stock prices, making
profitable predictions impossible. RWT argues that past stock price movements cannot reliably
predict future values, as stock prices change independently and randomly.
Trading strategies form the basis for making predictions in financial markets, with technical
trading and fundamentals trading being the most popular approaches (Rumelhart, Hinton, &
Williams 1986). Technical trading relies on analyzing past price movements to identify current
trends, while fundamentals trading focuses on a company's intrinsic value based on financial data.
Deep learning, a subfield of machine learning, has gained immense popularity for its ability to
automatically learn hierarchical representations from large datasets using neural networks with
multiple layers (Schmidhuber, 2015). These neural networks, inspired by the human brain's
structure and function, consist of interconnected nodes organized into layers. Each node processes
inputs through a weighted sum and an activation function to produce an output, allowing
information to flow through the network, eventually leading to the desired output.
The history of neural networks can be traced back to the work of (McCulloch and Pitts,1943)
where they proposed an artificial neuron model based on logical calculus. In the following
decades, Rosenblatt's perceptron model established the foundation for single-layer neural networks
capable of linear separability.
The breakthrough that paved the way for modern deep learning was the development of the
backpropagation algorithm (Rumelhart, Hinton, & Williams 1983). This algorithm enabled the
8
efficient training of multilayer perceptrons (MLPs) with multiple interconnected layers of neurons,
allowing the network to learn complex patterns.
To handle sequential data with temporal dependencies, Recurrent Neural Networks (RNNs) were
introduced. However, the standard RNN suffered from the vanishing gradient problem, hindering
the learning of long-term dependencies. The solution came with Long Short-Term Memory
(LSTM), proposed by (Hochreiter and Schmidhuber 1997), which introduced memory cells and
gating mechanisms to capture and propagate information over long sequences, making it suitable
for tasks like speech recognition and language modeling.
An alternative to LSTM, called Gated Recurrent Units (GRUs), was proposed by Cho et al (2014).
GRUs simplify the gating mechanism while achieving comparable performance. They strike a
balance between performance and computational efficiency and have found applications in various
domains.
These advancements in neural network architectures, along with optimization algorithms and
regularization techniques, have led to significant breakthroughs in diverse fields. CNNs excel in
computer vision, while RNNs, LSTM, and GRUs have made significant contributions to natural
language processing and speech recognition.
Deep learning has emerged as a powerful tool for solving complex problems across various
domains. The continuous evolution of neural network architectures and techniques continues to
drive the advancement of artificial intelligence and push the boundaries of what is possible in
machine learning.
9
2.3 Critiques of the existing literature relevant to the study
The use of technical analysis, particularly trending, as a basis for predicting stock market prices
has been explored in various markets. Ndiritu (2010) developed a support system for trending
future prices in the NSE. In different studies, researchers applied technical analysis to predict stock
prices in markets like the New York Stock Exchange (Deng, Mitsubichi, Shioda, Shimada &
Sakurai 2011), the Tehran Stock Exchange (Aghababaeyan and Tamannasiddiqui 2011), the
Bangladesh Stock Market (Khan, Alain & Hyussain 2011), and the Australian Stock Market (Pan,
Tilakaratne &Yearwood 2005). While these studies achieved varying levels of accuracy, their tools
were not commercialized or targeted for specific stockbrokers. Following the interest in stock price
prediction, the focus shifted towards deep learning techniques.
Yixin (2022), explored the application of ARIMA, GARCH, and LSTM models for stock price
prediction on the S&P 500 stock market. The LSTM model outperformed ARIMA and GARCH
models, showcasing its ability to capture long-term dependencies in stock data. However, the
limitations of the study included its focus on one stock market and the use of historical data. Further
research is necessary to validate the findings across different stock markets and time periods.
Bhattacharjee and Bhattacharja (2019) conducted a comparative study between statistical and
machine learning methods for stock price prediction using data from Tesla and Apple. The results
indicated that machine learning methods, especially multi-layer perceptron and long short-term
memory (LSTM), outperformed statistical methods in capturing non-linear relationships within
stock data. Although the study only used historical data and not future predictions, it provided a
clear explanation of the models and their evaluation metrics.
Hota, Chakravarty, Paikaray & Bhoya (2020) rexamined the application of four machine learning
algorithms to predict the opening price of American Airlines stocks. The results demonstrated that
the random forest algorithm outperformed other algorithms, including support vector regression,
decision tree, and artificial neural network, in terms of predictive accuracy. The authors suggested
using more recent data and exploring advanced evolutionary techniques for improving artificial
neural networks. Despite the dataset's time limitation, the study offers valuable insights into the
use of machine learning algorithms for stock price prediction.
Priya and Geetha (2022) present a machine learning model based on the random forest algorithm
for predicting stock market prices. The model achieved an impressive 75% accuracy in its
10
predictions. The authors acknowledged the limitations of using a single stock market dataset and
the need for further exploration into other deep learning models and additional features.
Nevertheless, the paper demonstrates the effectiveness of the random forest algorithm for stock
price prediction and lays the groundwork for future improvements and research in the field.While
the model's implementation and results are commendable, the small dataset and lack of comparison
with other algorithms pose limitations that need to be addressed for further validation.
Gao, Wang & Zhou (2021)investigates the performance of LSTM and GRU models for stock
market forecasting. The study reveals that both models can effectively predict stock prices, with
LASSO dimension reduction generally producing better results than PCA. The authors provide
valuable insights into the use of dimension reduction techniques and practical recommendations
for stock market forecasting. However, the study's limitations include the evaluation on a single
dataset, calling for further research to assess the generalizability of the results to other datasets and
explore additional dimension reduction methods.
Shahi, Shrestha, Neupane, & Guo (2020) compares LSTM and GRU models for stock market
forecasting. The authors employ a cooperative deep learning architecture, which results in LSTM
outperforming GRU with an average accuracy of 52.3% compared to 50.7%. Incorporating
financial news sentiment further improves accuracy to 53.2%. The paper's contributions include
the comparative study, demonstrating the impact of sentiment analysis, and proposing the
cooperative deep learning architecture. However, limitations include the focus on the S&P 500
index and the absence of comparison with other machine learning methods, calling for further
research in different stock markets and algorithm comparisons.
2.4 Summary
This literature review delves into the theoretical and empirical aspects of using deep learning
techniques for stock market prediction. It covers the challenges faced in stock market prediction,
the different methods used for forecasting, and the importance of time series analysis for
identifying patterns and trends. The review highlights the significance of deep learning,
particularly LSTM and GRU models, for handling sequential data with temporal dependencies.
The review discusses the history of neural networks, from early artificial neuron models to the
breakthrough of back propagation, CNNs, and RNNs. It emphasizes the importance of LSTM and
11
GRU in capturing long-term dependencies, making them suitable for tasks like speech recognition
and language modeling.
Critiques of existing literature are addressed, with a focus on the use of technical and fundamental
analysis for stock price prediction. The limitations of linear models and the advantages of deep
learning models are discussed. The review also touches on the Efficient Market Hypothesis and
Random Walk Theory, which influence stock market pricing movements.
Several papers are examined in the review. "Stock Price Prediction Using Machine Learning"
showcases the effectiveness of LSTM over ARIMA and GARCH models, while "Stock Price
Prediction: A Comparative Study between Traditional Statistical Approach and Machine Learning
Approach" favors machine learning models, especially multi-layer perceptron and LSTM, over
statistical methods. "Stock Market Prediction Using Machine Learning Techniques" highlights the
superiority of the random forest algorithm for predicting opening prices of American Airlines
stocks.
Overall, the literature review provides a comprehensive understanding of the theoretical and
empirical foundations of deep learning techniques for stock market prediction. It emphasizes the
potential and effectiveness of deep learning models, while acknowledging the limitations and areas
for further research. The review contributes valuable insights to the field of stock market
forecasting and highlights the continuous evolution of neural network architectures in driving
advancements in artificial intelligence and machine learning.
12
models in real-time or near-real-time stock market prediction will be valuable for assessing their
practicality in live trading scenarios. By addressing these research gaps, the field of deep learning
for stock market prediction will be enriched, making it more applicable and reliable in real-world
financial contexts.
13
CHAPTER THREE
METHODOLOGY
3.1 Introduction
In this study, we will investigate the use of Long Short-Term Memory (LSTM) and Gated
Recurrent Unit (GRU) neural networks for stock market prediction. The methodology will involve
data collection with relevant features, preprocessing, and partitioning for training, validation, and
testing. LSTM and GRU architectures are designed to capture long-term dependencies, and hyper
parameter tuning will be performed for optimal performance. The models will then be evaluated
using performance metrics and compared to identify their strengths and weaknesses in predicting
stock market prices. The study aims to provide valuable insights for investors, traders, and
researchers seeking advanced deep learning techniques to enhance their decision-making.
14
making. By following this methodology, the research aims to contribute valuable knowledge to
the use of deep learning techniques for forecasting financial markets.
15
• Agricultural
• Automobiles
and accessories
• Commercial and
services
• Construction
and Allied
• KCB GROUP
• Energy and • EQUITY
Petroleum RANDOM HOLDINGS
NSE STOCKS STRATIFIED • Insurance SAMPLING BANKING
STOCK LTD
SAMPLING PERFORMANCE • BAT
• Investment MANUFACTURING
• EABL
• Manufacturing
and Allies
• Telecommunica
tion
• Real Estate
Investment Plan
• Exchange
Traded Funds
• Investment
services
16
3.5 The Instruments
MODELS
The Long Short-Term Memory (LSTM) model comprises multiple LSTM cells, each equipped
with three vital gates: an input gate, an output gate, and a forget gate. These gates control the flow
of information within the model, with the input gate determining the relevant information passed
to the next cell, and Equations (1) and (2) show its related formulas where ℎ𝑡−1 is output at the
prior time (t − 1), and 𝑋𝑡 is input at the current time (t) into Sigmoid function (S(t)). All W and b
are the weight matrices and bias vectors that require to be learned during the training process.
𝑓𝑡 defines how much information will be remembered or forgotten. The input gate defines which
new information remember in cell state by Equations (3)–(4). The value of 𝑖𝑡 is generated to
determine how much new information cell state need to be remembered. A tanh function gains an
election message to be added to the cell state by inputting the output ℎ𝑡−1 at the prior time (t − 1)
and adding the current time t input information 𝑋𝑡 . 𝐶𝑡 gets the updated information that must be
added to the cell state (Equation (5)). The output gate regulating information transfer to the
subsequent layer, and the forget gate handling the retention or dismissal of previous time step
information. The value of 𝑂𝑡 is between 0 and 1; which is employed to indicate how many cells
state information that need to output (Equation (6)). The result of ℎ𝑡 is the LSTM block’s output
information at time t (Equation (7)) The cells are stacked in a feed-forward fashion, with the final
output derived from the last LSTM cell. During training, optimization algorithms like stochastic
gradient descent (SGD) or Adam are used to adjust the model's weights, minimizing the error
between predicted and actual outputs. Once trained, the LSTM model is capable of making
17
predictions on new data, utilizing its ability to capture long-term dependencies effectively in tasks
such as time series forecasting.
𝑆(𝑡) = 1/ 1 + 𝑒 −𝑡 (2)
𝐶𝑡 = 𝑓𝑡 × 𝐶𝑡−1 + 𝑖𝑡 × 𝐶̂ (5)
The Gated Recurrent Unit (GRU) is a simplified version of the LSTM model, designed to capture
long-term dependencies in sequential data. It also uses gates to control the flow of information,
but it has only two gates: an update gate and a reset gate. The Gated Recurrent Unit (GRU) model
is a type of recurrent neural network comprising multiple GRU cells. Each GRU cell is equipped
with two fundamental gates: an update gate (𝑧𝑡 ) and a reset gate (𝑟𝑡 ). These gates allow the model
to control the flow of information and handle long-term dependencies in sequential data. The
update gate determines how much of the previous cell's hidden state should be retained and how
18
much of the new candidate state should be integrated. On the other hand, the reset gate controls
which portions of the past information to forget. Below are the equations for these gates
respectively:
The candidate hidden state represents the new information that could be added to the hidden state.
The hidden state at the current time step is a combination of the previous hidden state and the
candidate hidden state, controlled by the update gate.
ℎ𝑡 = (1 − 𝑧𝑡 ) × ℎ𝑡−1 + 𝑧𝑡 × ℎ𝑡 ~ (11)
where:
19
Like LSTM, the GRU cells are stacked in a feed-forward manner, with the final output derived
from the last GRU cell. During training, optimization algorithms such as stochastic gradient
descent (SGD) or Adam are used to adjust the model's weights, minimizing the error between
predicted and actual outputs. Once trained, the GRU model is capable of making predictions on
new data, leveraging its ability to effectively capture long-term dependencies in tasks such as time
series forecasting.
20
3.6.1 Variables description
Historical price data refers to the time series of past stock prices for a given security. It provides
valuable information about the historical trends, patterns, and volatility in the stock market. By
incorporating historical price data as an input variable in LSTM and GRU models, deep learning
algorithms can learn from the historical price patterns and potentially capture complex
relationships between past price movements and future stock performance.
Macroeconomic variables
Gross Domestic Product (GDP) reflects the overall economic health, and its growth impacts
various sectors and market optimism. Inflation rate affects consumer spending and corporate
profits, while interest rates influence borrowing costs and economic activity. By including these
macroeconomic variables in deep learning models, we will be able identify complex relationships
between economic factors and stock market movements, leading to more informed and accurate
stock price predictions, despite the inherent challenges in financial markets. Changes in interest
rates can influence investment decisions. Lower interest rates can encourage borrowing and
investment, potentially boosting stock prices. Major stock market indices can act as general proxies
for market sentiment and overall market performance. LSTM and GRU models can learn patterns
and relationships between these indices and individual stocks to make more informed predictions.
Technical indicators
Technical indicators are commonly used in stock prediction models, including those based on
LSTM and GRU architectures. These indicators are mathematical calculations based on historical
stock price and volume data that provide insights into the market's momentum, trends, and
potential reversal points. Below is a brief descriptions of some of these indicators:
21
Weighted 14-day moving average
Similar to the simple moving average, the weighted moving average assigns different weights to
each closing price within the period. This can be useful in capturing short-term price movements
with higher weights on recent prices, and long-term trends with lower weights on older prices.
Momentum
Momentum measures the difference between the current closing price and the closing price "n"
periods ago. Including momentum as an input variable can provide information about the strength
and direction of recent price movements, which can help the deep learning model identify potential
continuation or reversal patterns.
Stochastic K%
The Stochastic Oscillator, represented by %K, compares the current closing price to the highest
and lowest prices over the last "n" periods. %K represents the relative position of the current price
within this range. By including Stochastic %K as an input variable, the model can identify
overbought and oversold conditions, which may signal potential reversals in stock price trends.
𝐶𝑡 −𝐿𝐿𝑡𝑡 −𝑛+1
StochasticK%=(𝐻𝐻 )*100 (14)
𝑡𝑡 −𝑛+1 −𝐿𝐿𝑡𝑡 −𝑛+1
Stochastic D%:
Stochastic %D is a smoothed version of Stochastic %K, providing a more stable signal for the deep
learning model. By using both %K and %D, the model can capture short-term and longer-term
trends in the stock's price movements.
Kt +Kt−1 +⋯+Kt−n+1
StochasticD%=( )*100 (15)
n
22
RSI measures the magnitude of recent price gains and losses over the last "n-1" periods. It
quantifies the stock's momentum and can help the deep learning model identify potential price
reversal points and overbought/oversold conditions.
100
Relative Strength index(RSI)=100- ∑𝑛−1 𝑈𝑃𝑡−𝑖
(16)
𝑖=1
1+ 𝑛−1
∑
𝑖=1 𝐷𝑊𝑡−1
Signal (n)
Signal (n)ₜ is used in the calculation of the Moving Average Convergence Divergence (MACD).
By including the signal line as an input variable, the model can better analyze MACD crossovers
and potential trend changes.
2 2
Signal(𝑛)𝑡 =MACDt ∗ n+1 + Signal(n)t−1 ∗ (1 − n+1) (17)
Larry William’s R %
Larry William's R %, similar to Stochastic %K, indicates the position of the current price relative
to the highest and lowest prices over the last "n" periods. Including this indicator can provide
additional insights into overbought and oversold conditions.
The A/D oscillator considers the relationship between the high, low, and closing prices, providing
information about the accumulation or distribution of the asset. This can be helpful for the deep
learning model in identifying potential buying or selling pressure in the market.
𝐻 −𝐶
Accumulation/ Distribution (A/D) oscillator = 𝐻𝑡 −𝐿𝑡 (19)
𝑡 𝑡
CCI measures the current price relative to its average over "n" periods, taking into account the
mean absolute deviation. By including CCI as an input variable, the deep learning model can gain
insights into potential trend reversals and extreme price movements.
23
𝑡 𝑀 −𝑆𝑀
𝑡
CCI (commodity channel Index) = 0.015𝐷 (20)
𝑡
Where
𝐿𝐿𝑡𝑡 −𝑛+1 and 𝐻𝐻𝑡𝑡−𝑛+1 is the lowest low and highest high prices in the last n days, respectively
𝑈𝑃𝑡 and 𝐷𝑊𝑡 means upward price change and downward price change at time t, respectively
2 2
EMA(𝐾)𝑡 =EMA(𝐾)𝑡−1 ∗ (1 − 𝑘+1)+𝐶𝑡 *𝑘+1 (21)
𝐻𝑡 +𝐿𝑡 +𝐶𝑡
𝑀𝑡 = (23)
3
∑𝑛−1
𝑖=0 𝑀𝑡−1
𝑆𝑀𝑡 = (24)
𝑛
∑𝑛−1
𝑖=0 |𝑀𝑡−𝑖 −𝑆𝑀𝑡 |
𝐷𝑡 = (25)
𝑛
3.7.1 Process
In this research, we aim to explore the application of deep learning models, specifically Long
Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), for stock prediction. Before using
information for the training process, it is vital to take a preprocessing step. We employ data
cleaning, which is the process of detecting and correcting inaccurate records from a dataset and
refers to identifying inaccurate or irrelevant parts of the data and then replacing, modifying, or
deleting the dirty data. The interquartile range (IQR score) is a measure of statistical dispersion
and is robust against outliers, and this method is used to detect outliers and modify the dataset.
24
Indeed, as an important point, to prevent the effect of the larger value of an indicator on the smaller
ones, the values of technical indicators for all groups are normalized independently. Data
normalization refers to rescaling actual numeric features into a 0 to 1 range and is employed in
machine learning to create a training model less sensitive to the scale of variables. The data
processing and analysis will involve several key steps. Firstly, historical stock market data will be
collected, including daily or intraday price and volume data, as well as relevant financial
indicators. Next, the data will be preprocessed to handle missing values, normalize the features,
and potentially perform feature engineering to extract relevant patterns. Subsequently, the dataset
will be split into training, validation, and testing sets to train and evaluate the LSTM and GRU
models. The deep learning models will be implemented using appropriate libraries such as
TensorFlow or PyTorch. We will experiment with various hyperparameters and architecture
configurations to optimize model performance. Additionally, backtesting and cross-validation
techniques will be employed to assess the robustness of the predictive models. The research will
conclude with a comprehensive analysis of the LSTM and GRU models' performance in predicting
stock prices, evaluating their accuracy, precision. The results will be discussed to draw meaningful
conclusions regarding the feasibility and effectiveness of deep learning techniques in stock market
prediction.
3.7.2 Evaluation
Evaluation Measures
Mean Absolute Percentage Error (MAPE) is often employed to assess the performance of the
prediction methods. MAPE is also used as a measure of prediction accuracy for forecasting
methods in the machine learning area, it commonly presents accuracy as a percentage. Its equation
is shown below:
100 𝑦𝑖 −𝑦̂𝑖
MAPE= ∑𝑛𝑖=1 | | (26)
𝑛 𝑦𝑖
25
Where 𝑦𝑖 is the actual value and 𝑦̂𝑖 is the forecast value. In the formula, the absolute value of the
difference between those is divided by 𝑦̂𝑖 .The absolute value is summed for every forecasted value
and divided by the number of data. Finally, the percentage error is made by multiplying to 100.
Mean absolute error (MAE) is a measure of the difference between two values. MAE is an average
of the difference between the prediction and the actual values. MAE is a usual measure of
prediction error for regression analysis in the machine learning area. The formula is shown below:
1
MAE=𝑛 ∑𝑛𝑖=1|𝑦𝑖 − 𝑦̂|
𝑖 (27)
Where 𝑦𝑖 is the true value and 𝑦̂𝑖 is the prediction value. In the formula, the absolute value of the
difference between those is divided by n (number of samples) and summed for every forecasted
value.
Root Mean Square Error (RMSE) is the standard deviation of the prediction errors in regression
work. Prediction errors or residuals show the distance between real values and a prediction model,
and how they are spread out around the model. The metric indicates how data is concentrated near
the best fitting model. RMSE is the square root of the average of squared differences between
predictions and actual observations. Relative Root Mean Square Error (RRMSE) is similar to
RMSE and this takes the total squared error and normalizes it by dividing by the total squared error
of the predictor model. The formula is shown below:
1
RMSE= √ ∑𝑛𝑖=1(𝑦𝑖 − 𝑦̂𝑖 )2 (28)
𝑛
Where 𝑦𝑖 is the observed value, 𝑦̂𝑖 is the prediction value and n is the number of samples.
The Mean Squared Error (MSE) measures the quality of a predictors and its value is always non-
negative (values closer to zero are better). The MSE is the second moment of the error (about the
origin), and incorporates both the variance of the prediction model (how widely spread the
26
predictions are from one data sample to another) and its bias (how close the average predicted
value is from the observation). The formula is shown below:
1
MSE=𝑛 ∑𝑛𝑖=1(𝑦𝑖 − 𝑦̂𝑖 )2 (29)
Where 𝑦𝑖 is the observed value, 𝑦̂𝑖 is the prediction value and n is the number of samples.
27
REFERENCES
Aghababaeyan, R., & TamannaSiddiqui, N. (2011). Forecasting the Tehran Stock Market
by Artificial Neural Network. International Journal of Advanced Computer Science and
Applications, Special Issue on Artificial Intelligence .
Bhattacharjee, I., & Bhattacharja, P. (2019). Stock Price Prediction: A Comparative Study
between Traditional Statistical Approach and Machine Learning Approach. International Journal
of Computer Applications, 975(8887), 1-6.
Cao, Y., Tay, F. E. H., & Yao, L. (2019). Deep LSTM with Attention for Large Vocabulary
Language Modeling. Proceedings of the 2019 Conference of the Association for Computational
Linguistics (ACL 2019).
Cho, K., van Merrienboer, B., Bahdanau, D., & Bengio, Y. (2014). Learning phrase
representations using RNN encoder-decoder for statistical machine translation. arXiv preprint
arXiv:1406.1078.
Deng, S., Mitsubuchi, T., Shioda, K., Shimada, T., & Sakurai, A. (2011). Combining
Technical Analysis with Sentiment Analysis for Stock Price Prediction. 2011 IEEE Ninth
International Conference on Dependable, Autonomic and Secure Computing, (pp. 800-807).
Gao, Y., Wang, R., & Zhou, E. (2021). Stock Prediction Based on Optimized LSTM and
GRU Models. Journal of Computational and Theoretical Nanoscience, 18(8), 4912-4918.
Government of Kenya. (2009). The Capital Markets Act, Laws of Kenya, Cap 485A.
Graham, B. (2003). The Intelligent Investor (Revised ed.). New York: HarperCollins
Guo, Y. (2022). Stock Price Prediction Using Machine Learning. Journal of Finance and
Investment Analysis, 1(1), 45-57.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation,
9(8), 1735-1780.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation,
9(8), 1735-1780.
28
Hota, J., Chakravarty, S., Paikaray, B. K., & Bhoyar, H. (2020). Stock Market Prediction
Using Machine Learning Techniques. International Journal of Advanced Research in Computer
Science, 11(2), 35-42.
Investopedia. (2021, April 15). Stock Exchange. Retrieved September 9, 2021, from
https://www.investopedia.com/terms/s/stockexchange.asp
Khan, Z. H., Alin, T. S., & Hussain, A. (2011). Price Prediction of Share Market using
Artificial Neural Network (ANN). International Journal of Computer Applications (0975– 8887) ,
22 (2).
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep
convolutional neural networks. In Advances in neural information processing systems (pp. 1097-
1105).
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied
to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied
to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous
activity. The bulletin of mathematical biophysics, 5(4), 115-133.
Ndiritu, J. M. (2010). Technical Trading Support System (TTSS): A Stock Market Analyst
Support System. MSc. Thesis, University of Nairobi, School of Computing and Informatics,
Nairobi.
Pan, H., Tilakaratne, C., & Yearwood, J. (2005). Predicting Australian Stock Market Index
Using Neural Networks Exploiting Dynamical Swings and Intermarket Influences. Journal of
Research and Practice in Information Technology , 37 (1).
Publishers Inc.
29
Retrieved April 12, 2013, from Capital Markets Authority
:http://cma.or.ke/index.php?option=com_docman&view=docman&Itemid=123 92
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and
retrieval. Psychological review, 65(6), 386-408.
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and
organization in the brain. Psychological review, 65(6), 386-408.
Shahi, T. B., Shrestha, A., Neupane, A., & Guo, W. (2020). Stock Price Forecasting with
Deep Learning: A Comparative Study. IEEE Transactions on Neural Networks and Learning
Systems, 31(4), 1144-1155.
Viveka Priya, N., & Geetha, S. (2022). Stock Prediction Using Machine Learning
Techniques. Journal of Finance and Economics, 10(2), 123-137.
30
WORK PLAN
Literature review
Gantt chart
Gantt chart for stock market prediction using LSTM and GRU
06-Sep 16-Sep 26-Sep 06-Oct 16-Oct 26-Oct 05-Nov 15-Nov 25-Nov
Data collection
Data preparation
Lstm model development
Start date
Gru model development
duration(days)
Model comparison
Performance evaluation
Results analysis
Finalizing research
31