Deep Reinforcement Learning for Stock Prediction
Deep Reinforcement Learning for Stock Prediction
DOI: 10.54254/2755-2721/69/20241453
Mingkai Wang
The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China
Abstract. Due to its chaotic and high volatility character, as well as many other uncertainties
from reality, predicting the price of stock market is always a challenging goal to achieve. Due to
those characteristics mentioned above, it could be regarded as a classification problem, and then
many methods using different machine learning tools could be well applied to solve the
challenging problem. Within these methods, deep neural network is a popular and highly noticed
one in recent years. This is mainly because of its unique advantages compared to the more
conventional machine learning methods, the highly complex nonlienearity and deep nonlinear
topologies, to appropriately describe the complex situations. Later, after adding advantages of
reinforcement learning to enable the model of the advantages to improve feature dimensions,
the deep reinforcement learning method is well proposed to improve the performance . Deep
reinforcement learning is a method to combine the advantages of deep learning and the
advantages of reinforcement learning, and this paper will discuss its characters and advantages,
and finally talk about its limitations and future development.
1. Introduction
Predicting the price of stock market is a key point to provide investors with the basis for an optimal
decision in trading but also a challenging task to achieve. Due to its chaotic and high volatility
character, the movement is considered to be a stochastic process influenced by many uncertainties
from reality, such as macroeconomic factors, the market anticipation and confidence.
There have been many previous studies involving the prediction of stock market, mostly divided
into fundamental and technical analysis. Fundamental analysis estimates intrinsic value by analyzing
internal and external variables, meanwhile technical analysis finds a pattern in the stock price based on
which we can make predictions.
Before the appearance of machine learning methods, early researchers use statistical methods such
as time series and multivariable statistics, as well as different kinds of econometric methods, to build
the model and make the prediction. However, because of the characteristics of being chaotic, high
volatility and nonparametric, the better methods of machine learning are applied. In this situation, it is
still difficult to predict the exact values to predict the stock prices, so it is better to treat it as a
classification problem rather than a prediction problem. Many machine learning methods are applied
based on these conditions, such as support vector machine (SVM), logistic regression, and random
forest.
© 2024 The Authors. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0
(https://creativecommons.org/licenses/by/4.0/).
85
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453
Deep neural networks (DNNs) is an important kind of machine learning due to its unique
advantages compared to those conventional methods: The highly complex nonlienearity of DNN can
approximately simulate and describe the complexity of the influencing factors, as deep nonlinear
topologies can be built to represent complex high-dimensional functions by stacking the hidden layers
of nonlinear activation functions. Besides, they are numeric, datadriven and adaptive to be able to
analyze inaccurate and noisy data and have been extensively used to predict time series.
The above advantages ensure a good performance of DNN for prediction problems, therefore it is
feasible to predict financial problems with deep learning. In this paper, we mainly show its effect of
the prediction and how it appears its advantages. (If possible, we may enforce the capability of
decision-making by combining reinforcement learning.) In comparison, we also use some traditional
methods in machine learning, such as SVM and logistic regression, and show the differences in the
results.
The main part of this paper is organized as follows. In part 2 we introduce some related works of
this topic in the previous time. In part 3 we discuss the reinforcement learning, including its characters
and advantages. The comparison between it and other traditional machine learning methods will also
be mentioned to show its improvement. Finally we make the conclusions about the problem and talk
about the future development.
2. Literature review
In the early ages, it was widely believed that predicting the future trends in stock market prices
contradicted a basic rule known as the Efficient Market Hypothesis (Fama and Malkiel) [1]. However,
later more researchers chose to reject this controversial and disputed theory by using more advanced
algorithms to model more complex dynamics of the financial system. In this field, more representative
work has been studies by Lo, by Mackinlay [2]. Some example studies on the predictability of returns
of a long-term investment are those by DeBondt and Thaler [3-5].
With a brilliant development of computer, the machine learning methods based on statistics have
contributed a lot to the prediction of stock price. The machine learning methods have a better
performance in handling complex nonlinear analysis, which enables various approaches compared
with other studies using traditional statistics only. Those representative methods include logistic
regression, genetic algorithm, fuzzy theory, SVM, decision tree, and adaptive boosting (AdaBoost)
[6-9]. For example, L Khaidem et al. successfully used ensemble learning to improve the performance
of random forest for stock prediction. As for logistic regression, J Gong and S Sun introduced
innovative feature index variables into the prediction model and proposed a special optimization
process to select optimizing regression parameters [10-11]. In 2012, A Upadhyay et al. used seven
independent financial ratiosto construct a Multinomial Logistic Regression method [12]. The method
of SVM is also widely considered. J Heo and JY Yang evaluated the stock price predictability of it and
the time to keep a good performance to support the efficient market hypothesis; meanwhile F Wen et
al. proposed the singular spectrum analysis to improve the performance of SVM [13-14].
In 1980S, NNs had been applied by IBM to predict the changes of daily stock prices as well as its
returns. An autoregressive integrated moving average (ARIMA) model was added to an artificial
neural network(ANN) for predicting the results of time series in stock markets. Later in recent years,
deep learning have experienced a fantastic development, as well as its wide applications in prediction
prices of stock market. According to different methods and usages of deep learning, for different
problems we have certain choices, such as convolutional neural network (CNN), Long Short-Term
Memory (LSTM) and Recurrent Neural network (RNN), but here we mainly focus on the most
frequently used deep learning methods, namely the deep neural network (DNN) and Reinforcement
learning (RL).
DNN can be used to predict stock trends because of its fantastic performance for prediction
problems with numerous data and nonlinear mapping relations. Due to this reason, much work have
been done recently. Shen et al constructed a deep belief network using a continuous restricted
Boltzmann machine and used it to predict exchange rates. Through a comparison, they found that the
86
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453
DL network was better than the conventional feedforward NNs [15]. Song, Y. proposed a plunge
filtering technique for a DNN model to improve the accuracy of it. This proposed model had great
profitability [16]. Naik, N. proposed a DNN using the Boruta feature selection technique to solve the
problem of selecting indicator feature and identificating the relevant indicators , which performed
much better than the ANN and SVM models [17]. Chatzis, S.P. proposed a DNN model with Boosted
approaches to predict stock market crisis episodes [18]. His research showed that it was meaningful
for the stock market crisis to predict the price. Nakagawa, K. proposed a deep factor model and a
shallow model with DNN, and the former performed better than the linear model, which implied the
nonlinear relationship between the stock returns and the factors in the financial market [19]. The deep
factor model also had a better performance than other machine learning methods including SVR and
random forest. Chong, E. examined the effects of DNN with three feature extraction methods
including, PCA, auto-encoder, and the restricted Boltzmann machine, to predict future market
behavior [20]. It was showed that additional information could be extracted from the residuals of the
auto-regressive model to improve the final performance in DNN.
RL learns the local optimal timing trading action from the response changes of the stock market by
viewing the neighboring information of the transaction, which could be regarded as the environment of
reinforcement learning, and there have been much work in this field. Shin, H.-G. proposed a RL model
combined with LSTM and CNN, which generated various charts from stock trading data and used
them as input layers [21]. The features extracted from the CNN were used to construct the LSTM layer.
The RL defined the agents’ policy neural network structure, reward, and provided the final output.
Jia, W. proposed a RL model with an LSTM-based agent to sense the dynamics of the stock market
and decrease the difficulty of designing indicators from massive data [22]. Carapuço, J. proposed a
RL-Q network model, in which three hidden layers of ReLU neurons were trained as RL agents
through the Q-learning algorithm [23]. The framework could consistently induce stable learning that
generalized to out-of-sample data. Kang, Q. proposed to apply the Asynchronous Advantage
Actor-Critic algorithm (A3C algorithm) to solve the portfolio management problem, and later
designed a deep RL model [24]. H Yang proposed a novel ensemble strategy combining three deep
reinforcement learning algorithms to find the optimal trading strategy in a complex and dynamic stock
market. The strategy can adjust to maximize return subject to risk constraint [25]. There are also
methods that combine DNN and RL together, namely Deep Reinforcement Learning (DRL), which
combines the perception ability of DNN with the decision-making ability of RL. For example, Y Li
proposed to introduce DRL into the application of finance and proved its advantages of improving
prediction ability [26].
3. Discussion
87
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453
88
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453
References
[1] Schwartz, R. A. (1970). Efficient capital markets: A review of theory and empirical work:
Discussion. The Journal of Finance, 25(2), 421-423.
[2] Malkiel, B. G. (2003). The efficient market hypothesis and its critics. Journal of economic
perspectives, 17(1), 59-82.
[3] De Bondt, W. F., & Thaler, R. (1985). Does the stock market overreact?. The Journal of
finance, 40(3), 793-805.
[4] Lim, K. P., & Brooks, R. (2011). The evolution of stock market efficiency over time: A survey
of the empirical literature. Journal of economic surveys, 25(1), 69-108.
[5] Sewell, M. (2011). History of the efficient market hypothesis. Rn, 11(04), 04.
[6] Hadavandi, E., Shavandi, H., & Ghanbari, A. (2010). Integration of genetic fuzzy systems and
artificial neural networks for stock price forecasting. Knowledge-Based Systems, 23(8),
800-808.
[7] Efficiency, S. M. (1993). Returns to Buying Winners and Selling Losers: Implications for. The
Journal of Finance, 48(1), 65-91.
[8] Pai, P. F., & Lin, C. S. (2005). A hybrid ARIMA and support vector machines model in stock
price forecasting. Omega, 33(6), 497-505.
[9] Wu, M. C., Lin, S. Y., & Lin, C. H. (2006). An effective application of decision tree to stock
trading. Expert Systems with applications, 31(2), 270-274.
[10] Khaidem, L., Saha, S., & Dey, S. R. (2016). Predicting the direction of stock market prices
using random forest. arXiv preprint arXiv:1605.00003.
[11] Gong, J., & Sun, S. (2009, June). A new approach of stock price prediction based on logistic
regression model. In 2009 International conference on new trends in information and service
science (pp. 1366-1371). IEEE.
[12] Upadhyay, A., Bandyopadhyay, G., & Dutta, A. (2012). Forecasting stock performance in
indian market using multinomial logistic regression. Journal of Business Studies Quarterly, 3(3),
16.
[13] Heo, J., & Yang, J. Y. (2016). Stock price prediction based on financial statements using
SVM. International Journal of Hybrid Information Technology, 9(2), 57-66.
[14] Fenghua, W. E. N., Jihong, X. I. A. O., Zhifang, H. E., & Xu, G. O. N. G. (2014). Stock
[15] Shen, F., Chao, J., & Zhao, J. (2015). Forecasting exchange rate using deep belief networks and
conjugate gradient method. Neurocomputing, 167, 243-253.
[16] Song, Y., Lee, J. W., & Lee, J. (2019). A study on novel filtering and relationship between
input-features and target-vectors in a deep learning model for stock price prediction. Applied
Intelligence, 49, 897-911.
[17] Naik, N., & Mohan, B. R. (2019). Stock price movements classification using machine and deep
learning techniques-the case study of indian stock market. In Engineering Applications of
Neural Networks: 20th International Conference, EANN 2019, Xersonisos, Crete, Greece, May
24-26, 2019, Proceedings 20 (pp. 445-452). Springer International Publishing.
[18] Chatzis, S. P., Siakoulis, V., Petropoulos, A., Stavroulakis, E., & Vlachogiannakis, N. (2018).
Forecasting stock market crisis events using deep and statistical machine learning
techniques. Expert systems with applications, 112, 353-371.
[19] Nakagawa, K., Uchida, T., & Aoshima, T. (2018, September). Deep factor model: Explaining
deep learning decisions for forecasting stock returns with layer-wise relevance propagation.
In Workshop on Mining Data for Financial Applications (pp. 37-50). Cham: Springer
International Publishing.
[20] Chong, E., Han, C., & Park, F. C. (2017). Deep learning networks for stock market analysis and
prediction: Methodology, data representations, and case studies. Expert Systems with
Applications, 83, 187-205.
89
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453
[21] Shin, H. G., Ra, I., & Choi, Y. H. (2019, October). A deep multimodal reinforcement learning
system combined with CNN and LSTM for stock trading. In 2019 International conference on
information and communication technology convergence (ICTC) (pp. 7-11). IEEE.
[22] Jia, W. U., Chen, W. A. N. G., Xiong, L., & Hongyong, S. U. N. (2019, July). Quantitative
trading on stock market based on deep reinforcement learning. In 2019 International Joint
Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE.
[23] Carapuço, J., Neves, R., & Horta, N. (2018). Reinforcement learning applied to Forex
trading. Applied Soft Computing, 73, 783-794.
[24] Kang, Q., Zhou, H., & Kang, Y. (2018, October). An asynchronous advantage actor-critic
reinforcement learning method for stock selection and portfolio management. In Proceedings of
the 2nd International Conference on Big Data Research (pp. 141-145).
[25] Yang, H., Liu, X. Y., Zhong, S., & Walid, A. (2020, October). Deep reinforcement learning for
automated stock trading: An ensemble strategy. In Proceedings of the first ACM international
conference on AI in finance (pp. 1-8).
[26] Li, Y., Ni, P., & Chang, V. (2020). Application of deep reinforcement learning in stock trading
strategies and stock forecasting. Computing, 102(6), 1305-1322.
90