0% found this document useful (0 votes)
11 views6 pages

Deep Reinforcement Learning for Stock Prediction

This paper discusses the application of deep reinforcement learning (DRL) for stock market prediction, highlighting its advantages over traditional machine learning methods. It emphasizes the unique capabilities of deep neural networks (DNNs) in handling complex non-linear data and the integration of reinforcement learning to enhance decision-making processes. The paper also addresses the limitations of current DRL models and suggests future improvements, particularly in incorporating multimodal data and portfolio strategy considerations.

Uploaded by

Ivan Medić
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views6 pages

Deep Reinforcement Learning for Stock Prediction

This paper discusses the application of deep reinforcement learning (DRL) for stock market prediction, highlighting its advantages over traditional machine learning methods. It emphasizes the unique capabilities of deep neural networks (DNNs) in handling complex non-linear data and the integration of reinforcement learning to enhance decision-making processes. The paper also addresses the limitations of current DRL models and suggests future improvements, particularly in incorporating multimodal data and portfolio strategy considerations.

Uploaded by

Ivan Medić
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Proceedings of the 6th International Conference on Computing and Data Science

DOI: 10.54254/2755-2721/69/20241453

Deep reinforcement learning for stock prediction

Mingkai Wang
The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong, China

[email protected]

Abstract. Due to its chaotic and high volatility character, as well as many other uncertainties
from reality, predicting the price of stock market is always a challenging goal to achieve. Due to
those characteristics mentioned above, it could be regarded as a classification problem, and then
many methods using different machine learning tools could be well applied to solve the
challenging problem. Within these methods, deep neural network is a popular and highly noticed
one in recent years. This is mainly because of its unique advantages compared to the more
conventional machine learning methods, the highly complex nonlienearity and deep nonlinear
topologies, to appropriately describe the complex situations. Later, after adding advantages of
reinforcement learning to enable the model of the advantages to improve feature dimensions,
the deep reinforcement learning method is well proposed to improve the performance . Deep
reinforcement learning is a method to combine the advantages of deep learning and the
advantages of reinforcement learning, and this paper will discuss its characters and advantages,
and finally talk about its limitations and future development.

Keywords: Stock prediction, Deep neural network, Reinforcement learning, Deep


reinforcement learning.

1. Introduction
Predicting the price of stock market is a key point to provide investors with the basis for an optimal
decision in trading but also a challenging task to achieve. Due to its chaotic and high volatility
character, the movement is considered to be a stochastic process influenced by many uncertainties
from reality, such as macroeconomic factors, the market anticipation and confidence.
There have been many previous studies involving the prediction of stock market, mostly divided
into fundamental and technical analysis. Fundamental analysis estimates intrinsic value by analyzing
internal and external variables, meanwhile technical analysis finds a pattern in the stock price based on
which we can make predictions.
Before the appearance of machine learning methods, early researchers use statistical methods such
as time series and multivariable statistics, as well as different kinds of econometric methods, to build
the model and make the prediction. However, because of the characteristics of being chaotic, high
volatility and nonparametric, the better methods of machine learning are applied. In this situation, it is
still difficult to predict the exact values to predict the stock prices, so it is better to treat it as a
classification problem rather than a prediction problem. Many machine learning methods are applied
based on these conditions, such as support vector machine (SVM), logistic regression, and random
forest.

© 2024 The Authors. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0
(https://creativecommons.org/licenses/by/4.0/).

85
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453

Deep neural networks (DNNs) is an important kind of machine learning due to its unique
advantages compared to those conventional methods: The highly complex nonlienearity of DNN can
approximately simulate and describe the complexity of the influencing factors, as deep nonlinear
topologies can be built to represent complex high-dimensional functions by stacking the hidden layers
of nonlinear activation functions. Besides, they are numeric, datadriven and adaptive to be able to
analyze inaccurate and noisy data and have been extensively used to predict time series.
The above advantages ensure a good performance of DNN for prediction problems, therefore it is
feasible to predict financial problems with deep learning. In this paper, we mainly show its effect of
the prediction and how it appears its advantages. (If possible, we may enforce the capability of
decision-making by combining reinforcement learning.) In comparison, we also use some traditional
methods in machine learning, such as SVM and logistic regression, and show the differences in the
results.
The main part of this paper is organized as follows. In part 2 we introduce some related works of
this topic in the previous time. In part 3 we discuss the reinforcement learning, including its characters
and advantages. The comparison between it and other traditional machine learning methods will also
be mentioned to show its improvement. Finally we make the conclusions about the problem and talk
about the future development.

2. Literature review
In the early ages, it was widely believed that predicting the future trends in stock market prices
contradicted a basic rule known as the Efficient Market Hypothesis (Fama and Malkiel) [1]. However,
later more researchers chose to reject this controversial and disputed theory by using more advanced
algorithms to model more complex dynamics of the financial system. In this field, more representative
work has been studies by Lo, by Mackinlay [2]. Some example studies on the predictability of returns
of a long-term investment are those by DeBondt and Thaler [3-5].
With a brilliant development of computer, the machine learning methods based on statistics have
contributed a lot to the prediction of stock price. The machine learning methods have a better
performance in handling complex nonlinear analysis, which enables various approaches compared
with other studies using traditional statistics only. Those representative methods include logistic
regression, genetic algorithm, fuzzy theory, SVM, decision tree, and adaptive boosting (AdaBoost)
[6-9]. For example, L Khaidem et al. successfully used ensemble learning to improve the performance
of random forest for stock prediction. As for logistic regression, J Gong and S Sun introduced
innovative feature index variables into the prediction model and proposed a special optimization
process to select optimizing regression parameters [10-11]. In 2012, A Upadhyay et al. used seven
independent financial ratiosto construct a Multinomial Logistic Regression method [12]. The method
of SVM is also widely considered. J Heo and JY Yang evaluated the stock price predictability of it and
the time to keep a good performance to support the efficient market hypothesis; meanwhile F Wen et
al. proposed the singular spectrum analysis to improve the performance of SVM [13-14].
In 1980S, NNs had been applied by IBM to predict the changes of daily stock prices as well as its
returns. An autoregressive integrated moving average (ARIMA) model was added to an artificial
neural network(ANN) for predicting the results of time series in stock markets. Later in recent years,
deep learning have experienced a fantastic development, as well as its wide applications in prediction
prices of stock market. According to different methods and usages of deep learning, for different
problems we have certain choices, such as convolutional neural network (CNN), Long Short-Term
Memory (LSTM) and Recurrent Neural network (RNN), but here we mainly focus on the most
frequently used deep learning methods, namely the deep neural network (DNN) and Reinforcement
learning (RL).
DNN can be used to predict stock trends because of its fantastic performance for prediction
problems with numerous data and nonlinear mapping relations. Due to this reason, much work have
been done recently. Shen et al constructed a deep belief network using a continuous restricted
Boltzmann machine and used it to predict exchange rates. Through a comparison, they found that the

86
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453

DL network was better than the conventional feedforward NNs [15]. Song, Y. proposed a plunge
filtering technique for a DNN model to improve the accuracy of it. This proposed model had great
profitability [16]. Naik, N. proposed a DNN using the Boruta feature selection technique to solve the
problem of selecting indicator feature and identificating the relevant indicators , which performed
much better than the ANN and SVM models [17]. Chatzis, S.P. proposed a DNN model with Boosted
approaches to predict stock market crisis episodes [18]. His research showed that it was meaningful
for the stock market crisis to predict the price. Nakagawa, K. proposed a deep factor model and a
shallow model with DNN, and the former performed better than the linear model, which implied the
nonlinear relationship between the stock returns and the factors in the financial market [19]. The deep
factor model also had a better performance than other machine learning methods including SVR and
random forest. Chong, E. examined the effects of DNN with three feature extraction methods
including, PCA, auto-encoder, and the restricted Boltzmann machine, to predict future market
behavior [20]. It was showed that additional information could be extracted from the residuals of the
auto-regressive model to improve the final performance in DNN.
RL learns the local optimal timing trading action from the response changes of the stock market by
viewing the neighboring information of the transaction, which could be regarded as the environment of
reinforcement learning, and there have been much work in this field. Shin, H.-G. proposed a RL model
combined with LSTM and CNN, which generated various charts from stock trading data and used
them as input layers [21]. The features extracted from the CNN were used to construct the LSTM layer.
The RL defined the agents’ policy neural network structure, reward, and provided the final output.
Jia, W. proposed a RL model with an LSTM-based agent to sense the dynamics of the stock market
and decrease the difficulty of designing indicators from massive data [22]. Carapuço, J. proposed a
RL-Q network model, in which three hidden layers of ReLU neurons were trained as RL agents
through the Q-learning algorithm [23]. The framework could consistently induce stable learning that
generalized to out-of-sample data. Kang, Q. proposed to apply the Asynchronous Advantage
Actor-Critic algorithm (A3C algorithm) to solve the portfolio management problem, and later
designed a deep RL model [24]. H Yang proposed a novel ensemble strategy combining three deep
reinforcement learning algorithms to find the optimal trading strategy in a complex and dynamic stock
market. The strategy can adjust to maximize return subject to risk constraint [25]. There are also
methods that combine DNN and RL together, namely Deep Reinforcement Learning (DRL), which
combines the perception ability of DNN with the decision-making ability of RL. For example, Y Li
proposed to introduce DRL into the application of finance and proved its advantages of improving
prediction ability [26].

3. Discussion

3.1. characters and advantages


The basic view of traditional research methods could be mainly divided by the viewpoint of time
series and factor analysis. The former consider the evolution and changing trend of a single stock as
time goes by, and in this way make the prediction of the future trend of this single stock. In this
process the main characters in consider are those significantly influenced by time. The latter, on the
other hand, mainly explore the deep influence factors that decides the value of stocks. This target is
achieved by comparing the value of different stocks as well as their main influence factors given a
fixed time.
By comparing different traditional views above, it is concluded that the researching methods from
the view of time series may be short of mining the hidden information from the related characters;
meanwhile from the view of factor analysis,it is not convenient to precisely control the develop trend
following the time. Based on this situation, the DRL could combine the advantages of these two
methods and avoid their disadvantages at the same time.

87
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453

3.2. deep and reinforcement learning


Deep learning could uncover complex, intrinsic, and deep-seated data information. When dealing with
problems about stock prediction, the complex characters and factors concerned with the stock price
could be well described by DNN due to its complex nonlinearity. Because of the goodness and
advantages above, the DNN have a quite fantastic accuracy for complex prediction problems in many
fields in practice, such as image classification and natural language process. The research results
above also functions well for data analysis concerning time series and prediction with a deep learning
algorithm. Overall, deep learning models perform excellent in broad and diverse research fields, which
makes it feasible to predict the future trend of stock markets with deep learning.
However, although DNN have many advantages as stated, there also exists limitations and we
cannot blindly increase depth. Firstly, increasing the depth will increase the complexity of the model,
and the required number of training samples will also increase accordingly. Secondly, the single-layer
feedforward operation complexity of the prediction model is relatively high, and increasing the
number of layers will lead to an overall high complexity. Thirdly, gradient explosion and gradient
disappearance may occur. By combining it with another machine learning method, the DNN method
could have some developments.
Reinforcement learning could be applied to learn an optimal policy by making interactions with the
environment and repeated experiments. The process above is widely applied for sequential decision
making problems. For such models, the key conception is the value funtion which decides the learning
targets, the control algorithms are applied to find optimal policies, and the final target is to learn
behavior strategies in multiple stages. Reinforcement learning enhances the model's ability to learn
from data and improves feature dimensions. The method have the advantages to minimum the
transaction cost.
Combining the two methods above, it is obvious that each method has its goodness and shortages.
In this case, we use DNN to simulate components of reinforcement learning, including value function,
policy, and the models. Although reinforcement learning can process problems of making decisions, it
is short in expressing perception, which prompts the combination of reinforcemet learning and DNN.
DRL integrates the perception of DNN with the decision-making ability of reinforcement learning
to achieve its goals of simulating the cognition and learning mode of human being. In this method,
different sorts of information can be input, and then actions will be directly output through the
simulation of DNN, which is controlled directly according to the input data without other outside
supervision.

4. Conclusion and future development


Generally speaking, models usually have their shortages in real applications. The DRL also has its
shortages and limitations. One key point is that the current DRL mainly use the historical data of one
stock and process it as a single mode, which means that the model has a significant shortage of been in
lack of the exploration of other dimensions which may potentially influence the final data. In this case,
it is a highly considerable view to make the model been more compatible of the data structure with
multimodal and promote the performance in general situations. The improvement of performance in
generalization is important for the model, because this is closed related to the universality of it.
Apart from that, the current DRL model still can not handle the problems about portfolio strategy.
This is because that the DRL model mainly focus on the problems about prediction. But in reality, on
the other hand, it is more common to pursue the portfolio strategy according to the predicting result,
which may have different results between by dividing it into two problems and by regarding the whole
problem as an integral. How to connect the portfolio strategy to the current model and regard the
problem as a whole may be a meaningful topic for prediction models, including the DRL model.
Finally, attention becomes popular these years, and it is especially suitable for stock prediction
problems. As the prices of the stock have different influence on the future at different points of time in
the past, it is really reasonable to add the attention mechanism to the current prediction model in the
field of stock prediction.

88
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453

References
[1] Schwartz, R. A. (1970). Efficient capital markets: A review of theory and empirical work:
Discussion. The Journal of Finance, 25(2), 421-423.
[2] Malkiel, B. G. (2003). The efficient market hypothesis and its critics. Journal of economic
perspectives, 17(1), 59-82.
[3] De Bondt, W. F., & Thaler, R. (1985). Does the stock market overreact?. The Journal of
finance, 40(3), 793-805.
[4] Lim, K. P., & Brooks, R. (2011). The evolution of stock market efficiency over time: A survey
of the empirical literature. Journal of economic surveys, 25(1), 69-108.
[5] Sewell, M. (2011). History of the efficient market hypothesis. Rn, 11(04), 04.
[6] Hadavandi, E., Shavandi, H., & Ghanbari, A. (2010). Integration of genetic fuzzy systems and
artificial neural networks for stock price forecasting. Knowledge-Based Systems, 23(8),
800-808.
[7] Efficiency, S. M. (1993). Returns to Buying Winners and Selling Losers: Implications for. The
Journal of Finance, 48(1), 65-91.
[8] Pai, P. F., & Lin, C. S. (2005). A hybrid ARIMA and support vector machines model in stock
price forecasting. Omega, 33(6), 497-505.
[9] Wu, M. C., Lin, S. Y., & Lin, C. H. (2006). An effective application of decision tree to stock
trading. Expert Systems with applications, 31(2), 270-274.
[10] Khaidem, L., Saha, S., & Dey, S. R. (2016). Predicting the direction of stock market prices
using random forest. arXiv preprint arXiv:1605.00003.
[11] Gong, J., & Sun, S. (2009, June). A new approach of stock price prediction based on logistic
regression model. In 2009 International conference on new trends in information and service
science (pp. 1366-1371). IEEE.
[12] Upadhyay, A., Bandyopadhyay, G., & Dutta, A. (2012). Forecasting stock performance in
indian market using multinomial logistic regression. Journal of Business Studies Quarterly, 3(3),
16.
[13] Heo, J., & Yang, J. Y. (2016). Stock price prediction based on financial statements using
SVM. International Journal of Hybrid Information Technology, 9(2), 57-66.
[14] Fenghua, W. E. N., Jihong, X. I. A. O., Zhifang, H. E., & Xu, G. O. N. G. (2014). Stock
[15] Shen, F., Chao, J., & Zhao, J. (2015). Forecasting exchange rate using deep belief networks and
conjugate gradient method. Neurocomputing, 167, 243-253.
[16] Song, Y., Lee, J. W., & Lee, J. (2019). A study on novel filtering and relationship between
input-features and target-vectors in a deep learning model for stock price prediction. Applied
Intelligence, 49, 897-911.
[17] Naik, N., & Mohan, B. R. (2019). Stock price movements classification using machine and deep
learning techniques-the case study of indian stock market. In Engineering Applications of
Neural Networks: 20th International Conference, EANN 2019, Xersonisos, Crete, Greece, May
24-26, 2019, Proceedings 20 (pp. 445-452). Springer International Publishing.
[18] Chatzis, S. P., Siakoulis, V., Petropoulos, A., Stavroulakis, E., & Vlachogiannakis, N. (2018).
Forecasting stock market crisis events using deep and statistical machine learning
techniques. Expert systems with applications, 112, 353-371.
[19] Nakagawa, K., Uchida, T., & Aoshima, T. (2018, September). Deep factor model: Explaining
deep learning decisions for forecasting stock returns with layer-wise relevance propagation.
In Workshop on Mining Data for Financial Applications (pp. 37-50). Cham: Springer
International Publishing.
[20] Chong, E., Han, C., & Park, F. C. (2017). Deep learning networks for stock market analysis and
prediction: Methodology, data representations, and case studies. Expert Systems with
Applications, 83, 187-205.

89
Proceedings of the 6th International Conference on Computing and Data Science
DOI: 10.54254/2755-2721/69/20241453

[21] Shin, H. G., Ra, I., & Choi, Y. H. (2019, October). A deep multimodal reinforcement learning
system combined with CNN and LSTM for stock trading. In 2019 International conference on
information and communication technology convergence (ICTC) (pp. 7-11). IEEE.
[22] Jia, W. U., Chen, W. A. N. G., Xiong, L., & Hongyong, S. U. N. (2019, July). Quantitative
trading on stock market based on deep reinforcement learning. In 2019 International Joint
Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE.
[23] Carapuço, J., Neves, R., & Horta, N. (2018). Reinforcement learning applied to Forex
trading. Applied Soft Computing, 73, 783-794.
[24] Kang, Q., Zhou, H., & Kang, Y. (2018, October). An asynchronous advantage actor-critic
reinforcement learning method for stock selection and portfolio management. In Proceedings of
the 2nd International Conference on Big Data Research (pp. 141-145).
[25] Yang, H., Liu, X. Y., Zhong, S., & Walid, A. (2020, October). Deep reinforcement learning for
automated stock trading: An ensemble strategy. In Proceedings of the first ACM international
conference on AI in finance (pp. 1-8).
[26] Li, Y., Ni, P., & Chang, V. (2020). Application of deep reinforcement learning in stock trading
strategies and stock forecasting. Computing, 102(6), 1305-1322.

90

You might also like