Financial Time Series Forecasting Applying Deep Learning Algorithms
Financial Time Series Forecasting Applying Deep Learning Algorithms
net/publication/356463745
CITATIONS READS
2 198
3 authors, including:
3 PUBLICATIONS 2 CITATIONS
Yachay Tech University
23 PUBLICATIONS 74 CITATIONS
SEE PROFILE
SEE PROFILE
All content following this page was uploaded by Erick Cuenca on 07 February 2022.
1 Introduction
A time series is an ordered sequence of values that are usually equally spaced
over time [9]. Time series are encounter in stock prices, weather forecasts, or
historical trends. For instance, Moore’s law is empirical historical forecasting
about the development of microchips [28]. This law describes the regularity in
which the number of transistors on integrated circuits doubles approximately
every two years. In this case, there is a single value describing each time step,
so these types of series are called univariate. There are also multivariate time
series, where the sequence is composed of multiple values at each time step.
An example of multivariate time series is the register of births versus deaths
in a period of time. Multivariate time series are useful for understanding the
correlation between variables allowing to analyze the impact of the related data.
For example, if the number of deaths passes to the number of births, it leads to
a population decline [22].
Univariate temporal data could contains patterns describing different behav-
iors. One of these patterns is the trend, where time series have a specific direction
that they are moving in. Another is seasonality, which occurs when patterns re-
peat at predictable intervals [19]. Also, there are time series with a completely
random behavior producing what is typically called white noise. Another type
is auto-correlated time series, where the value at each time step is dependent
2 Solı́s et al.
on previous ones. Commonly, time series such as weather forecast, stock prices,
or population statistics are described as a combination of trend, seasonality,
auto-correlation, and noise [37].
Algorithms focused on forecasting time series are known as sequential mod-
els [39]. These models are designed to spot patterns within the data. Once the
model spot these patterns, it is possible to make predictions. Traditional sequen-
tial models are based on the assumption that patterns that existed in the past
will continue in the future. However, this assumption can not be translated to
stock price prediction since the stock market behavior is influence by different
external interrelated factors such as economic, industry, company, psychological,
and political variables [14]. These variables interact in a very complex manner
leading to the assumption that stock markets can not be predicted. Therefore,
analyzing stock market movements has become an extremely challenging task for
both investors and researchers. The complexity related to this task is based on
the behavior of the stock market characterized by being non-stationary, i.e., un-
predictable. In this sense, the efficient market hypothesis states that those asset
prices reflect all available information at the moment. This hypothesis implies
that it is impossible to predict the market behavior consistently since market
prices should only react to new information [20]. Nevertheless, some researchers
state that markets are inefficient, in part due to the psychological variables of
market participants and the inability of the markets to immediately respond
to newly released information [15]. Based on this hypothesis, financial variables
such as stock prices are thought to be predictable. Thus, due to potential market
inefficiencies, market participants have focused on the development of accurate
forecasting strategies of financial variables.
To analyze the stock markets, statistical and machine learning methods have
been explored. On the one hand, statistical approaches often employ Autoregres-
sive Moving Average (ARMA) [33], Autoregressive Integrated Moving Average
(ARIMA) [30] or Linear Discriminant Analysis (LDA) [1]. On the other hand,
machine learning techniques used to forecast financial variables has been Arti-
ficial Neural Networks (ANNs). Conventional ANNs were mostly used in stock
market prediction in the latter part of the last century [36]. The following trend
of machine learning application on financial markets focused on applying Mul-
tilayer Perceptron (MLP) [31]. Nevertheless, ANNs are not the most accurate
method for dealing with time series containing noise and complex dimension-
ality that extends for long periods [4]. However, deep architectures can over-
come these problems. In this sense, sequential models such as Recurrent Neural
Networks (RNNs) and specially Long Short Term Memory (LSTM) have trans-
formed speech recognition, natural language processing, and other areas focused
on the analysis of time series [18]. Other approaches focused on combining deep
architectures such as LSTM for dealing with sequential data and Convolutional
Neural Networks (CNNs) for identifying features within the data such as inter-
dependencies among the companies to understand the market dynamics [29].
This paper presents the capability of deep architectures for forecasting non-
stationary time series composed of stock prices. A deep learning model composed
Financial Time Series Forecasting Applying Deep Learning Algorithms 3
2 Background
3 Methodology
This section describes the methodology followed in the proposed work. First, we
obtain a suitable dataset of time series ; after that, the dataset is split, obtaining
training and test data. Then, the architecture of the model is proposed to carry
out experiments and finally discuss the results.
Financial Time Series Forecasting Applying Deep Learning Algorithms 5
3.1 Dataset
The data used in this work involves the intraday stock prices of Amazon (ticker:
AMZN). The stock prices are collected with intervals of two and five minutes
during the trading hours from the last sixty days, from January 25 to March 25
of 2021. The time series used for building the dataset consists of the intraday
opening prices for two and five minutes intervals composed of 6425, and 3274
data points respectively. This data was obtained through yfinance1 , a library
that collects historical market data from Yahoo! finance.
The data is divided into two time series, one for training and one for testing.
For the data with two and five-minute intervals, 3500 and 2000 observations are
selected respectively for training, and the leftover observations are used for the
testing. In order to capture a high diversity in price movements, a window slice
mechanism is used. A window of 60 data points corresponding to 2 hours of stock
prices is selected for the two-minute intervals dataset, and 108 data points related
to 9 hours of stock prices for the five-minute intervals dataset. This process of
choosing an optimal window size focuses on overcoming the characteristic of
non-stationary time series and thus increasing the model’s accuracy.
The process of training the model consists of fine-tuning parameters such as the
learning rate, optimizer, loss function, and the number of epochs for training.
1
https://github.com/ranaroussi/yfinance, Last access: June, 2021
6 Solı́s et al.
Fig. 1: Model Architecture composed of CNN, LSTMs, dense, and lambda layers.
The loss function used in this work is the Huber function, this function tends to
be less sensitive to outliers. Therefore, the Huber loss function is able to perform
well for intraday stocks characterized by having a high variance and noisy behav-
ior. For this model, the optimizer used is Stochastic Gradient Descent (SGD).
This gradient is an iterative method for optimizing an objective function with
suitable smoothness properties. It can be regarded as a stochastic approxima-
tion of gradient descent optimization since it replaces the actual gradient by an
estimation of itself.
A learning rate scheduler (LR) is used to define the optimum value for the
stochastic gradient descent. This rate is used to modulate the learning rate over
time of SGD. In this regard, the model is trained for a short number of epochs
where the LR is modified to analyze the returning loss values. Defining a high
LR will make the learning jump over minima, and a low LR will either take too
long to converge or get stuck in an undesirable local minimum.
The metric for measuring the error between the values predicted by the model
and the observed values in this study is the mean absolute error (MAE). A
characteristic of this metric is not penalizing significant errors as much as the
MSE or RMSE does. The mean absolute error is a common measure of forecast
error in time series analysis since the loss values tend to be proportional to the
size of the error. Two types of prediction were performed over the test dataset to
evaluate the performance of the model. The first evaluation consists of one-step
forecasting over the whole test data with a stride of one. The second evaluation
consists of four-step forecasting given an input sequence. Additionally, in order to
test the performance of the model when forecasting extended periods of time, a
fifteen-step ahead forecasting was performed for five-minute stock price intervals.
Financial Time Series Forecasting Applying Deep Learning Algorithms 7
4 Experiments
This section provides a comparative analysis of different experiments consisting
of varying the window size, learning rate selection, one-set and four-step forecast-
ing for two-minute and five-minute intervals, and a fifteen-step ahead forecasting
for five-minute intervals.
4.1 Materials
The implementation of this work was performed using Python 3 routines with
the support of Keras and executed using Google Colab. Colab is a hosted Jupyter
notebook service providing free access to computing resources including GPUs,
RAM and disk. The resources available in Colab used for this project was a GPU
Nvidia K80s, 12.69GB of RAM, and 68.35GB of disk.
(a) Interval: 2 minutes - Window Size: 40 (b) Interval: 5 minutes - Window Size: 46
(c) Interval: 2 minutes - Window Size: 60 (d) Interval: 5 minutes - Window Size: 72
(e) Interval: 2 minutes - Window Size: 80 (f) Interval: 5 minutes - Window Size: 108
Fig. 2: One-step ahead forecasting (Orange) vs. Test Dataset (Blue). A window
size of 40 provides the best results in the case of two-minute intervals and a
window size of 108 for five-minute intervals.
of non-stationary time series. The experiments show that for timer series with
two-minute intervals, the optimum window size consists of 60 data points. Fig-
ure 2a shows the one-step-ahead prediction over the test dataset with a window
size of 40, this plot shows how the forecasted values tend to be below the actual
values. Figure 2e shows the prediction of using a window size of 80, in this case,
the forecasted values tend to be above the actual ones. Figure 2c shows how the
forecasted values are quite close to the actual values. In the case of five-minute
intervals, Figure 2f shows the one-step-ahead predictions using a window size
Financial Time Series Forecasting Applying Deep Learning Algorithms 9
In order to define the learning rate for training. A learning rate scheduler provides
an approximation of the optimum learning rate. This process consists on varying
the learning rate for 100 training epochs. The performed experiments show that
the optimum learning rate for both datasets varies between 5e−7 and 5e−6 .
The optimum learning rate depends on multiple factors such as the number of
training epochs, model parameters, or dataset.
4.5 Forecasting
(a) One-step ahead forecasting for 2-minute (b) One-step ahead forecasting for 5-minute
intervals dataset. intervals dataset.
For the two-minute intervals dataset, window sizes of 60 and 3500 training
examples. For the five-minute intervals dataset, window sizes of 108 and 2000
training examples.Figure 4 shows the model behavior for four-step ahead fore-
casting. The first row corresponds to the forecasted values for 2-minute and
5-minute intervals datasets. The second row corresponds to the input sequence
followed by the four predicted values.
(a) Forecasted values for 2-minute intervals. (b) Forecasted values for 5-minute intervals.
(c) Input sequence followed by forecasted (d) Input sequence followed by forecasted
values for 2-minute intervals dataset. values for 5-minute intervals dataset.
In order to test the performance of the model for predicting longer periods of
time, an additional experiment for fifteen-step ahead forecasting was performed.
The results showed that the model trained for forecasting 5 minute intervals
time series provides much more accurate results than the model for 2-minute in-
tervals. Figure 5 only corresponds to the results provided by the model trained
for forecasting 5-minute intervals. In this experiment the same forecasting ap-
proach is performed for two different time series, the first corresponds to the
stock prices from the last sixty days ending at March 24 and the second corre-
sponds to the last sixty days ending at March 25, 2021. Figure 5a and 5b shows
the model performance for fifteen-step ahead forecasting, this experiment shows
that the model trained for 5-minute intervals can forecast the trend in which the
non-stationary time series behaves for at least 15 data points ahead.
Financial Time Series Forecasting Applying Deep Learning Algorithms 11
(a) Forecasting (Blue) vs. Actual values (b) Forecasting (Blue) vs. Actual values
(Orange) (Orange)
(c) Forecasting (Blue) vs. Actual values and (d) Forecasting (Blue) vs. Actual values
input sequence (Orange). and input sequence (Orange).
Fig. 5: Fifteen-step ahead forecasting for March 24 and March 25, 2021.
5 Results
The results of the model for one-step, four-step, and fifteen-step ahead prediction
are shown in Table 1. This table describes the most important parameters used
to achieve these results and the average error rate returned by the models that
performed these experiments. Table 2 shows the experimental results for varying
the window size in order to face the drawbacks of treating with non-stationary
time series and define an optimum period of time long enough to capture a high
diversity in price movements.
The experimental results show that the proposed architecture provides the
best performance for a short period of time, the best performance was achieved
training the model with two-minute intervals time series, in this case, the error
rate of the model is equal to 6.7. By using five minute intervals time series for
training, the model provides an error rate of 9.94, which is higher than the model
trained for two-minute intervals but also provides good performance for a longer
period of time. Figure 3a and 3b shows the forecasted values compared to the
actual values for two and five minute intervals. In the case of four-step ahead
forecasting, the model trained with two-minute intervals time series also provides
a better error rate, 3.49 for two-minute intervals time series and 8.07 in the case
of five-minute intervals time series.
12 Solı́s et al.
Table 1: Performance results in terms of mean absolute error (MAE) for one-step
and multi-step forecasting.
Approach Interval Window Size Training Examples MAE
One-step ahead forecasting 2 minutes 60 3500 6.7
One-step ahead forecasting 5 minutes 108 2000 9.94
Four-step ahead forecasting 2 minutes 60 3500 3.49
Four-step ahead forecasting 5 minutes 108 2000 8.07
Fifteen-step ahead forecasting 5 minutes 108 2000 9.84
Although the models trained for two-minute intervals time series provides
better performance, the results show that models trained with five-minute in-
tervals time series forecast more precisely the longer period of time than the
two-minute intervals approach. The results show that for fifteen-step ahead fore-
casting, the model is able to maintain the error rate closer to 9 while the error
rate for models trained with two-minute intervals time series is affected drasti-
cally. These results suggest that models trained with longer intervals time series
are able to forecast the trend in which the market data behaves for at least
fifteen steps ahead.
Regarding the analysis of non-stationary time series, the experimental results
show that defining an optimum window size is fundamental for increasing the
model accuracy. The selection of an optimum window size depends on the in-
tervals of the time series. It was found that for two-minute intervals time series,
the optimal window size is equal to 60 data points. In the case of five minute
interval time series, the optimal window size is 108 data points.
The results show that deep architecture performs accurately when forecasting
stock market prices. In this sense, one-step and four-step ahead forecasting can
be applied to high-frequency trading strategies. Since high-frequency trading
strategies focused on short-term positions, the forecasted values by the model
can be used as an indicator for determining a position. An advantage of using
deep learning models in high-frequency trading is the speed at which the model
provides accurate forecasting, this approach enables the possibility to exploit
trading opportunities that may open up for milliseconds or seconds.
Financial Time Series Forecasting Applying Deep Learning Algorithms 13
Since the financial market is a highly dynamic system, the patterns and
dynamics existing within the model will not always correspond to the current
dynamics of the financial market. Therefore, in order to maintain the perfor-
mance of the model, a limitation of this approach is that the model must be
trained constantly to allow the model to learn the current behavior of the fi-
nancial market. In this sense, further researches could focus on extending the
variables provided to the model to identify more complex features and increase
the model accuracy.
6 Conclusion
This work has presented a deep learning model that combines a CNN layer with
two LSTM layers and three regular densely-connected NN layers for intraday
stock price forecasting. The model uses as input a batch dataset built from non-
stationary time series. Results presented in Table 1 show that the model can
perform one-step and multi-step ahead forecasting with a low error rate.
The proposed model uses a sliding window approach that shows the impor-
tance of selecting shorter sequences of data points to train the model and thus
helps to overcome the drawbacks when dealing with non-stationary time series.
In addition, the experimental results show that choosing an optimal window size
can improve the model’s accuracy.
Regarding the model architecture, it can be concluded that the combination
of different deep architectures improves the capability of the model for identifying
interrelations within the time series to allow to forecast changes in trends of
the stock market. Furthermore, the results show that deep architectures can be
applied successfully to trading strategies due to the speed at which the model
performs accurate forecasting of stock prices.
Although the proposed model has a satisfactory forecasting performance,
it still has some improvements for future studies. For example, a multivariate
dataset could identify more complex features within the data to improve the
model accuracy. Moreover, since the model must be constantly trained to learn
the current behavior of the stock prices, this process becomes a time-consuming
problem that high-performance computing (HPC) techniques can address.
References
1. Altman, E.I., Marco, G., Varetto, F.: Corporate distress diagnosis: Comparisons
using linear discriminant analysis and neural networks (the Italian experience).
Journal of Banking & Finance 18(3), 505–529 (1994)
2. Ariyo, A.A., Adewumi, A.O., Ayo, C.K.: Stock price prediction using the arima
model. In: 16th International Conference on Computer Modelling and Simulation.
pp. 106–112. IEEE (2014)
3. Atsalakis, G.S., Valavanis, K.P.: Surveying stock market forecasting techniques–
part ii: Soft computing methods. Expert Systems with Applications 36(3), 5932–
5941 (2009)
14 Solı́s et al.
4. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., et al.: Greedy layer-wise
training of deep networks. Advances in neural information processing systems 19,
153 (2007)
5. Bollerslev, T., Marrone, J., Xu, L., Zhou, H.: Stock return predictability and vari-
ance risk premia: statistical inference and international evidence. Journal of Fi-
nancial and Quantitative Analysis pp. 633–661 (2014)
6. Cakra, Y.E., Trisedya, B.D.: Stock price prediction using linear regression based on
sentiment analysis. In: International Conference on Advanced Computer Science
and Information Systems. pp. 147–154. IEEE (2015)
7. Chen, Y., Kang, Y., Chen, Y., Wang, Z.: Probabilistic forecasting with temporal
convolutional neural network. ArXiv preprint arXiv 1906.04397 (2020)
8. Deboeck, G.J.: Trading on the edge: neural, genetic, and fuzzy systems for chaotic
financial markets, vol. 39. John Wiley & Sons (1994)
9. Durbin, J., Koopman, S.J.: Time series analysis by state space methods. Oxford
University Press (2012)
10. Ferreira, M.A., Santa-Clara, P.: Forecasting stock market returns: The sum of the
parts is more than the whole. Journal of Financial Economics 100(3), 514–537
(2011)
11. Franses, P.H., Ghijsels, H.: Additive outliers, garch and forecasting volatility. In-
ternational Journal of Forecasting 15(1), 1–9 (1999)
12. Heaton, J., Polson, N.G., Witte, J.H.: Deep learning in finance. ArXiv preprint
arXiv:1602.06561 (2016)
13. Huang, C.J., Yang, D.X., Chuang, Y.T.: Application of wrapper approach and com-
posite classifier to the stock trend prediction. Expert Systems with Applications
34(4), 2870–2878 (2008)
14. Huang, N.E., Wu, M.L., Qu, W., Long, S.R., Shen, S.S.: Applications of hilbert–
huang transform to non-stationary financial time series analysis. Applied Stochastic
Models in Business and Industry 19(3), 245–268 (2003)
15. Jensen, M.C.: Some anomalous evidence regarding market efficiency. Journal of
Financial Economics 6(2/3), 95–101 (1978)
16. Jia, H.: Investigation into the effectiveness of long short term memory networks
for stock price prediction. ArXiv preprint arXiv:1603.07893 (2016)
17. Kim, J.H., Shamsuddin, A., Lim, K.P.: Stock return predictability and the adaptive
markets hypothesis: Evidence from century-long us data. Journal of Empirical
Finance 18(5), 868–879 (2011)
18. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–44 (05 2015)
19. MacDonald, J.M.: Demand, information, and competition: why do food prices fall
at seasonal demand peaks? The Journal of Industrial Economics 48(1), 27–45
(2000)
20. Malkiel, B.G.: The efficient market hypothesis and its critics. Journal of Economic
Perspectives 17(1), 59–82 (2003)
21. Malkiel, B.G.: A random walk down wall street the time-tested strategy for suc-
cessful investing (2021)
22. Mizuno, R.: The male/female ratio of fetal deaths and births in japan. The Lancet
356(9231), 738–739 (2000)
23. Murphy, J.J.: Technical analysis of the financial markets: A comprehensive guide
to trading methods and applications. Penguin (1999)
24. Ou, P., Wang, H.: Prediction of stock market index movement by ten data mining
techniques. Modern Applied Science 3(12), 28–42 (2009)
Financial Time Series Forecasting Applying Deep Learning Algorithms 15
25. Roman, J., Jameel, A.: Backpropagation and recurrent neural networks in financial
analysis of multiple stock market returns. In: International Conference on System
Sciences. vol. 2, pp. 454–460. IEEE (1996)
26. Sapankevych, N.I., Sankar, R.: Time series prediction using support vector ma-
chines: a survey. IEEE Computational Intelligence Magazine 4(2), 24–38 (2009)
27. Sarantis, N.: Nonlinearities, cyclical behaviour and predictability in stock markets:
international evidence. International Journal of Forecasting 17(3), 459–482 (2001)
28. Schaller, R.R.: Moore’s law: past, present and future. IEEE Spectrum 34(6), 52–59
(1997)
29. Selvin, S., Vinayakumar, R., Gopalakrishnan, E., Menon, V.K., Soman, K.: Stock
price prediction using lstm, rnn and cnn-sliding window model. In: International
Conference on Advances in Computing, Communications and Informatics. pp.
1643–1647. IEEE (2017)
30. Siami-Namini, S., Namin, A.S.: Forecasting economics and financial time series:
Arima vs. lstm. ArXiv preprint arXiv:1803.06386 (2018)
31. Situngkir, H., Surya, Y.: Neural network revisited: perception on modified poincare
map of financial time-series data. Physica A: Statistical Mechanics and its Appli-
cations 344(1-2), 100–103 (2004)
32. Teixeira, L.A., De Oliveira, A.L.I.: A method for automatic stock trading com-
bining technical analysis and nearest neighbor classification. Expert Systems with
Applications 37(10), 6885–6890 (2010)
33. Tsay, R.S.: Analysis of financial time series, vol. 543. John Wiley & Sons (2005)
34. Valueva, M., Nagornov, N., Lyakhov, P., Valuev, G., Chervyakov, N.: Application
of the residue number system to reduce hardware costs of the convolutional neural
network implementation. Math. Comput. Simul. 177, 232–243 (2020)
35. Wang, B., Huang, H., Wang, X.: A novel text mining approach to financial time
series forecasting. Neurocomputing 83, 136–145 (2012)
36. White, H.: Economic prediction using neural networks: The case of ibm daily stock
returns. In: ICNN. vol. 2, pp. 451–458 (1988)
37. Wu, J., Wei, S.: Time series analysis. Human Science and Technology Press 20,
2018 (1989)
38. Yosinski, J., Clune, J., Nguyen, A.M., Fuchs, T.J., Lipson, H.: Understanding neu-
ral networks through deep visualization. ArXiv preprint arXiv:1506.06579 (2015)
39. Zhang, Q., Luo, R., Yang, Y., Liu, Y.: Benchmarking deep sequential models on
volatility predictions for financial time series. ArXiv preprint arXiv:1811.03711
(2018)