Markowitz Mean-Variance Portfolio Optimization With Predictive Stock Selection Using Machine Learning
Markowitz Mean-Variance Portfolio Optimization With Predictive Stock Selection Using Machine Learning
Financial Studies
Article
Markowitz Mean-Variance Portfolio Optimization with
Predictive Stock Selection Using Machine Learning
Apichat Chaweewanchon and Rujira Chaysiri *
Abstract: With the advances in time-series prediction, several recent developments in machine learning
have shown that integrating prediction methods into portfolio selection is a great opportunity. In
this paper, we propose a novel approach to portfolio formation strategy based on a hybrid machine
learning model that combines convolutional neural network (CNN) and bidirectional long short-term
memory (BiLSTM) with robust input features obtained from Huber’s location for stock prediction
and the Markowitz mean-variance (MV) model for optimal portfolio construction. Specifically, this
study first applies a prediction method for stock preselection to ensure high-quality stock inputs for
portfolio formation. Then, the predicted results are integrated into the MV model. To comprehensively
demonstrate the superiority of the proposed model, we used two portfolio models, the MV model and
the equal-weight portfolio (1/N) model, with LSTM, BiLSTM, and CNN-BiLSTM, and employed them
as benchmarks. Between January 2015 and December 2020, historical data from the Stock Exchange
of Thailand 50 Index (SET50) were collected for the study. The experiment shows that integrating
preselection of stocks can improve MV performance, and the results of the proposed method show
that they outperform comparison models in terms of Sharpe ratio, mean return, and risk.
Citation: Chaweewanchon, Apichat,
and Rujira Chaysiri. 2022.
Keywords: portfolio optimization; mean-variance model; stock prediction; stock selection; machine
Markowitz Mean-Variance Portfolio
Optimization with Predictive Stock
learning
Selection Using Machine Learning.
International Journal of Financial
Studies 10: 64. https://doi.org/
10.3390/ijfs10030064 1. Introduction
Academic Editors: Florian Ielpo and
Portfolio management is an analytical process of selecting and allocating a group of
Sabri Boubaker investment assets in which the portion of allocated investment is persistently changed
to optimize expected return and risk tolerance (Markowitz 1952). The Markowitz mean-
Received: 1 July 2022 variance (MV) model, first developed in 1952, is the foundation of portfolio theory, which
Accepted: 2 August 2022
is extensively used and recognized in portfolio management (Sharpe and Markowitz 1989).
Published: 8 August 2022
However, based on the classical MV model, there are two main issues of concern for
Publisher’s Note: MDPI stays neutral practical application. The first is that the MV relies on the expected return and risk of asset
with regard to jurisdictional claims in inputs to produce optimal portfolios for each level of expected return and risk (Beheshti
published maps and institutional affil- 2018). As a result, by selecting good assets to put into the optimization process, the MV
iations. model may achieve improved performance (Mitra Thakur et al. 2018). Another issue is that
many high-risk assets often return a large number of small-scale weights in the optimal
portfolio, which makes them difficult to implement, particularly for individual investors
(Ben Salah et al. 2018; Ortiz et al. 2021; Huang et al. 2021).
Copyright: © 2022 by the authors.
In recent years, machine learning has been proven to be advantageous in quantitative
Licensee MDPI, Basel, Switzerland.
finance (Dixon et al. 2020); portfolio optimization is one of the most interesting problems
This article is an open access article
in this regard. Normally, the MV model relies on historical data to generate the optimal
distributed under the terms and
portfolio and can only show the optimal portfolio as far as the data input. Therefore, a
conditions of the Creative Commons
number of researchers have been applying machine learning for predicting return and
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
volatility in the future (Henrique et al. 2019). Investors in the financial market must
4.0/).
evaluate a variety of factors and perspectives to maximize their investment earnings
(Rahiminezhad Galankashi et al. 2020). In this regard, including stock price prediction
methods in portfolio optimization would be advantageous and profitable for investors
(Kolm et al. 2014). Financial time-series prediction has long been a difficult field of study
since financial market fluctuations are inherently unstable, complex, and dynamic (Paiva
et al. 2019). However, several related studies claim that there is a pattern of asset price
movement in financial time-series data and that this pattern may be used to forecast
financial time-series data to some extent (Wan et al. 2020; Wang et al. 2020).
The main purpose of this study is to develop a portfolio-formation approach for
individual investors in which a hybrid machine learning model that combines convo-
lutional neural network and bidirectional long short-term memory with robust input
features (R-CNN-BiLSTM) is applied to predict future stock closing prices before us-
ing the MV model to form the optimal portfolio. In this regard, this study has two
main contributions that fill the gap in the existing literature. Firstly, this study pro-
poses a novel approach for portfolio formation that combines R-CNN-BiLSTM and MV
(R-CNN-BiLSTM+MV1 ). This method, which is suitable for capturing the pattern in finan-
cial time-series data, leverages robust input instead of direct stock closing price for machine
learning training. Three LSTM-based machine learning models (i.e., LSTM, BiLSTM, and
CNN-BiLSTM) are used as comparison models in this experiment to compare the results
with the R-CNN-BiLSTM model in terms of prediction accuracy to illustrate the superiority
of the proposed method. Second, the method includes a stock selection process to ensure
the quality of stock inputs, in which stocks with higher potential returns are selected as
candidates before constructing different sizes of optimal portfolios using the MV model to
determine the appropriate number of stocks in the optimal portfolio that provide the best
return and risk for individual investors.
The remainder of the paper is organized as follows. In Section 2, reviews of some
existing studies are discussed relating to stock prediction and portfolio optimization, as
well as empirical works that employ traditional statistics and machine learning methods
to solve problems in relation to stock prediction and selection. Section 3 briefly explains
the underlying knowledge used in this study. Section 4 presents the detailed experimental
process. Section 5 reports the experimental results. Finally, Section 6 addresses the work’s
key findings, theory implementations, and limitations.
2. Literature Review
Many studies have been performed on the process of stock selection and portfolio
optimization using various methods. Lozza et al. (2011) proposed the ex-post comparison
of asset preselection strategies using the joint Markovian behavior of the returns in relation
to market stochastic bounds to deal with large-scale portfolio selection with approximately
10,000 stocks from 14 different stock markets and discovered that Markovian strategies
outperformed the classical approach based on maximizing Sharpe ratio. Huang (2012)
proposed a stock selection model using the support-vector machine (SVR) and genetic
algorithms (GA). This model applies SVR to predict each stock’s future performance, with
GA utilized to optimize model parameters and input characteristics. The highest-ranked
stocks are then weighted equally to build the portfolio. The experimental results show that
the investment performance of the proposed model is better than the benchmarks. Nguyen
(2014) proposed a risk-measurement method for large-scale datasets that includes a stock-
preselecting procedure to remove low-diversification stocks before optimization using the
Sharpe ratio, Stutzer performance index, and the Omega measure. The experimental results
showed that the preselection process improved the performance and diversification of the
proposed portfolio. Rather et al. (2015) proposed a novel robust hybrid model for stock
return prediction. The model consists of two linear models, the autoregressive moving
average and the exponential smoothing models, and a non-linear model, a recurrent neural
network (RNN). The proposed model combined the results of these three prediction-based
methods with the objective to improve the accuracy of the model prediction. The optimiza-
tion model was then used to generate the model’s ideal weight using GA. The proposed
Int. J. Financial Stud. 2022, 10, 64 3 of 19
hybrid prediction model outperformed the RNN model in terms of the prediction accuracy.
Le Caillec et al. (2017) integrated several indicators for stock selection using analysis perfor-
mance evaluation and a behavioral uncertainty framework of human bias to calculate the
cumulative return (CR) of the portfolio. The combined methods, one probabilistic and one
possibilistic, focused on discriminating the common use of multiple technical indicators
(TI) to preselect stocks based on the probabilistic framework. Experiments showed that the
proposed model could raise portfolio performance. Fischer and Krauss (2018) implemented
the LSTM neural network to predict the directional movement of the constituent stocks
of the S&P 500 from 1992 to 2015. The study found that the portfolio based on the LSTM
outperformed the other machine learning models without a memory function (i.e., RF,
DNN, and LR).
These models only apply simple portfolio construction methods that ignore individual
stock risk, such as the equal-weight method, which resulted in a portfolio with unbalanced
risk and expected return. As a result, they are not suitable for individual investors in
practice. Due to the shortcomings of the models, some researchers have adopted the MV
model and used a quantitative method to improve investment decisions.
Tu and Zhou (2010) incorporated Bayesian priors with economic objective functions
in the MV model in which the priors were imposed on the solution rather than primitive
parameters. The study used the monthly return on a Fama-French 25 size from January 1965
to December 2004 and book-to-market portfolio. The results show that portfolio strategies
using objective-based priors outperformed the standard portfolio allocation. Brown and
Smith (2011) studied some heuristic trading strategies in portfolio optimization and devel-
oped a dual approach to examine the quality of the heuristics. The approach considered
several utility functions, i.e., transaction costs, constraint sets, and models of returns. Most
heuristic models performed very close to the optimal solution in the experiment, indicating
that the heuristics model could capture the tradeoff between improving the position of
assets and reducing the transaction costs. Li et al. (2015) proposed a specific portfolio
selection approach using background risk. The study compared a probabilistic portfolio
model with background risk to a probabilistic portfolio without background risk. The
experiment indicated that when the expected return is the same, the variance of the back-
ground risk is larger than the one without risk. Bodnar et al. (2017) analyzed the weights
in the optimal portfolio using the Bayesian framework. This approach enabled investor’s
beliefs to be incorporated into portfolio selection. The study derived explicit formulas
for the posterior distributions of linear combinations of global minimum variance (GMV)
using different priors for the return of assets, specifically, the non-informative (diffuse) and
the informative (conjugate and hierarchical) priors. Then, the prior is suggested directly
for the weights of the portfolio. The numerical study results showed that the studies
performed well for the suggested prior. Katsikis et al. (2021) presented an online approach
for time-varying financial problems while removing the limitations of static methods. The
study found that time-varying mean-variance portfolio selection with transaction costs and
a cardinality constraint (TV-MVPSTC-CC) can be made more realistic by using technical
analysis to generate the expected return of a portfolio. Additionally, a beetle antennae
search (BAS) was implemented to automatically adjust the parameters, which results in
dramatically improved computationally efficacy. The results demonstrated that BAS more
is suitable than the Fa, Ga, and De algorithms for portfolio configurations in real-world
data. Khan et al. (2021) developed a meta-heuristic optimization called quantum beetle
antennae search (QBAS) and incorporated it into portfolio selection to generate the optimal
portfolio. The study applied QBAS on real-world data from the Shanghai Stock Exchange
50 Index (SSE50) and compared the performance to conventional algorithms (i.e., particle
swarm optimization (PSO), genetic algorithm (GA), and beetle antennae search (BAS)).
The experimental results showed that QBAS outperformed other algorithms in terms of
time-consumption, especially for extensive data. Although this method is neither computa-
tionally expensive nor time-consuming, the optimal portfolio still relies on historical data.
Khan et al. (2022) proposed a meta-heuristic algorithm called non-linear activated beetle
Int. J. Financial Stud. 2022, 10, 64 4 of 19
3. Background Knowledge
3.1. Mean-Variance Optimization
Markowitz (1952) proposed the mean-variance (MV) model and was awarded the
Noble Prize in Economics in 1990. The MV model made use of mean and variance, which
are calculated from historical asset prices to quantify the expected return and risk of the
generated portfolio. The MV model assumes that the investor would like to either maximize
the expected return for a given level of risk or minimize risk for a given return (Kolm et al.
2014). However, in this study, we only show the optimization with minimum variance. The
MV model is described as follows:
N N
Minimize σ2 = ∑i=0 ∑ j=0 wi w j Cij (1)
N
subject to ∑i=1 wi Ei = γ (2)
N
∑ i =1 wi = 1 (3)
wi ≥ 0 (4)
where N represents the total number of assets, which indicates the dimensionality of the
optimization in the portfolio; wi is the weight of each i asset in the portfolio to be optimized;
σ2 stands for the variance of the portfolio which generally refers to portfolio risk; Cij is the
covariance of return between asset i and j; γ is the expected or target return; and Ei is the
average return on an individual asset i.
3.2. CNN
A convolutional neural network (CNN) is a kind of deep learning model for processing
grid pattern data, such as image processing and natural language processing. CNN can
be applied to predict time-series data (Sadouk 2019). CNN can significantly improve the
quality of the learning models by reducing the number of parameters. CNN is mainly
composed of three types of layers: a convolution layer, a pooling layer, and a fully connected
layer (Albawi et al. 2017). The first two layers, the convolution layer and the pooling layer,
execute feature extraction, while the last layer, the fully connected layer, directs the extracted
features into output (Milošević and Racković 2019).
3.3. LSTM
Long short-term memory (LSTM) was proposed by Hochreiter and Schmidhuber
(Hochreiter and Schmidhuber 1997). The model is a class of RNN but has a function of
memory, which enables LSTM to retrain data over a long period of time compared to RNN
(Fischer and Krauss 2018). The LSTM model filtrates information that enters through gate
structures composed of an input gate, a forget gate, and an output gate to improve and
maintain memory cells. LSTM is particularly popular in the field of financial time-series
prediction, since the model can effectively handle the redundancy in historical data (Gao
et al. 2021). The operation equation of LSTM is as follows:
Forget gate:
f t = σ w f [ h t −1 , x t ] + b f (5)
Input gate:
i t = σ ( w i [ h t − 1 , x t ] + bi ) (6)
o t = σ ( w o [ h t − 1 , x t ] + bo ) (7)
Int. J. Financial Stud. 2022, 10, 64 6 of 19
3.4. BiLSTM
Bidirectional long short-term memory (BiLSTM) is an improved version of LSTM
with the ability to access both forward and backward directions of the input feature (Dong
et al. 2014). The key difference between BiLSTM and LSTM is that it uses two hidden
layers. BiLSTM was shown to be better compared to LSTM in terms of time-series data
prediction (Siami-Namini et al. 2019). The hidden layer output of BiLSTM has the activation
function for both forward and backward. The BiLSTM equations (Yang and Wang 2022) are
described as follows:
→ →
ht = σ W → xt + W→→ ht−1 + b→ (11)
xh hh h
← ←
ht =σ W ← xt + W←← ht−1 + b← (12)
xh hh h
→ →
Ht = W → h + W← h + by (13)
xh hy
where σ stands for the activation function of the model; W is the weight of the matrix; Wxh
is the weight of input (x) to the hidden layer (h); Ht indicates the hidden layer input; and bx
denotes the bias of the respective gates (x). The output is carried out by updating forward
→ ←
ht and backward ht structures.
∑in=1 Ψ ( xi − µm )
µ m +1 = µ m + (15)
∑in=1 Ψ 0 ( xi − µm )
4. Experimental Process
4.1. Data Preparation
One of the greatest challenges in stock prediction is to capture the pattern of financial
time-series data between the past and future (Wang et al. 2020). Hence, it is easier to predict
stable stocks than volatile stocks. The Stock Exchange of Thailand SET 50 index (SET50)
consists of the topmost 50 large-capitalization companies in the stock market of Thailand,
which comprehensively reflect the overall situation of the stock market in Thailand. In this
study, the historical data of the stocks in SET50 are considered as the experimental data set
according to characteristics of stability and large scale of the stocks. Additionally, some
related studies have been conducted by selecting 21–49 stocks as the experimental data
set. Wang et al. (2020) randomly select 21 stocks from FTSE100 as the sample for machine
learning prediction process before optimization. Chen et al. (2021) randomly chose 24 stocks
from SSE50 as candidate assets in stock prediction process before forming a portfolio. Ma
et al. (2021) employed 49 stocks from SSE100 as a dataset for stock prediction using machine
learning before constructing a portfolio. Additionally, numerous researchers agree on
holding around 10 different stocks in the portfolio. For instance, Soeryana et al. (2017) chose
five different stocks in the optimal portfolio. Abrami and Marsoem (2021) constructed an
eight-asset portfolio. Therefore, our study randomly selected 25 stocks that have been fully
trading between 1 January 2015 and 30 December 2020 covering 1462 trading days from
the SET50 index and used closing price as the experimental data set, which is sufficiently
large for individual investors to build a portfolio (Zaimovic et al. 2021). The names of
Int. J. Financial Stud. 2022, 10, 64 8 of 19
these stocks are “Airport of Thailand” (AOT), “Bangkok Dusit Medical Services” (BDMS),
“Bangkok Expressway and Metro” (BEM), “Berli Jucker” (BJC), “BTS Group Holdings”
(BTS), “CP ALL” (CPALL), “Central Pattana” (CPN), “Delta Electronics Thailand” (DELTA),
“Total Access Communication” (DTAC), “Energy Absolute” (EA), “Siam Global House”
(GLOBAL), “Intouch Holdings” (INTUCH), “IRPC” (IRPC), “Indorama Ventures” (IVL),
“KCE Electronics” (KCE), “Krungthai Card” (KTC), “Land & Houses Public” (LH), “Minor
International” (MINT), “Muangthai Capital” (MTC), “Petroleum Authority of Thailand”
(PTT), “PTT Exploration and Production” (PTTEP), “PTT Global Chemical” (PTTGC),
“Ratch Group” (RATCH), “Srisawad Corporation” (SAWAD), and “The Siam Cement”
(SCC).
Table 1 presents summary statistics of close prices for the 25 stocks. The stocks with
the highest and lowest returns are clearly Delta and IRPC, while the stocks with the highest
and lowest standard deviation are SCC and LH, respectively.
The
The proposed
proposed modelmodel consists
consists of ofthree
threeparts:
parts:datadatatransformation,
transformation,feature
featureextraction,
extraction,
and price prediction.
and price prediction.
First,
First, the
the data
data transformation
transformation component
component converts
converts the the stock
stock closing
closing prices
prices into
into the
the
robust
robust domain,
domain, which
which isis the
the non-noisy
non-noisy version
version ofof the
the data.
data. In
In this
this study,
study, the
the direct
direct stock
stock
closing
closing price
pricedata
dataarearenot
notsuitable
suitableforfor
machine
machine learning
learningtraining
trainingduedue
to high standard
to high standardde-
viations.
deviations.Therefore,
Therefore, weweneed
need to to
transform
transform thethedata
datatotomake
makethem
themmoremoresuitable
suitable for
for the
training
training process.
process. Stock
Stock closing
closingprices
pricesare
are divided
divided intointo aa small
small time-series
time-series size
size of
of 44 days,
days,
the so-called
the so-called lag time. The lag times
The lag times overlap 1 day with each other. The Huber’s location
estimator of each lag time is calculated using Equations (14) and (15).
Second,
Second, the thefeature
featureextraction
extractionis isperformed
performed using a CNN
using a CNNnetwork. CNNCNN
network. has thehasabil-
the
ity to identify
ability important
to identify factors
important in the
factors data,
in the which
data, whichareare
called “features”.
called TheThe
“features”. purpose
purpose of
of this
this step
step is is
to to preserve
preserve thethehistorical
historicaldata
dataininthe
thetime-series
time-seriesdatadataand
and feed
feed them
them into
BiLSTM. Therefore, the input data is converted by performing convolutional operations
on the time steps of the time-series data using a sequence folding layer. In the next step,
the two-dimensional convolutional layer is used to extract the data features. The filtering
size of the first convolutional layer is 3 × 3, and the stride parameter is set to {a = 1, b = 1},
Int. J. Financial Stud. 2022, 10, x FOR PEER REVIEW 10 of 20
Figure 3.
Figure 3. The
The framework
framework of
of BiLSTM
BiLSTM model.
model.
Figure 3. The framework of BiLSTM model.
4.1.2.
4.1.2. Process
Process ofof Training
Trainingand andTesting
Testing
4.1.2.One
One of the most important factors
of
Process theof most important
Training and factors that
Testing that determine
determine the
the success
success ofof machine
machine learning
learning is is
the process
the process
One of ofof training
thetraining and testing.
and testing.
most important In this
In this
factors study,
study,
that we divided
we divided
determine the close
the close
the success price of each
price of each
of machine chosen
chosen
learning is
stock
stock
the into
into training
process training and
andand
of training testing
testing sets according
sets In
testing. according to
to the
this study, we ratio
the ratio of
dividedof 80:20.
80:20. Therefore,
price ofthe
Therefore,
the close the
eachfirst
first 1201
1201
chosen
days
days of
of data
data are
are used
used in
in the
the training
training process,
process, and
and the
the last
last 262
262 days
days are
are used
used as
as the
the testing
testing
stock into training and testing sets according to the ratio of 80:20. Therefore, the first 1201
set.
set. of data are used in the training process, and the last 262 days are used as the testing
days
set.
4.1.3. Hyperparameter Setting
The training dataset is passed to the proposed model for training. In this step, the
various hyperparameters of the neural network are specified. These include the number
of hidden layers, the number of epochs, and the size of batch inputs. Finding the optimal
hyperparameters is still a major challenge in the field of deep learning. In this study,
Int. J. Financial Stud. 2022, 10, 64 11 of 19
hyperparameters are set manually by trial and error with the selection of best parameters
from the experiment. The following is a detailed description of the hyperparameters and
their value settings.
1. The number of epochs: An epoch is one round of full training. In our experiments, we
set the number of epochs to 100 and performed our training. After training, we found
that all training stops at a maximum of 100 to 120 epochs. Therefore, 100 is selected as
the value for this hyperparameter.
2. The number of hidden layers: This is the number of layers between input and output
layers. For the CNN network, we set the hidden convolutional layer counts to 100,
100, and 50. In the BiLSTM network, we set these numbers to 128 and 16.
3. Learning rate: This value is set for the accurate model convergence of the model in
prediction. In our experiment, we set a learning rate to 0.0001. Many researchers
recommend using a learning value lower than 0.01 (Hastie et al. 2017).
4. Optimizer: This is the optimization function used to obtain the best results. In our
work, we use the Adam optimizer, as it works well for LSTM based networks.
5. Loss function: Mean Squared Error (MSE) was used as the loss function. Our imple-
mentation was written using MATLAB with GPU computing.
p̂t − p̂t−1
R̂t = (17)
p̂t−1
where R̂t is the return of the stock at time t, while p̂t is the predicted stock price at time t
and p̂t−1 is the predicted stock price at time t − 1.
As a result, we select the top (N) number of stocks with a higher potential return
according to the ranking order. Only the selected stocks are qualified for constructing the
portfolio in the next stage. The MV model is used in this process to build the optimal
portfolio with different proportions of asset allocation based on the qualified stocks. The
optimization process is performed using the MS Excel solver in which the minimum
variance is set as the objective function, and the weight of each asset is adjusted using the
Excel solver. Consequently, each of the optimal portfolios with the lowest variance is found
and used for analysis.
5. Experimental Results
This section first presents the prediction performance of the LSTM, BiLSTM, CNN-
BiLSTM, and R-CNN-BiLSTM models. In the following, this study constructs different
sizes of portfolios using the classical MV model to compare the prediction result of different
machine learning models without a transaction fee.
n
1
MAE =
n ∑ | p̂i − pi| (19)
i =1
n
1 | p̂i − pi |
SMAPE =
n ∑ (| pi| + | p̂i|) (20)
i =1
where p̂i refers to the predicted price, pi represents the true value, and n indicates the total
number of stocks used in the experiment.
Mean absolute error (MAE): As can be seen from Tables 2 and 3, the average value of
MAE for each of the machine learning model is descending as follows: 1.7219 for LSTM,
1.5350 for CNN-BiLSTM, 1.5222 for BiLSTM, and 1.4582 for R-CNN-BiLSTM. Stock PTTGC
has the highest MAE of 4.7754, which is found for the LSTM model. The lowest MAE is for
the stock BEM, which was predicted using CNN-BiLSTM, with a value of 0.1651.
Mean square error (MSE): According to Tables 2 and 3, the average values of MSE
for each of the machine learning models are reported as follows: 2.9000 for CNN-BiLSTM,
2.5794 for BiLSTM, 2.5570 for LSTM, and 1.8081 for R-CNN-BiLSTM. The biggest MAE is
9.3412, which is found on stock MTC generated from CNN-BiLSTM. Using R-CNN-BiLSTM
in stock BEM, the least MAE of 0.0523 was predicted.
Mean absolute percentage error (SMAPE): From Tables 2 and 3, the average values
for each of the machine learning models are described from high to low as follows: 2.7197
for LSTM, 2.4589 for CNN-BiLSTM, 2.4229 for BiLSTM, and 2.3332 for R-CNN-BiLSTM.
The largest SMAPE is 13.487 which is associated with stock DELTA predicted using CNN-
BiLSTM. The lowest SMAPE is found on stock CPALL, R-CNN-BiLSTM model, with the
value of 0.5713.
In conclusion, most of the R-CNN-BiLSTM results outperform the LSTM, BiLSTM,
and CNN-BiLSTM models for the stock prediction process in terms of MAE, MSE, and
Int. J. Financial Stud. 2022, 10, 64 14 of 19
SMAPE. Specifically, 14 stocks, BEM, BJC, CPALL, CPN, DELTA, EA, GLOBAL, IVL, KCE,
KTC, MINT, MTC, PTT, and PTTEP, which were predicted using R-CNN-BiLSTM, perform
the best in terms of all three metrics, followed by BiLSTM and CNN-BiLSTM. In addition,
a traditional single machine learning model, LSTM, performs the worst, with several
predictive errors in this experiment. Specifically, only the stock RATCH performs the best
in terms of MAE and SMAPE for the LSTM model. It can be seen that the proposed model,
R-CNN-BiLSTM, which uses robust input features instead of the direct stock closing price
in the machine learning training process, achieves a majority of better results than machine
learning models that use direct stock closing price input.
Ep − R f
Sharpe Ratio = (21)
σ
where E p denotes the expected (average) return or mean return of the portfolio; σ is the
standard deviation or risk of the portfolio; and R f refers to risk-free assets. In this study,
we use a risk-free asset rate of 0.022, according to the 10-year Thai treasury rate.
Figure 4. 4.Annualizedportfolio
Figure Annualizedportfolio performance fordifferent
performance for different sizes
sizes of portfolios.
of portfolios.
6. 6. Discussionand
Discussion andConclusions
Conclusions
6.1. Discussion and Key Findings
6.1. Discussion and Key Findings
The paper aims to extend the existing literature on portfolio optimization with stock
The paper
selection. aims to extend
The proposed the model
prediction existing literature based
is developed on portfolio optimization
on the use with stock
of robust statistics
selection. TheCNN-BiLSTM
theory and proposed prediction
machinemodel
learningis developed
model to based
advance onthe
the MV
use of robust
model, statistics
which
theory and CNN-BiLSTM
incorporates the advantages machine learning
of machine model
learning to stock
into advance the MVThis
selection. model, which
study has in-
corporates the advantages
several findings. of machine learning into stock selection. This study has several
First, this paper compares the predictive performance of LSTM, BiLSTM, CNN-
findings.
BiLSTM,
First, and
this stock
paperprediction.
compares The
theexperimental
predictive results show that
performance ofBiLSTM
LSTM, isBiLSTM,
superior to
CNN-
the other models, which indicates that it is more suitable for financial time-series
BiLSTM, and stock prediction. The experimental results show that BiLSTM is superior to predic-
thetion thanmodels,
other the other machine
which modelsthat
indicates applied in thissuitable
it is more experiment, confirming
for financial the study predic-
time-series by
Wang et al. (2020) showing that that traditional LSTM was superior in terms of prediction
tion than the other machine models applied in this experiment, confirming the study by
performance.
Wang et al. (2020) showing that that traditional LSTM was superior in terms of prediction
performance.
Second, this study improves the predictive accuracy of the CNN-BiLSTM by trans-
forming the stock closing price into a robust input feature that can effectively reduce the
Int. J. Financial Stud. 2022, 10, 64 17 of 19
Second, this study improves the predictive accuracy of the CNN-BiLSTM by trans-
forming the stock closing price into a robust input feature that can effectively reduce the
error of the prediction before the model predicts the future price. After comparing the
outcomes of the R-CNN-BiLSTM to LSTM, BiLSTM, and CNN-BiLSTM, it was discovered
that the robust input is appropriate to use as an input feature for the machine learning train-
ing process to capture financial time-series data that can overcome the other comparison
models when the direct stock closing price is used as the input feature.
Finally, the result from the prediction process is incorporated into stock selection for
portfolio optimization; the stocks with higher returns calculated from predicted prices
are chosen to construct the optimal portfolio. The experimental results show that holding
five stocks is appropriate and realistic for individual investors, which is different from
the results Wang et al. (2020) and Chen et al. (2021). Additionally, most of the results
of R-CNN-BiLSTM+MV, R-CNN-BiLSTM+1/N, CNN-BiLSTM+MV, CNN-BiLSTM+1/N,
BiLSTM+MV, BiLSTM+1/N, LSTM+MV, and LSTM+1/N are superior to both Random+MV
and Random+1/N in terms of the Sharpe ratio, mean return, and standard deviation, which
indicates the significance of selecting high-quality stocks in portfolio optimization. The
significance of stock preselection is similar to the conclusions by Wang et al. (2020), Ta et al.
(2020), and Chen et al. (2021).
Author Contributions: Conceptualization, A.C. and R.C.; Data curation, A.C.; Formal analysis, A.C.;
Investigation, A.C.; Methodology, A.C. and R.C.; Supervision, R.C.; Validation, R.C.; Visualization,
A.C.; Writing—original draft, A.C.; Writing—review and editing, R.C. All authors have read and
agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Dataset available at https:www.//finance.yahoo.com, accessed on
30 January 2022.
Conflicts of Interest: The authors declare no conflict of interest.
Int. J. Financial Stud. 2022, 10, 64 18 of 19
Note
1 R-CNN-BiLSTM is used for stock prediction before optimizing the portfolio using the 1/N model.
References
Abrami, Rizkar, and Santoso Marsoem. 2021. Optimal portfolio Formation with Single Index Model Approach on Lq-45 Stocks on
Indonesia Stock Exchange. International Journal of Innovative Science and Research Technology 6: 1301–1309.
Albawi, Saad, Tareq A. Mohammed, and Saad Al-Zawi. 2017. Understanding of a convolutional neural network. Paper presented at
2017 International Conference on Engineering and Technology (ICET), Antalya, Turkey, August 21–23.
Alizadeh, Meysam, Roy Rada, Fariboz Jolai, and Elnaz Fotoohi. 2010. An adaptive neuro-fuzzy system for stock portfolio analysis.
International Journal of Intelligent Systems 26: 99–114. [CrossRef]
Almahdi, Saud, and Steve Y. Yang. 2017. An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent
reinforcement learning with expected maximum drawdown. Expert Systems with Applications 87: 267–79. [CrossRef]
Beheshti, Bijan. 2018. Effective stock selection and portfolio construction within US, International, and emerging markets. Frontiers in
Applied Mathematics and Statistics 4: 17. [CrossRef]
Ben Salah, Hanen, Jan G. De Gooijer, Ali Gannoun, and Mathieu Ribatet. 2018. Mean–variance and mean–semivariance portfolio
selection: A multivariate nonparametric approach. Financial Markets and Portfolio Management 32: 419–36. [CrossRef]
Bodnar, Taras, Stepan Mazur, and Yarema Okhrin. 2017. Bayesian estimation of the global minimum variance portfolio. European
Journal of Operational Research 256: 292–307. [CrossRef]
Brown, David B., and Jame E. Smith. 2011. Dynamic portfolio optimization with transaction costs: Heuristics and dual bounds.
Management Science 57: 1752–70. [CrossRef]
Chen, Wei, Haoyu Zhang, Mukesh Kumar Mehlawat, and Lifen Jia. 2021. Mean–variance portfolio optimization using machine
learning-based stock price prediction. Applied Soft Computing 100: 106943. [CrossRef]
Dixon, Matthew F., Igor Halperin, and Paul Bilokon. 2020. Machine Learning in Finance. Berlin and Heidelberg: Springer International
Publishing.
Dong, Li, Furu Wei, Chuanqi Tan, Duyu Tang, Ming Zhou, and Ke Xu. 2014. Adaptive Recursive Neural Network for Target-dependent
Twitter Sentiment Classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2:
Short Papers). Baltimore: Association for Computational Linguistics, pp. 49–54. [CrossRef]
Fischer, Thomas, and Christopher Krauss. 2018. Deep learning with long short-term memory networks for financial market predictions.
European Journal of Operational Research 270: 654–69. [CrossRef]
Fox, John, and Sanford Weisberg. 2019. An R Companion to Applied Regression. New York: SAGE Publication, Inc.
Gao, Yo, Rong Wang, and Enmin Zhou. 2021. Stock prediction based on optimized LSTM and GRU models. Scientific Programming
2021: 4055281. [CrossRef]
Hampel, Frank, Christian Hennig, and Elvezio Ronchetti. 2011. A smoothing principle for the Huber and other location M-estimators.
Computational Statistics & Data Analysis 55: 324–37. [CrossRef]
Hastie, Trevor, Jerome Friedman, and Robert Tisbshirani. 2017. The Elements of Statistical Learning: Data Mining, Inference, and Prediction.
Berlin and Heidelberg: Springer.
Henrique, Bruno M., Vinicius A. Sobreiro, and Herbert Kimura. 2019. Literature review: Machine learning techniques applied to
financial market prediction. Expert Systems with Applications 124: 226–51. [CrossRef]
Hochreiter, Sepp, and Jurgen Schmidhuber. 1997. Long short-term memory. Neural Computation 9: 1735–80. [CrossRef] [PubMed]
Huang, Chien-Feng. 2012. A hybrid stock selection model using genetic algorithms and support vector regression. Applied Soft
Computing 12: 807–18. [CrossRef]
Huang, Ripeng, Shaojian Qu, Xiaoguang Yang, Fengmin Xu, Zeshui Xu, and Wei Zhou. 2021. Sparse portfolio selection with uncertain
probability distribution. Applied Intelligence 51: 6665–84. [CrossRef]
Huber, Peter J. 1964. Robust estimation of a location parameter. The Annals of Mathematical Statistics 35: 73–101. [CrossRef]
Jierula, Alipujiang, Shuhong Wang, Tae-Min OH, and Pengyu Wang. 2021. Study on accuracy metrics for evaluating the predictions of
damage locations in deep piles using artificial neural networks with acoustic emission data. Applied Sciences 11: 2314. [CrossRef]
Katsikis, Vasilios N., Spyridon D. Mourtas, Predrag S. Stanimirović, Shuai Li, and Xinwei Cao. 2021. Time-varying mean-variance
portfolio selection under transaction costs and cardinality constraint problem via beetle antennae search algorithm (BAS).
Operations Research Forum 2: 18. [CrossRef]
Khan, Ameer Hamza, Xinwei Cao, Vasilio N. Katsikis, Predrag Stanimirovic, Ivona Brajevic, Shuai Li, Seifedine Kadry, and Y. Nam.
2020. Optimal portfolio management for engineering problems using nonconvex cardinality constraint: A computing perspective.
IEEE Access 8: 57437–50. [CrossRef]
Khan, Ameer Tamoor, Xinwei Cao, Inova Brajevic, Predrag S. Stanimirovic, Vasilio N. Katsikis, and Shuai Li. 2022. Non-linear
activated beetle antennae search: A novel technique for non-convex tax-aware portfolio optimization problem. Expert Systems
with Applications 197: 116631. [CrossRef]
Khan, Ameer Tamoor, Xinwei Cao, Shuai Li, Bin Hu, and Vasilio N. Katsikis. 2021. Quantum beetle antennae search: A novel technique
for the constrained portfolio optimization problem. Science China Information Sciences 64: 152204. [CrossRef]
Kolm, Petter N., Reha Tütüncü, and Frank J. Fabozzi. 2014. 60 years of portfolio optimization: Practical challenges and current trends.
European Journal of Operational Research 234: 356–71. [CrossRef]
Int. J. Financial Stud. 2022, 10, 64 19 of 19
Le Caillec, Jean-Marc, Alya Itani, Didier Guriot, and Yves Rakotondratsimba. 2017. Stock picking by probability–possibility approaches.
IEEE Transactions on Fuzzy Systems 25: 333–49. [CrossRef]
Lefebvre, William, Gregoire Loeper, and Huyen Pham. 2020. Mean-variance portfolio selection with Tracking Error Penalization.
Mathematics 8: 1915. [CrossRef]
Li, Ting, Weiguo Zhang, and Weijun Xu. 2015. A fuzzy portfolio selection model with background risk. Applied Mathematics and
Computation 256: 505–13. [CrossRef]
Lozza, Sergio Ortobelli, Enrico Angelelli, and Daniele Toninelli. 2011. Set-portfolio selection with the use of market stochastic bounds.
Emerging Markets Finance and Trade 47: 5–24. [CrossRef]
Ma, Yilin, Ruizhu Han, and Weizhing Wang. 2021. Portfolio optimization with return prediction using Deep Learning and machine
learning. Expert Systems with Applications 165: 113973. [CrossRef]
Markowitz, Harry. 1952. Portfolio selection*. The Journal of Finance 7: 77–91. [CrossRef]
Maronna, Ricardo A., Douglas Martin, and Víctor J. Yohai. 2006. Robust Statistics: Theory and Methods. Hoboken: John Wiley & Sons.
Maronna, Ricardo A., Douglas Martin, Victor J. Yohai, and Matías Salibián-Barrera. 2019. Robust Statistics Theory and Methods (with R).
Hoboken: John Wiley & Sons.
Mba, Jules Clement, Kofi Agyarko Ababio, and Samuel Kwaku Agyei. 2022. Markowitz mean-variance portfolio selection and
optimization under a behavioral spectacle: New empirical evidence. International Journal of Financial Studies 10: 28. [CrossRef]
Milošević, Nemanja, and Milos Racković. 2019. Classification based on missing features in deep convolutional neural networks. Neural
Network World 29: 221–34. [CrossRef]
Mitra Thakur, Gour Sundar, Rupak Bhattacharyya, and Seema Sarkar (Mondal). 2018. Stock portfolio selection using Dempster–Shafer
Evidence theory. Journal of King Saud University-Computer and Information Sciences 30: 223–35. [CrossRef]
Nguyen, Than Thi. 2014. Selection of the right risk measures for portfolio allocation. International Journal of Monetary Economics and
Finance 7: 135. [CrossRef]
Ortiz, Roberto, Mauricio Contreras, and Cristhian Mellado. 2021. Improving the volatility of the optimal weights of the Markowitz
model. Economic Research-Ekonomska Istraživanja, September 29. [CrossRef]
Paiva, Felipe D., Rodrigo T. Cardoso, Gustova P. Hanaoka, and Wendel M. Duarte. 2019. Decision-making for financial trading: A
fusion approach of machine learning and portfolio selection. Expert Systems with Applications 115: 635–55. [CrossRef]
Rahiminezhad Galankashi, Masoud, Farimah Mokhatab Rafiei, and Maryam Ghezelbash. 2020. Portfolio selection: A fuzzy-ANP
approach. Financial Innovation 6: 17. [CrossRef]
Rather, Akhter M., Arun Agarwal, and V. N. Sastry. 2015. Recurrent neural network and a hybrid model for prediction of Stock returns.
Expert Systems with Applications 42: 3234–41. [CrossRef]
Sadouk, Lamyaa. 2019. CNN approaches for Time Series Classification. Time Series Analysis-Data, Methods, and Applications, November
5. [CrossRef]
Sharpe, William F., and Harry M. Markowitz. 1989. Mean-variance analysis in portfolio choice and capital markets. The Journal of
Finance 44: 531. [CrossRef]
Siami-Namini, Sima, Neda Tavakoli, and Akbar S. Namin. 2019. The performance of LSTM and BiLSTM in forecasting time series.
Paper presented at 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, December 9–12.
Sikalo, Mirza, Almira Arnaut-Berilo, and Azra Zaimovic. 2022. Efficient Asset Allocation: Application of game theory-based model for
superior performance. International Journal of Financial Studies 10: 20. [CrossRef]
Singh, Upma, Mohammad Rizwan, Muhannad Alaraj, and I. Alsaidan. 2021. A machine learning-based gradient boosting regression
approach for wind power production forecasting: A step towards Smart Grid Environments. Energies 14: 5196. [CrossRef]
Soeryana, E., N. Fadhlina, Sukono, E. Rusyaman, and S. Supian. 2017. Mean-variance portfolio optimization by using time series
approaches based on logarithmic utility function. IOP Conference Series: Materials Science and Engineering 166: 012003. [CrossRef]
Ta, Van-Dai, Chuan-Ming Liu, and Direselign A. Tadesse. 2020. Portfolio optimization-based stock prediction using long-short term
memory network in quantitative trading. Applied Sciences 10: 437. [CrossRef]
Tu, Juntu, and Guofu Zhou. 2010. Incorporating economic objectives into bayesian priors: Portfolio choice under parameter uncertainty.
Journal of Financial and Quantitative Analysis 45: 959–86. [CrossRef]
Wan, Yuqing, Raymond Y. Lau, and Yain-Whar Si. 2020. Mining subsequent trend patterns from financial time series. International
Journal of Wavelets, Multiresolution and Information Processing 18: 2050010. [CrossRef]
Wang, Wuyu, Weizi Li, Ning Zhang, and Kecheng Liu. 2020. Portfolio formation with preselection using deep learning from long-term
financial data. Expert Systems with Applications 143: 113042. [CrossRef]
Yang, Mo, and Jing Wang. 2022. Adaptability of Financial Time Series prediction based on bilstm. Procedia Computer Science 199: 18–25.
[CrossRef]
Zaimovic, Azra, Adna Omanovic, and Almira Arnaut-Berilo. 2021. How many stocks are sufficient for equity portfolio diversification?
A review of the literature. Journal of Risk and Financial Management 14: 551. [CrossRef]