0% found this document useful (0 votes)
7 views

Machine_Learning_Approaches_for_Enhanced_Portfolio_Optimization_A_Comparative_Study_of_Regularization_and_Cross-Validation_Techniques

This study compares various machine learning approaches for portfolio optimization, focusing on regularization and cross-validation techniques. It highlights the limitations of traditional mean-variance optimization in high-dimensional settings and explores methods like LASSO and ridge regression to enhance prediction accuracy and stability. The research emphasizes the importance of machine learning models in improving portfolio diversification and dynamic asset allocation.

Uploaded by

saibole2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Machine_Learning_Approaches_for_Enhanced_Portfolio_Optimization_A_Comparative_Study_of_Regularization_and_Cross-Validation_Techniques

This study compares various machine learning approaches for portfolio optimization, focusing on regularization and cross-validation techniques. It highlights the limitations of traditional mean-variance optimization in high-dimensional settings and explores methods like LASSO and ridge regression to enhance prediction accuracy and stability. The research emphasizes the importance of machine learning models in improving portfolio diversification and dynamic asset allocation.

Uploaded by

saibole2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Machine Learning Approaches for Enhanced Portfolio

Optimization: A Comparative Study of Regularization


and Cross-Validation Techniques
Khalid Ul Islam Bilal Ahmad Pandow
ICFAI Business School School of Business
Bangalore, India Bahrain Polytechnic
[email protected] Manama, Bahrain
[email protected]

Abstract— One of the ways to optimize an investment thus the portfolio optimization problem in the form of mean-
portfolio is by way of diversification across multiple asset classes variance optimization fails to estimate the optimum weights.
including stocks, bonds, mutual funds, etc. Machine Learning is
becoming an important tool for portfolio optimization given the This problem of a large dimensionality and inadequate data
dynamics and the non-linearities inherent in financial markets availability regarding the system is what underlies every
besides providing computational speed and accuracy. In the optimization problem and the same has been existent in the
present study, we evaluate various machine learning models portfolio optimization as well. The best that institutional
utilizing cross-validation and regularization for portfolio investors can have for N and T is to be of the same order of
optimization. We begin with the Generalized Autoregressive magnitude. In portfolio optimization, the situation that has
Conditional Heteroscedasticity (GARCH) and Autoregressive been called in [2] the ‘thermodynamic limit’ is where T and N
Integrated Moving Average (ARIMA) models and machine approach infinity while maintaining a constant ratio between
learning models like long short-term memory networks (LSTMs) them. Many studies have been conducted to overcome the
and recurrent neural networks (RNNs). We also utilize the mean- difficulty in optimizing portfolios in the mean-variance context
variance and mean-C VaR optimization within the Least Absolute of Markowitz including the factor models approach [3],
Shrinkage, Selection Operator (LASSO) and ridge regression. Bayesian estimators [4]–[8], shrinkage method [6], [7] where
Our study has important implications for investors and the variance-covariance matrix is proposed to be substituted
professional wealth managers including enhanced prediction
with a combination of the sample covariance and a structured
accuracy, dynamic asset allocation, and portfolio diversification.
matrix, weighted accordingly. Additionally, [9] has used the
Keywords— Portfolio Optimization, Machine Learning, Lasso (L-1) approach which entails placing a restriction on the
Financial Markets, Regularization, Cross Validation total sum of the absolute values of the portfolio weights. This
leads to the creation of a sparse portfolio, and the level of
I. INTRODUCTION sparsity is contingent upon a tuning parameter.
In his seminal work [1], Markowitz propounded a In the present study, we will analyze and discuss various
theoretical framework for optimum portfolio choice in the regularization methods developed over time on inverse
form of an optimal risk-return trade-off rather than the return problems. Regularization methods in the context of inverse
maximization-only objective. The theory has significantly problems are employed to mitigate issues related to instability,
influenced both the theoretical foundations and practice in overfitting, or ill-posedness in the solutions. These techniques
investment, risk management, capital allocation, indexing, and aim to impose constraints or penalties on the solution space,
other related fields. One of the basic underpinnings of the preventing the occurrence of undesirable behaviors or
theory lies in combining non-correlated assets and benefiting enhancing the stability and reliability of the results.
from the cancellation between their idiosyncratic fluctuations.
However, the mean-variance optimization calls for estimating II. INSTABILITY AND REGULARISATION METHODS IN
the variance-covariance matrix and taking its inverse. Given LITERATURE
the problem of parameter uncertainty, averages are replaced by There are number of studies that have attempted to analyze
the sums over the sample period. If T (period for each security) the finance and ML like [10], [11]. Following [12] when
"is significantly sizable relative to N (number of securities), institutions optimize portfolios of a large number of assets and
the sample averages, as per the central limit theorem, converge their corresponding data points, the challenge in estimating the
toward the true averages. The above procedure is well justified mean becomes so significant that it becomes difficult to
and does not lead to estimation errors. However, due to the effectively balance the trade-off between risk and return. Even
existence of transaction costs and the non-stationarity of large- when the expected return (sample mean) constraint is ignored,
sample time-series data, the optimization has to be achieved the problem of estimating correlations with a finite sample
with large N but limited T. The result is an estimation error, remains. If N<T, the problem of the singular covariance matrix

978-93-80544-51-9/24/©BVICAM, New Delhi, India 1440


Authorized licensed use limited to: SRM University Amaravathi. Downloaded on December 12,2024 at 13:51:20 UTC from IEEE Xplore. Restrictions apply.
occurs which makes the portfolio selection meaningless. The In the equation, w represents the vector containing asset
only condition that leads to a solution to the portfolio problem weights, μ stands for the vector representing expected returns,
is when T>N, however, when T is not much larger, portfolio ∑ denotes the covariance matrix of asset returns, and λ is the
optimization is not possible because of the instability of the parameter for regularization..
solution to the optimization problem [13], [14].
Another alternative, called shrinkage, is the regularization
A. The Inverse Problem of the covariance vector of the asset returns or its inverse [5]–
Generally, portfolio optimization revolves around the [7]. The name shrinkage is given to this method since it
problem of optimizing the objective function subject to a set of shrinks the sample covariance vector towards the structured
given constraints. Either, it involves maximization of expected estimator. This approach estimates the variance-covariance
returns subject to the risk tolerance level and some other matrix by a weighted average of the sample covariance
constraints or minimization of some measure of risk subject to estimate and a structured matrix. The shrinkage estimator is
a minimum expected returns and some other constraints. When expressed as a convex linear combination
a particular set is optimal, we say that there is no other set that
would provide a better solution for the objective function. ߜ‫ ܨ‬൅ ሺͳ െ ߜሻȭ෠  
Inverse problems work in reverse to what we have Where δ is the shrinkage constant and takes a value
discussed above. Basically, in an inverse problem, the between 0 and 1.
investor's utility function and a set of constraints are taken
along with the investor's portfolio such that we solve for a A more effective way to implement the shrinkage approach
probability distribution function that makes the portfolio in [23] is to use the expected returns instead of the sample
optimal. In the seminal works of [13], and [14], the optimal mean, and the former is established to be more appropriate
allocation can be rewritten with respect to tangency portfolio than the latter.
as: Another regularization method involves imposing a
constraint for enhancing the performance of the portfolio given
෢‫ כ‬ൌ  ߚመ Ȁͳᇱே ߚመ 
‫ݔ‬ the investment strategy. One way to do so is to impose the
constraint of short sale[4] such that the estimation risk is
Where ߚመ is the ordinary least square (OLS) estimate of β in reduced while estimating the weights of the optimum portfolio.
the following regression In the portfolio optimization study [9], [24] the short sale
constraint has been generalized by using the Least Absolute
ͳ ൌ ߚԢ‫ݎ‬௧ାଵ ൅ ‫ݑ‬௧ାଵ or ͳ ் ൌ ܴߚ ൅ ‫ݑ‬ (1)
Shrinkage and Selection Operator (LASSO) regularization also
Where R is an asset returns TxN matrix. called the L1 regularization introduced in[25]. This method
imposes a short sale constraint by imposing a constraint on the
β is the minimum least-squares solution for the following sum of absolute portfolio weights. The maximization function
equation: within the LASSO approach takes the following form:
ܴߚ ൌ ͳ ்   
‫ݔܽܯ‬ǣ‫ ߤ ் ݓ‬െ ߣሺ‫ ் ݓ‬ȭȁ‫ݓ‬ȁሻ  
Which is a typical inverse problem.
As per [17], because of two reasons that is, the assets in the Subject to: ȭȁ‫ݓ‬ȁ ൑ ‫ܤ‬
population may be highly correlated or the quantity of assets
Where B is the constraint on the sum of absolute weights.
within the population might be excessively high compared to
This approach is beneficial as it generates portfolios with
the sample size [18], [19], the sample covariance matrix is
varying degrees of sparsity defined by a tuning parameter. This
nearly singular. As such, inverse problems tend to be ill-poised
can be particularly advantageous in the investment process as
and generally lack a unique solution.
it minimizes transaction costs. In a study [26], an information-
B. Regularization Methods based improvement of the L1 regularization is proposed.
The ridge regularization introduced in [20], and utilized for Additionally, [27] provides a non-constrained approach to the
portfolio optimization in [21], [22] adds a diagonal matrix such optimization problem.
that in the issue of multicollinearity, which arises because of III. RISK IN OPTIMIZATION PROBLEMS
the number of assets in the population being relatively larger
than in the sample, is taken care of. The ridge regularization Traditionally, the mean-variance optimization in [1]
approach also called the L2 or Tikhonov regularization assumes a single period only, and as such no focus is given to
resolves this problem by incorporating a penalty term to the investor's adjustment of their investments with the arrival
maximize the subsequent objective function: of new information. Later, Mohsin, Samuelson, and Merton
[28], extended the portfolio optimization problem to a multi-
ߚመ் ൌ ሺܴᇱ ܴ ൅ ߬‫ܫ‬ሻିଵ ܴԢ‫    ்ܫ‬ period scenario. Primarily, the risk measure was taken as
standard deviation or variance within the minimization of risk
as an objective function which is assumed to be constant over
‫ݔܽܯ‬ǣ‫ ߤ ் ݓ‬െ ߣሺ‫ ் ݓ‬ȭ‫ݓ‬ሻ  
time.

2024 11th International Conference on Computing for Sustainable Global Development (INDIACom) 1441
Authorized licensed use limited to: SRM University Amaravathi. Downloaded on December 12,2024 at 13:51:20 UTC from IEEE Xplore. Restrictions apply.
However, the autoregressive conditional heteroscedasticity than other ML models. Others, [38],[39],[40] have established
(ARCH) model given in [29] assumes variance to be a function that ML models like random forecast (RF), support vector
of past values. To incorporate a large number of past lags regression (SVR), LSTM neural network, and convolutional
within the conditional model of volatility the generalized neural network (CNN) outperform linear regression in stock
parameterization of the ARCH as GARCH was introduced (see price predictions[41].
[28]).
VI. CONCLUSION
In its basic form, the GARCH model estimates the returns
and the conditional volatility together as: Markowitz's seminal work on portfolio optimization
revolutionized investment theory by introducing the concept of
attaining an ideal equilibrium between risk and return.
‫ݎ‬௧ ൌ ݉௧ ൅ ඥ݄௧ ߝ௧    However, the mean-variance optimization technique faces
constraints, notably when confronted with numerous assets and
݄௧ାଵ ൌ ‫ ݓ‬൅ ߙሺ‫ݎ‬௧ െ ݉௧ ሻଶ ൅ ߚ݄௧    limited data points. The "thermodynamic limit" scenario,
where the quantity of assets (N) and data points (T) are
݄௧ାଵ ൌ ‫ ݓ‬൅ ߙ݄௧ ߝ௧ ଶ ൅ ߚ݄௧    comparable, adds further complexity to portfolio optimization.
Various regularization methods have been examined in
Where ‫ݎ‬௧ is the return on an asset over time, ݉௧ is the mean scholarly literature to tackle this instability. Ridge
return, ߝ௧ is the unsystematic variance and ݄௧ is the conditional regularization, also termed L2 regularization, combats
volatility. multicollinearity through the addition of a penalty term to the
objective function. Shrinkage methods, expressed as a convex
Most investors are worried about the downside risk linear combination, strike a harmony between sample
measure and their optimization problems are as such covariance and a structured matrix. L1 regularization, known
constrained by the minimization of the downside risk. One as LASSO, imposes restrictions on the sum of absolute
such measure that has been utilized in literature [31] is the weights, encouraging sparsity within the portfolio. These
value at risk (VaR). The VaR provides a better measure of risk approaches aim to enhance stability and reliability, crucial for
in the portfolio optimization problem. Moreover, a better addressing estimation errors and ambiguous solutions.
measure of downside risk, which is the conditional VaR (C- Additionally, the research explores the relevance of machine
VaR) and robust C-VaR (R-CvaR) as in [32], [33] have been learning techniques like artificial neural networks and random
established to be more stable estimates for the portfolio forecasts in portfolio optimization, acknowledging their
problem as compared to the conventional mean-variance superior performance in forecasting net asset values and stock
optimization. prices.
IV. CROSS-VALIDATION REFERENCES
Machine learning models undergo cross-validation to [1] H. Markowitz, “Portfolio Selection,” J Finance, vol. 7, no. 1, pp. 77–91,
evaluate their performance, which aids in predicting their 1952, doi: 10.1111/j.1540-6261.1952.tb01525.x.
effectiveness with new data. Typically, two methods are [2] S. Still and I. Kondor, “Regularizing portfolio optimization,” New J
employed: Phys, vol. 12, Jul. 2010, doi: 10.1088/1367-2630/12/7/075034.
[3] E. J. Elton and M. J. Gruber, “Modern portfolio theory and investment
One method involves data splitting, where the dataset is analysis”, Language, vol. 14 No.705, 2003.
partitioned into a training set and a validation set. The model is [4] R. Jagannathan and T.Ma“Risk Reduction In Large Portfolios: Why
trained on one set and then assessed on the other. This process Imposing The Wrong Constraints Helps,” The journal of finance Vol.58
is iterated multiple times, and the average performance is No.4, pp.1651-1683, 2002. [Online]. Available:
computed. http://www.nber.org/papers/w8922
[5] O.Ledoit and M. Wolf, “Improved Estimation of The Covariance Matrix
Another method is k-fold cross-validation. Here, the Of Stock Returns With An Application To Portfolio Selection,” Journal
dataset is divided into k subsets, and the model is trained on k- of empirical finance, Vo.10,no.5, pp. 603-621,2003.
1 subsets and tested on the remaining subset. This process is [6] O. Ledoit and M. Wolf, “A well-conditioned estimator for large-
repeated k times, and the average performance is determined. dimensional covariance matrices,” J Multivar Anal, vol. 88, no. 2, pp.
365–411, 2004, doi: 10.1016/S0047-259X(03)00096-4.
V. MACHINE LEARNING MODELS FOR PORTFOLIO [7] O. Ledoit and M. Wolf, “Honey, I Shrunk the Sample Covariance
OPTIMIZATION Matrix,” UPF economics and business working paper, pp.691,2003.
[8] V. Demiguel, L. Garlappi, F. J. Nogales, and R. Uppal, “A Generalized
In a study for predicting net asset values (NAV) of mutual Approach to Portfolio Optimization: Improving Performance by
funds, [34] have established that non-linear techniques like Constraining Portfolio Norms,” Source: Management Science, vol. 55,
artificial neural networks (ANN) showcase better model fitting no. 5, pp. 798–812, 2009, doi: 10.1287/mnsc.l080.0986.
capabilities. Different ML models including particle swarm [9] J. Brodie, I. Daubechies, C. De Mol, D. Giannone, and I. Loris, “Sparse
optimization (PSO) [35], k-NN regression [36], combined and stable Markowitz portfolios,” Proc Natl Acad Sci U S A, vol. 106,
no. 30, pp. 12267–12272, Jul. 2009, doi: 10.1073/pnas.0904287106.
attention mechanism, deep multilayer perception (DMLP), and
[10] K. Ali Ganai and B. Ahmad Pandow, “Understanding Financial Impact
bidirectional long-short-term memory neural network (LSTM) of Machine and Deep Learning in Healthcare: An Analysis,” in
[37] have been used for analyzing the technical indicators for Applications of Machine Learning and Deep Learning on Biological
stock price prediction. Overall, these ML models are better

1442 2024 11th International Conference on Computing for Sustainable Global Development (INDIACom)
Authorized licensed use limited to: SRM University Amaravathi. Downloaded on December 12,2024 at 13:51:20 UTC from IEEE Xplore. Restrictions apply.
Data, Auerbach Publications, , pp. 41–56, 2023. doi: OBJECTIVE AND CONSTRAINTS,” 2001. [Online]. Available:
10.1201/9781003328780-3. http://www.gloriamundi.org/.
[11] B. A. Pandow, A. M. Bamhdi, and F. Masoodi, “Internet of Things: [32] S. Hashemkhani Zolfani, H. Mehtari Taheri, M. Gharehgozlou, and A.
Financial Perspective and Associated Security Concerns,” International Farahani, “An asymmetric PROMETHEE II for cryptocurrency portfolio
Journal of Computer Theory and Engineering, vol. 12, no. 5, pp. 123– allocation based on return prediction,” Appl Soft Comput, vol. 131, p.
127, 2020, doi: 10.7763/ijcte.2020.v12.1276. 109829, Dec. 2022, doi: 10.1016/j.asoc.2022.109829.
[12] L. D. Brown and L. H. Zhao, “A Geometrical Explanation of Stein [33] Ahanger, A. S., Khan, S. M., & Masoodi, F. (2022). Building an
Shrinkage,” Statistical Science, vol. 27, no. 1, pp. 24–30, Feb. 2012, doi: intrusion detection system using supervised machine learning classifiers
10.1214/11-STS382. with feature selection. In Inventive Systems and Control: Proceedings of
[13] S. Pafka and I. Kondor, “Noisy Covariance Matrices and Portfolio ICISC 2022 (pp. 811-821). Singapore: Springer Nature Singapore
Optimization”, The European Physical Journal B-Condensed Matter and [34] E. Priyadarshini, “A COMPARATIVE ANALYSIS OF PREDICTION
Complex Systems, vol 27 pp. 277-280 2002. USING ARTIFICIAL NEURAL NETWORK AND AUTO
[14] S. Pafka and I. Kondor, “Noisy Covariance Matrices and Portfolio REGRESSIVE INTEGRATED MOVING AVERAGE,” vol. 10, no. 7,
Optimization II,” Physica A: Statistical Mechanics and its Applications 2015, [Online]. Available: www.arpnjournals.com
Vol. 319 pp.487-494,2003. [35] O. Hegazy, O. S. Soliman, and M. A. Salam, “A Machine Learning
[15] J. D. Jobson and B. Korkie, “Statistical Inference in Two-Parameter Model for Stock Market Prediction,” International Journal of Computer
Portfolio Theory with Multiple Regression Software,” 1983. Science and Telecommunications, vol. 2 No.12, 2013.
[16] M. Britten-Jones, “The Sampling Error in Estimates of Mean-Variance [36] M. Ananthi and K. Vijayakumar, “Retraction Note to: Stock market
Efficient Portfolio Weights,” The Journal of Finance, vol.54 No.2 pp. analysis using candlestick regression and market trend prediction
655-671.1999. (CKRM),” J Ambient Intell Humaniz Comput, vol. 14, no.1, pp. 285–
285, Apr. 2023, doi: 10.1007/s12652-022-04067-6.
[17] H. W. Engl and P. Kügler, “Nonlinear Inverse Problems: Theoretical
Aspects and Some Industrial Applications,” In Multidisciplinary methods [37] Q. Chen, W. Zhang, and Y. Lou, “Forecasting Stock Prices Using a
for analysis optimization and control of complex systems,Berlin, Hybrid Deep Learning Model Integrating Attention Mechanism, Multi-
Heidelberg: Springer Berlin Heidelberg, 2005, pp. 3-47. Layer Perceptron, and Bidirectional Long-Short Term Memory Neural
Network,” IEEE Access, vol. 8, pp. 117365–117376, 2020, doi:
[18] F.Masoodi, F. “Machine learning for classification analysis of intrusion 10.1109/ACCESS.2020.3004284.
detection on NSL-KDD dataset” Turkish Journal of Computer and
Mathematics Education, Vol.12 No.10, pp. 2286-2293, 2021 [38] G.V. Navin, “Big Data Analytics for Gold Price Forecasting Based on
Decision Tree Algorithm and Support Vector Regression (SVR),”
[19] Z. Bai, H. Liu and W.K.Wong, “Enhancement of the applicability of International Journal of Science and Research (IJSR) Vol.4, no. 3 pp.
Markowitz's portfolio optimization by utilizing random matrix theory” 2026-2030. 2015.
Mathematical Finance: An International Journal of Mathematics,
Statistics and Financial Economics, Vol.19, No.4, pp.639-667. 2009. [39] Abrar, I., Ayub, Z., Masoodi, F., & Bamhdi, A. M. (2020, September). A
machine learning approach for intrusion detection system on NSL-KDD
[20] A. E. Hoerl and R. W. Kennard, “Ridge Regression: Applications to
dataset. In 2020 international conference on smart electronics and
Nonorthogonal Problems,” 1970.
communication (ICOSEC) (pp. 919-924). IEEE
[21] Hou-Duo Qi, “Geometric Characterization of Maximum Diversification
[40] S. Yasmeen, M. Meera, K. Gowthami, and D. Lakshmi, “Forecasting
Return Portfolio via Rao’s Quadratic Entropy,” SIAM Journal on
Stock Market Future Movement Direction: Supervised Machine Learning
Financial Mathematics, vol. 14, no. 2, 2023.
Algorithm,” International Journal of Research in Engineering, Science
[22] D. Bertsimas and R. Cory-Wright, “A Scalable Algorithm For Sparse and Management, vol. 2, no. 2, 2019.
Portfolio Selection,”, INFORMS Journal on Computing, Vol.34 No.3, [41] Y. Ma, R. Han, and W. Wang, “Portfolio optimization with return
pp.1489-1511 Oct. 2018, doi: 10.1287/ijoc.2021.1127.
prediction using deep learning and machine learning,” Expert Syst Appl,
[23] P. Jorion, “Bayes-Stein Estimation for Portfolio Analysis,”, Journal of vol. 165, p. 113973, Mar. 2021, doi: 10.1016/j.eswa.2020.113973.
Financial and Quantitative analysis, Vol.21 No.3, pp.279-292.1986.
[24] J. Fan, J. Zhang, and K. Yu, “Asset Allocation and Risk Assessment with
Gross Exposure Constraints for Vast Portfolios,” Dec. 2008, [Online].
Available: http://arxiv.org/abs/0812.2604
[25] R. Tibshirani, “Regression Shrinkage and Selection via the Lasso”
Journal of the Royal Statistical Society Series B: Statistical
Methodology, Vol.58, no. 1 pp. 267-288. 1996.
[26] B. Fastrich, S. Paterlini, and P. Winker, “Constructing optimal sparse
portfolios using regularization methods,” Computational Management
Science, vol. 12, no. 3, pp. 417–434, Jul. 2015, doi: 10.1007/s10287-014-
0227-5.
[27] M. Ao, Y. Li, and X. Zheng, “Approaching Mean-Variance Efficiency
for Large Portfolios,” vol. 32, no. 7, pp. 2890–2919, 2019, doi:
10.2307/48568741.
[28] F. S. Masoodi, I. Abrar, and A. M. Bamhdi, “An Effective Intrusion
Detection System Using Homogeneous Ensemble Techniques,” Int. J.
Inf. Secur. Priv., vol. 16, no. 1, pp. 1–18, 2021, doi:
10.4018/ijisp.2022010112.
[29] Teli, T. A., Masoodi, F. S., & Bahmdi, A. M. (2021). HIBE: hierarchical
identity-based encryption. In Functional Encryption (pp. 187-203).
Cham: Springer International Publishing
[30] R. Engle, “GARCH 101: The Use of ARCH/GARCH Models in Applied
Econometrics,” 2001.
[31] P. Krokhmal, J. Palmquist, and S. Uryasev, “PORTFOLIO
OPTIMIZATION WITH CONDITIONAL VALUE-AT-RISK

2024 11th International Conference on Computing for Sustainable Global Development (INDIACom)
Authorized licensed use limited to: SRM University Amaravathi. Downloaded on December 12,2024 at 13:51:20 UTC from IEEE Xplore. Restrictions apply.
1443

You might also like