Machine Learning Asset Allocation
Machine Learning Asset Allocation
1
min 𝜔′ 𝑉𝜔
𝜔 2
s. t. : 𝜔′ 𝑎 = 𝑎ത
𝜕𝐿2 𝜔, 𝜆 𝜕𝐿2 𝜔, 𝜆
𝜕𝜔 2 𝜕𝜔𝜕𝜆 = 𝑉′ −𝑎′ = 𝑎′ 𝑎 ≥ 0
𝜕𝐿2 𝜔, 𝜆 𝜕𝐿2 𝜔, 𝜆 𝑎 0
𝜕𝜆𝜕𝜔 𝜕𝜆2
• The issue is, this solution is mathematically correct, but impractical, among other
reasons due to numerical instabilities
−1 𝑎ො
𝑉
ෝ ∗ = 𝑎ത ′ −1
𝜔
𝑎ො 𝑉 𝑎ො
𝑇 𝜆+ − 𝜆 𝜆 − 𝜆−
𝑓 𝜆 = ൞𝑁 if 𝜆 ∈ 𝜆− , 𝜆+
2𝜋𝜆𝜎 2
0 if 𝜆 ∉ 𝜆− , 𝜆+
• In fact, we can fit the function 𝑓 𝜆 to the empirical distribution of the eigenvalues
to derive the implied 𝜎 2
That will give us the variance that is explained by the random
eigenvectors present in the correlation matrix, and it will determine
the cut-off level 𝜆+ , adjusted for the presence of non-random
eigenvectors.
𝐶ሚ1 = 𝑊Λ
෩𝑊 ′
−1ൗ2 −1ൗ2
𝐶1 = diag 𝐶ሚ1 𝐶ሚ1 diag 𝐶ሚ1
where 𝑊𝑀 and Λ𝑀 are the eigenvectors and eigenvalues associated with market
components (usually only one, but possibly more), and 𝑊𝐷 and Λ𝐷 are the
eigenvectors and eigenvalues associated with non-market components.
De-noising is much more effective than shrinkage: The de-noised maximum Sharpe ratio portfolio
the de-noised minimum variance portfolio incurs incurs only 0.04% of the RMSE incurred by the
only 40.15% of the RMSE incurred by the minimum maximum Sharpe ratio portfolio without de-noising.
variance portfolio without de-noising. That is a That is a 94.44% reduction in RMSE from de-noising
59.85% reduction in RMSE from de-noising alone, alone, compared to a 70.77% reduction using
compared to a 30.22% reduction using Ledoit-Wolf Ledoit-Wolf shrinkage. While shrinkage is somewhat
shrinkage. Shrinkage adds little benefit beyond helpful in absence of de-noising, it adds no benefit
what de-noising contributes. The reduction in RMSE in combination with de-noising. This is because
from combining de-noising with shrinkage is shrinkage dilutes the noise at the expense of
65.63%, which is not much better than the result diluting some of the signal as well.
from using de-noising only.
But because the trace of the correlation matrix is exactly 𝑁, that means
that an eigenvalue can only increase at the expense of the other 𝑁 − 𝐾
eigenvalues, resulting in a condition number greater than 1.
Accordingly, the greater the intra-cluster correlation is, the higher the
condition number becomes.
NCO computes the minimum variance portfolio with NCO computes the maximum Sharpe ratio portfolio
52.98% of Markowitz’s RMSE, i.e. a 47.02% with 45.17% of Markowitz’s RMSE, i.e. a 54.83%
reduction in RMSE. Ledoit-Wolf shrinkage is reduction in RMSE. The combination of shrinkage
detrimental. Combining shrinkage and NCO yields a and NCO yields a 18.52% reduction in the RMSE of
15.30% reduction in RMSE, which is better than the maximum Sharpe ratio portfolio, which is better
shrinkage but worse than NCO alone. than shrinkage but worse than NCO. Once again,
The implication is that NCO delivers substantially NCO delivers substantially lower RMSE than
lower RMSE than Markowitz’s solution, even for a Markowitz’s solution, and shrinkage adds no value.
small portfolio of only 50 securities, and that It is easy to test that NCO’s advantage widens for
shrinkage adds no value. larger portfolios.
Sim_Num = 0
Report best performing
method for that Draw:
particular set of inputs
De-noise
Compute
estimation error
for each method Optimize using
True
www.QuantResearch.org