Simple Portfolio Optimization That Works
Simple Portfolio Optimization That Works
Abstract
We first show that the common “mean-variance” portfolio method fails because variance is a
horrible risk-measure for investing, and also because estimation errors may cause that method
to concentrate the portfolio in losing assets that are highly correlated. We then present a new
so-called “filter-diversify” method for portfolio optimization. The filtering process is trivial as
it only allows assets into the portfolio if they have sufficiently high estimated returns. The
diversification process is based on a new algorithm with several benefits: The algorithm is
fairly simple. It allows both positive and negative portfolio weights. It is extremely fast and
only takes a few milli-seconds to compute for a portfolio of 1000 assets. It is guaranteed to
converge to the optimal solution. It is very robust to estimation errors, because it will only
decrease the portfolio weights, so the worst that can happen is that it moves too much of the
portfolio into cash (or another low-risk asset of your choice). We perform numerous tests of
the new portfolio method on real-world stock-data from USA, and find that the new method
performs extremely well on all performance metrics, and is very robust to estimation errors.
Simple Portfolio Optimization That Works! Page 2 / 164
Table of Contents
1 Introduction.............................................................................................................................................3
2 Mean-Variance Optimization..................................................................................................................4
3 Variance is NOT Risk!............................................................................................................................8
4 Naive Forecasting.................................................................................................................................20
5 Conditional Forecasting........................................................................................................................24
6 Random Walks......................................................................................................................................29
7 Filtering Methods..................................................................................................................................35
8 Diversification Method.........................................................................................................................38
9 Test Settings..........................................................................................................................................67
10 Test A – Full Data Period (Omniscient)..............................................................................................72
11 Test B – Data Until 2010 (Omniscient)...............................................................................................91
12 Test C – Data From 2010 (Omniscient)............................................................................................102
13 Test D – Noisy Returns (Robustness)................................................................................................113
14 Test E – Noisy Correlations (Robustness).........................................................................................125
15 Test F – Noisy Returns & Correlations (Robustness).......................................................................140
16 Test G – Parameter Tuning (Omniscient)..........................................................................................153
17 Future Research.................................................................................................................................160
18 Conclusion........................................................................................................................................161
19 Data & Computer Code.....................................................................................................................162
20 Bibliography......................................................................................................................................164
Simple Portfolio Optimization That Works! Page 3 / 164
1 Introduction
In the 1950’s a young man named Harry Markowitz invented a new method for optimizing investment
portfolios of multiple assets, which is known as “mean-variance” portfolio optimization, because it
maximizes the estimated mean return of the portfolio while simultaneously minimizing its variance.
Nearly 40 years later in 1990, Markowitz was awarded the Nobel-prize in finance for his work. See
[Markowitz 1952] and [Markowitz 1959] for the original paper and book by Markowitz. An easier and
more concise description is found in [Luenberger 1998], and there are perhaps more recent descriptions
that are even easier to understand.
The “mean-variance” method is still the standard portfolio method used in academia to this day, nearly
70 years after it was first invented. According to Google Scholar, since the beginning of the year 2020
until the time of this writing in early October 2021, there were more than 3000 papers published with
the words “portfolio optimization” and Markowitz in the title or abstract. Unfortunately there are no
good survey papers for getting a quick overview of the state-of-art research in that field. But sampling
some of the papers from the most well-known researchers in the field, shows that they are still using
the mean-variance method as their basic framework, and they still believe it is fundamentally sound.
In this paper we first show that the mean-variance method is inherently broken because variance is a
horrible risk-measure for investing, and because the mean-variance method optimizes both the
portfolio’s mean and variance simultaneously, so it may concentrate the portfolio in losing assets that
are highly correlated, if there are estimation errors in the mean returns and correlations for the assets.
We then present our new “filter-diversify” portfolio method which has two separate phases: First is the
filtering process which is almost trivial, as it only allows assets into the portfolio if the estimated future
returns are sufficiently high, and the portfolio weights are made higher when the estimated returns are
higher. These portfolio weights are then fed into the diversification process, which is based on a new
algorithm that we have dubbed “Hvass Diversification” for easy reference, and which has several
advantages: The algorithm is fairly simple to describe and implement. It is guaranteed to converge to
the optimal solution in just a few iterations. It has quadratic time-complexity so it is very fast and can
diversify a portfolio with 1000 assets in just a few milli-seconds on a normal household computer. It
greatly improves the portfolio on several performance metrics. And it only allows the portfolio weights
to decrease, so the worst that can happen is that it moves too much of the portfolio into cash.
The new portfolio method is tested extensively on many thousands of portfolios of varying sizes, that
are selected randomly from nearly 1000 U.S. stocks between the years 2007 and 2021. Although this
paper is nearly 170 pages long, most of the pages actually consist of these tests with plots and analysis
of the results. The portfolio method is first tested using the actual future 2-3 year average stock-returns
and the actual future 10-day stock-correlations. We call this “omniscient” testing and it shows that the
new portfolio method works extremely well when given correct input-data for the future stock-returns
and correlations. We then add various kinds of heavy noise to the future stock-returns and correlations
to test for robustness, and this shows the new portfolio method is very robust to estimation errors. The
diversification algorithm can even handle completely malformed correlation matrices without harming
the portfolio’s performance, probably because it only allows the portfolio weights to decrease.
Simple Portfolio Optimization That Works! Page 4 / 164
2 Mean-Variance Optimization
Because academia still considers the mean-variance method to be the standard way of optimizing
investment portfolios, it is important to have a basic understanding of how it works. The academic
literature is often very dense and mathematical, and that is perhaps why the mean-variance method is
still believed to be correct after 70 years, because it is hard for people to connect the abstract
mathematics to the real world. In this section we will therefore give a brief explanation of the mean-
variance method that is hopefully a bit easier to understand, so we can see its flaws more clearly.
Say we want to invest in a portfolio of different assets, which can be stocks, bonds, currencies, real
estate, etc. If we are only considering a few assets, we often denote them by letters such as Asset A and
Asset B, or if we are considering many assets we may list them by numbers: Asset 1, Asset 2, etc.
The investment return on an asset is typically measured in percentages. For example, if we buy an asset
for $1000 and sell it for $1200 we have made $200 profit, which corresponds to a +20% gain because
$ 1200/ $ 1000 – 1=+20 % . If instead we sold the asset for only $800 we would have incurred a -20%
loss because $ 800 / $ 1000 – 1=−20 % . It is sometimes useful to write the returns as “plus one”, so in
the two examples above we would write 1.2 instead of +20% and 0.8 instead of -20%.
We are interested in forecasting or predicting the future asset returns, so we can invest in the one asset
that has the highest future return. But our predictions about future returns are often inaccurate, and it is
possible that we could experience losses instead of gains. If we cannot somehow improve our estimates
of the future asset returns so as to make them more accurate, then we would instead like a method that
combines multiple assets into a portfolio, so as to maximize the portfolio’s return while minimizing its
risk. And that is exactly what the mean-variance method claims to do, as we will see in a moment.
But let us first explain a few basics about so-called random variables. Because the future asset returns
are uncertain, we say that they are random variables. Let us denote the future return on Asset A as a
random variable named ReturnA and sometimes just as RA. In a simple example, we could have
estimated that Asset A could have a future ReturnA of either +20% or -20%.
In the real world, asset returns are rarely so simple that they only have two possible outcomes. The
outcomes are usually a continuous spectrum of possible returns with varying probabilities. This is
called the probability distribution of the random variable. Instead of listing all possible outcomes and
each of their probabilities, we usually summarize the probability distribution by only writing its centre-
point and the dispersion around this centre-point. This summarizes the whole probability distribution
into just two numbers. There are different ways of calculating these two numbers, and perhaps the most
common way is to use the mean as the centre-point and the standard deviation as the dispersion.
For a so-called discrete probability distribution, there is only a finite number of possible values for the
random variable. If they all have equal probability of occurring, then the mean of the random variable
is simply the sum of all its possible values, divided by the number of possible values. This is also called
the arithmetic mean and is often denoted E [ Return A ] for the random variable ReturnA, or more shortly
as μ A (pronounced as mu).
Simple Portfolio Optimization That Works! Page 5 / 164
For example, if the return on Asset A could be either -20%, -10%, 0%, +5%, +15%, or +25%, then its
arithmetic mean would be calculated as follows:
1
μ A = E [ Return A ] = ⋅(−20 %−10 %+0 %+5 %+15 %+25 %) = 2.5 % (1)
6
If the possible outcomes have different probabilities of occurring, we would have to modify the above
formula to weigh each possible return according to its probability. And for a continuous probability
distribution we would have to calculate the arithmetic mean using an integral instead.
Now that we have a centre-point for the probability distribution, let us define a measure of its spread or
dispersion around this centre-point. A common measure of dispersion is the so-called variance, which is
defined mathematically as follows:
Var [ Return A ] = E [ ( Return A −μ A ) ]
2
(2)
This formula is very important for the mean-variance portfolio method, because it uses the variance as
its risk-measure. It is therefore important that you understand what this formula really means. The
inner-part of the formula (Return A −μ A )2 is basically just the difference between each possible return
on Asset A minus their average return, and then squared to get a positive number. Then we take the
average of all those squared differences, and that is the variance of the random variable ReturnA. So the
variance is a measure of how far from the average all the possible return values are.
The reason that we are calculating the squared differences instead of just taking the absolute value (i.e.
removing the sign of the difference), is that it is mathematically convenient in other regards. But it also
means that the variance is no longer on the same scale as the original numbers. This is easily corrected
by taking the square-root of the variance, which gives us the so-called standard deviation:
Let Weighti denote the portfolio’s weight for asset number i. We typically assume that all the portfolio
weights sum to 1, so our entire capital is invested in different assets. If the weights sum to more than 1,
then we would be investing for borrowed money, and if the weights sum to less than 1, then some of
the portfolio would be held in cash (or other low-risk investments such as a short-term bond fund).
The return on the portfolio is then another random variable denoted Returnp, which is just defined as
the weighted sum of the returns on the N individual assets in the portfolio:
N
Return P = ∑
i=1
Weight i⋅Returni (6)
The portfolio’s mean return μp is very easy to calculate, as it is just the weighted sum of the mean
returns for the individual assets, because of how the arithmetic mean is defined mathematically:
N N
μ P = E [ Return P ] = ∑ Weight i⋅E[ Returni ] = ∑ Weight i⋅μi (7)
i=1 i=1
But the portfolio’s variance is more complicated to calculate, because the definition of variance from
Eq. (2) contains a non-linearity in the form of the squared differences. Although it is not too difficult to
derive the variance for a portfolio of only two assets, it is a slightly long derivation so it has been
omitted here for brevity, but it would be a very useful exercise for you to do, so you understand why
the formula is defined this way. For a portfolio of N assets, the variance of their weighted sum is:
N N
2
σ = Var [ Return p ] =
p ∑ ∑ Weight i⋅Weight j⋅Cov [ Returni , Return j ] (8)
i=1 j=1
The standard deviation of the portfolio’s return is simply the square-root of the variance:
σ p = Std [ Return p ] = √ Var [ Return p ] (9)
The so-called covariance in Eq. (8) is defined as follows:
Cov [ Returni , Return j ] = E [ ( Returni−E [ Returni ] ) ⋅ ( Return j −E [ Return j ] ) ] (10)
So the covariance is defined quite similarly to the variance in Eq. (2), but it uses the returns for two
different assets instead of just one. A special case of the covariance formula is when we use it with a
single asset, in which case it equals the variance Cov [ Returni , Returni ]=Var [ Returni ] .
Like the variance, the covariance also has a strange scale and this is difficult to interpret by a human.
So we will often use the so-called correlation instead of the covariance. It basically just normalizes the
covariance so it ranges between -1 and 1 using this formula:
Cov[ Returni , Return j ]
Corr [ Returni , Return j ] = (11)
Std [ Returni ] ⋅ Std [ Return j ]
The advantage of using the correlation is that it is always a value between -1 and 1, with 1 meaning that
the two random variables always move in the same direction (but they may not have the same values),
and a correlation of -1 means that the two random variables always move in opposite direction to each
other. Note that an asset always has correlation 1 with itself Corr [ Returni , Returni ]=1 . And the
correlation is symmetrical so that Corr [ Returni , Return j ]=Corr [ Return j , Returni ] . If the correlation is
zero, then the two random variables are said to be uncorrelated.
Simple Portfolio Optimization That Works! Page 7 / 164
Using the correlation instead of the covariance in Eq. (8), and denoting the correlation as
ρ i, j=Corr [ Returni , Return j ] (pronounced rho), and using the shorter notation for standard deviation
σ i=Std [ Returni ] gives the following formula for the variance of a portfolio of weighted assets:
N N
2
σ = Var [ Return p ] =
p ∑ ∑ Weight i⋅Weight j⋅σ i⋅σ j⋅ρi , j (12)
i=1 j=1
The inner-part of this formula is Weight i⋅Weight j⋅σ i⋅σ j⋅ρi , j which is just a product of the portfolio
weights for the two assets, their standard deviations, and their correlation. And Eq. (12) then calculates
the variance of the portfolio by summing over all the possible combinations of assets in the portfolio.
Let us now give an example of how to calculate the variance of a portfolio using Eq. (12). Consider the
same example as before in this section, with two assets A and B with equal means μ A =μ B =2.5 % , and
the standard deviations σ A ≃14.93 % and σ B ≃7.465 % . Let us say that the portfolio weights are
Weight A =0.4 and Weight B =0.6 so they sum to 1. And let us say that the asset returns are positively
correlated with a coefficient of 0.5. Because there are only two assets A and B, there are four possible
pairs of assets in the summation in Eq. (12): Assets A & B, Assets B & A, Assets A & A, and finally
Assets B & B. Let us calculate the inner-part of Eq. (12) for these four asset pairs:
A , B : Weight A⋅Weight B⋅σ A⋅σ B⋅ρ A , B = 0.4⋅0.6⋅14.93 %⋅7.465 %⋅0.5 ≃ 0.001337
B , A : Weight B⋅Weight A⋅σ B⋅σ A⋅ρ B , A = 0.6⋅0.4⋅7.465 %⋅14.93 %⋅0.5 ≃ 0.001337
(13)
A , A : Weight A⋅Weight A⋅σ A⋅σ A⋅ρ A , A = 0.4⋅0.4⋅14.93 %⋅14.93 %⋅1 ≃ 0.003566
B , B : Weight B⋅Weight B⋅σ B⋅σ B⋅ρB , B = 0.6⋅0.6⋅7.465 %⋅7.465 %⋅1 ≃ 0.002006
We then sum these four intermediate calculations to get the variance of the portfolio using Eq. (12):
σ 2p = Var [ Return p ] ≃ 0.001337+0.001337+0.003566+0.002006 ≃ 0.008246 (14)
And finally we take the square-root to obtain the standard deviation of the portfolio’s return:
σ p = Std [ Return p ] = √ Var [ Return p ] ≃ √ 0.008246 ≃ 9.1% (15)
If instead the portfolio weights were Weight A =0.2 and Weight B=0.8 , and we go through all the same
calculations as above, then we would get a standard deviation of only 7.9% for the portfolio’s return.
The goal of the mean-variance method is to find the portfolio weights that maximize the mean while
also minimizing the variance (or standard deviation) of the portfolio’s returns. Because these two
objectives may be in conflict, there is no single choice of portfolio weights that is optimal on both
objectives, and we instead get a so-called Efficient or Pareto Frontier of mutually optimal compromises
between these two objectives. The proponents of the mean-variance method claim that this maximizes
the portfolio’s return while also minimizing the portfolio’s risk.
When I first studied the mean-variance portfolio method, it puzzled me that it was using the variance as
a risk-measure. If you come from a school of thought where investment risk has to do with the future
prospects of a company, a stock’s valuation ratio, etc. it seemed very odd to define risk from the spread
or dispersion of the return distribution, because that only measures the degree of uncertainty we have
about the future returns, and not whether some of those could be big losses. This criticism is formalized
in the next section with numerous examples that show the variance is in fact a very bad risk-measure.
Simple Portfolio Optimization That Works! Page 8 / 164
Figure 2: Compare the probability of loss for normal return distributions with negative mean and
different standard deviations.
Simple Portfolio Optimization That Works! Page 13 / 164
Figure 3: Compare the probability of loss for normal return distributions with positive mean and
different standard deviations.
Simple Portfolio Optimization That Works! Page 14 / 164
Figure 4: Compare the probability of loss for normal return distributions with negative mean (blue)
and positive mean (yellow) and different standard deviations.
Simple Portfolio Optimization That Works! Page 15 / 164
Figure 5: (1) The Efficient / Pareto Front for two assets. (2) Return distribution for the
"Minimum Risk" portfolio. (3) Return distribution for Asset A. (4) Return distribution for Asset B.
Simple Portfolio Optimization That Works! Page 17 / 164
In the other extreme, we again assume that the values a and b remain fixed, so the future range of
P/Sales ratios, growth in Sales Per Share, and Dividend Yield do not change. But now the current
P/Sales ratio is assumed to be very large and approaching infinity. Then both the mean and standard
deviation for the future annualized return will approach zero, so the stock-return will be nearly a
complete loss and with very low standard deviation.
This again shows that the standard deviation is a horrible risk measure, because it is inversely
proportional to the probability and magnitude of loss. According to the formulas above, when the future
stock-return is very large, the standard deviation is also very large, and conversely, when the future
stock-return is nearly a complete loss, the standard deviation is very small and approaches zero. This is
the opposite of how the standard deviation should behave if it was a good risk measure for investing.
3.6 Summary
In this section we saw numerous examples that the variance (or standard deviation) is a horrible risk
measure for investing. The variance often measures the exact opposite of what we are interested in. The
variance does not measure if one asset is likely to perform better or worse than another (Section 3.1).
The variance does not measure the probability and magnitude of losses (Section 3.2). And the
"Minimum Risk" portfolio found with the mean-variance method, can result in losses for all possible
outcomes, even though some of the individual assets may have only positive returns for all outcomes
(Section 3.3). All of these problems exist for "omniscient" return distributions where we know the
future return distributions with complete accuracy. In practice we will also make estimation errors
when trying to predict the future mean, variance and correlation of asset returns, and this may cause the
mean-variance method to concentrate the portfolio in losing assets that are highly correlated, so in
reality the mean-variance method is a “double-whammy” of risk-taking (Section 3.4).
The mean-variance method is horrible at optimizing investment portfolios, and it is absurd that after it
has existed for 70 years, academia has not only failed at properly analyzing the method and rectifying
its flaws – instead they have awarded its author with the highest academic accolades and prizes.
If you are still not convinced after having read this section, and you insist the mean-variance method is
a robust investment tool, then please contact my trusted business partner Dr. Augustus Kwembe as we
would love to offer you some "zero risk” investments!
Simple Portfolio Optimization That Works! Page 20 / 164
4 Naive Forecasting
In the academic research literature, the performance of portfolio methods are often tested using “naive”
forecasts of the future stock-returns and their correlations, where the recent past is merely assumed to
continue into the future. We now show that such naive forecasts are highly inaccurate and makes it
impossible to determine which portfolio method actually works best, because some methods may be
more robust than others to noisy estimates of the future stock-returns and their correlations.
Figure 6 shows the mean daily returns for three stocks. The top-plot shows it for AAPL, the middle-
plot shows it for BBBY, and the bottom-plot shows it for FL. In each of these sub-plots, the mean daily
returns are shown for rolling periods of 20, 60 and 250 days, which roughly correspond to 1, 3 and 12
months. The shortest periods of 20 days are shown in blue and are the most erratic. The periods of 60
days are shown in orange and are a bit more smooth. The periods of 250 days are shown in green and
are even more smooth. The important thing to note here, is that the moving averages for the daily
stock-returns are unstable for all of these three different window-lengths, which means that we cannot
simply make a naive forecast which tries to predict the future mean daily stock-return from the recent
past, as it will be highly inaccurate. If it were possible, these plots would have been much more stable.
Figure 7 shows it for the same three stocks and window-lengths, only it shows it for the standard
deviations of the daily stock-returns. For 20 day periods these are also highly unstable so the recent
past cannot be used to predict the near future. For 60 day periods and especially for 250 day periods the
lines are quite smooth, but they still fluctuate significantly over longer periods of time. Perhaps these
could be used as a rough forecast for the near future, but it depends on the portfolio method and how
robust it is to estimation errors. The method presented in this paper does not use the standard deviation.
Figure 8 shows the correlations between pairs of stocks: AAPL vs. BBBY, AAPL vs. FL, and FL vs.
BBBY. For 20 day periods the correlations are highly unstable and may even change sign in a short
period of time, so two stocks may be highly positively correlated in one 20 day period, but highly
negatively correlated shortly thereafter. For 60 day periods and especially 250 day periods the
correlations are a bit more stable, but they do still fluctuate significantly over longer periods of time.
Because the mean-variance portfolio method adjusts stock-weights based on your estimates for the
future mean stock-returns, their variance, and their correlations all at once, it is particularly vulnerable
to estimation errors. If your estimates for both the mean stock-returns and their correlations are grossly
incorrect, then the mean-variance portfolio may become concentrated in losing stocks that are also
highly correlated, which would likely perform much worse than a simple equal-weighted portfolio.
The portfolio method presented in this paper is much more robust, because it separates the calculation
of portfolio weights into two steps: 1) The filtering method which only allows stocks into the portfolio
if their estimated future returns are sufficiently high; and 2) the diversification method which lowers
those stock-weights to try and minimize the correlation between stocks. The diversification method will
never increase the stock-weights, so the worst that can happen, is that it causes the portfolio to under-
invest in some stocks because they were falsely estimated to be highly correlated with other stocks in
the portfolio.
Simple Portfolio Optimization That Works! Page 21 / 164
Figure 6: Rolling windows of different lengths for the mean of daily returns. These are also known as
“moving averages”.
Simple Portfolio Optimization That Works! Page 22 / 164
Figure 7: Rolling windows of different lengths for the standard deviation of daily returns.
Simple Portfolio Optimization That Works! Page 23 / 164
Figure 8: Rolling windows of different lengths for the correlations of daily returns.
Simple Portfolio Optimization That Works! Page 24 / 164
5 Conditional Forecasting
In the previous section we saw that “naive” forecasts of the mean daily stock-returns and their
correlations were highly inaccurate, because the recent past does not simply repeat into the future. You
need to have some kind of method that can make reasonable forecasts for the future, based on so-called
conditional probability distributions, so the future stock-returns and their correlations are estimated
from current observations of some predictive signals and variables. This is a very hard problem and if
there are errors in the forecasts, we cannot distinguish whether the portfolios performed poorly because
the portfolio method is deficient or because the stock-forecasts are inaccurate.
In this paper we will bypass this problem by using the actual future stock-returns averaged for 2-3 year
investment periods to determine the weights of stocks in the portfolio. This is of course a form of
cheating which we call “omniscient testing” because we now have perfect knowledge about the future
stock-returns. But this allows us to clearly test whether the portfolio method is working correctly, when
we are using the actual future returns to determine the stock-weights.
We could use even longer prediction periods than just 2-3 years of stock-returns, but this would make
our dataset quite limited for testing purposes, because it only contains about 14 years of data for most
stocks. We could also use shorter forecasting periods, maybe even just a few days into the future, which
would make the portfolios perform much better. But we will now show that short-term stock
forecasting is extremely difficult and probably impossible, while it is sometimes possible to make
reasonable forecasts for e.g. 2-3 year stock-returns by using certain predictor variables.
We often work with stock-returns that are “plus one” so instead of writing 0.2 for 20% we often write
the number 1.2, and similarly instead of writing 0.00073 for 0.073% we often write it “plus one” so it is
1.00073. The daily returns on the y-axis in Figure 9 are written “plus one” like this. The reason is that it
avoids the need to add and subtract 1 in many of the calculations, e.g. as done in Eq. (19) above.
Simple Portfolio Optimization That Works! Page 25 / 164
The middle plot in Figure 9 shows the actual daily returns on the AAPL stock between the years 2007
and 2021. The x-axis again shows the P/Sales ratio and there is clearly no relation between the P/Sales
ratio and the daily returns, as they seem to be completely independent of each other. But the top-plot
showed that there was such a clear relation when considering 2-3 year average stock-returns, so why
does it not show up in the middle plot for the daily returns? The reason is that the daily returns are
much more volatile, often the AAPL stock goes up or down several percent in a single day, so the 2-3
year returns become tiny in comparison when they are converted into daily returns. You can see this by
the different scales on the y-axis for the top and middle plots in Figure 9. The bottom plot compares the
daily and 2-3 year returns in the same plot, where you can clearly see that the daily returns completely
dominate the 2-3 year annualized returns that have been converted into daily returns.
5.4 Summary
The purpose of this section is to demonstrate that for some stocks it is much easier to predict their long-
term returns than their daily returns. It is still not easy to predict long-term stock returns – but it is
certainly easier than predicting short-term returns. When you are trying to predict what a stock will do
tomorrow, you are essentially trying to predict what other people and computers will do next in the
stock-market, while they are trying to predict what you are going to do next. This is a very silly game,
like a group of near-sighted people inside a “house of mirrors”. When doing long-term investing, we
want to make investments that perform well over several years, while allocating our portfolio so we can
take advantage of short-term volatility. That is what a portfolio method should help us achieve.
Simple Portfolio Optimization That Works! Page 26 / 164
Figure 9: The P/Sales ratio versus daily and 2-3 year future returns for the AAPL stock.
Simple Portfolio Optimization That Works! Page 27 / 164
Figure 10: The P/Sales ratio versus daily and 2-3 year future returns for the FL stock.
Simple Portfolio Optimization That Works! Page 28 / 164
Figure 11: The P/Sales ratio versus daily and 2-3 year future returns for the BBBY stock.
Simple Portfolio Optimization That Works! Page 29 / 164
6 Random Walks
In the academic research literature, it is common to assume that stock-returns are so-called IID, which
means that they are Independent and Identically Distributed random variables, so for each time-step the
random stock-returns would be drawn from the exact same distribution. This is a convenient
assumption that allows for some elegant mathematical theory. Unfortunately, the previous section
proved that this assumption is completely incorrect. Although it seems impossible to predict the daily
stock-returns from a predictor variable such as the P/Sales ratio, we can sometimes predict more long-
term stock-returns when considering investment periods of a few years or more.
It follows from the belief that stock-returns are IID random variables, that the stock-prices are so-called
“random walks”. Finance professors often seem to believe that stocks don’t have any connection to the
real world, but are merely random variables that follow their own random course and are also somehow
correlated with each other. Once again, this is incorrect when considering long-term investing, where
the stock returns tend to follow the growth of the company’s Sales and Earnings Per Share, while the
short-term fluctuations do indeed seem to be random, caused by the daily “tug-of-war” between
speculators who think they can outsmart each other.
If stock-prices were completely random walks then they would be unbounded both towards zero and
infinity. This means that you would sometimes be able to buy shares in great companies such as Coca-
Cola or Microsoft for a tiny fraction of their annual Earnings Per Share, and at other times the stock-
prices might be higher than the total amount of money in circulation world-wide. We do not see this in
practice, because stock-prices tend to fluctuate within certain ranges of valuation ratios.
A much more useful way of thinking about the progression of stock-prices is as a “semi-random walk”
where the short-term volatility is indeed random, but over time the stock-prices tend to converge to
their “intrinsic value”, as determined by their change in Sales or Earnings Per Share and their change in
valuation ratio. The goal of a long-term investor is to make a good estimate of the stock’s “intrinsic
value”, and then allocate the portfolio to maximize expected future returns, while also being able to
take advantage of short-term market fluctuations.
Note that the random stock-returns in Eq. (21) above are IID because they are always drawn from the
same normal distribution with mean μ and standard deviation σ.
In the “semi-random walk” we instead use a new mean μt for each time-step t:
Random Returnt ∼ N (μ t , σ 2 ) (22)
We then calculate the mean μt from the future “intrinsic value” of the stock at time-step t+K, divided by
the current random stock-price at time-step t, and take this fraction to the power of 1/K so as to get the
so-called geometric mean, which is the return required in each time-step, in order to go from the
current Random Pricet to the future Intrinsic Valuet+K in K time-steps, through compounded returns.
( )
1/ K
Intrinsic Value t+ K
μt = (23)
Random Pricet
In practice we would have to estimate what the future Intrinsic Valuet+K is, but we can “cheat” in these
examples by simply using the actual future stock-price at that time-step, so the formula for μt becomes:
( )
1/ K
Pricet +K
μt = (24)
Random Price t
We can use random numbers for the stock-returns drawn from the standard-normal distribution N(0,1),
and then merely scale and shift those random numbers differently for the random walks and the semi-
random walks, by using the mean μ for the random walks and using μt for the semi-random walks, and
using the same σ for both. This allows for a completely fair comparison between the two methods,
because their random stock-prices are generated from the same underlying random numbers. See the
computer code in Section 19.
In the opposite end, the random walks also suggest the market-cap of AAPL could have been only
1/100 of its actual market-cap around USD 1.5 trillion, so around USD 15 billion, which would also be
absurd considering the company had earnings of around USD 57 billion in the year 2020. This again
shows how absurd the notion of completely random walks are in the stock-market.
The semi-random walks which are shown as green lines in Figure 12, are much closer to the actual
stock-prices. This is a far more reasonable way of thinking about stock-market randomness, namely
that the stock-prices are indeed random in the short-term, because they arise from the daily “tug-of-
war” between speculators, but in the long-term the stock-prices tend to converge to their “intrinsic
value”. This results in the range of the semi-random walks being much more reasonable than the
completely random walks.
Of course, we are “cheating” in these examples by using the actual future stock-prices to generate the
semi-random walks. In reality you would have to estimate the future “intrinsic value” of the stock. But
the point is that you cannot just calculate the mean and standard deviation from the historical stock-
returns and assume the future stock-price is a completely random walk with the same parameters. You
need to make a reasonably good estimate of the future stock-returns based on one or more predictor
variables, such as demonstrated in Section 5 and studied in great detail in [Pedersen 2020]. And every
time the predictor variables change, you need to recalculate the estimates for the future stock-returns,
and then you also need to change the stock-weights in your portfolio.
6.4 Summary
In this section we saw that completely random walks where stock-returns for each time-step are IID,
generate random stock-prices that are often absurd, because they are several orders of magnitude above
or below the actual stock-prices. One might think that the problem is that real-world stock-returns are
actually not normal-distributed as assumed here, but the main problem is in fact that the random stock-
returns are assumed to be IID, as if the stock-prices don’t have any relation to the company in which
the stocks represent part-ownership. That is why a semi-random walk is much better at modelling the
short-term randomness that arises from the daily “tug-of-war” between speculators, combined with the
long-term reality of the actual company and how its sales and earnings grow or shrink over time.
Using this understanding of short-term randomness and long-term predictability, we want a portfolio
method that will only invest in stocks whose long-term returns are estimated to be sufficiently high,
while allocating the portfolio between multiple stocks, so as to take advantage of short-term volatility.
If stock A goes down tomorrow, we would ideally like to have another stock B that goes up, so we can
rebalance the portfolio by selling some of stock B and buy more of stock A, to take advantage of the
lower price of stock A. We cannot do this if both stocks A and B go down at the same time.
Simple Portfolio Optimization That Works! Page 32 / 164
Figure 12: Random and Semi-Random Walks for the AAPL stock.
Simple Portfolio Optimization That Works! Page 33 / 164
Figure 14: Random and Semi-Random Walks for the BBBY stock.
Simple Portfolio Optimization That Works! Page 35 / 164
7 Filtering Methods
In this section we present the first part of our new portfolio method, which is a very basic and almost
trivial filtering process, so we only allow assets into the portfolio, if the assets are estimated to have
sufficiently high future returns. This should be an obvious requirement, because why would you want
to include assets in your portfolio, if you have estimated they will result in a loss? That seems moronic!
But we saw in Section 3.3 that the mean-variance method can do just that, as it may include assets in
the portfolio that are guaranteed to result in a loss, if the assets have negative correlations with other
assets in the portfolio, because that would lower the variance (or standard deviation) of the portfolio’s
return. We avoid this by splitting our new portfolio method into two parts: The filtering part that we
describe in this section, and the diversification part that we describe in the next Section 8.
Weight i = {
Weight max
0
if μi ≥μ min
else
(25)
For example, we might set the threshold to μ min=10 % so if we have estimated that Asset 1 has a
future mean return of μ 1=17.5 % then it is sufficiently high to be included in the portfolio. We then set
its portfolio weight to Weightmax which could be e.g. 5%, in which case we can have a maximum of 20
assets in the portfolio before the sum of their weights would exceed 100% of the portfolio’s capacity,
unless we invest for borrowed money, which we will not consider in this paper. So we need to ensure
the portfolio weights sum to 100% or less. We do this by first calculating their sum:
N
Weight sum = ∑
i=1
Weight i (26)
If the sum is greater than 1 (or 100%), then we divide all the portfolio weights with the sum. This is
also a simple if-else statement in a computer program, which can be written mathematically as follows:
Weight i =
{
Weight i /Weight sum
Weight i
if Weight sum >1
else
(27)
Once we have estimated the future mean returns for all assets, and we have used the above formulas to
determine their portfolio weights, we have a portfolio of assets that have all been estimated to have
sufficiently high future returns. As we will see in the real-world experiments later in this paper, if you
can make a reasonably good estimate of the future 2-3 year returns on stocks, then you can do
extremely well with this simple filtering method. But we can do even better with a slightly more
sophisticated filtering method.
Simple Portfolio Optimization That Works! Page 36 / 164
Figure 15: Adaptive filter for calculating portfolio weights from estimated mean returns.
Simple Portfolio Optimization That Works! Page 37 / 164
The linear filter adapts to the estimated mean return of each asset, so assets that are expected to have a
higher future return will get a larger portfolio weight. It can be written mathematically as follows:
{
0 if μi <μ min
Weight i = μ i⋅a+b if μ min≤μ i≤μ max (28)
Weight max if μi >μ max
Where the parameters a and b for the linear function are:
Weight
a = μ −μmax b = −a⋅μ min (29)
max min
For example, if an asset is estimated to have a future mean return of μ i =15 % and we need at least
μ min=10 % to allow an asset into the portfolio, and when it reaches μ max =30 % the portfolio weight is
Weight max =5 % , then using these numbers in Eq. (28) and (29) gives Weight i =1.25 % for the asset.
As with the simple threshold filter in Section 7.1, we can also have the portfolio weights sum to more
than 1 for this adaptive filter, which would require us to invest for borrowed money. So if the weights
sum to more than 1 we again need to normalize them so they only sum to 1 using Eq. (26) and (27).
8 Diversification Method
In this section we present the second part of our new portfolio method, which is a diversification
method that takes the portfolio weights that were calculated by the filtering method in Section 7, and
adjusts those weights to create a more diversified portfolio where the assets have lower correlation.
The diversification method is only allowed to lower the portfolio weights. This is an important
distinction from the mean-variance portfolio method, which may include assets in the portfolio that are
guaranteed to result in a loss, if those assets have negative correlations with other assets in the
portfolio, because that would lower the portfolio’s overall variance, as shown in Section 3.3.
Because our new diversification method only allows the portfolio weights to decrease, the worst that
can happen is that too much of the portfolio is being held in cash with zero investment return. The
diversification method will not over-invest in some assets just because they have low correlations.
Compared to the various algorithms for optimizing mean-variance portfolios, this new diversification
algorithm is also much simpler, it can be computed much faster, and it works much better in practice.
8.1 Motivation
There are mainly two reasons why we want to diversify an investment portfolio: The first reason is to
protect ourselves from making estimation errors in the future returns of the assets, so we hopefully
aren’t wrong about the future prospects of all the assets in our portfolio. The second reason is to try and
avoid that all the assets in our portfolio experience losses at the same time. We would prefer that some
assets go up while others go down in price, so we can take advantage of short-term volatility to
rebalance our portfolio, by selling some of the assets that have increased in price, and buy more of the
assets that have decreased in price, if we still believe those assets have good long-term prospects.
A good example of such a situation was the “Corona-Virus Panic” in early 2020, where the stock-
markets were extremely volatile, and some stocks were highly correlated while others were not. If you
had invested your entire portfolio in stocks that all lost half or more of their market-value at the same
time, then you would not be able to take advantage of the lower prices and buy more of those assets,
because your entire portfolio had suffered big losses. There were several days during the spring of
2020, where many stocks went up or down 10-20% and some stocks even more. So we would like to
diversify our portfolio to be able to take advantage of such short-term market volatility.
We define the “Full Exposure” of each asset to be its own portfolio weight plus the entire portfolio’s
indirect exposure to that same asset through its correlation with the other assets in the portfolio. A
simple and intuitive (but also slightly incorrect) way of calculating the Full Exposure of Asset A is to
take the weight of Asset A plus the correlated weight of Asset B:
Full Exposure A = Weight A + ρ A , B⋅Weight B = 9 % + 0.5⋅12% = 15 % (30)
And similarly for the Full Exposure of Asset B, where the correlation is symmetric ρ A , B =ρB , A :
Full Exposure B = Weight B + ρB , A⋅Weight A = 12 % + 0.5⋅9 % = 16.5 % (31)
So we thought we had only invested 9% of the portfolio in Asset A, but through its correlation with
Asset B, the portfolio’s Full Exposure to Asset A is in fact 15% of the portfolio. Similarly, we thought
we had only invested 12% of the portfolio in Asset B, but through its correlation with Asset A, the
portfolio’s Full Exposure to Asset B is in fact 16.5% of the portfolio.
We want to find new weights for the two assets that are denoted Weight *A and Weight *B (marked with
an asterisk * to indicate they are the new or adjusted weights), so that both of the Full Exposures that
are calculated using these new weights, will be equal to the originally desired Weight A and Weight B .
That is, we want to find Weight *A and Weight *B that solve these two equations:
term price-volatility and rebalance the portfolio to our advantage. It also happens to lower the variance
of the portfolio’s returns. So we should not include this negative correlation in the calculation of the
Full Exposure, because we want both these assets to be included in the portfolio at their originally
desired weights. This case is shown in the second row of Table 1.
Another reason that we should not include such negative correlations in the calculation of the Full
Exposure, is that the Full Exposure would then be lower than the original portfolio weights. To see this
you should try and calculate Eq. (30) and (31) with a negative correlation. This would require for the
new portfolio weights that solve Eq. (32) to be greater than the originally desired portfolio weights, so
we would increase the portfolio weights simply because the two assets have negative correlations. This
is one of the major flaws of the mean-variance method as shown in Section 3.3.
For negative (or “short”) portfolio weights, the two cases for positive and negative correlations are
actually the same as for positive portfolio weights. The reason is that even though the portfolio weights
are now negative, if the correlation is still positive, then the two assets will tend to move up or down in
price at the same time, and therefore have a similar effect on the portfolio. And conversely if the
correlation is negative then the two asset prices will tend to move in opposite directions and have
opposite effects on the overall portfolio, even though both asset weights are negative. These two cases
are shown in rows 5 and 6 in Table 1.
In case one portfolio weight is positive and the other is negative, and the correlation between the assets
is also negative, then the two negatives cancel each other out, so the two assets tend to have a similar
effect on the overall portfolio. Although one asset is “long” and the other is “short”, because they are
negatively correlated, this is effectively the same as two “long” positions or two “short” positions with
regard to their correlation, so the correlation needs to be included in the calculation of the Full
Exposure, so the portfolio weights can be adjusted accordingly. This case is shown in row 3 of Table 1.
In case one portfolio weight is positive and the other is negative, but the correlation between the assets
is now positive, then the two asset prices will tend to move up or down together, but because one asset
weight is “long” and the other asset weight is “short”, the two assets will have opposite effects on the
overall portfolio. So in this case the correlation should not be included in the calculation of the Full
Exposure. This case is shown in row 4 of Table 1.
Weight Types sign(Weight i ) sign(Weight j ) sign(ρi , j ) sign(W i⋅W j⋅ρi , j ) Adjust Weights?
+ + + + Yes
Long
+ + – – No
+ – – + Yes
Long & Short
+ – + – No
– – + + Yes
Short
– – – – No
Table 1: Summary of the different cases for signs of portfolio weights and correlations, and whether
each case should be included in the calculation of the Full Exposure to adjust the portfolio weights.
Simple Portfolio Optimization That Works! Page 41 / 164
Also shown in Table 1 is a column with the sign of the product of the portfolio weights and correlation.
This turns out to be a plus whenever the correlation should be included in the calculation of the Full
Exposure, and it turns out to be a minus whenever the correlation should not be included in the Full
Exposure. This means we can make a simple mathematical formula to decide whether or not to include
the correlation between Assets i and j in the calculation of the Full Exposure. It is denoted Use i , j
which is just short for “Use in calculation of the Full Exposure”:
Use i , j =
{
1
0
if sign(Weight i⋅Weight j⋅ρi , j ) is +
else
(33)
Note that both Eq. (33) and Table 1 are symmetrical because the correlation is symmetrical ρ i, j=ρ j ,i .
Full Exposurei
Full Exposurei*
Figure 16: The concept of Full and Correlated Exposure, and how to adjust the portfolio weights.
Simple Portfolio Optimization That Works! Page 42 / 164
My first definition of the Full Exposure was the simple and intuitive one used in Section 8.2, but that
resulted in problems if some of the portfolio weights were zero. After some more attempts at making a
meaningful definition of the Full Exposure, it became clear that it should at least satisfy these criteria:
(1) Full Exposure i = 0 if Weight i = 0
(2) Full Exposure i = Weight i if Weight j = 0 or ρi , j = 0 for all j≠i
(37)
(3) Full Exposure i ≥ Weight i if Weight i > 0
(4) Full Exposure i ≤ Weight i if Weight i < 0
Let us explain each of these criteria and why they are necessary:
(1) If an asset’s portfolio weight is zero Weight i=0 then we must also have Full Exposurei =0 ,
because we cannot decrease the updated portfolio weight Weight *i any further than zero, but we
need Full Exposure*i to equal the original Weight i=0 , so the only way to achieve that would
be to lower the weights of the other assets in the portfolio to get Full Exposure*i =Weight i=0 .
So without this criterion it would cause many of the other portfolio weights to become zero.
(2) If an asset is either not correlated with any other asset in the portfolio ρ i, j=0 for all j≠i , or if
the other portfolio weights are all zero Weight j =0 for all j≠i , or a mix of these two cases,
then Full Exposurei should equal Weight i , because the asset has no correlated exposure to
other assets in the portfolio, so we don’t want the portfolio weight to change Weight *i =Weight i .
(3) If a portfolio weight is positive, then its Full Exposure should be greater than the weight.
(4) If a portfolio weight is negative, then its Full Exposure should be less than the weight. Together
with criterion (3) this ensures the Full Exposure is always greater in magnitude than the weight,
while having the same sign as the portfolio weight.
There are other necessary requirements for the definition of Full Exposure to be sensible, such as it
being a strictly increasing function when one or more of the portfolio weights are increased. And when
changing Weight *i it should have a much greater impact on Full Exposure*i for that particular Asset i
compared to the Full Exposure for all the other assets. These requirements will be clarified in the
convergence proof in Section 8.12.
Through experimentation it was found that the following definition of the Full Exposure works very
well in practice, and you should check that it satisfies all the criteria in Eq. (37) above:
√∑|
N
Weight i⋅Weight j⋅ρi , j⋅Usei , j|
2
Full Exposurei = sign(Weight i )⋅ (38)
j=1
Let us explain the reasoning behind the different parts of this formula. We of course want the Full
Exposure to somehow include the weights of the portfolio’s other correlated assets. So we sum over all
other Assets j. Inside the summation, we first multiply with Weight i so the Full Exposure is also zero if
Weight i=0 . We then multiply with Weight j for the other Asset j and the correlation between the two
assets ρ i, j , but we use the squared correlation ρ 2i, j instead, because the surrounding square-root would
Simple Portfolio Optimization That Works! Page 43 / 164
otherwise amplify the correlation. We then multiply with Use i , j which is either 0 or 1 to only include
the “bad” correlations that require adjustments to their portfolio weights, as explained in Section 8.3.
From its definition in Eq. (33), Use i , j ensures the product of Weight i , Weight j and ρ i, j is positive,
but because we are using the squared correlation ρ 2i, j instead, it is possible that the overall product is
negative, so to ensure the product is positive, we simply take its absolute value. We calculate this for all
combinations of Assets i and j, sum the results, take the square-root, and finally multiply with the sign
of the original portfolio weight, to ensure the Full Exposure is also negative if Weight i is negative.
You may note that the way we have defined the Full Exposure in Eq. (38) is different from how we
originally defined the Full Exposure from the Correlated Exposure in Eq. (34). We can also define the
Correlated Exposure somewhat similar to Eq. (38) where the summation just excludes Asset i:
If we add this to Weight i then we get the following definition of the Full Exposure:
But that is not exactly the same as the Full Exposure defined in Eq. (38), which can be rewritten
slightly to separate Weight i from the summation:
√
Full Exposurei = sign(Weight i )⋅ Weight 2i + ∑
j≠i
|Weight i⋅Weight j⋅ρ2i , j⋅Use i, j| (41)
Which one of these two definitions of the Full Exposure is the correct one? They both satisfy all the
criteria in Eq. (37) so they are both valid in that sense. But they are in fact different and Eq. (40)
generally results in higher values for the Full Exposure compared to the ones calculated in Eq. (41),
because Weight i is inside the square-root in Eq. (41). This means that Eq. (40) will generally result in
lower adjusted portfolio weights Weight *i and therefore a more conservative portfolio allocation.
In this paper we will use the Full Exposure defined in Eq. (41), which is the same as Eq. (38). But it is
possible that Eq. (40) works even better, or perhaps that you can find a completely different definition
of the Full Exposure that works better still. For example, you may note that these definitions have some
similarity to the definition of a portfolio’s standard deviation as defined in Eq. (12), but we don’t use
the standard deviations of the assets σi and σj in the above definitions of the Full Exposure. Perhaps it
would improve the diversification even more if we include these in the definition of the Full Exposure?
It would be interesting to see a performance comparison of many different definitions of the Full
Exposure, but it is beyond the scope of this paper, so hopefully you feel inspired to write such a paper.
N
1
⋅∑ ( Weight i − Full Exposure *i )
2
MSE = (42)
N i=1
The goal is to find Weight *i that minimize the MSE. Because the MSE is a continuous function it can
be minimized by common methods such as the L-BFGS-B method which is implemented in many
software packages including the SciPy package for the Python programming language. It is important
that the boundaries are set correctly to ensure the sign of Weight *i is the same as the sign of Weight i .
For portfolios of only 100 assets, minimizing the MSE using L-BFGS-B takes about half a second,
which is 1000x slower than the custom algorithm in Section 8.7 below. For larger portfolios of 1000
assets it becomes very slow and impractical to minimize the MSE using L-BFGS-B, while the custom
algorithm in Section 8.7 still only needs 20 milli-seconds to adjust the portfolio weights of 1000 assets!
It might be possible to improve the computation speed by deriving the gradient of the MSE, but this
depends on the exact definition of the Full Exposure, so we would need to derive the gradient for each
variant of the Full Exposure that we might want to experiment with, while the custom algorithm in
Section 8.7 should be able to handle all reasonable definitions of the Full Exposure.
j≠i
* 2
(43)
⇔
(Weight ) + |Weight |⋅∑ |Weight j⋅ρi, j⋅Usei , j| − Weight i = 0
* 2 * * 2 2
i i
j≠i
So the equation from the last line of Eq. (43) can be written as:
2
|Weight *i| + |Weight *i|⋅si − Weight 2i = 0 (45)
We then want to find |Weight *i| that satisfies this equation, as it will give us the solution to the original
equation where the Full Exposure equals the originally desired weight |Full Exposure *i|=|Weight i| and
the sign can easily be copied from the original Weight i . To solve this note that Eq. (45) is actually a 2nd
degree polynomial in the variable |Weight *i| whose solution is well-known:
|Weight |
*
=
−si ± √s i + 4⋅Weight 2i
(46)
i
2
Simple Portfolio Optimization That Works! Page 45 / 164
Because the variable |Weight *i| is an absolute value it is always positive, so the solution to the 2 nd
degree polynomial must also be positive, so we only use the + case of the ± operator in Eq. (46).
When we have found the new portfolio weight |Weight *i| that makes |Full Exposure *i|=|Weight i| it
also impacts the Full Exposure of the other assets that use Weight *i in their calculations, and this
causes |Full Exposure *j|≠|Weight j| for those other assets. So we may need to perform several iterations
of the weight-update in Eq. (46) before |Full Exposure *i| converges to |Weight i| for all Assets i.
So that we can distinguish the updated portfolio weights for different iterations of the algorithm, let us
denote the portfolio weight for Asset i in the k’th iteration of the algorithm as Weight *i , k , and the
starting weight for iteration k=1 is denoted Weight *i , 1 .
The algorithm is then:
• Initialize the new weights by setting Weight i*, 1=Weight i
• Repeat the following for a given number of iterations from k =1 to some upper limit:
◦ For each Asset i do the following:
• Calculate the sum in Eq. (44) for the k’th iteration:
si , k = ∑ |Weight *j , k⋅ρ2i , j⋅Use i, j| (47)
j≠i
• Calculate the new weight using the original sign and the positive case of Eq. (46):
−s + √ si , k +4⋅Weight 2i
Weight *i, k +1 = sign(Weight i )⋅ i , k (48)
2
This algorithm works extremely well and converges to the correct solution in just a few iterations, even
for portfolios of 1000 assets. The only problem is that it is specifically made for the definition of Full
Exposure in Eq. (38) (and slightly rewritten in Eq. (41)), which means that a completely new algorithm
has to be implemented if you want to use another definition of the Full Exposure.
diverge towards infinity. So we should only adjust Weight *i with the part of Dif Weight *i that can be
directly attributed to the influence of Weight *i on Full Exposure*i , which is the fraction in Eq. (50).
Because Full Exposure*i changes when the other portfolio weights Weight *j change, it is still possible
that Full Exposure*i will be slightly over-adjusted. That is why Eq. (51) allows a smaller step-size.
So that we can distinguish the updated portfolio weights and their corresponding Full Exposure for
different iterations of the algorithm, let us again denote the portfolio weight for Asset i in the k’th
iteration of the algorithm as Weight *i, k , and the starting weight for iteration k =1 is denoted Weight *i, 1 .
The Full Exposure that is calculated using Weight *i , k is denoted Full Exposure*i , k .
The algorithm is then:
• Initialize the new weights by setting Weight i*, 1=Weight i
• Repeat the following for a given number of iterations from k =1 to some upper limit:
◦ Calculate the difference between the new Full Exposure*i , k and the original Weight i :
( )
*
Full Exposure −Weight i
= Weight *i , k⋅ 1− i ,k
*
Full Exposurei , k
* Weight i
= Weight i , k⋅
Full Exposure *i , k
It means that we can update the portfolio weight for Asset i with this simple formula:
Weight i
Weight *i, k +1 = Weight *i , k⋅ * (54)
Full Exposurei , k
So the updated portfolio weight Weight i,* k +1 is simply the portfolio weight from the previous iteration
Weight *i , k multiplied by the ratio between the originally desired Weight i and Full Exposure*i , k . We are
merely changing Weight *i, k by how far Full Exposure*i , k is from the desired Weight i .
The simplified algorithm is:
• Initialize the new weights by setting Weight i*, 1=Weight i
• Repeat the following for a given number of iterations from k =1 to some upper limit:
Another loop is required inside the above algorithm for iterating over all the Assets i. The inner-loop
could either iterate over the assets “element-wise” or “vectorized”, as we also discussed in Section 8.7.
Simple Portfolio Optimization That Works! Page 48 / 164
Weight *A , 1 = Weight A = 9 %
* (57)
Weight B , 1 = Weight B = 12%
We then calculate the Full Exposure for these two weights using Eq. (38). Because the two weights and
their correlation are all positive, we know from Eq. (33) that Use A , B =Use B , A =1 , so we have:
*
Full Exposure A , 1 = √ 9 %⋅9 %+9 %⋅12%⋅0.52 ≃ 10.4 %
(58)
*
Full Exposure B , 1 = √ 12 %⋅12%+12%⋅9 %⋅0.52 ≃ 13.1 %
We then update the portfolio weights using Eq. (55) from the simplified algorithm in Section 8.8:
Weight *A , 2 = Weight *A , 1⋅Weight A / Full Exposure *A , 1 = 9 %⋅9 % /10.4 % ≃ 7.8 %
* * * (59)
Weight B , 2 = Weight B , 1⋅Weight B / Full Exposure B , 1 = 12 %⋅12 %/13.1 % ≃ 11.0 %
We then calculate the Full Exposure for the adjusted portfolio weights, which are now very close to the
originally desired portfolio weights of Weight A =9 % and Weight B =12% :
*
Full Exposure A , 2 = √ 7.8 %⋅7.8 %+7.8 %⋅11.0 %⋅0.52 ≃ 9.1 %
(60)
*
Full Exposure B , 2 = √ 11.0 %⋅11.0 %+11.0 %⋅7.8 %⋅0.52 ≃ 11.9 %
Simple Portfolio Optimization That Works! Page 49 / 164
Let us try another iteration of the algorithm by inserting the updated weights into Eq. (55) again:
Weight *A , 3 = Weight *A , 2⋅Weight A / Full Exposure *A , 2 = 7.8 %⋅9 % /9.1 % ≃ 7.7 %
* * * (61)
Weight B , 3 = Weight B , 2⋅Weight B / Full Exposure B , 2 = 11.0 %⋅12 %/11.9 % ≃ 11.1 %
These portfolio weights are very close to the weights in Eq. (59) so the algorithm has nearly converged.
We can repeat the above steps of the algorithm, to find portfolio weights whose Full Exposure get
arbitrarily close to the originally desired portfolio weights. After only a few more iterations, the
algorithm converges to approximately Weight *A ≃7.72 % and Weight *B ≃11.07 % whose Full Exposure
is very close to the originally desired portfolio weights of Weight A =9 % and Weight B=12% .
This means that if the two assets are correlated with a coefficient of 0.5, and we want to invest 9% of
the portfolio in Asset A and 12% of the portfolio in Asset B, then we should actually only invest about
7.72% of the portfolio in Asset A and only about 11.07% of the portfolio in Asset B, in order for the
Full Exposure of Asset A to be 9%, and the Full Exposure of Asset B to be 12%, as originally desired.
This is because the two assets are positively correlated, so an investment in one asset is also an indirect
investment in the other asset through this correlation.
In this basic example, there is only a small difference between the original and adjusted portfolio
weights. But for larger portfolios where some assets are highly correlated and other assets are perhaps
negatively correlated, the adjusted portfolio weights can be very different from the original portfolio
weights. And it would be very difficult to adjust the portfolio weights for large portfolios by hand
without a computer algorithm such as the one above.
Example 1
In the first example, the portfolio has two assets whose desired portfolio weights and correlation are:
Weight 1=−9 % , Weight 2=+12%
(62)
ρ1,2=+0.5
Because the two weights have different signs (one is negative and the other is positive), and because
their correlation is positive, we know from Eq. (33) that Use 1,2=0 so this is a “good” correlation and
the two portfolio weights should not be adjusted. As the Full Exposure already equals the desired
portfolio weights, the diversification method terminates immediately, as shown in Table 2.
Example 2
In the second example, the portfolio still has two assets whose desired portfolio weights are the same as
before, but now their correlation is negative instead of positive, so we have:
Weight 1=−9 % , Weight 2=+12%
(63)
ρ1,2=−0.5
From Eq. (33) we know that Use 1,2=1 because this is a “bad” correlation so we need to adjust the two
portfolio weights. Table 3 shows the iterations of the diversification method, which converges very
quickly to new portfolio weights, thus making their new Full Exposure almost exactly equal to the
originally desired portfolio weights. These numbers are shown in bold for Iteration 1, because we
initialized the diversification method with the originally desired weights from Eq. (63). Compare these
to the final values for the Full Exposure, which are shown in bold in the row of the last iteration. Note
how the Mean Squared Error (MSE) between FullExp*i , k and Weight i decreases by several orders of
magnitude in each iteration. For such a small portfolio of only two assets, the diversification method
appears to be very efficient, as it only requires a few iterations to converge to the correct solution.
Example 3
The next example has 3 assets instead of only 2. Their weights and correlations are all positive:
Weight 1=+10 % , Weight 2=+15 % , Weight 3 =+20 %
(64)
ρ1,2 =+0.8 , ρ1,3 =+0.5 , ρ2,3=+0.2
Table 4 shows the iterations of the diversification method, which converges after only a few iterations,
so the new portfolio weights are very close to the originally desired portfolio weights from Eq. (64).
Note how FullExp *1 , k decreases in every iteration and converges from above towards its target value of
* *
10%, while both FullExp 2, k and FullExp 3 , k get over-adjusted in the first iteration and then converges
from below towards their target values of 15% and 20%. But they all converge very quickly. Also note
that Weight *1 , k converges to around 5.4% which is nearly half of the original weight of 10%, while
Weight 3* , k converges to around 19.1% which is much closer to the original weight of 20%. This is
because Asset 1 has comparatively much higher correlation with both of the other assets.
Iteration k Weight *1, k FullExp*1 , k Weight *2 , k FullExp*2 , k Weight *3, k FullExp *3 , k MSE
1 10.000% 15.684% 15.000% 18.248% 20.000% 21.494% 1.5E-03
2 6.376% 10.983% 12.330% 14.544% 18.610% 19.626% 4.4E-05
3 5.805% 10.415% 12.717% 14.786% 18.965% 19.921% 7.5E-06
4 5.574% 10.180% 12.901% 14.909% 19.040% 19.972% 1.4E-06
5 5.476% 10.078% 12.980% 14.962% 19.067% 19.989% 2.6E-07
6 5.433% 10.034% 13.013% 14.984% 19.078% 19.995% 4.8E-08
7 5.415% 10.015% 13.027% 14.993% 19.082% 19.998% 9.1E-09
8 5.407% 10.006% 13.033% 14.997% 19.084% 19.999% 1.7E-09
9 5.403% 10.003% 13.036% 14.999% 19.085% 20.000% 3.2E-10
Table 4: Iterations of the diversification algorithm for the portfolio weights in Eq. (64).
Simple Portfolio Optimization That Works! Page 52 / 164
Example 4
This example has the same weights and correlations as the previous example, except one of the weights
and one of the correlations are now negative:
Weight 1=−10 % , Weight 2=+15 % , Weight 3=+20 %
(65)
ρ1,2 =−0.8 , ρ1,3=+0.5 , ρ2,3 =+0.2
Table 5 shows the iterations of the diversification method, which requires a few iterations less than
when all the weights and correlations were positive in the previous example. This is probably because
the single negative weight and correlation means that not all the correlations are “bad” according to
Table 1 and Eq. (33), so they are not all included in the calculation of the Full Exposure. The
diversification method can therefore converge faster to the new portfolio weights whose Full Exposure
is equal to the originally desired portfolio weights from Eq. (65).
Note that the adjusted portfolio weights from the last row in Table 5 are not the same as the weights
from the last row in Table 4. Although the original portfolio weights and correlations are all the same in
magnitude, the ones in Eq. (64) are all positive while the ones in Eq. (65) are both positive and
negative. This causes them to have different Full Exposure so the adjusted weights are also different.
Example 5
This example has the same weights and correlations as the previous example, except two of the weights
and two of the correlations are now negative:
Weight 1=−10 % , Weight 2=+15 % , Weight 3=−20 %
(66)
ρ1,2 =−0.8 , ρ1,3=+0.5 , ρ2,3 =−0.2
Table 6 shows the iterations of the diversification method which again converges very quickly, so the
Full Exposure of the adjusted portfolio weights are close to the originally desired weights in Eq. (66).
Compare the last row in Table 6 to the last row in Table 5 and note how the final adjusted weights have
slightly different magnitudes. Then compare the last row in Table 6 to the last row in Table 4 and note
that the final adjusted portfolio weights are the same except for their signs. This is because all the
correlations in both Eq. (64) and Eq. (66) are considered “bad” according to Eq. (33), so all correlations
are included in the calculation of the Full Exposure using Eq. (38), and therefore the adjusted portfolio
weights converge to the same values – they just have different signs to match the original weights.
Example 6
Let us repeat Example 3 from above with all positive weights and correlations. But the starting point
for the diversification method is not the originally desired portfolio weights, but instead their negation:
Weight 1=+10 % , Weight 2=+15 % , Weight 3=+20 %
* * *
Weight 1,1=−10 % , Weight 2,1=−15 % , Weight 3,1=−20 % (67)
ρ1,2=+0.8 , ρ1,3 =+0.5 , ρ2,3 =+0.2
Table 7 shows the diversification method still converges very quickly even though its starting point had
incorrect signs. This is because in the first iteration of the diversification algorithm, the weight-update
in Eq. (54) flips the sign of the adjusted weight to match the originally desired portfolio weight.
Comparing Table 7 and Table 4 shows that it is only the weights in the initial iteration 1 that have the
wrong signs, and the adjusted weights are otherwise completely identical for all other iterations.
Example 7
Let us repeat Example 5 above, but with initial guesses for the adjusted weights that are just “crazy”:
Iteration k Weight *1, k FullExp*1 , k Weight *2 , k FullExp*2 , k Weight *3, k FullExp *3 , k MSE
1 -123456% -297103% 567890% 622972% -912345% -938753% 4.5E+07
2 4.155% 8.592% -13.674% -15.296% 19.437% 20.215% 7.1E-05
3 4.836% 9.389% -13.409% -15.219% 19.231% 20.085% 1.4E-05
4 5.151% 9.735% -13.216% -15.111% 19.150% 20.037% 2.8E-06
5 5.292% 9.885% -13.119% -15.052% 19.114% 20.017% 5.4E-07
6 5.353% 9.950% -13.073% -15.023% 19.098% 20.007% 1.0E-07
7 5.380% 9.978% -13.053% -15.010% 19.091% 20.003% 1.9E-08
8 5.392% 9.991% -13.045% -15.004% 19.088% 20.001% 3.7E-09
9 5.397% 9.996% -13.041% -15.002% 19.087% 20.001% 6.9E-10
Table 8: Iterations of the diversification algorithm for the portfolio weights in Eq. (68).
Simple Portfolio Optimization That Works! Page 56 / 164
8.12 Convergence
We have now seen several small examples of using the diversification algorithm, which all converged
to the optimal solution in just a few iterations of updating the portfolio weights. The question is
whether we can be certain that the algorithm will always converge to the optimal solution?
This section is more mathematical and although it can be skipped completely, it is recommended that
you at least try to grasp the ideas for proving that the diversification method converges.
We want to prove that the Full Exposure FE*i , k converges to the originally desired weight W i if we
just calculate enough iterations of the adjusted portfolio weights W *i , k which are used to calculate the
Full Exposure FEi*, k . The weights are updated using Eq. (54) from the simple algorithm in Section 8.8.
The Full Exposure from Eq. (38) can be written as follows using the short notation:
√∑|
N
W i , k⋅W j , k⋅ρi, j⋅Usei , j|
* * * * 2
FEi , k = sign (W i, k )⋅ (69)
j=1
For simplicity, we consider the absolute value of the Full Exposure |FE*i , k| which removes the sign:
√∑|
N
|FE*i , k| = W i , k⋅W j , k⋅ρi , j⋅Usei , j|
* * 2
(70)
j=1
Using the absolute value makes the convergence proof easier, as we would otherwise have to keep
track of the sign and flip the inequalities for the boundaries when the sign is negative. We can easily
restore the sign of |FE*i , k| when needed, because we know that the signs are always identical (except
for the very first iteration, if we have initialized the new weights W *i ,1 with other values than the
original weights W i and they have different signs, as explained in Section 8.9):
We are interested in knowing how the Full Exposure for the next iteration |FE*i , k +1| changes when all
of the portfolio weights change from W *i , k to W *i , k +1 , so inserting Eq. (72) into Eq. (70) gives:
√ | |
N
W W
|FE *
i , k+1 | = ∑
j=1
* *
W i , k⋅ *i ⋅W j , k⋅ *j
FE FE
2
⋅ρi , j⋅Use i , j (73)
i,k j,k
The fraction W i / FE *i, k can be pulled out of the summation because it is the exact same value for all the
summation indices j. The fraction can also be pulled further outside the square-root, so we get:
√| | √∑| |
N
Wi Wj
|FE *
i , k +1 | = *
⋅
*
W i , k⋅W j , k⋅
*
*
2
⋅ρi , j⋅Use i , j (74)
FE i,k j=1 FE j,k
Now find the index L that minimizes the fraction W j / FE*j , k so that for all indices j we have:
| | | |
Wj
FE
*
j,k
≥
WL
*
FE L, k
(75)
And similarly find the index U that maximizes the fraction W j / FE*j , k so that for all indices j we have:
| | | |
Wj
FE
*
j,k
≤
WU
*
FE U , k
(76)
Then first using the lower-bound from Eq. (75) to replace the fraction W j / FE*j , k in Eq. (74) we get:
√| | √| | √ √| | √| ||
N
WL Wi WL Wi
|FEi , k+1| ≥
*
FE *
⋅
FE *
⋅ ∑ |W *i , k⋅W *j , k⋅ρ2i , j⋅Use i, j| = FE *
⋅
FE *
⋅ FE i , k|
*
(77)
L,k i,k j=1 L,k i, k
And then using the upper-bound from Eq. (76) to replace the fraction W j / FE*j , k in Eq. (74) we get:
√| | √| | √ √| | √| ||
N
WU Wi WU Wi
|FEi*, k+1| ≤
FE *
⋅
FE *
⋅ ∑ |W i*, k⋅W *j , k⋅ρ2i, j⋅Usei , j| = FE *
⋅
FE *
⋅ FE i, k|
*
(78)
U ,k i ,k j=1 U ,k i, k
√| | √| |
WL
FE
*
L,k
⋅
√| | √| ||
WU
FE
*
U ,k
⋅
Wi
FE
*
i,k
⋅ FEi , k|
*
(79)
√| | FE
WL
*
L,k
⋅√|W i⋅FEi , k| ≤ |FEi , k +1| ≤
* *
√| | √|
WU
FE
*
U ,k
⋅ W i⋅FE i , k|
*
(80)
So the Full Exposure |FE*i , k+1| for the updated portfolio weight W *i , k +1 is in a neighbourhood of
√|W ⋅FE |
i
*
i, k which is roughly a mid-point between the originally desired portfolio weight |W i| and the
Full Exposure from the previous iteration |FE*i , k| . Figure 17 shows an example of these boundaries.
Simple Portfolio Optimization That Works! Page 58 / 164
√| |
FE
WL
*
L, k
⋅√|W i⋅FE i , k|
*
√| | √|
WU
FE
*
U ,k
⋅ W i⋅FEi , k|
*
0 |W i| √|W ⋅FE | i
*
i,k |FE *i, k|
Figure 17: Boundaries for the updated Full Exposure |FE*i , k +1| from Eq. (80).
There are several other possibilities for the boundaries of |FE*i , k+1| than those shown in Figure 17, such
as both boundaries being either below or above √|W ⋅FE |
i
*
i, k if the boundary ratios √|W L / FE L , k| and
*
√|W U / FEU , k| are either both below or above 1. It is also possible that the boundaries for |FEi , k+1| can
* *
Explanation
What does all this mean? We can think of this as the Full Exposure |FE*i , k+1| for the updated portfolio
weight W *i , k +1 being the value √|W ⋅FE |
i
*
i, k instead of the value |FE*i , k| from the previous iteration.
The value √|W ⋅FE |
i
*
i, k is a mid-point somewhere between the previous value |FE*i , k| and the
*
originally desired portfolio weight |W i| , so the update from the previous weight W to the new i ,k
portfolio weight W i*, k +1 using the update from Eq. (72) has indeed brought the Full Exposure |FE*i , k+1|
much closer to the originally desired weight |W i| compared to the previous Full Exposure |FE*i , k| .
However, all the other portfolio weights W *j , k+1 for assets j≠i have also been updated using Eq. (72),
which may pull the new Full Exposure |FE*i , k +1| away from the value √|W ⋅FE | . The impact from
i
*
i, k
two ratios are square-roots, they will usually only have a very minor effect on √|W ⋅FE | . Even in
i
*
i, k
case they pull |FE*i , k+1| further away from its goal |W i| than the previous |FE*i , k| , in the next iteration
|FE*i , k +2| will be moved back towards the goal |W i| again, because √|W ⋅FE |
i
*
i, k +1 is now somewhere
between |FE | and the goal |W i| . As the move of the Full Exposure towards the goal |W i| is
*
i , k+1
always roughly a mid-point between the Full Exposure and the goal, and the pull from the adjustments
to the other portfolio weights is at maximum the square-root of the biggest adjustment made to the
other portfolio weights; over a few iterations the portfolio weights will all be moved much closer to
their individual goals than their mutual pulling on each other.
Simple Portfolio Optimization That Works! Page 59 / 164
Normal Initialization
If all the adjusted portfolio weights are initialized with the originally desired weights W *i ,1=W i , then
their Full Exposure all exceed the originally desired weight |FE*i , 1|≥|W i| according to the criteria from
Eq. (37) that the Full Exposure must satisfy. So the weight-updates calculated using Eq. (72) will
decrease all the portfolio weights after the first iteration so that |W *i , 2|≤|W *i ,1| . Because the portfolio
weights are all being adjusted in the same direction, the Full Exposure |FE*i , 2| will get close to the goal
|W i| for most assets i. Some |FE*i , 2| may be a bit too low and others may be a bit too high compared
to their goal |W i| , but they will be close already after one weight-update. In the next iteration they
move even closer around √|W ⋅FE | while only being pulled slightly away from this point, because
i
*
i, 2
the two boundary ratios √|W / FE | and √|W / FE | in Eq. (80) are already close to 1 as the
L
*
L,2 U
*
U ,2
weights are already close to their goal. So when we initialize the new portfolio weights with the
original weights W *i ,1 =W i , we get rapid convergence of |FE*i , k| towards |W i| in just a few iterations.
“Crazy” Initialization
What if we initialize the adjusted portfolio weights W *i ,1 with some “crazy” values as we did in some
of the examples above? It actually only takes a single update of the portfolio weights using Eq. (72) to
bring the “crazy” values back into the normal range again. To see this first note that the adjusted
portfolio weight |W *i , k| is always less than the Full Exposure |FE*i , k| , because we have defined the
Full Exposure to satisfy this criterion from Eq. (37). This means that their ratio is always less than one:
|W |*
i,k ≤ |FE *
i,k | ⇔
| | W *i , k
*
FE i , k
≤ 1 (81)
Using this fact in the weight-update formula from Eq. (72), we see that the updated weight |W *i , k +1| is
upper-bounded by the originally desired portfolio weight |W i| as follows:
|W *i , k +1| = | W*
|
W i⋅ i*, k ≤ |W i|
FE i , k
(82)
The lower bound for the updated weight is zero, and it can only be equal to zero if either the previous
weight was zero |W *i , k|=0 or if the originally desired weight is zero |W i|=0 . So the updated weight is
lower-bounded by zero and upper-bounded by the originally desired weight:
0 ≤ |W *i, k +1| ≤ |W i| (83)
If we update all the portfolio weights using Eq. (72) so they are all bounded as in Eq. (83), then their
Full Exposure is also lower-bounded by zero and upper-bounded by the Full Exposure of the originally
desired portfolio weights:
0 ≤ |FE*i , k+1| ≤ |FE i| (84)
So no matter how “crazy” the initial portfolio weights were, we just need to perform a single weight-
update using Eq. (72) to bring the weights and their Full Exposure back into a reasonable range again.
Simple Portfolio Optimization That Works! Page 60 / 164
“Crazy” Example 1
Let us consider some more examples with “crazy” initial values for the new portfolio weights. The
portfolio has only two assets. The first asset weight is initialized to +4.876% which is the value that it
will ultimately converge to, so this initial guess is actually the correct value. But the second portfolio
weight is initialized to a “crazy” value of +100000%, which would of course require a completely
unrealistically leveraged investment, but we merely want to see what happens with the diversification
algorithm when using such a “crazy” value for the initial portfolio weight. The desired and initial
portfolio weights, and the correlations between the two assets, are summarized as follows:
Weight 1=+8 % , Weight 2=+12%
* *
Weight 1,1=+4.876 % , Weight 2,1=+100000 % (85)
ρ1,2=0.9
Table 9 shows the iterations of the diversification method. Note that the Full Exposure in the 1 st initial
iteration is about +628% for the first asset and about +100000% for the second asset, because the initial
guess for the second asset weight is +100000%. Both of these Full Exposures are just “crazy” wrong
compared to the originally desired portfolio weights of +8% and +12%, respectively.
But already in the 2nd iteration we get much more reasonable values, so the Full Exposure is around
+12% for the second asset, which is the same as the originally desired portfolio weight, as shown in
Eq. (85), so the second asset has already reached its goal. But now the first asset has a much too low
Full Exposure of only around +0.088% where its goal is +8%. So even though the initial guess for the
portfolio weight of the first asset was actually the correct value, because the weight of the second asset
was “crazy” wrong, it has pulled the first weight far away from its goal. So we need to increase the
weight for the first asset and decrease the weight for the second asset, which is done in the 3rd iteration.
How do we avoid moving back and forth forever between “crazy” wrong portfolio weights? That is
because the most “crazy” wrong portfolio weights get moved much closer to their correct values when
updating the weights using Eq. (72). This also impacts the portfolio weights of the other assets as we
have just seen, but it has a relatively minor effect on the other asset weights, because it is at most the
square-root of the weight-adjustments that are propagated to the other assets, which is what Eq. (80)
shows. In the next iteration those assets have their portfolio weights moved back again, and after a few
more iterations, all the portfolio weights start to converge to their correct values, as shown in Table 9.
“Crazy” Example 2
In the second “crazy” example, the portfolio still only has two assets whose desired portfolio weights
are the same as before, and the initial guess for the first portfolio weight is still +4.876% as before, but
now the initial guess for the second portfolio weight is only +0.01%, so we have:
Weight 1=+8 % , Weight 2=+12 %
* *
Weight 1,1=+4.876 % , Weight 2,1=+0.01 % (86)
ρ1,2=0.9
Table 10 shows the iterations of the diversification method. In the 1 st initial iteration, the Full Exposure
of the first asset is around 4.88% which is much below its goal of 8%, while the second asset has an
even lower Full Exposure of only around 0.2% with a goal of 12%. In the 2 nd iteration the Full
Exposure of the second asset has increased to its goal of 12%, while the Full Exposure of the first asset
has also increased to about 11.1% but that is now significantly higher than its goal of only 8%.
The weight of the second asset has increased about 60-fold while the weight of the first asset has only
roughly doubled. This is again because the weight-update in Eq. (72) moves each portfolio weight
much closer to its correct value, and while this weight-change also propagates to the Full Exposure of
the other assets in the portfolio, it only has a minor impact on those, which is at most the square-root of
the weight-adjustment, as shown in Eq. (80).
The result is that the diversification algorithm takes some major jumps in the first few iterations to
correct the initial “crazy” wrong guesses for the portfolio weights, and then the algorithm quickly
converges to the correct values.
“Crazy” Example 3
In the third “crazy” example, the portfolio still only has two assets whose desired portfolio weights are
the same as before, but now both the initial guesses for the adjusted weights are “crazy” wrong. The
first portfolio weight is guessed to be +100000% and the second weight is guessed to be only +0.01%:
Weight 1=+8 % , Weight 2=+12 %
* *
Weight 1,1=+100000 % , Weight 2,1=+0.01 % (87)
ρ1,2=0.9
Table 11 shows the iterations of the diversification method. Because the initial guess for the first weight
is “crazy” high, and the initial guess for the second weight is “crazy” low, when we update these two
weights using Eq. (72), we expect the weights to pull each other in opposite directions. So the first
weight that is “crazy” high gets moved down towards its correct value by its own weight-update, but it
also gets pulled upwards to compensate for the other weight being “crazy” low. We expect the opposite
for the second weight, which gets moved up towards its correct value by its own weight-update, but at
the same time it gets pulled downwards to compensate for the other weight being “crazy” high.
We might imagine that these weight-updates and their pulling on one another would cancel each other
out, but because each weight-adjustment only has a comparatively minor impact on the other weights,
the combined effect is that the weight-update still brings the weights much closer to their correct
values, even though all of the portfolio weights were “crazy” to begin with. This is again what Eq. (80)
shows, that the most important effect of a weight-update using Eq. (72) is that the weight is moved
much closer to its own goal, while only having a negligible impact on the other portfolio weights.
As usual, we see that after a few iterations of correcting the “crazy” wrong initial guesses for the
portfolio weights, the diversification algorithm quickly converges to the correct values that make the
Full Exposure of each asset equal to its originally desired portfolio weight, as shown in Table 11.
One difficulty in making such a convergence proof, is that the difference |FE*i , k −W i| for an individual
Asset i can increase for a few iterations before it starts to converge again, as we saw in some of the
previous examples. So it may be easier to prove convergence for the Mean Absolute Error of all assets:
N
1
lim ⋅∑ |Full Exposure*i , k −Weight i| = 0 (89)
k→∞ N i=1
But due to the self-referential (or recursive) nature of how the Full Exposure is defined, this is very
challenging to prove directly, without resorting to slightly “hand-wavy” arguments similar to the ones
we used in the convergence proof above.
Perhaps it is possible to use more sophisticated mathematics, such as the Banach fixed-point theorem or
Cauchy sequences to prove that if the Full Exposure satisfies certain criteria, then the diversification
algorithm is guaranteed to converge, and the solution always exists and it is unique. If you are able to
make such a proof, then it would be a very valuable contribution to this work.
Remarkably, the number of iterations required to reach a sufficiently precise solution does not depend
on the number of portfolio assets N, but only depends on the level of precision that is required, and the
difference between the initial Full Exposure |FE*i , 1| and the originally desired portfolio weight |W i| .
We know from Eq. (80) that for each iteration of the algorithm, the updated portfolio weight W *i , k +1
causes the Full Exposure |FE*i , k +1| to be approximately √|W ⋅FE | which is much closer to the goal
i
*
i, k
of |W i| compared to |FE*i , k| . For simplicity, let us say that √|W ⋅FE | is about half-way between
i
*
i, k
|W i| and |FE*i , k| . So after each weight-update the Full Exposure is brought about half-way closer to
its goal. The required number of iterations K is therefore related to the required precision as follows:
K ≃ log 2 (91)
Required Precision
For example, if max|FE*i , 1−W i|=1000 and Required Precision=0.001 as used in all the experiments
in this paper, then we only need K ≃20 iterations of the algorithm to find portfolio weights with that
precision. But this is a quite conservative estimate because it is calculated from the boundaries in
Eq. (80). In practice the algorithm converges much faster and usually only needs around 7 iterations to
achieve a precision of 0.001 (or 0.1%) for the portfolio weights, regardless of the number of assets.
The factor K in the time-complexity is therefore negligible and remains a constant low number if you
always use the same precision requirement. So the time-complexity of the diversification algorithm is
dominated by the quadratic number of assets N in the portfolio, that is, the time-complexity is O(N2).
Figure 18 shows the time-usage of the simple diversification algorithm from Section 8.8 for portfolios
of different sizes and with a required precision of 0.001 (or 0.1%) for the adjusted portfolio weights.
As can be seen, the time-usage is indeed quadratic in the number of portfolio assets N. For example, the
algorithm uses about 0.5 second for a portfolio of N =5,000 assets, while it uses about 2 seconds for a
portfolio of N =10,000 assets, that is, a 4-fold increase in time-usage for a 2-fold increase in portfolio
size. There is also roughly a quadratic difference in time-usage between portfolios with N =10,000
assets and portfolios with N =20,000 assets. And similarly between portfolios with N =15,000 and
N =30,000 assets. So these experiments confirm that the diversification algorithm has quadratic time-
complexity in the number of assets N, provided we hold constant the required precision.
The algorithm is extremely efficient and only takes 20 (twenty) milli-seconds to compute for a
portfolio of 1000 assets! The time-usage can be further improved by enabling parallelism (this is a
simple toggle in the computer code), and removing all of the research options and error-checking to
simplify the computer code. The implementation of the Full Exposure in Eq. (38) has been optimized
for speed, because it is the most expensive part of the algorithm. The diversification algorithm is
already so fast that it can easily be used in back-testing of investment strategies, as well as real-time
High-Frequency Trading. Perhaps a highly optimized C++ implementation can be made even faster.
In Figure 18 the original portfolio weights and their correlations are randomly generated using various
normal-distributions. For the smaller portfolio sizes, many thousands of random portfolios are
generated. For the larger portfolio sizes only 10 random portfolios are generated. See the computer
code in Section 19 for details. It is run on a laptop computer with a 2.6 GHz CPU (boost 3.5 GHz).
Simple Portfolio Optimization That Works! Page 65 / 164
The space-complexity for the diversification algorithm is linear in the number of portfolio assets N,
provided it has been properly implemented so it does not need to allocate memory for extra matrices.
However, the correlation matrix itself requires quadratic storage in computer memory, which becomes
very large for portfolios with many assets. For example, a portfolio with N =10,000 assets requires a
correlation matrix with N 2 =100,000,000 elements to be stored in memory. If you have a very sparse
correlation matrix for very large portfolios with many assets, or if the portfolio weights are zero for
many assets, then it is possible to improve the diversification algorithm to take advantage of this
sparsity and achieve both lower time and space complexity.
Figure 18: Time usage for the diversification method from Section 8.8 with different portfolio sizes.
8.15 Summary
This section presented a new method for diversifying an investment portfolio, which takes as input the
desired portfolio weights that were created by another process, such as the filtering methods from
Section 7. The diversification method also needs an estimate of the future correlations of the asset
returns. Then the diversification method adjusts the original portfolio weights downwards, so that the
so-called Full Exposure of each asset becomes equal to the originally desired portfolio weights.
The Full Exposure measures how much the portfolio is exposed to each asset, both directly through the
investment in that particular asset, but also indirectly through its correlations with other assets in the
portfolio. The mathematical formula for the Full Exposure must satisfy certain criteria so it is sensible.
Because of how the Full Exposure is defined, the portfolio weights are only allowed to decrease. This
makes the diversification method very robust to estimation errors in the correlation matrix, because the
worst that can happen, is that too much of the portfolio is moved into cash with zero returns.
A few algorithms were also presented in this section, for adjusting the portfolio weights to find their
correct values. The simplest algorithm was proven to always converge, no matter how “crazy” the
initial guesses for the portfolio weights are. The time-complexity of the algorithm is roughly quadratic
in the number of assets, and a portfolio of 1000 assets can be optimized in just a few milli-seconds.
Simple Portfolio Optimization That Works! Page 67 / 164
9 Test Settings
This section gives an overview of how the tests are performed in the following sections.
9.1 Stock-Data
We use daily share-price data for 949 U.S. stocks between the years 2007 and 2021. The stock-returns
are calculated from the so-called Total Return, which is the daily closing share-price adjusted for both
stock-splits and reinvestment of dividends, and assuming there were no taxes.
The stock-data is processed and cleaned before it is being used, in order to remove problematic data-
points and make the analysis easier and hopefully more reliable. The full data-set contains nearly 3000
stocks, but we remove stocks where the median daily trading-volume is less than USD 1 million, or the
max daily return is greater than 100%, or more than 20% of the days have missing data. This results in
only 949 stocks remaining in the data-set, which are listed in Section 19.2.
Figure 19 shows the number of stocks available on each day between the years 2007 and 2021. To
ensure all stocks have price-data available for the same days, we could either shorten the data-period,
or we could remove the stocks that don’t have data for the entire period. Either way, this would remove
a large part of the data-set, so we instead fill in the missing share-prices using the nearest value. This
means that an investment in a stock with missing data simply corresponds to a cash-position during that
period. This should not be a problem for comparing the portfolio methods, because it just means that
those particular stocks have zero returns during the periods with missing data.
Figure 19: Number of stocks available in the data-set for each day between 2007 and 2021.
9.2 Log-Returns
Whenever we use the future 2-3 year average stock-returns in this paper, it is actually the average log-
returns because they can be calculated more efficiently by the computer, and they are fairly close to the
returns for values between ±30%. For example, ln (1−20 %)≃−22.3 % and ln (1+20 %)≃+18.2 % .
Simple Portfolio Optimization That Works! Page 68 / 164
Figure 20: The three basic parts of using a portfolio method for investing.
These three parts are often conflated in academic research papers, where it is assumed that previous
financial data is predictive of the future. For example, the mean, variance and covariance for the
preceding year may be used to forecast the future when testing the mean-variance portfolio method.
Often the assumptions are poorly described so you have to guess exactly how the historical data is
being used to forecast the future. This is probably because academia generally has very strong beliefs
about the efficiency and randomness of financial markets, so the academic researchers tend to use the
same assumptions and therefore do not describe them so carefully. But we saw in Sections 4, 5 and 6
that their beliefs about “naive” forecasting and random walks in the financial markets are incorrect.
Furthermore, some portfolio methods are very vulnerable to estimation errors, while other methods are
very robust. The mean-variance method is very vulnerable because it tries to maximize the mean return
while simultaneously minimizing the variance, and this can greatly amplify portfolio-weights if you
make estimation errors in both the mean return of an asset and its correlation with other assets in the
portfolio. This may cause the portfolio to get concentrated in losing assets that are highly correlated.
When we are using some kind of forecasting model for step (A) in Figure 20, we are actually testing
both the forecasting model and the portfolio method at the same time, thus conflating the testing of two
separate parts of the investment system. So in this paper we will first test the portfolio method using the
actual future stock-data. We call this “omniscient” testing which is of course a form of cheating. 2 This
allows us to properly test if the portfolio method works as intended when using the correct predictions
about the future stock-returns and their correlations. The omniscient testing is done in Sections 10-12.
Then we test the robustness of the portfolio methods, by adding noise to the omniscient stock-data, to
see how well the methods cope with estimation errors in the data. This is done in Sections 13-15.
The last part of the investment system is the buying and selling of assets in the financial markets. This
can also interfere with the testing of the portfolio method, if for example one portfolio method is better
suited than another method for using stop-loss orders. So to keep this testing as fair as possible, all the
portfolio methods are tested by simply buying and selling the stocks at their daily closing prices.
2 The academic term for testing with the actual historical data is “ex-post” testing, but I prefer the term “omniscient”
which is also Latin and means “all-knowing”.
Simple Portfolio Optimization That Works! Page 70 / 164
Figure 21: Computation flowchart used for testing the portfolio methods.
Simple Portfolio Optimization That Works! Page 71 / 164
Figure 30 and is that the Rebalanced+ portfolios had on average more than 60% of the portfolio placed
in cash. This is typical for small portfolios because the diversification method needs quite large
portfolios to function properly. So the under-performance of the diversified Rebalanced+ portfolios did
not arise from a bad methodology, but simply from having placed a majority of the portfolio in cash.
The 3rd and middle plot in Figure 22 compares the portfolio values of the Threshold and Rebalanced
portfolio methods. We see a similar pattern as before, namely that the Threshold method performed
much better during the 2009 stock-market crash, but in the following years it started to perform worse
than the simple Rebalanced portfolios. The reason is also similar, namely that the Threshold portfolio
only invests when the future 2-3 year average stock-returns would give an annualized return of at least
10%. Because the portfolio only contains 5 stocks, there are long periods where the Threshold method
simply cannot invest in any stocks and therefore holds a lot of the portfolio in cash, which is again
shown in Figure 30.
The 4th plot in Figure 22 compares the portfolio values of the Adaptive and Threshold portfolio
methods, neither of which employs the diversification method. The only difference between these two
portfolio methods, is that the Adaptive method adjusts or adapts the portfolio weights to the magnitude
of the future 2-3 year stock-returns, while the Threshold method simply gives the stocks equal weights
in the portfolio if their future 2-3 year stock-returns exceed 10%. We once again see from the plot, that
the Adaptive method performed much better during the 2009 crash, but under-performed the Threshold
method in the following years. Figure 30 shows that the Adaptive portfolios had on average more than
90% of their portfolios in cash. So a portfolio with only 5 stocks is simply too small for the Adaptive
portfolio method to work properly and keep the portfolio fully invested.
The 5th and final plot in Figure 22 compares the portfolio values of the Adaptive+ and Adaptive
portfolios. The only difference between these two portfolio methods is that the Adaptive+ also uses the
diversification method from Section 8.8, which gave a small advantage in the 2009 crash, but in the
following years the Adaptive+ method significantly under-performed the Adaptive method. Figure 30
shows the likely cause was again that the Adaptive+ portfolios had even bigger cash-positions.
nearly 12 years the Threshold portfolios had roughly 5-9 times higher portfolio values. The Adaptive
portfolios also performed much better than the Threshold portfolios, except during the 2009 crash, and
after nearly 12 years the Adaptive portfolios had roughly 2-3 times higher portfolio values. The
Adaptive+ portfolios were generally better than the Adaptive portfolios, but this seems to be mostly
between the years 2007 and 2013, and after 2013 the Adaptive+ portfolios seem to have been worse
than the Adaptive portfolios, which can be seen from the declining ratios of the portfolio values. So
portfolios with 100 stocks are sufficient for the Threshold and Adaptive portfolio methods, but still
insufficient for the Adaptive+ method to function properly.
any of them. The words “portfolio size” really mean “the size of the universe available for investing”,
and when that is e.g. 300 it means that there are 300 random stocks available for investing.
more remarkable is that the diversification method can greatly improve on that performance, simply by
adjusting the portfolio weights according to the stock-correlations in the next 10 days.
The explanation that the simple filtering methods work so well when using the future 2-3 year returns,
is that even though some or all of the stocks might crash in the interim period, we know that they will
recover again within a few years, because we are using the actual future 2-3 year stock-returns when
making the investments. So any interim losses will only be temporary. And if some stocks crash deeper
than others, it will be beneficial to move more of the portfolio into other stocks that will have a higher
return in the following 2-3 year period. This is further improved by the diversification method, which
tries to avoid that all the stocks in the portfolio crash at the same time (but this also avoids that they all
increase at the same time). As we have just seen, this works extremely well when the portfolio methods
are using the actual future stock-returns and correlations. In the robustness tests further below, we will
see how well the methods cope with noisy estimates of the future stock-returns and correlations.
For small portfolio sizes this is partly because the Threshold, Adaptive and Adaptive+ portfolios have a
large part of their portfolios in cash. But it shows that in general these portfolio methods are very
effective at improving the mean daily return of the portfolio, without incurring a higher daily volatility.
a stock-market crash in the interim period. If you can predict a stock-market crash, you can of course
improve the Max Drawdown significantly, but in this test we only use the future 2-3 year mean return.
10.3 Summary
These experiments have shown on several different performance metrics, that in order for the filtering
and diversification methods to become truly effective, the investment universe needs to contain at least
a few hundred assets, so the filtering method can select enough assets for inclusion into the portfolio,
and the diversification method can lower their desired portfolio weights without moving a significant
part of the portfolio into cash, which is what often happens for portfolios with only a few assets.
Although the tests have shown that there is benefit to using the diversification method alone, it works
best in combination with the filtering method, which only allows assets into the portfolio if their future
returns are sufficiently high. And this filtering works best when it adapts to the magnitude of the future
return estimates, so an asset with a higher estimated return is given a bigger position in the portfolio.
In this section we used so-called omniscient data, which is the actual future 2-3 year mean returns, and
the actual future 10-day stock-correlations. We will make a few more tests using omniscient data in the
following sections, and then we will make several robustness tests to see how well the portfolio
methods work when the estimates of the future stock-returns and correlations are very noisy.
Simple Portfolio Optimization That Works! Page 79 / 164
Figure 22: Test A – Compare the values of 128 random portfolios with 5 stocks each.
Simple Portfolio Optimization That Works! Page 80 / 164
Figure 23: Test A – Compare the values of 128 random portfolios with 30 stocks each.
Simple Portfolio Optimization That Works! Page 81 / 164
Figure 24: Test A – Compare the values of 128 random portfolios with 100 stocks each.
Simple Portfolio Optimization That Works! Page 82 / 164
Figure 25: Test A – Compare the values of 128 random portfolios with 300 stocks each.
Simple Portfolio Optimization That Works! Page 83 / 164
Figure 26: Test A – Compare Arithmetic Mean daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 84 / 164
Figure 27: Test A – Compare Geometric Mean daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 85 / 164
Figure 28: Test A – Compare Std.Dev. of the daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 86 / 164
Figure 29: Test A – Compare Sharpe Ratios for daily returns of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 87 / 164
Figure 30: Test A – Compare Cash Mean Daily Position of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 88 / 164
Figure 31: Test A – Compare Months With Losses of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 89 / 164
Figure 32: Test A – Compare Max Drawdown of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 90 / 164
Figure 33: Test A – Compare Max Pullup of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 91 / 164
Figure 34: Test B – Compare values of 128 random portfolios with 30 stocks each.
Simple Portfolio Optimization That Works! Page 93 / 164
Figure 35: Test B – Compare values of 128 random portfolios with 100 stocks each.
Simple Portfolio Optimization That Works! Page 94 / 164
Figure 36: Test B – Compare values of 128 random portfolios with 300 stocks each.
Simple Portfolio Optimization That Works! Page 95 / 164
Figure 37: Test B – Compare Arithmetic Mean daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 96 / 164
Figure 38: Test B – Compare Std.Dev. of the daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 97 / 164
Figure 39: Test B – Compare Sharpe Ratios for daily returns of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 98 / 164
Figure 40: Test B – Compare Cash Mean Daily Position of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 99 / 164
Figure 41: Test B – Compare Months With Losses of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 100 / 164
Figure 42: Test B – Compare Max Drawdown of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 101 / 164
Figure 43: Test B – Compare Max Pullup of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 102 / 164
Figure 44: Test C – Compare values of 128 random portfolios with 30 stocks each.
Simple Portfolio Optimization That Works! Page 104 / 164
Figure 45: Test C – Compare values of 128 random portfolios with 100 stocks each.
Simple Portfolio Optimization That Works! Page 105 / 164
Figure 46: Test C – Compare values of 128 random portfolios with 300 stocks each.
Simple Portfolio Optimization That Works! Page 106 / 164
Figure 47: Test C – Compare Arithmetic Mean daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 107 / 164
Figure 48: Test C – Compare Std.Dev. of the daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 108 / 164
Figure 49: Test C – Compare Sharpe Ratios for daily returns of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 109 / 164
Figure 50: Test C – Compare Cash Mean Daily Position of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 110 / 164
Figure 51: Test C – Compare Months With Losses of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 111 / 164
Figure 52: Test C – Compare Max Drawdown of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 112 / 164
Figure 53: Test C – Compare Max Pullup of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 113 / 164
20% then you are going to get an investment return that is always 20% lower than your forecasts. But
for the purpose of robustness testing, it seems adequate to use heavy noise without a bias.
13.4 Summary
These tests showed that the portfolio methods still work very well with heavy and unbiased noise in the
estimated stock-returns. This means that if you can create a forecasting model that is just broadly
correct at predicting 2-3 year stock-returns, and it is also roughly unbiased so the forecasting model
neither over-predicts nor under-predicts all the future returns in the same way, then with sufficient
diversification in enough stocks, the new portfolio methods should still work really well.
Simple Portfolio Optimization That Works! Page 115 / 164
Figure 54: Test D – Compare values of 128 random portfolios with 30 stocks each.
Simple Portfolio Optimization That Works! Page 116 / 164
Figure 55: Test D – Compare values of 128 random portfolios with 100 stocks each.
Simple Portfolio Optimization That Works! Page 117 / 164
Figure 56: Test D – Compare values of 128 random portfolios with 300 stocks each.
Simple Portfolio Optimization That Works! Page 118 / 164
Figure 57: Test D – Compare Arithmetic Mean daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 119 / 164
Figure 58: Test D – Compare Std.Dev. of the daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 120 / 164
Figure 59: Test D – Compare Sharpe Ratios for daily returns of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 121 / 164
Figure 60: Test D – Compare Cash Mean Daily Position of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 122 / 164
Figure 61: Test D – Compare Months With Losses of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 123 / 164
Figure 62: Test D – Compare Max Drawdown of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 124 / 164
Figure 63: Test D – Compare Max Pullup of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 125 / 164
diversification method moves a part of the portfolio into cash as expected, but it is remarkable that even
when the correlation matrix is e.g. pure noise or completely inverted, it still improves the performance.
Figure 69 shows the Sharpe Ratios, where some portfolio methods had consistently higher and
therefore better Sharpe Ratios than other portfolio methods. For all portfolio sizes the highest Sharpe
Ratios were for the Adaptive+ portfolios which use the actual future 10-day stock-correlations. The 2 nd
best Sharpe Ratio is for the Adaptive+ portfolios which add heavy noise to the actual future 10-day
stock-correlations. And the 3rd best Sharpe Ratio is for the Adaptive+ portfolios which use the previous
10-day stock-correlations as a “naive” forecast of their future correlations. Remarkably, the Adaptive+
portfolios with completely random or malformed correlation matrices still have slightly higher Sharpe
Ratios than the Adaptive portfolios which do not use the diversification method at all. This again shows
how extremely robust the diversification method is to noise and errors in the correlation estimates.
Figure 71 compares the percentages of months with losses for the different portfolio methods. In all
cases the Adaptive and Adaptive+ variants have much better performance than the trivial Rebalanced
portfolios. For smaller portfolio sizes all the Adaptive and Adaptive+ variants have roughly the same
performance, probably because they all hold so much of their portfolios in cash. For portfolios with 50
stocks and above, the Adaptive+ portfolios had significantly better performance on this metric, and the
Adaptive+ portfolios with heavy noise had the second best performance. Even the Adaptive+ variants
with purely random or malformed correlation matrices performed roughly the same as the Adaptive
portfolios which did not adjust the portfolio weights for diversification, again showing the robustness
of the diversification method on this performance metric.
Figure 72 compares the Max Drawdown for the different portfolio methods. In all cases the Adaptive
and Adaptive+ variants performed significantly better than the Rebalanced portfolios on this metric,
and for smaller portfolio sizes the performance was much better, because the Adaptive and Adaptive+
variants held a large part of their portfolios in cash. For larger portfolio sizes of 150, 200 and 300
stocks, the Adaptive+ portfolios with heavy noise in the correlation matrix performed slightly better
than all other portfolios, and the Adaptive+ variants with all correlations either being equal or inverted
performed roughly on par with the Adaptive portfolios that did not adjust for correlation at all. Recall
that the originally desired portfolio weights from the filtering process were calculated using the actual
future 2-3 year average stock-returns, which means that all these portfolio methods are completely
“blind” to any impending stock-market crashes, which is the reason all the portfolio methods have
fairly high Max Drawdowns. If you can somehow predict an impending stock-market crash, you can
incorporate this into the filtering process when calculating the desired portfolio-weights, and thereby
dramatically increase the performance. But it is very hard to predict stock-market crashes.
Figure 73 compares the Max Pullup for the different portfolio methods, which measures how well the
portfolios performed during the recovery-phases of stock-market crashes. For small portfolio sizes of
only 5 or 10 stocks the Adaptive and Adaptive+ variants had worse Max Pullups than the trivial
Rebalanced portfolios, but already for portfolios with 30 stocks they start to perform better, and for
portfolios with 50 stocks or more, the Adaptive and Adaptive+ portfolios all performed much better
than the trivial Rebalanced portfolios. For portfolios with 100 stocks and above, all the Adaptive+
variants performed much better than the Adaptive portfolios which did not adjust for correlation. Even
the Adaptive+ variants that use equal or inverted correlations, performed better than the Adaptive
portfolios. Interestingly, the best performance was for the Adaptive+ portfolios whose correlations had
heavy noise, and the portfolios that used pure noise for their correlations had roughly the same Max
Simple Portfolio Optimization That Works! Page 129 / 164
Pullup as the Adaptive+ portfolios that used the actual future 10-day stock-correlations. So the
diversification method is often very beneficial for a portfolio’s performance during stock-market
recoveries, and it is extremely robust to noisy and malformed estimates of the correlation matrix.
14.3 Summary
In this section we tested the diversification method from Section 8.8 when using various kinds of noisy
and malformed correlation matrices. For smaller portfolio sizes it performed worse than the portfolios
whose weights were not adjusted for correlation, but this was merely because the diversification
method moves a lot of the portfolio into cash when there are too few assets available for investing.
For larger portfolio sizes the diversification method works extremely well. It performs significantly
better when the correlation matrix accurately represents the future stock-correlations. But it is truly
remarkable that the diversification method still works so well in the presence of heavy noise in the
correlation matrix – it even works with a correlation matrix that is completely random, or where all the
correlation values are set to 0.1, or all the correlations are inverted. That is really quite stunning!
It would require further research to establish the exact reason why the diversification method is so
extremely robust to correlation estimates that are completely malformed. But a brief explanation is
perhaps that correlations actually change continually over time, and they are merely serving as a rough
guide to whether different assets tend to move up or down together in price. And because our new
diversification method is only allowed to decrease the portfolio-weights, the worst that can happen is
that it moves too much of the portfolio into cash, as we indeed saw for the smaller portfolio-sizes in the
tests above. So nothing bad happens when the correlation estimates are wrong, but when the
correlations are roughly correct, the diversification method still benefits the portfolio’s performance.
When using completely random or malformed correlation estimates, it is almost as if the diversification
method is like the proverbial blind hen that sometimes gets lucky and finds a grain of corn. In the next
section we will test if the diversification method still manages to “get lucky” when there is also heavy
noise in the future stock-returns that are used in the filtering process to calculate the portfolio weights.
Simple Portfolio Optimization That Works! Page 130 / 164
Figure 64: Test E – Compare values of 128 random portfolios with 30 stocks each.
Simple Portfolio Optimization That Works! Page 131 / 164
Figure 65: Test E – Compare values of 128 random portfolios with 100 stocks each.
Simple Portfolio Optimization That Works! Page 132 / 164
Figure 66: Test E – Compare values of 128 random portfolios with 300 stocks each.
Simple Portfolio Optimization That Works! Page 133 / 164
Figure 67: Test E – Compare Arithmetic Mean daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 134 / 164
Figure 68: Test E – Compare Std.Dev. of the daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 135 / 164
Figure 69: Test E – Compare Sharpe Ratios for daily returns of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 136 / 164
Figure 70: Test E – Compare Cash Mean Daily Position of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 137 / 164
Figure 71: Test E – Compare Months With Losses of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 138 / 164
Figure 72: Test E – Compare Max Drawdown of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 139 / 164
Figure 73: Test E – Compare Max Pullup of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 140 / 164
Figure 76 compares the portfolio values when each portfolio has 300 random stocks. This is very
similar to Figure 75 that we just discussed for portfolios of 100 stocks. All the Adaptive+ variants
performed significantly better than the Adaptive portfolios which did not adjust the portfolio weights
for correlations. Once again it seems that the “Ad+ Corr. Naive” portfolios were slightly better than the
others, and the Adaptive+ variants that used equal or inverted correlations were the worst, although
these also managed to significantly improve the portfolio values compared to the Adaptive portfolios.
These plots have shown that for small portfolios with only 30 stocks, the diversification method from
Section 8.8 has a seemingly random effect on the investment return over time. This is probably because
portfolios with only 30 stocks are too small for the diversification method to function properly. Already
for portfolios with 100 stocks, the diversification method significantly improves the investment return
over time, and it works even better for portfolios with 300 stocks. It is truly remarkable that the
diversification method works so well in the presence of very noisy and malformed correlations, as well
as heavy noise in the estimates for the future stock-returns.
Adaptive+ variants always had lower standard deviation than even the Rebalanced portfolios, including
the portfolios that were using the “naive” forecasting of the stock-correlations.
Figure 79 compares the Sharpe Ratios for the different portfolios. Regardless of the portfolio size, the
following portfolio types had the highest and therefore best Sharpe Ratios: The Adaptive+ portfolios
which used the actual future 10-day stock-correlations, the portfolios that added heavy noise to these
correlations, and the “naive” portfolios that used the previous 10-day stock-correlations. For larger
portfolio sizes of 150, 200 and especially 300 stocks, the Adaptive+ portfolios with the “naive”
correlation forecasts usually performed significantly better than all the others. Even the Adaptive+
portfolios with equal or inverted correlations performed on par with (or perhaps slightly better than) the
Adaptive portfolios which did not adjust the portfolio weights with regard to stock-correlations.
Figure 81 compares the percentages of months with losses for the different portfolio methods. For
portfolio sizes of 30 stocks or more, the performance patterns are roughly the same, namely that the
Rebalanced portfolios had losses in roughly 35% of months, while the Adaptive portfolios had losses in
roughly 32% of months so they were slightly better, and the Adaptive+ portfolios using the actual
future 10-day stock-correlations had losses in roughly 30% of months, and slightly less for the larger
portfolios with 150, 200 or 300 stocks. The Adaptive+ variants with noisy or malformed correlations
performed roughly on par with this, with the “naive” forecasts for the stock-correlations having almost
exactly the same performance as the Adaptive+ portfolios using the actual future 10-day correlations.
The worst Adaptive+ variants were the ones using equal or inverted correlations, but they still just had
roughly the same performance as the Adaptive portfolios which did not adjust for correlations at all.
Figure 82 compares the Max Drawdowns of the different portfolios. For smaller portfolios the Adaptive
and Adaptive+ variants performed much better than the Rebalanced portfolios, but that was simply
because they held a lot of their portfolios in cash. For portfolios with 50 stocks or more, the Adaptive
and Adaptive+ variants had Max Drawdowns that were roughly on par with each other, and they were
mostly better than for the Rebalanced portfolios.
Figure 83 compares the Max Pullups of the different portfolios, to see how well they recovered from
stock-market crashes. For small portfolios with only 5 or 10 stocks, all the Adaptive and Adaptive+
variants performed much worse than the Rebalanced portfolios, but that was probably just because they
held a lot of their portfolios in cash. Already for portfolios with 30 stocks, the Adaptive and Adaptive+
variants were usually better than the Rebalanced portfolios. For the larger portfolio sizes with 150, 200
and especially 300 stocks, the Adaptive portfolios were nearly always better than the Rebalanced
portfolios, and the Adaptive+ variants were nearly always better than the Adaptive portfolios, perhaps
with exception of the Adaptive+ variants using equal or inverted correlations. The other Adaptive+
variants had roughly the same performance with Max Pullups around 200%.
15.3 Summary
This section used heavy noise in the estimates for the future stock-returns, as well as very noisy and
malformed stock-correlations. For smaller portfolios the diversification method under-performed
because it moved a lot of the portfolios into cash. But for larger portfolios it worked extremely well,
which is very impressive considering how noisy the estimated stock-returns and correlations were!
Simple Portfolio Optimization That Works! Page 143 / 164
Figure 74: Test F – Compare values of 128 random portfolios with 30 stocks each.
Simple Portfolio Optimization That Works! Page 144 / 164
Figure 75: Test F – Compare values of 128 random portfolios with 100 stocks each.
Simple Portfolio Optimization That Works! Page 145 / 164
Figure 76: Test F – Compare values of 128 random portfolios with 300 stocks each.
Simple Portfolio Optimization That Works! Page 146 / 164
Figure 77: Test F – Compare Arithmetic Mean daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 147 / 164
Figure 78: Test F – Compare Std.Dev. of the daily return of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 148 / 164
Figure 79: Test F – Compare Sharpe Ratios for daily returns of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 149 / 164
Figure 80: Test F – Compare Cash Mean Daily Position of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 150 / 164
Figure 81: Test F – Compare Months With Losses of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 151 / 164
Figure 82: Test F – Compare Max Drawdown of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 152 / 164
Figure 83: Test F – Compare Max Pullup of different portfolio methods and sizes.
Simple Portfolio Optimization That Works! Page 153 / 164
16.1 Parameters
The Adaptive filter is just a mathematical formula with a few parameters that determine how the filter
behaves for different estimates of the future stock-returns, so the formula makes a portfolio weight
larger when the future stock-returns are estimated to be higher, and vice versa.
We can change the parameters for the Adaptive filter and thereby change how it behaves for different
estimates of the future stock-returns. There are three such parameters for the Adaptive filter in Eq. (28)
and Eq. (29). In all the previous tests, the parameters were set to these values:
μ min =10 % , μ max =30 % , Weight max =5 % (92)
These parameters were chosen according to what I personally thought were reasonable: The minimum
future stock-return should be 10% per year, for which the portfolio weight would be zero. The portfolio
weight then increases linearly until the future stock-return is estimated to be 30%, for which the
portfolio weight is set to 5%, which is the maximum portfolio weight that is allowed. Using these
parameters, the Adaptive filter worked extremely well for larger portfolio sizes of 100-300 stocks, but
it caused the smaller portfolios to hold too much cash.
16.2 Optimization
We now want to find parameters for the Adaptive filter that are better suited for portfolios of only 30
stocks. The question is how we should find such parameters? In this case we only have 3 parameters
that need tuning, but for a more sophisticated filtering process, you could have many more parameters.
And the number of possible parameter combinations increases exponentially with each new parameter,
which makes it impossible to try all parameter combinations. So we need a clever way of tuning the
parameters automatically using a computer.
This is essentially just an optimization problem, where the search-space consists of all valid parameter
combinations, and we then want to find the parameters that perform best on some metric, such as the
arithmetic mean daily return. But we need to use a special kind of optimization method, which does not
require the mathematical gradient of the problem that is being optimized, because the performance
metric is calculated from simulation results, when the portfolio method is used on e.g. 128 random
portfolios for a given set of parameters with the Adaptive filter.
If there is only a single objective that needs optimization, such as the mean daily return of the portfolio,
then we could use one of many so-called Evolutionary Algorithms, which usually work quite well.
Simple Portfolio Optimization That Works! Page 154 / 164
Fitness 1 Fitness 2
NSGA-2 Optimizer
Figure 85: The Pareto Front for parameters of the Adaptive portfolio method.
Simple Portfolio Optimization That Works! Page 157 / 164
Figure 86: Test G-1 – Compare performance statistics for the different portfolio methods, where some
of them use the tuned parameters from Eq. (93).
Simple Portfolio Optimization That Works! Page 158 / 164
Figure 87: Test G-2 – Compare performance statistics for the different portfolio methods, where some
of them use the tuned parameters from Eq. (94).
Simple Portfolio Optimization That Works! Page 159 / 164
Figure 88: Test G-3 – Compare performance statistics for the different portfolio methods, where some
of them use the tuned parameters from Eq. (95).
Simple Portfolio Optimization That Works! Page 160 / 164
17 Future Research
Suggestions for future research have been made throughout this paper and the more important ones are
summarized here along with a few more. You are encouraged to do this research and write a paper.
Some of these suggestions are fairly easy and would just require minor modifications to the computer
code provided in Section 19, while other suggestions might require significant effort and ingenuity.
The suggestions for future research are:
• Further analysis of the diversification method, e.g.: Why is it so robust on noisy and malformed
correlation matrices? How does it change the portfolio weights? Are there any weaknesses?
• Find a better or perhaps more general convergence proof for the diversification method. This
would be a very big contribution. What are the requirements for the Full Exposure function that
makes the algorithm converge, how fast does it converge, and is the solution always unique?
• Try other definitions of the Full Exposure that is used in the diversification algorithm. Can the
performance be improved in some way? Or is the Full Exposure already optimal? Why?
• Try other variants of the filtering process. Perhaps an exponentially increasing function is
better, because it would give exponentially more weight to assets with higher expected returns?
Can other financial data for a company or stock be used successfully in the filtering process?
• Use low-risk bonds or a bond-ETF (Exchange Traded Funds) instead of cash in the portfolios.
Does that improve or worsen the portfolio’s performance on some metrics?
• The data-set with U.S. stocks has gone through several selection processes where a lot of stocks
have been eliminated, including stocks that became worthless. This means there is so-called
“survivorship bias” in all the tests in this paper. Try using the new portfolio method with a
larger data-set that contains more stocks that became worthless to see how it performs.
• Try other variants of the correlation matrix, e.g.: Which correlation measure works best? How is
the correlation matrix best forecast, e.g. what is the optimal number of days X that makes the
diversification method work best, when using the previous X days of stock-correlations as a
“naive” forecast for the future correlations? Is it useful to combine correlations from historic
periods, such as the big stock-market crashes in the years 2009 and 2020?
• A big open problem is of course the forecasting of future stock-returns. We have seen in this
paper that if we can make reasonable but also very noisy predictions of the future 2-3 year
average stock-returns, then that is sufficient to make good portfolio allocations with high
returns. The big question is how we can predict the future 2-3 year stock-returns? A good
starting point is my previous paper on Long-Term Stock Forecasting [Pedersen 2020].
• On a theoretical level, what would happen to asset-prices if all market-participants used the
same filtering and diversification methods with the same parameters and assumptions?
Godspeed!
Simple Portfolio Optimization That Works! Page 161 / 164
18 Conclusion
In this paper we first showed that so-called “mean-variance” portfolio optimization is inherently broken
because variance is a horrible risk-measure for investing, and because the portfolio’s mean and
variance are being optimized simultaneously, so estimation errors can cause that method to concentrate
its portfolio in losing assets that are highly correlated. We also dispelled some other common academic
misbeliefs about the randomness and predictability of future stock-returns and correlations.
We then presented our new so-called “filter-diversify” portfolio method which has two separate phases:
The filtering process only allows assets into the portfolio if they have sufficiently high estimated
returns. The portfolio weights created from the filtering process are then fed into the diversification
process, which uses a new algorithm to minimize the correlations between assets in the portfolio.
The new diversification algorithm can be dubbed “Hvass Diversification” for easy reference, and it has
several benefits: It is fairly simple. It supports both long and short portfolios. It is very fast as it only
takes a few milli-seconds to compute for a portfolio of 1000 assets. It has quadratic time-complexity so
it can be used with even larger portfolios. It is guaranteed to converge to the optimal solution in a small
number of iterations. And it is extremely robust to estimation errors in the correlation matrix.
Both the filtering and diversification algorithms have been extensively tested using real-world data for
nearly 1000 U.S. stocks. It is common in academic research that testing of portfolio optimization is
conflated with testing of stock-prediction models, so it is impossible to tell which part was responsible
for the poor performance. In this paper we instead made so-called “omniscient” tests which used the
actual future 2-3 year average stock-returns in the filtering process, and then we used the actual future
10-day stock-correlations in the diversification algorithm. This showed that our new portfolio method
worked extremely well when it was given correct predictions about the future stock-returns and
correlations. We then tested our new portfolio method for robustness by adding heavy noise to the
future stock-returns and correlations. We even tested the diversification method with correlations that
were completely malformed, and the diversification method still performed very well.
The reason that our new diversification algorithm performed so well in the presence of heavy noise or
completely malformed correlation matrices, is probably that the diversification algorithm only allows
the portfolio weights to decrease, so the worst that can happen is that it moves too much of the portfolio
into cash. And even with very noisy correlations, the diversification algorithm was still able to improve
on some performance metrics, probably because stock-correlations vary greatly over time, so even
though the correlation matrix is very noisy, sometimes it is approximately correct; much like the
proverbial broken clock that is still correct twice a day. But more research would be needed to
understand exactly why the diversification method works so well under heavy noise. It is generally
recommended that you extensively test the new portfolio method before using it in your own investing.
The diversification method does not have any user-adjustable parameters, but the filtering process does.
For most of the tests we saw that our new portfolio method placed too much of the portfolio into cash
when the portfolio size was small. This was because of the particular parameters used in the filtering
process. We then showed how to use a so-called “Multi-Objective” optimizer to tune the parameters, so
our new portfolio method could also be used effectively for smaller portfolios with only 30 stocks.
Simple Portfolio Optimization That Works! Page 162 / 164
19.2 Stock-Tickers
The following are all 949 stock-tickers from USA whose daily stock-prices were used in these tests.
A, AAL, AAP, AAPL, AAWW, ABC, ABG, ABMD, ABT, ACC, ACHC, ACM, ACN, ADBE, ADI, ADM, ADP,
ADS, ADSK, ADTN, ADXS, AEE, AEO, AEP, AES, AET, AFG, AFL, AGCO, AGN, AGNC, AGO, AHL, AIG,
AIZ, AJG, AKAM, AKS, ALB, ALE, ALGN, ALGT, ALK, ALKS, ALL, ALNY, ALR, ALV, ALXN, AMAT,
AMD, AME, AMED, AMG, AMGN, AMKR, AMP, AMT, AMTD, AMZN, AN, ANDV, ANF, ANSS, ANTM,
AON, AOS, APA, APC, APD, APH, ARE, ARG, ARNA, ARO, ARRS, ARW, ASB, ASH, ATI, ATO, ATR, ATVI,
ATW, AVB, AVGO, AVNT, AVP, AVT, AVY, AWI, AWK, AXON, AXP, AXS, AYI, AZO, AZPN, BA, BAC,
BAX, BB, BBBY, BBWI, BBY, BC, BCO, BCR, BDC, BDN, BDX, BEBE, BEN, BG, BGS, BHC, BHI, BID,
BIG, BIIB, BIO, BJRI, BK, BKD, BKE, BKH, BKNG, BKS, BLK, BLL, BMRN, BMS, BMY, BOH, BPL,
BPOP, BR, BRCD, BRK-A, BRO, BRS, BSX, BWA, BX, BXP, BYD, BZH, C, CA, CACI, CAG, CAH, CAKE,
CAR, CASY, CAT, CAVM, CB, CBB, CBI, CBRE, CBRL, CBSH, CBT, CCE, CCI, CCK, CCL, CCOI, CDE,
CDNS, CDR, CE, CELG, CEQP, CERN, CF, CFR, CFX, CGRN, CHD, CHE, CHH, CHRW, CI, CIEN, CINF,
CL, CLC, CLF, CLGX, CLH, CLI, CLR, CLX, CMA, CMC, CMCSA, CME, CMG, CMI, CMP, CMPR, CMS,
CNC, CNK, CNO, CNP, CNVR, CNX, COF, COG, COL, COLM, COO, COP, COST, CP, CPB, CPE, CPN,
CPT, CR, CREE, CRI, CRL, CRM, CROX, CRR, CRS, CRUS, CRZO, CSC, CSCO, CSL, CSX, CTAS, CTSH,
CTXS, CUBE, CUZ, CVA, CVLT, CVS, CVX, CW, CXO, CXW, CY, D, DAL, DAR, DBD, DBI, DCI, DD,
DDS, DE, DECK, DEI, DFODQ, DFS, DFT, DG, DGX, DHI, DHR, DIS, DISCA, DISH, DK, DKS, DLB,
DLR, DLTR, DLX, DOV, DPZ, DRE, DRI, DRQ, DTE, DUK, DVA, DVN, DXCM, DY, EA, EAT, EBAY, ECA,
ECL, ED, EEFT, EEP, EFX, EGP, EIX, EL, EMC, EME, EMN, EMR, ENDP, ENR, ENS, EOG, EPAC, EPD,
Simple Portfolio Optimization That Works! Page 163 / 164
EPR, EQIX, EQR, EQT, ES, ESL, ESRX, ESS, ETFC, ETN, ETR, EV, EVR, EW, EWBC, EXAS, EXC, EXEL,
EXP, EXPD, EXPE, EXR, F, FAST, FCN, FCX, FDS, FDX, FE, FFIV, FHI, FHN, FICO, FIS, FISV, FITB, FL,
FLEX, FLIR, FLO, FLR, FLS, FMC, FNB, FNSR, FOSL, FR, FRT, FRX, FSLR, FTI, FTNT, FUL, FULT, G,
GBX, GCO, GD, GE, GEO, GHC, GILD, GIS, GL, GLW, GNTX, GNW, GPC, GPN, GPS, GRA, GRMN, GS,
GT, GWR, GWW, GXP, H, HAIN, HAL, HAS, HBAN, HBI, HD, HE, HES, HFC, HIBB, HIW, HL, HLF,
HMC, HOG, HOLX, HON, HP, HPQ, HR, HRB, HRC, HRL, HSIC, HST, HSY, HUM, HUN, HWC, HWM,
HXL, IAC, IBKC, IBKR, IBM, ICE, IDA, IDCC, IDXX, IEX, IFF, IGT, ILMN, INCY, INFN, INGR, INT,
INTC, INTU, IO, IONS, IP, IPG, IPGP, IPI, IPXL, IRBT, IRET, IRM, ISIL, ISRG, IT, ITRI, ITT, ITW, IVR,
IVZ, J, JACK, JAKK, JBHT, JBL, JBLU, JCI, JCOM, JEF, JKHY, JLL, JNJ, JNPR, JOY, JPM, JWN, K, KATE,
KBH, KBR, KDP, KEX, KEY, KIM, KLAC, KMB, KMT, KMX, KO, KR, KRC, KSS, KSU, L, LAMR, LAZ,
LBTYA, LDOS, LEA, LECO, LEG, LEN, LH, LHX, LII, LKQ, LL, LLL, LLY, LMT, LNC, LNG, LNT, LOGI,
LOGM, LOPE, LOW, LPX, LRCX, LSTR, LULU, LUMN, LUV, LVLT, LVS, LXP, LYV, M, MA, MAA, MAC,
MAN, MANH, MAR, MAS, MASI, MAT, MCD, MCHP, MCK, MCO, MD, MDLZ, MDP, MDR, MDRX,
MDSO, MDT, MDU, MELI, MET, MGLN, MGM, MHK, MIC, MIDD, MJN, MKC, MKL, MKSI, MKTX,
MLHR, MLM, MMC, MMM, MMS, MNKD, MNRO, MNST, MO, MOH, MOS, MPW, MPWR, MRK, MRO,
MRVL, MS, MSCC, MSCI, MSFT, MSI, MSM, MTB, MTD, MTG, MTN, MTOR, MTZ, MU, MUR, MWW,
MXIM, MYGN, MYL, NATI, NAV, NBIX, NBL, NCR, NDAQ, NDSN, NEE, NEM, NEU, NFG, NFLX, NFX,
NI, NKE, NKTR, NLOK, NLY, NNN, NOC, NOK, NOV, NRG, NS, NSC, NTAP, NTGR, NTRS, NUAN, NUE,
NUVA, NVAX, NVDA, NVR, NWE, NWL, NXST, NYT, O, OA, OC, OCN, ODFL, ODP, OFC, OGE, OHI,
OI, OII, OKE, OLED, OLN, OMC, OMI, ON, OPI, ORCL, ORI, ORLY, OSK, OVV, OXY, PAA, PACW,
PAYX, PBCT, PBI, PCAR, PCG, PCH, PDCO, PDLI, PEAK, PEG, PENN, PEP, PETM, PFE, PFG, PG, PGR,
PH, PHH, PHM, PII, PKG, PKI, PLCE, PLD, PM, PNC, PNM, PNR, PNRA, PNW, PODD, POM, POOL, PPC,
PPG, PPL, PRGO, PRU, PRXL, PSA, PTC, PTEN, PVH, PWR, PX, PXD, PZZA, QCOM, QRTEA, R, RAD,
RAI, RAX, RBC, RCII, RCL, RDC, RDN, RE, REGN, REN, RES, RF, RGA, RGLD, RHI, RHT, RIG, RJF, RL,
RMD, RNR, ROK, ROP, ROST, RPM, RRC, RS, RSG, RTN, RTX, RYN, SAFM, SAM, SANM, SBAC, SBGI,
SBH, SBUX, SCG, SCHW, SCI, SEE, SEIC, SF, SFLY, SGEN, SGMS, SGY, SHO, SHW, SIG, SIRI, SITC,
SIVB, SJM, SKT, SKX, SLAB, SLB, SLG, SLM, SM, SMG, SMTC, SNA, SNBR, SNDK, SNH, SNI, SNPS,
SO, SOHU, SON, SONC, SPB, SPG, SPGI, SPLS, SPR, SPWR, SPXC, SPY, SRCL, SRE, SSYS, STE, STI,
STJ, STLD, STRA, STT, STX, STZ, SUI, SVU, SWK, SWKS, SWN, SWX, SWY, SXT, SYK, SYNA, SYY, T,
TAP, TCBI, TCO, TDC, TDG, TDW, TDY, TECD, TECH, TEL, TEN, TER, TEVA, TFC, TFX, TGI, TGNA,
TGT, THC, THO, THS, TIF, TIVO, TJX, TKR, TLRD, TM, TMO, TMUS, TNL, TOL, TPR, TPX, TRN,
TROW, TRUE, TRV, TSCO, TSM, TSN, TSS, TT, TTEK, TTWO, TUP, TWX, TXN, TXRH, TXT, TYL, UAL,
UDR, UFS, UGI, UHS, UIS, ULTA, ULTI, UMPQ, UNFI, UNH, UNM, UNP, UPS, URBN, URI, USB, USG,
UTHR, V, VAR, VFC, VIA, VIAC, VLO, VMC, VMI, VMW, VNO, VR, VRSK, VRSN, VRTX, VSAT, VSH,
VTR, VZ, WAB, WAT, WBA, WBC, WBMD, WCC, WCG, WDC, WEC, WELL, WEN, WERN, WEX, WFC,
WFM, WFT, WGL, WHR, WLK, WM, WMB, WMT, WOR, WPC, WR, WRB, WRE, WRI, WSM, WSO, WST,
WTRG, WU, WWD, WWW, WY, WYNN, X, XCO, XEC, XEL, XLNX, XOM, XPO, XRAY, XRX, Y, YHOO,
YUM, ZBH, ZBRA, ZION
Simple Portfolio Optimization That Works! Page 164 / 164
20 Bibliography
[Deb 2000] K. Deb, S. Agrawal, A. Pratap, T. Meyarivan, “A Fast Elitist Non-dominated Sorting
Genetic Algorithm for Multi-objective Optimization: NSGA-II”,
Parallel Problem Solving from Nature VI, 2000.
[Luenberger 1998] D.G. Luenberger, “Investment Science”, 1998.
[Markowitz 1952] H. Markowitz, “Portfolio Selection”, Journal of Finance, 1952.
[Markowitz 1959] H. Markowitz, “Portfolio Selection: Efficient Diversification of Investments”, 1959.
[Pedersen 2014] M.E.H. Pedersen, “Portfolio Optimization & Monte Carlo Simulation”, 2014. [PDF]
[Pedersen 2020] M.E.H. Pedersen, “Long-Term Stock Forecasting”, 2020. [PDF]
[Pedersen 2021] M.E.H. Pedersen, “Does Volatility Harvesting Really Work?”, 2021. [PDF]
[Thorp 1975] E.O. Thorp, “Portfolio Choice and the Kelly Criterion”,
Stochastic Optimization Models in Finance, 1975.
My previous papers and books can all be downloaded through SSRN and GitHub.