0% found this document useful (0 votes)
21 views

Robust Estimation of Risk Factor Model Covariance Matrix

Uploaded by

lascu.roman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Robust Estimation of Risk Factor Model Covariance Matrix

Uploaded by

lascu.roman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

ROBUST ESTIMATION OF RISK FACTOR

MODEL COVARIANCE MATRIX


Dmitri Mossessian
[email protected]

Viviana Vieli
[email protected]
Robust Estimation of Risk Factor Model Covariance Matrix

Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Covariance matrix of a risk factor model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Covariance matrix estimation using random matrix theory (RMT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Simple example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Introduction

Every multifactor risk model is based on a factor covariance matrix that is employed to reproduce, model, and analyze the
joint distribution of risk factor returns. The moments of that distribution are widely used by market practitioners as measures
of risk. The factor covariance matrix is computed using historical time series of risk factor returns, and the limited number of
samples in these time series always leads to an estimation errors in covariance matrix itself. If a relatively small number of
historical samples (comparable to the number of factors in the model) are used to estimate the matrix, this error may become
very large. The presence of this noise in the matrix not only decreases the accuracy of the risk forecasts, but can also cause
the matrix estimator to have undesired properties that prevent such forecasts altogether. In particular, the estimator of the
covariance matrix may not be positive definite, rendering the matrix useless for Monte Carlo simulation of risk. This note
addresses the problem of noise in a factor model covariance matrix and outlines the method of finding the optimal matrix
estimator that is based on random matrix theory. The method is similar to a simple regularization method proposed by
Ribonato in [1]. The advantage of the proposed method is that it not only makes the resulting estimator positive definite,
but also reduces the amount of noise in the estimator and minimizes the differences between the estimator and (generally
unknown) true covariance matrix of the model.

Covariance matrix of a risk factor model

Factor models are widely used to describe the return process of financial securities and analyze the risks of portfolios. They
allow us to reduce the dimensionality of the problem and at the same time help to define major economic sources of risk of
portfolios. A factor model decomposes the total return of a security into a sum of systematic returns that are due to movements
of common factors, and an idiosyncratic component. The idiosyncratic returns are generally independent of the systematic
return and uncorrelated, so that correlations between securities are defined by the correlations between systematic risk
factors. The following equations describe the components of security return in a linear factor model (a nonlinear model
might have more complex relationship between factor returns and security returns, but the basic relationship between factor
covariances and portfolio risk remains the same). A linear factor model expresses return rj of a security Sj as a linear
combination of factor returns fi , i = 1, ...M , where M is the number of factors in the model:

M
rj = lij fi + εj (1)
i

Here lij is the loading (sensitivity) of the security j to the factor fi , and εj is the idiosyncratic component of the security return
(the portion of the return not explained by systematic factors). If we introduce the vector of loadings lTj = (l1j , ...lM j ) for
each security, and define the matrix of loadings L that consists of columns lj j (M × K matrix), the vector of security returns
rT = (r1 , ...rK ) for all securities in the portfolio will be:
r = LT × f + ε (2)
K×1 K×M M ×1

Copyright © 2017 FactSet Research Systems Inc. All rights reserved. 2 FactSet Research Systems Inc. | www.factset.com
where ε is the vector of εj components. Thus, a portfolio return can be written as:
T
rp = wT (Lf + E) = (L × w) × f + wT E (3)

where E is a diagonal matrix with εj on diagonal and the vector wT consists of weights of securities in the portfolio. Each
factor return fi is a random variable, and a factor model of returns is basically a model that forecasts the joint distribution
of all the factors in the model at certain time horizon T . This forecast allows us to use Eq.(3) to model the distribution of
portfolio return at that horizon, which is fully described by the covariance matrix of the factor return Σ. For example, volatility
of portfolio return can be computed as:
T
σp2 = (L × w) × Σ × (L × w) + wT × E × w (4)

where E is the diagonal matrix of idiosyncratic return volatilities.


Thus, estimating the factor model is equivalent to building an estimator for the factor covariance matrix. The only available
information we have for that are the historical time series of factor returns, which effectively represent the samples of random
processes fi (t), and the problem at hand is the problem of building the best estimator for a sample covariance matrix. This
problem is best addressed from the point of view of the random matrix theory.

Covariance matrix estimation using random matrix theory (RMT)

Random matrix theory (a good review of the theory can be found in [2]) states that if X is N × M random matrix with i.i.d.
standard normal elements, the eigenvalues λ of the correlation matrix Σ̂ = XT X are distributed with the following probability
distribution function:
c √
ρλ = (λ − λmin )(λmax − λ) (5)
2πλ
N
where c = M is the ratio of the number of samples to the number of variables, and the boundaries for the eigenvalues
are ( √ )2
1
λmax,min = 1± (6)
c
All eigenvalues of the matrix are concentrated in the region between λmin and λmax —the support region of the eigenvalue
probability distribution function. Since the matrix XT X is the sample estimator of the correlation matrix of independently
distributed standard normals, it does converge to the identity matrix when number of samples goes to infinity. That corre-
sponds to the case of c → ∞, when all eigenvalues are equal to one and Σ̂ is the true estimator of the correlation matrix
of M independent normals. But when the number of samples is limited, c becomes smaller, the size of the support region
widens, and the matrix Σ̂ develops random off-diagonal elements. Thus the width of the support region effectively identifies
the range of errors in the matrix estimator due to the limited sample size. Let us look now at an M-factors risk model that
have a true correlation matrix Σ(M × M ) (we can always normalize the covariance matrix to the factor volatilities to obtain
the correlation matrix). The sample correlation matrix of the factor returns can then be written as (assuming that the factors
have zero mean):
1 1
Σ̂ = Σ 2 XT XΣ 2 (7)
It is clear from this equation that the sample correlation Σ̂ will be equal to the actual matrix Σ only if the product XT X equals
the identity matrix. In the finite sample size case, the noise in the matrix XT X will determine the errors in the full model
covariance matrix estimator Σ̂. The distribution of eigenvalues of the sample correlation matrix will be wider than in the
case of a pure random matrix as there will be a number of large eigenvalues lying outside of the support region of a random
matrix. A simple and intuitive assumption to make is that the components of the correlation matrix that are defined by the
small eigenvalues within the random matrix support region (i.e. are orthogonal to the space of the large eigenvalues) are
dominated by noise. In other words, only the eigenvalues of the sample matrix Σ̂ that lie outside of the support region
[λmin , λmax ] contain information relevant to the actual correlation matrix.
The underlying assumption we are going to use to construct the optimal estimator is that the eigenvectors of the optimal
estimator are the same as the eigenvectors of the sample estimator Σ̂ (for detailed discussion see [3]). This assumption

Copyright © 2017 FactSet Research Systems Inc. All rights reserved. 3 FactSet Research Systems Inc. | www.factset.com
allows us to construct the optimal matrix estimator by doing spectral decomposition of the original estimator, adjusting the
eigenvalues and reconstructing the optimal estimator using the same eigenvector matrix. This, in fact, is similar to the method
proposed by Ribonato in [1], with the only difference being the algorithm for eigenvector adjustment. The adjustment we
use is similar in nature to principal component analysis when only the principal components that carry information are used,
and the ones that carry only the noise are discarded. In the same spirit we construct the optimal estimator of the correlation
matrix using only the eigenvalues outside of the RMT support region λi ∈ U ≡ {λi > λmax }:
 
∑  ∑ 
Σ̂ = λi qi qTi + diag I − λi qi qTi (8)
 
λi ∈U λi ∈U

where qi are the eigenvectors of the unadjusted correlation matrix. The second term in the equation (8) replaces the diagonal
elements of the new estimator with ones. This will ensure that the estimator can be used as a correlation matrix - that it is
positive definite and its diagonal elements are all 1.

Simple example

In this section we discuss a simple example that illustrates the


approach and its results and talk about the implication of the 500

proposed matrix regularization method (RMT method) for port- 400

folio risk measures. To illustrate the effect of finite sampling 300

and subsequent regularization on a correlation matrix let us look 200

at a (100 × 100) matrix with all correlations equal 0.1. We use


100

this matrix to construct the estimator Σ̂ using a (100 × 1000) -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3

set of random samples X drawn from standard normal distribu- 600

tion using Equation (7). Because of the finite number of sam- 400

ples, the resulting correlation coefficients vary between −0.1


200
and 0.27.
0

The top panel of Figure (1) shows the distribution of the ele- -0.15 -0.1 -0.05 0 0.05 0.1 0.15 0.2 0.25 0.3

ments of the matrix Σ̂ (excluding diagonal elements) before reg-


ularization is applied. While the distribution of the true correla- Figure 1: Distributions of correlation values in the sample matrix
tion matrix should be a single vertical line at 0.1, the estimation before (top) and after (bottom) regularization
noise results in the distribution approximately centered around
the true value, but having a final width (standard deviation) of
about 0.05. The bottom panel of that figure shows the same distribution after we applied the regularization procedure de-
scribed in previous section. Clearly, the amount of noise in the correlation matrix (the width of the distribution) is significantly
reduced.
The parameters of the distributions of the values of matrix elements before and af-
ter regularization are summarized in Table (1). When the correlation matrix is re- Before After
constructed using the regularized eigenvalues the scatter of the correlation values µ 0.11 0.091
around its true value of 0.1 is reduced by more than a factor of two. The method σ 0.047 0.021
moves the sample correlations much closer to their true value by eliminating random
noise in eigenvalues. Table 1: Mean and standard deviation of
all elements of a constant correlation ma-
Applying a new regularization method to the risk model covariance matrix will in-
trix before and after regularization
evitably change the characteristics of the resulting joint distribution of the factors and
, ultimately, affect the estimated portfolio risk measure. To gauge the effect of the
new regularization method on actual portfolio VaR we ran the risk model on several test portfolios. Table (2) shows the
results of the tests for Dow Jones Industrial Index (with S&P500 index as a benchmark) and for Barclays Global Aggregate
Index ran against USD cash. We show the estimates of 95% daily VaR for three cases - factor covariance matrix estimated
without any regularization, covariance method regularized using simple shrinkage (the approach discussed in the FactSet
MAC white paper [4]), and covariance matrix regularized using the new method.

Copyright © 2017 FactSet Research Systems Inc. All rights reserved. 4 FactSet Research Systems Inc. | www.factset.com
The results show that the random noise present in the non- DJI vs SP500 Barclays Agg
regularized estimator of the covariance matrix in general No Regularization 1.92 3.15
leads to slightly lower values of VaR. Intuitively that can be Shrinkage 1.91 3.20
easily understood as a diversification effect caused by the RMT Method 2.13 3.42
presence of uncorrelated noise in model factors, just as the
presence of large number of assets with uncorrelated re- Table 2: MAC 1 day 95% VaR estimated using covariance matrices
turns leads to reduction of portfolio risk because of the di- with different regularization method applied
versification. Application of the RMT regularization method
result in partial reduction of the covariance matrix noise, thus reducing the diversification effect and an increasing in the
estimated VaR value.

References

[1] Ribonato R., Jackel P. (1999). The most simple methodology to create a valid correlation matrix for risk management
and option pricing purposes. Quantitative Research Center Papers, http://www.quarchome.org/correlationmatrix.pdf.
[2] Bai Z. (1999). Methodologies in spectral analysis of large dimensional random matrices, a review. Statistica Sinica, 9,
611-677
[3] Ledoit O., Wolf M. (2012). Nonlinear shrinkage estimation of large-dimensional covariance matrices. The Annals of
Statistics, 40(2), 1024–1060
[4] Steven P. Greiner, William McCoy, Nikita S. Imennov, Christopher Carpentier. (2014) FactSet Multi-Asset Class Risk
Model. FactSet White Paper.

Copyright © 2017 FactSet Research Systems Inc. All rights reserved. 5 FactSet Research Systems Inc. | www.factset.com

You might also like