Variance Estimation
Variance Estimation
is an estimate of the mean . The expectation value of the sample mean is the
population mean, E(x) = , and the variance of the sample mean is var(x) =
2 /N . Since the expectation value of the sample mean is the population mean,
the sample mean is said to be an unbiased estimator of the population mean. And
since the variance of the sample mean approaches zero as the sample size increases
(i.e., fluctuations of the sample mean about the population mean decay to zero with
increasing sample size), the sample mean is said to be a consistent estimator of the
population mean.
These properties of the sample mean are a consequence of the fact that if
x1 , . . . , xN are mutually uncorrelated random variables with variances 12 , . . . , N
2 ,
z2 = 12 + + N
2
. (1)
1
this, we calculate
N
1 X
E [(xi )(x )] = E [(xi )(xj )]
N
j=1
1
= E(xi )2
N
2
= ,
N
where we have
used 2the
assumption that the xi are mutually uncorrelated. With
2
var(x) = E (x ) = /N , it then follows that
n o
E (xi x)2 = E [(xi ) (x )]2
2 2
= 2 + 2
N N
N 1 2
= .
N
Thus,
N
1 X
E(s2 ) = E (xi x)2 = 2 .
N 1
i=1
The denominator N 1 in the sample variance is necessary to ensure unbi-
asedness of the variance estimator. The denominator N would only be correct if
fluctuations about the population mean and not about the sample mean x would
appear in the expression for the sample variance. With the denominator N 1,
one obtains an indefinite sample variance for a sample of size N = 1, as expected.
With the denominator N , the sample variance would vanish, yielding an obviously
incorrect estimate of the population variance. The denominator N 1 appears
because, after estimation of the sample mean, only N 1 degrees of freedom are
available for the estimation of the variance, since the variables x 1 , . . . , xN and the
sample mean satisfy the constraint
N
X
(xi x) = 0.
i=1
References
Johnson, R. A., and D. W. Wichern, 2002: Applied Multivariate Statistical Anal-
ysis. 5th ed., Prentice-Hall, 767 pp.
Papoulis, A., 1991: Probability, Random Variables, and Stochastic Processes. 3rd
ed., McGraw Hill, 666 pp.