0% found this document useful (0 votes)
17 views

Degrees of Freedom (1)

VETERINARIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Degrees of Freedom (1)

VETERINARIA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Degrees of Freedom

Original
XX
Blackwell
Oxford,
Teaching
TEST
©
0141-982X
Journal
XXX Articles
2007 The
compilation
UK
Statistics
Publishing
Author ©Ltd
Teaching Statistics Trust

KEYWORDS: Joseph G. Eisenhauer


Teaching; Wright State University, Dayton, Ohio, USA
Degrees of freedom; e-mail: [email protected]
Effective sample size.
Summary
This article reviews several attempts to define
degrees of freedom, and offers some simple
explanations of how they are derived and why they
are used in various contexts.

context of physics rather than statistics. Schwartz-


ª INTRODUCTION ª man (1994, p. 96) states only that ‘in mathematics
the term degrees of freedom refers to the number
of independent variables involved in a statistic’.
‘Degrees of freedom: An elusive concept that occurs throughout
statistics.’ While such definitions emphasize the notion of
independence, Mayhew (2004) relates degrees of
–B.S. Everett, The Cambridge freedom rather vaguely to sample size and signifi-
Dictionary of Statistics,
2nd edition, 2002, p. 111.
cance, as follows:
‘A number which in some way represents the size of the sample
lthough most of the statistical tests encountered
A during a course on inferential statistics depend
on degrees of freedom, many introductory textbooks
or samples used in a statistical test. In some cases, it is the
sample size, in others it is a value which has to be calculated.
Each test has its specific calculation, and the correct value for
present degrees of freedom in a strictly formulaic each test must be calculated before the result of the test can be
checked for statistical significance.’
manner, often without a useful definition or insightful
explanation. Indeed, some otherwise comprehensive Upton and Cook (2002) refer to ‘a parameter that
volumes simply abandon any attempt at discus- appears in some probability distributions used in
sion; Black (1994, p. 306), for example, concedes statistical inference, particularly the t distribution,
‘the concept of degrees of freedom is difficult and the chi-squared distribution, and the F distribution’
beyond the scope of this text’. This shortcoming and note that ‘the phrase “degrees of freedom”
leaves students without much intuition regarding was introduced by Sir Ronald Fisher in 1922’
the issue, and reinforces their all-too-common without mentioning its purpose. This is then
perception of statistical methods as a ‘black box’ followed by several formulae for computing
whose inner workings they are incapable of com- degrees of freedom, without any explanation of
prehending. After a brief discussion of definitions, how the formulae are derived.
the present note addresses the calculation and use of
degrees of freedom, to help develop and strengthen Indeed some of the definitions offered in the
students’ understanding of this fundamental notion. literature are inconsistent with one another. Clapham
(1996, pp. 65–66) states that the number of degrees
of freedom is ‘a positive integer normally equiva-
lent to the number of independent observations in
ª DEFINITIONS ª a sample, minus the number of population para-
meters to be estimated from the sample’. In contrast,
It is perhaps not surprising that many texts offer Kotz and Johnson (1982, pp. 293 –294) point out
little in the way of explanation; the reference that, technically, ‘although the number of degrees
literature is almost equally opaque. Writing for a of freedom is usually a positive integer, fractional
dictionary of mathematics, Daintith and Rennie numbers occur in some approximations, and one
(2005, p. 60) define the number of degrees of can, for example, have a non-central chi-squared
freedom as ‘the number of independent parameters distribution with zero degrees of freedom,
that are needed to specify the configuration of a obtained by taking this value for the degrees of
system’ and proceed to outline the concept in the freedom parameter’.
© 2008 The Author Teaching Statistics. Volume 30, Number 3, Autumn 2008 • 75
Journal compilation © 2008 Teaching Statistics Trust
A somewhat clearer definition is offered by Everett students will begin to observe degrees of freedom
(2002). After describing degrees of freedom as ‘an virtually everywhere in the world around them.
elusive concept’, Everett (2002, p. 111) explains
‘essentially the term means the number of inde-
pendent units of information in a sample relevant
to the estimation of a parameter or the calculation ª DEGREES OF FREEDOM FOR ª
of a statistic. For example, in a 2 × 2 contingency SAMPLE VARIANCE
table with a given set of marginal totals, only one
of the four cell frequencies is free and the table has Many textbooks introduce the formula for a
therefore a single degree of freedom’. Even more sample variance without specifically designating
helpful is the explanation given by Glenn and Littler the divisor (n – 1, where n is the sample size) as the
(1984, p. 46), which addresses both independence degrees of freedom, and few provide an intuitive
and sample size: justification for dividing the sum of squared
deviations by n – 1 rather than n. But the initial
‘In statistics it is the number of independent items of information
discussion of sample variance early in the course is
given by the data; that is, the total number of items less the
number of relevant summary statistics or restraints. Thus a set
perhaps the most convenient place to emphasize
of independent results x1, x2, . . . xn has n degrees of freedom, but both the meaning and purpose of degrees of
n – 1 if the mean x is known, since any one of the xi is now freedom, because the concept resurfaces so often.
dependent on the sum of the others. Note that a sample of size Consider, for example, a sample of five observa-
n retains n degrees of freedom if the population mean μ is
known, since this does not determine xi for i = 1 . . . n if the other
tions. If nothing else is known about the sample,
(n – 1) values are known. The concept is of importance in statis- there are no restrictions on the values taken by any
tical inference since it defines the effective size of a sample.’ of the observations; any five values will suffice to
form a sample of n = 5. Indeed all five observa-
Thus in the most general terms and on the most tions could be freely discarded and replaced by
elementary level, we may think of degrees of others drawn from the population. But if we wish
freedom as the number of pieces of information to calculate the sample variance, it is first neces-
that can be freely varied without violating any sary to compute the sample mean, x = (Σxi)/n. Suppose
given restrictions. And once the distinction we find x = 10. Then it is no longer true that all
between the nominal sample size and the effective five observations are free to be replaced by random
sample size is recognized, it becomes clear why draws from the larger population. Since nx = 50,
some types of averages are based on degrees of the sum of all five observations must now also
freedom rather than on the number of observations equal 50; thus four (or fewer) observations could
per se. The following sections offer a few simple be freely altered, but once any four of the observations
examples in various contexts. are fixed, the final observation is determined by
default. Consequently there are only four degrees of
freedom (df ) for use in calculating the sample vari-
ance; the effective sample size has been reduced to
ª DEGREES OF FREEDOM IN ª df = n – 1. This helps explain why the sample vari-
EVERYDAY LIFE ance is calculated by averaging squared deviations
from the mean over the degrees of freedom rather
Illustrations taken from everyday life can often than over n. It also illustrates the general principle
provide students with a valuable intuitive grasp of that a degree of freedom is lost for each parameter
an issue before formal statistical applications are that must be estimated from sample data.
studied. Consider, for example, having to complete
three different hour-long tasks (read, eat and nap) This is, undoubtedly, obvious to instructors of
between the hours of 1 p.m. and 4 p.m. This statistics, but it is assuredly not obvious to most
scheduling problem has two degrees of freedom: students ( just ask them!); thus despite its simplicity,
any two tasks can be scheduled at will, but once this discussion is well worthwhile. And knowing
two of them have been placed in time slots, the why there are n – 1 degrees of freedom for calcu-
time slot for the third is determined by default. lating a sample variance, students can readily
Moreover if an additional constraint is imposed appreciate the fact that the same degrees of
(e.g. the nap must be completed first), then only freedom carry over to the sample standard devia-
one other task can be scheduled freely; the extra tion, and hence to Student’s t statistic. Moreover
constraint has removed a degree of freedom. it becomes clear that as the effective sample size
Employing such examples with a little creativity, increases, the sample becomes increasingly
76 • Teaching Statistics. Volume 30, Number 3, Autumn 2008 © 2008 The Author
Journal compilation © 2008 Teaching Statistics Trust
representative of the general population, so in the C1 C2 C3 Total
limit, t approaches the standard Normal variable
R1 (R1, C1) (R1, C2) (R1, C3) 100
Z. In addition, having n – 1 degrees of freedom for R2 (R2, C1) (R2, C2) (R2, C3) 80
each sample variance explains why a t test for Total 20 70 90 180
comparing two population means using a pooled Table 1. A 2 × 3 contingency table
variance has (n1 – 1) + (n2 – 1) = n1 + n2 – 2 degrees
of freedom, and why the F ratio for comparing
two population variances using samples of n1 and
n2 observations has n1 – 1 numerator degrees of ª DEGREES OF FREEDOM FOR ª
freedom and n2 – 1 denominator degrees of freedom.
TESTS OF INDEPENDENCE
In this respect, emphasizing degrees of freedom
early on pays benefits throughout the course. Perhaps the most engaging illustration comes from
contemplating the degrees of freedom for the
contingency table used in a chi-squared test of
independence. Consider, for example, a 2 × 3 matrix
ª DEGREES OF FREEDOM FOR ª with known row and column totals, such as that
ANOVA AND REGRESSION given in table 1.

A further application of the same principle occurs Clearly the table is impossible to complete without
in analysis of variance (ANOVA): since the further information. Of course, if a single frequency
degrees of freedom represent the effective sample is inserted, such as (R1, C1) = 16, one other cell
size, it makes perfect sense to divide the sums of becomes determined or redundant; in particular, it
squares by their respective df to get mean squares. must be the case that (R2, C1) = 4. But once a
In particular, for a total of n observations, the second nonredundant frequency is given, the rest
overall variance is simply the sum of squares total of the table is determined by default: if (R1,
(SST ) divided by the total degrees of freedom, n – 1. C2) = 44, then (R1, C3) = 40, (R2, C2) = 26, and (R2,
For k treatment categories, the sum of squares C3) = 50. Thus for r = 2 rows and c = 3 columns,
due to treatments is given by SSTR = Σ ik=1ni (x i − x )2 the table is uniquely determined by 2 nonredundant
where ni and xi denote the number of observations pieces of information; that is, there are (r – 1)
and the mean in the ith treatment category, respec- (c – 1) = 2 degrees of freedom. Of course, this exercise
tively. Once SSTR has been calculated, the final can easily be replicated and even extrapolated to
term in the sum, nk(xk – x)2, is determined by the larger contingency tables, so that with sufficient prac-
value of SSTR and the preceding k – 1 terms; tice students will convince themselves that the formula
hence for the purpose of computing the mean for degrees of freedom is intuitively reasonable.
square due to treatments there are only k – 1
degrees of freedom. And because the sum of Alternatively the general formula for the degrees
squares due to error (SSE) is based on squared of freedom in a contingency table can be derived
deviations within each of the k categories, the as follows. Any table of r rows and c columns has
mean square due to error has (n1 – 1) + (n2 – 1) + rc total cells. The table is effectively completed
. . . + (nk – 1) = n – k degrees of freedom. Of course, after entries are made in all but one row and one
because they are additive, the degrees of freedom column. Since each row has c cells, each column
associated with treatments and those associated has r cells, and each row intersects each column
with error sum to the total: (k – 1) + (n – k) = n – 1. once, the number of redundant cells (i.e. the
number of cells in the final row and final column)
Exactly the same degrees of freedom reappear in is r + c – 1. Therefore the degrees of freedom can
the context of a multiple linear regression if k is be calculated as df = rc – (r + c – 1) = (r – 1)(c – 1).
taken to represent the number of regression
coefficients including the constant term. The
rationale for replacing the coefficient of determi-
nation, R2 = 1 – (SSE/SST), with its adjusted ª DEGREES OF FREEDOM FOR ª
counterpart, adjusted R2 = 1 – [(SSE/(n – k))/(SST/ GOODNESS-OF-FIT
(n – 1))], is then a straightforward application of
the argument used above – that is, it is proper to As a final application, consider a chi-squared
average SSE and SST over their effective sample goodness-of-fit test of the hypothesis that a sample
sizes, n – k and n – 1, respectively. of 100 observations was drawn from a population
© 2008 The Author Teaching Statistics. Volume 30, Number 3, Autumn 2008 • 77
Journal compilation © 2008 Teaching Statistics Trust
having a (truncated) Poisson distribution. Initially, Although most textbooks state this rule in the
let us assume that the population mean is known goodness-of-fit context, many do so without
to be μ = 1.4, and for simplicity let us assume that providing a convincing rationale.
no observation ever exceeds 5 (for context, one can
think of a physical or legal limitation which precludes
x ≥ 6; suppose, for example, that a local ordinance
bars tavern patrons from purchasing more than five ª CONCLUSION ª
alcoholic beverages per hour). The sample data
can then be divided into six categories (k = 6), such Degrees of freedom are nearly ubiquitous in
that the random variable X takes the values 0, 1, statistical inference, yet they are often ill-defined.
2, 3, 4 or 5. The frequencies with which the sample Remarkably, this is as true in the reference
observations fall into these categories represent the literature as it is in the pedagogical literature. As a
information used in the chi-squared test, so there consequence, students taking courses in statistics –
are nominally six such pieces of information. But especially those students with high levels of math
since the six frequencies must sum to 100, only five anxiety or outright math phobia – tend to view
of them can be freely varied, the remaining frequency degrees of freedom as yet another set of inexplicable
being determined by default. Hence the chi-squared formulas. This need not be the case; however,
statistic in this case has df = k – 1 = 5. some intuitive discussions and elementary exercises
can provide reassurance that the concept is not
Now suppose instead that μ is unknown, and must nearly as complex or forbidding as it may at first
be estimated from the sample data. The estimation appear, and students can at least glimpse the
of the mean from the sample data will ‘cost’ relationships among the degrees of freedom in
another degree of freedom. To see why, suppose various statistical procedures.
the sample mean is calculated to be x = 1.40.
Now not only must the frequencies sum to 100,
that is, References
Black, K. (1994). Business Statistics: Con-
f1 + f2 + f3 + f4 + f5 + f6 = 100, temporary Decision Making. Minneapolis,
MN: West Publishing Company.
but also the sum of all observations – or equiva- Clapham, C. (1996). The Concise Oxford
lently, the sum of the multiplicative products of Dictionary of Mathematics (2nd edn).
the x values and their respective frequencies – Oxford, UK: Oxford University Press.
must be 140; that is, Daintith, J. and Rennie, R. (2005). The Facts
on File Dictionary of Mathematics (4th
0 f1 + 1 f2 + 2 f3 + 3 f4 + 4 f5 + 5 f6 = 140. edn). New York: Market House Books.
Everett, B.S. (2002). The Cambridge Dictionary
Clearly these last two expressions represent of Statistics (2nd edn). Cambridge, UK:
simultaneous equations in the six frequencies. Cambridge University Press.
Thus we have two equations in six unknowns, Glenn, J.A. and Littler, G.H. (eds) (1984). A
leaving k – 2 = 4 degrees of freedom. In particular, Dictionary of Mathematics. Totowa, NJ:
we can manipulate these simultaneous equations Barnes and Noble Books.
algebraically to derive the observed frequencies Kotz, S. and Johnson, N.L. (1982). Encyclo-
for the final two categories as functions of the pedia of Statistical Sciences (vol. 2). New
frequencies in the preceding four categories: York: John Wiley and Sons.
Mayhew, S. (2004). A Dictionary of Geography
f5 = 360 − 5 f1 − 4 f2 − 3 f3 − 2 f4 (3rd edn). Oxford, UK: Oxford University
Press, http://www.oxfordreference.com/.
and Schwartzman, S. (1994). The Words of Math-
ematics: An Etymological Dictionary of
f6 = 4 f1 + 3 f2 + 2 f3 + 1 f4 − 260, Mathematical Terms Used in English.
Washington, DC: Mathematics Association
so that only four frequencies are free to be varied. of America.
This extension of the simpler example reinforces Upton, G. and Cook, I. (2002). A Dictionary
the notion that a degree of freedom is lost for each of Statistics. Oxford, UK: Oxford University
parameter that must be estimated from the sample. Press, http://www.oxfordreference.com/.

78 • Teaching Statistics. Volume 30, Number 3, Autumn 2008 © 2008 The Author
Journal compilation © 2008 Teaching Statistics Trust

You might also like