Spectral Analysis
Spectral Analysis
4 Spectrum
The spectrum of a time series is the distribution of variance of the series as a function of
frequency. The object of spectral analysis is to estimate and study the spectrum. The spectrum
contains no new information beyond that in the autocovariance function (acvf), and in fact the
spectrum can be computed mathematically by transformation of the acvf. But the spectrum and
acvf present the information on the variance of the time series from complementary viewpoints.
The acf summarizes information in the time domain and the spectrum in the frequency domain.
Why analyze the spectrum?
The spectrum is of interest because many natural phenomena have variability that is
frequency-dependent, and understanding the frequency dependence may yield information about
the underlying physical mechanisms. Spectral analysis can help in this objective. A classic
example from dendroclimatology is Val LaMarches (1974) application of spectral analysis to
study differences in frequency properties of tree growth at the upper treeline and lower forest
border in eastern Nevada. LaMarche found that lower frequencies dominated the variance at the
upper treeline, and that higher frequencies were more important at the lower treeline (Figure 4.1).
From correlation analysis of frequency-stratified components of variability, he concluded that the
low-frequency variations reflected temperature fluctuations and the high-frequency fluctuations
precipitation variations. A couple of years earlier, LaMarche and Fritts (1972) had first applied
modern spectral analysis methods to study a possible solar-variability signal in tree rings. This
topic received much attention from dendroclimatologists in subsequent years. Ed Cook followed
up on the solar-tree ring connection by applying spectral analysis methods to relate tree-ring
variations at hundreds of sites in North America to an index of area covered by drought (Cook et
al. 1997) and concluded that the data supports a bi-decadal rhythm in drought area possibly
driven by interacting solar and lunar influences near the double sunspot and lunar nodal periods.
Figure 4.1. Spectra of upper treeline (solid) and lower forest border
(dashed ) chronologies (LaMarche 1974).
Notes_4, GEOS 585A, Spring 2011 2
4.1 The frequency domain
In the time domain, variations are studied as a function of time. For example, the time series
plot of an annual tree-ring index displays variations in tree-growth from year to year, and the acf
summarizes the persistence of a time series in terms of correlation between lagged values for
different numbers of years of lag. In the frequency domain, the variance of a time series is
studied as a function of frequency or wavelength. The main building blocks of variation in the
frequency domain are sinusoids, or sines and cosines. In discussing the frequency domain, it is
helpful to start with definitions pertaining to waves. For simplicity, we will use a time increment
of one year. Consider the simple example of an annual time series
t
y generated by superimposing
random normal noise on a cosine wave:
cos( )
t t t t
y R t z w z e | = + + = + (1)
where t is time (years, for this example),
t
z is the random normal component in year t,
t
w is the
sinusoidal component; and R, e and | are the amplitude, angular frequency (radians per year),
and phase of the sinusoidal component.
The plot in Figure 4.2 shows a time series generated by model (1) with the following settings:
-time series length of 201 years
-sinusoidal component with wavelength 100 years, amplitude 1.0, and phase 0 degrees
relative to time 0
-noise component from normal distribution with mean 0 and variance 0.01
The peaks are the high points in the wave; the troughs are the low points. The wave varies
around a mean of zero. The vertical distance from zero to the peak is called the amplitude. The
variance of a sinusoid is proportional to the square of the amplitude:
2
var( ) 2
t
w R = . The
phase | describes the offset in time of the peaks or troughs from some fixed point in time. From
the relationship between variance and amplitude, the sinusoidal component in this example has a
variance of 50 times that of the noise (0.5 is 50 times 0.01).
The angular frequency e describes the number of radians of the wave in a unit of time, where
2t radians corresponds to a complete cycle of the wave (peak to peak). In practical applications,
the frequency is often expressed by f, the number of cycles per time interval. The relationship
between the two frequency measures is given by
(2 ) f e t = (2)
The wavelength, or period, of the cosine wave is the distance from peak to peak, and is the
inverse of the frequency
1
f
= (3)
Thus a frequency of one cycle per year corresponds to an angular frequency of 2t radians per
year and a wavelength of 1 year.
From the relationship (3), the frequency of the cosine wave in Figure 4.2 is
1 100 0.01 cycles per year f = = (4)
Since 0.01 cycle is covered in one year, 100 years is required for a full cycle. The the angular
frequency is
2 0.0628 radians per year f e t = = (5)
Notes_4, GEOS 585A, Spring 2011 3
A frequency of one cycle every two years corresponds to an angular frequency of t radians
per year, or a wavelength of 2 years. In the analysis of annual time series, this frequency of
0.5 cycles / f yr = or radians / yr e t = corresponds to what is called the Nyquist frequency,
which is the highest frequency for which information is given by the spectral analysis.
Another important frequency in spectral analysis is the fundamental frequency, also referred to
as the first harmonic. If the length of a time series is N years, the fundamental frequency is1/ N .
The corresponding fundamental period is N years, or the length of the time series. For example,
the fundamental period of a time series of length 500 years is 500 years a wave that undergoes a
complete cycle over the full length of the time series.
4.2 Sinusoidal model of a time series
The model given by equation (1) is extremely simple, consisting of just a single sinusoidal
component with superposed noise. Following Percival and Walden (1993), the spectrum can be
defined in terms of a more complicated model in which a time series
t
X of length N consists of a
linear combination of many sinusoids at fixed frequencies
{ }
j
f and random amplitudes
{ }
j
A and
{ }
j
B :
0 20 40 60 80 100 120 140 160 180 200
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
t(yr)
y
(
t
)
Amplitude R=1.0
Wavelength =100 yr
Frequency f=1/=1/100 yr
-1
Angular Frequency e=2tf
Phase |=0
z
t
= random normal noise
y
t
=Rcos(et +|) + z
t
R
Figure 4.2. Simple example of periodic series with superimposed noise.
Notes_4, GEOS 585A, Spring 2011 4
( ) ( )
[ / 2]
1
cos 2 sin 2 , =1,2, ,
N
t j j j j
j
X A f t B f t t N t t
=
(
= + +
(6)
where is a constant term, the notation | | / 2 N refers to the greatest integer less than or equal to
/ 2 N , and the frequencies
j
f are related to the sample size N by
/ , 1 [ / 2]
j
f j N j N s s (7)
The frequencies of the sinusoids are at intervals of 1/ N and are the Fourier frequencies, or
standard frequencies. The frequency
j
f is th j standard frequency (Figure 4.3). The standard
frequencies clearly depend on the sample size. For example, for a 500-year tree-ring series, the
standard frequencies are at 1/ 500, 2/ 500, cycles per year. The highest standard frequency
is ( ) / 2 1/ 2 0.5 f N N = = = , which corresponds to a wavelength of two years.
In developing the definition of the spectrum with this model, the additional assumptions must
be made that the amplitudes
{ }
j
A and
{ }
j
B are random variables with expected values
{ } { }
0
j j
E A E B = = (8)
and
{ } { }
2 2 2
j j j
E A E B o = = (9)
From the relationship between variance and amplitude of sinusoid, equation (9) implies that
the variance associated with the th j standard frequency is
2
j
o . It must also be assumed that the
amplitudes associated with various standard frequencies are uncorrelated:
{ } { }
0 for
j k j
E A A E B B j k = = = (10)
and
{ }
0 for all , .
j k
E A B j k = (11)
With the assumptions above, it can then be shown that the expected value of
t
X is
{ }
t
E X = (12)
the variance of
t
X is
( )
{ }
[ / 2]
2 2 2
1
N
t j
j
E X o o
=
= =
(13)
and the autocorrelation function of
t
X is
[ / 2]
2
1
[ / 2]
2
1
cos(2 )
N
j j
j
k
N
j
j
f k o t
o
=
=
=
(14)
Equation (13) says that the variance of the series
t
X is the sum of the sum of the variances
associated with the sinusoidal components at the different standard frequencies. Thus the
variance of the series can be decomposed into components at the standard frequencies -- the
variance can be expressed as a function of frequency.
For the model given above, the spectrum can be defined as
2 2
, 1 [ / 2]
j j
S j N o s s (15)
Notes_4, GEOS 585A, Spring 2011 5
A plot of
j
S against frequencies
j
f shows the variance contributed by the sinusoidal terms at
each of the standard frequencies. From equation(13), the variance of
t
X can then be expressed as
the sum of the spectral components
[ / 2]
2
1
N
j
j
S o
=
=
(16)
The variance contributed at frequency
j
f is the spectrum
j
S at that frequency. The shape of
the spectral values
j
S plotted against
j
f indicates which frequencies are most important to the
variability of the time series.
Figure 4.3. Illustration of fundamental and Fourier frequencies. (A) Tree-ring index
MEAF, with length 108 years. (B) Sinusoidal time series at fundamental frequency
(wavelength 108 yr) of the tree-ring series. (C) Sinusoidal time series at Fourier
frequencies 2/N and 10/N (wavelengths 54 yr and 10.8 yr), where N=108 years. Series
in B and C are scaled to same mean and variance as tree-ring series. As the tree-ring
series is not periodic, its peaks and troughs are irregularly spaced, unlike those of the
sinusoids. Theoretically, the variance of the tree-ring series in (A) could be
decomposed into contributions from all Fourier frequencies.
Notes_4, GEOS 585A, Spring 2011 6
Considering that
2
j
o are spectral values, equation (14) gives an important relationship between
the spectrum and the autocorrelation function: the acf is expressed as a cosine transform of the
spectrum. Similarly the spectrum can be shown to be the Fourier transform of the acf. The
spectrum and acf are therefore different characterizations of the same time series information.
The acf is a time-domain characterization and the spectrum is a frequency-domain
characterization. From a practical point of view, the spectrum and acf are complementary to each
other. Which is most useful depends on the data and the objective of analysis.
4.3 Harmonic analysis =periodogram analysis =Fourier analysis
For a given time series, it is possible to apply the above model to mathematically estimate the
parameters of the sinusoidal terms at each Fourier frequency. Such an analysis is called Fourier
analysis, or harmonic analysis Chatfield (2004; Panofsky and Brier 1958). Harmonic analysis is
most appropriate for phenomena with known periodic components. For example, a single time
series made up 12 values of the long-term mean of monthly temperature will likely have a well-
defined annual (12-month) component at the fundamental frequency. Harmonic analysis can be
used to quantify the importance of this annual wave relative to other components of the
variability of the annual distribution of monthly means.
In harmonic analysis, the frequencies / , 1,..., / 2 j N j N = are referred to as the harmonics:
1/N is the first harmonic, 2/N the second harmonic, etc. Any series can be decomposed
mathematically into its N/2 harmonics. The sinusoidal components at all the harmonics
effectively describe all the variance in a series. A plot of the variance associated with each
harmonic as a function of frequency has been referred to above as the spectrum, for the
hypothesized model. Such a plot of variance (sometimes scaled in different ways) against
frequency is also called the periodogram of the series, and the analysis is called periodogram
analysis (Chatfield 2004). Spectral analysis, to be described next, departs from periodogram
analysis in an important way: in spectral analysis, the time series is regarded as just one possible
realization from a random process, and the objective is to estimate the spectrum of that process
using just the observed time series.
4.4 Spectral analysis
In analyzing the spectrum, we want to acknowledge the uncertainty in trying to understand a
process from a single sample. Spectral analysis is therefore concerned with estimating the
unknown spectrum of the process from the data and with quantifying the relative importance of
different frequency bands to the variance of the process. The spectrum that is being estimated in
a sense then is not really the spectrum of the observed series, but the spectrum of the unknown
infinitely long series from which the observed series is assumed to have come.
Various methods have been developed to estimate the spectrum from an observed time series.
For an overview and comparisons of different methods, see Percival and Walden (1993),
Chatfield (2004), and Bloomfield (1976). In this chapter, we use the Blackman-Tukey method,
one of several available nonparametric methods (Percival and Walden 1993). (Later in the
semester we will also use another spectral estimation method -- the smoothed periodogram.)
The presentation on the Blackman-Tukey method closely follows Chatfield (2004). Note the
use of angular frequency in the equations.
Notes_4, GEOS 585A, Spring 2011 7
Spectral distribution function
We previously defined the sample autocorrelation
k
r and autocovariance function
k
c . We will
refer to these functions also as the acf and the acvf. In the notation of Chatfield (2004), the
corresponding population statistics are the population acvf ( ) k and population acf ( ) k . An
important time series theorem, called the Wiener-Khintchine theorem, says that for any stationary
stochastic process with acvf ( ) k , there exists a monotonically increasing function, ( ) F e , such
that
0
( ) cos ( ) k k dF
t
e e =
}
(17)
Equation (17) is called the spectral representation of the autocovariance function, and ( ) F e is
called the spectral distribution function. ( ) F e has a direct physical interpretation:
( ) contribution to the variance of the series which is
accounted for by frequencies in the range (0, )
F e
e
=
(18)
Note the similarity of equation (18) to the relationship between autocorrelation function and
spectrum for the simple sinusoidal model discussed previously (14). Why is the range restricted
to angular frequencies (0, ) t ? First, there is no variation at negative frequencies, so the lower
limit on the range is 0. Second, if a process is measured at unit intervals, the highest possible
frequency that can be studied corresponds to wave that undergoes a complete cycle in two
intervals, or an angular frequency of e t = . Thus, all the variation is accounted for by
frequencies less than t :
2
( ) var( )
t X
F X t o = = (19)
The function ( ) F e increases monotonically between 0 e = and e t = , and in this way
( ) F e is similar to a cumulative distribution function , or cdf. In fact, by scaling ( ) F e by the
variance, we get what is called the normalized spectral distribution function
2 *
( ) ( ) /
X
F F e e o = (20)
which gives the proportion of variance accounted for by frequencies in the range (0, ) e , and like
a cdf, reaches a maximum of 1.0, since
*
( ) 1 F t = .
Spectral density function, or spectrum
The spectral distribution function gives the variance of the process at frequencies less than
some frequency. By differentiating the spectral distribution function we get the spectral density
function, which gives the variance associated with each frequency
d ( )
( ) (power) spectral density function
F
f
d
e
e
e
= (21)
The term spectral density function is often shortened to spectrum. The adjective power is
often omitted. Power comes from the application of spectral analysis in engineering, and is
related to the passage of an electric current through a resistance. For a sinusoidal input, the power
is directly proportional to the squared amplitude of the oscillation. We have seen that for a time
series the variance of a sinusoid is proportional to its squared amplitude. Thus power is
equivalent to variance.
Notes_4, GEOS 585A, Spring 2011 8
Relationship between variance and spectrum of a time series
A plot of the spectrum ( ) f e against frequency e is essentially a plot of the variance of a time
series as a function of frequency. More precisely, a given value of ( ) f e is a variance per unit
frequency, such that by integrating over some increment of ( ) f e we get the variance associated
with that range of frequencies. In other words, if de is some increment of frequency, ( )d f e e is
the contribution to the variance of components with frequencies in the range ( , d ) e e e + .
In a graph of the spectrum, therefore, the area under the curve bounded by two frequencies
represents the variance in that frequency range, and the total area underneath the curve represents
the variance of the series. A peak in the spectrum represents relatively high variance in a
frequency band centered on the peak. The Wolf sunspot series appears to have about 10 peaks
per century (Figure 4.4). The spectrum of this series peaks at a wavelength somewhat longer than
10 years, and indicates a large fraction of the variance is contributed by wavelengths between 10
and 12 years (Figure 4.5).
A flat spectrum indicates that variance is evenly distributed over frequencies. Theoretical
white noise is characterized by such even distribution of variance, and is represented by a
horizontal line in the spectral plot (see dashed line in Figure 4.5). A sampled time series referred
to as white noise series therefore has no tendency for amplified variance at any particular
frequency, and has no significant spectral peaks.
Figure 4.4. Time series of Wolf Sunspot Number. Source:
ftp://ftp.ngdc.noaa.gov/STP/SOLAR_DATA/SUNSPOT_NUMBERS/YEARLY.PLT
Notes_4, GEOS 585A, Spring 2011 9
Relationship between acvf and spectrum
In equation (17) the acvf was expressed in terms of a cosine transform of increments of the
spectral distribution function. That relationship also means that the acvf can be expressed as a
cosine transform of the spectral density function, or spectrum. An inverse relationship can also
be shown, such that the spectrum is the Fourier transform of the acfv
1
1
( ) (0) 2 ( ) cos .
k
f k k e e
t
=
(
= +
(
(22)
The Blackman-Tukey method of spectral estimation exploits this relationship to estimate the
spectrum by way of the sample acf.
The normalized spectrum
Because the area under the spectrum equals the variance, scaling the spectrum by dividing it
by the variance
2
X
o yields a plot for which the area under the curve is 1. The normalized
spectrum is accordingly defined as
2
*( ) ( ) /
X
f f e e o = (23)
Figure 4.5. Spectrum of Wolf Sunspot Cycle, 1700-2007.
Shading covers frequency range (1/12) yr
-1
to (1/10) yr
-1
,
corresponding to wavelengths between 10 and 12 years. Ratio
of shaded area to total area under curve is fraction of series
variance contributed by wavelength-band 10-12 years.
Horizontal dashed line is at the mean of the spectrum; its
ordinate is the variance of the time series (here,
variance0.1635E4). Area under dashed line equals area
under solid curve, and is proportional to the series variance.
The coefficient of proportionality is given that the
frequency axis extends from 0 to 0.5.
Notes_4, GEOS 585A, Spring 2011 10
Recall that the acf is just the acvf divided by the variance. The normalized spectrum can
therefore be written as the Fourier transform of the acf
1
1
*( ) 1 2 ( ) cos .
k
f k k e e
t
=
(
= +
(
(24)
The area beneath a portion of the normalized spectrum, *( )d f e e , is the proportion of
variance in the frequency range ( , d ) e e e + . Plots of the normalized spectrum are often
preferable to plots of the spectrum for comparing spectral properties of time series with greatly
differing variances, or different units of measurement. For example, a comparison of spectral
properties of two different segments of the Wolf Sunspot series is distorted by gross differences
in the total series variance for the sub-periods (Figure 4.6). In this case the normalized spectra
show that the segments are really quite similar in the fraction of variance near wavelength 11
years.
4.5 Estimating the spectrum from data
The discussion above suggests that an obvious estimator for the spectrum is the Fourier
transform of the complete sample acvf. Why not simply substitute the sample acvf for the
population acf in equation (24) and have the summation run out to the maximum possible lag,
which is lag 1 k N = ? In fact this approach is one way of estimating the periodogram, which as
we have seen is a spectrum with spectral estimates at the standard frequencies. The estimation of
the periodogram in this way is described by Chatfield (2004). A problem with estimating the
spectrum in this seemingly obvious way is that the estimator is not consistent, meaning that
variance of the estimate does not decrease as the sample size N increases. One reason is that the
method entails estimating N parameters from N observations, no matter how long the series is.
Another problem is that the acvf at high lags is uncertain, and the method does not discount the
higher lags. As a result, the spectrum (or periodogram) estimated in this way fluctuates wildly
from one standard frequency to another and is extremely difficult to interpret. The Blackman-
Tukey method circumvents this problem by applying the Fourier transform to a truncated,
smoothed acvf rather than to the entire acvf.
Figure 4.6. Comparison of spectra and normalized spectra. (A) Spectra for two different
sub-periods of the Wolf Sunspot series. (B) Normalized spectra for the same two sub-periods.
Series variance for 1801-1900 is only about half that for 1901-2000. From the time series plot
(Figure 4.4), it can be seen that the variance for 1801-1900 is lower than for 1901-2000. The
greater area under the spectrum for 1901-2000 reflects this difference (A). The differences in
total variance have been removed in the normalized spectra (B), and the areas under the two
spectra are equal.
Notes_4, GEOS 585A, Spring 2011 11
The lag-window estimate of the spectrum
The Blackman-Tukey estimation method consists of taking a Fourier transform of the
truncated sample acvf using a weighting procedure. Because the precision of the acvf estimates
k
r decreases as lag k increases, it seems reasonable to give less weight to the values of the acvf at
high lags. Such an estimator is given by
0 0
1
1
( ) 2 cos
M
k k
k
f c c k e e
t
=
= +
`
)
(25)
where { }
k
are a set of weights called the lag window, and ( ) M N < is called the truncation
point. From equation (25) we see that the acvf at lags M k N < < are no longer used, and that the
acvf estimates at lower lags are weighted by a weighting function
k
. Various weighting
functions have been used to estimate the spectrum by equation (25); all have decreasing weight
toward higher lags, such that the higher-lag acvf values are discounted. One popular form of lag
window is the Tukey window.
Tukey window
The Tukey window, also called the Tukey-Hanning and Blackman-Tukey window, is given by
0.5 1 cos 0,1, ,
k
k
k M
M
t
| |
= + =
|
\ .
(26)
where k is the lag, M is the width of the lag window also called the truncation point , and
k
is
the weight at lag k. The Tukey window for a window-width of 30 lags is shown in Figure 4.7.
Notice that the weight
decreases in the form of a
bell-shaped curve from a
maximum weight of 1 at
lag 0 to a minimum weight
of 0 at lag M. In
estimating the spectrum by
smoothing the acvf, you
must choose the truncation
point M. This is generally
done by trial and error,
with a subjective
evaluation of which
window best displays the
important spectral features.
The choice of M affects the
bias, variance, and
bandwidth of the spectral
estimates.
0 5 10 15 20 25 30
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
W
e
i
g
h
t
Lag (yr)
k
=0.5[1+cos(t/M)]
Tukey Window, M = 30
Figure 4.7. Weights in Tukey Window of width 30.
Notes_4, GEOS 585A, Spring 2011 12
Smaller M increased bias
Bias refers to the tendency of spectral estimates to be less extreme (both highs and lows) than
the true spectrum. Increased bias is manifested in a flattening out of the estimated spectrum,
such that peaks are not as high as they should be, and troughs not as low. This bias is
acknowledged, but not explicitly expressed as a function of M.
Smaller M smaller variance of spectral estimates (narrower confidence bands)
Chatfield (2004) references Jenkins and Watts (1968, Section 6.4.2) for the general
relationship between the truncation point, M, of the Tukey window and the variance of the
spectral estimates. The relationship is that the quantity
( ) ( ) f f v e e is approximately distributed
as
2
v
_ , or chi squared with v degrees of freedom, where for the Tukey window the degrees of
freedom depends on the ratio of sample size to truncation point as follows
2.67N M v = (27)
The relationship yields the following asymptotic 100(1 )% o confidence interval for ( ) f e :
2 2
, / 2 ,1 / 2
( ) ( )
to
f f
v o v o
v e v e
_ _
(28)
Lets looks at a hypothetical example to illustrate the effect of changing M on the confidence
interval. Say the time series has length 500 years, and you try two values of truncation point:
1
/ 5 100 M N = = and
2
/10 50 M N = = . From equation (27), the corresponding values for
degrees of freedom are
1
5*2.67 13.35 v = = (29)
and
2
10*2.67 26.7 v = = (30)
For
1
13 v ~ , associated with a choice of truncation point 100 M = , the chi square values for
the 95% confidence interval are
2
13,.025 13,0.975
5.01 , 24.73 _ _ = = (31)
and the confidence interval is
13 ( ) 13 ( )
to or 2.59 ( ) to 0.52 ( )
5.01 24.73
f f
f f
e e
e e (32)
For
2
27 v ~ , associated with a choice of truncation point 50 M = , the chi square values for
the 95% confidence interval are
2
27,.025 27,0.975
14.57 , 43.19 _ _ = = (33)
and the confidence interval is
27 ( ) 27 ( )
to or 1.85 ( ) to 0.62 ( )
14.57 43.19
f f
f f
e e
e e (34)
For a spectral estimate