SolutionsManual MCstyle 2018
SolutionsManual MCstyle 2018
Faculty of Engineering
Centre for Mathematical Sciences
Mathematical Statistics
1
Solutions to exercises in
Stationary stochastic processes
for scientists and engineers
Mathematical Statistics
Centre for Mathematical Sciences
Lund University
Box 118
SE-221 00 Lund, Sweden
http://www.maths.lu.se
c Georg Lindgren, Johan Sandberg,
Maria Sandsten, 2017
Contents
Preface v
2 Stationary processes 1
4 Spectral representations 9
5 Gaussian processes 13
iii
iv CONTENTS
Preface
This booklet contains hints and solutions to exercises in Stationary stochastic processes for
scientists and engineers by Georg Lindgren, Holger Rootzén, and Maria Sandsten, Chapman
& Hall/CRC, 2013.
The solutions have been adapted from course material used at Lund University on first
courses in stationary processes for students in engineering programs as well as in mathematics,
statistics, and science programs.
The web page for the course during the fall semester 2013 gives an example of a schedule
for a seven week period:
http://www.maths.lu.se/matstat/kurser/fms045mas210/
Note that the chapter references in the material from the Lund University course do not
exactly agree with those in the printed volume.
v
vi CONTENTS
Chapter 2
Stationary processes
2:3. The mean value function: mX (t) , E[X(t)] = E[1.2et + 0.9et−1 ] = 1.2m + 0.9m = 2.1m.
The covariance function:
A process X(t) is weakly stationary if the mean value function, mX (t), does not depend
on t and the covariance function, rX (t, s), only depends on |t − s|. Here the mean value
function does not depend on t and the covariance function only depends on |s−t| (check
this by proving that rX (s + c,t + c) = rX (s,t) for any c.). Thus, the process is weakly
stationary.
p
2:4. rY (1, 2) = 0.5σ 2 6= rY (2, 3) = 0.625σ 2 ⇒ non-stationary. With c = 4/3 one gets a
stationary sequence.
2:5. Hint: Stockholm has a temperate climate with four distinct seasons.
The variance is
1
2 CHAPTER 2. STATIONARY PROCESSES
The expected value is constant but the variance depends on t; thus the process is not
weakly stationary. Alternatively
C[X(t), X(s)] = C[Y,Y ] sin(t) sin(s) + C[Z, Z] = σy2 sin(t) sin(s) + σz2
1
= σy2 (cos(t − s) − cos(t + s)) + σz2 ,
2
depending on t − s and t + s.
2:8. Option 1 has a positive correlation and Option 2 has a negative correlation at τ = 1.
Both have zero correlation at τ = 2. The figure indicates positive correlation at τ = 1,
and therefore Option 1 seems to be the best alternative.
2:9. (a) This function has the necessary properties of a covariance function stated in Theo-
rem 2.2, but one should note that these conditions are not sufficient. That the function
actually is a covariance function is shown in Chapter 4, where we study the correspond-
ing spectral density.
(b) This is not a covariance function since it is asymmetric.
(c) This is not a covariance function since it is asymmetric.
(d) This is a covariance function; as shown in Theorem 2.3.
(e) This is not a covariance function since there are r(τ) > r(0).
√2
b y)] = 2 + 8
2:10. V[M(x, 1
≈ 0.52. Hint: A few different covariances are needed:
5 25 2
Count how many covariances similar to these there are (in total 25 including the vari-
ances V[N(x, y)] = σ 2 ).
2:11. (a) No, E[X(t)] = 0 is constant but V[X(t)] = sin2 t is not. Note the constant phase.
(b) Yes, the phase is randomly distributed, uniformly between 0 and 2π. Further, W (t) =
sin(t + Z) = cos(t + Z − π/2) is a process of similar form as the process U(t) = cos(t +
Z), which we know is weakly stationary (Theorem 2.3 with A = 1, f0 = 1/2π). Both
W (t) and U(t) are cosine functions started at a random time point, uniformly distributed
over a full period of 2π.
p
2:12. (a) The standard deviation is V[m b N ] where
1 N−1
bN ] =
V[m ∑ (N − |k|)rX (k)
N 2 k=−N+1
1
= (15rX (0) + 2 · 14rX (1))
152
3
1 17
= 2
(15 · 3 + 2 · 14 · (−1)) = 2 = 0.27492 .
15 15
2:13. (a) {YN (n)} and {ZM (n)} are smoothed versions of the original signal. If the signal
of interest varies very slowly, we can expect it to be approximately constant in the
interval, [n − N + 1, . . . , n], and in that case, the averaging will not do any harm to the
deterministic signal, but the noise might be reduced.
(b) A large value of N (or M) means averaging over a large number of terms, which
will reduce the noise more. However, we will loose time resolution of the deterministic
signal.
(c) Let the covariance function of {X(n)} be denoted by rX (τ) = r(τ), and solve the
following inequalities:
and
(d) While {Y (n)} is a sum of random variables with positive correlation, the construc-
tion of {Z(n)} takes advantage of the fact that rX (±2) = 0, which makes {Z(n)} a sum
of uncorrelated variables.
2:17. Assume the correlation function is ρ(τ) = θ |τ| . Then, the variance of the mean value
estimator, based on N observations is
1 1+θ
V [m
bN ] ≈ ·
N 1−θ
according to the approximation (2.12). For uncorrelated variables the variance is
V[mb n ] = 1/n. Setting the variances equal we get N/n = (1 + θ )/(1 − θ ). To find θ ,
let τK be the time lag at which the correlation has dropped off to 1/K, i.e. 1/K = θ τK ,
leading to θ = K −1/τK , and
1738
2:18. V[m1 ] = 50, V[m2 ] = 9 . The estimator m1 is preferred to m2 .
2:20. By the law of large numbers n−1 ∑n1 Zn → E[X1 ] + Y = Y 6= 0; thus Zn is not linearly
ergodic. Further, rZ (t) = 2 for t = 0 and rZ (t) = 1 for t 6= 0; thus, the sufficient condition
that ∑∞
0 r(t) is convergent is not satisfied.
Chapter 3
3:1. It is not weakly stationary. The variance of Y (t) = X(t) − λt is the same as that of X(t),
V[Y (t)] = V[X(t)] = E[X(t)] = λt,
which depends on t.
3:2. The calculations in Section 3.2.3 show the following interesting property of a Poisson
process: If one has observed that there has occurred N events in an observation interval
(0, T ] then the times at which these N events occurred are distributed over the interval as
N independent random variables, uniformly distributed over (0, T ]. This means that the
number of events that occurred in an interval [a, b] with 0 < a < b < T has a binomial
distribution with parameters N and (b − a)/T . The arguments can be repeated for more
than one interval, leading to a multinomial distribution of events in non-overlapping
intervals, with probabilities equal to the relative lengths of the intervals.
We calculate the probability that of the four events in (0, 4] there were two in (0, 1], two
in (1, 2] and none in (2, 4]. The probability is
4! 24
0.252 0.252 0.50 = 0.254 ≈ 0.0234.
2! 2! 0! 2·2·1
3:3. The number of accidents, nT , over T = 544 × 106 km is assumed to have a Poisson
distribution with expectation λ T and variance λ T . From the observed number, 67, an
estimate of λ is
67
λT =
b = 0.123 × 10−6 .
544 × 106
According to Example 3.3 an estimate of the standard deviation of b
λT is
q
D[bλT ] = b λT /T = 0.0150 × 10−6 ,
and an approximative 95% confidence interval is (with qα/2 = 1.96),
A realistic answer would be between 10 and 15 fires per 100 million train kilometers.
5
6 CHAPTER 3. THE POISSON PROCESS AND ITS RELATIVES
3:4. The number of background counts Nb during a time span Tb is Poisson with mean and
variance equal to λb Tb . The number of background + sample counts Nb+s during a time
span Ts is Poisson with mean and variance (λb + λs )Ts .
The background and sample intensities can be estimated by
λb = Nb /Tb ,
b
since the counts are independent. For a fixed Tb + Tx = T this is minimized for
p
λ
Tb = T p pb ,
λb + λb + λs
p
λb + λx
Ts = T p p .
λb + λb + λs
3:5. Obviously, the event that T1 ≤ s is equivalent to X(s) ≥ 1. Thus the conditional proba-
bility is
Thus,
2 /2
P(T1 ≤ s) = P(X(s) ≥ 1) = 1 − P(X(s) = 0) = 1 − e−γs .
Since the number of events in disjoint intervals are independent we obtain, as in the
previous exercise, the following expression for the conditional probability
3:7. (a) Accident risk is affected by many factors. Some of these factors are directly related
to time of the day, for example, the amount and type of illumination on the road. Other
factors are not directly related to time, but show a systematic variation with time, for
example traffic intensity and average speed.
(b) The average number of accidents between six in the morning and six in the evening
is
Z 18 18
24
E[X(18) − X(6)] = λ (t) dt = 0.001 2t − sin(2πt/24)
6 2π 6
= 0.001(24 + 24/π) = 0.01636.
3:8. If the number of raisins in a muffin is Poisson distributed with mean m, then the proba-
bility of no raisin is e−m . Thus m has to be at least − log 0.05 = 3. The Poisson assump-
tion requires that raisins do not influence each other, and that seems to be a questionable
property for raisins in muffins.
3:9. The germ-grain model is an example of a marked point process with the random radius
of the attached disc is a “mark” on the point. If the disc sizes are statistically indepen-
dent of the locations and of each other, the model can be defined as an inhomogeneous
Poisson process in R3 , generated as follows.
Assume the radius distribution is continuous with probability density function
pR (r), r > 0. If there is a point in the center point process at location (x, y) ∈ R2 , then
we draw a random number r from the distribution pR (r) and put a (marked) point
8 CHAPTER 3. THE POISSON PROCESS AND ITS RELATIVES
λ (x, y, r) = λ × pR (r).
The number of marked points in a region A in R3 has a Poisson distribution with mean
ZZZ
m(A) = λ (x, y, r) dx dy dr. (3.1)
A
We now return to the original problem: find the probability that a given fixed point is not
covered by any disk; we calculate the probability that the point (0, 0), i.e., the origin, is
not covered.
2 2 2
p at (x, y) and radius r covers the origin if and only if x + y ≤ r . The
A disk with center
inequality r ≥ x2 + y2 defines a region A in R3 , and we let N(A) be the number of
marked points (x, y, r) that fulfil this criterion. We seek the probability P(N(A) ≥ 1) =
1 − P(N(A) = 0) that there is at least one marked point in A.
The marked point process is an inhomogeneous Poisson process and, by (3.1), the ex-
pected number of points in A is
Z ∞ ZZ
m(A) = λ (x, y, r) dx dy dr
r=0 x2 +y2 ≤r2
Z ∞ ZZ Z ∞
=λ pR (r) 1 dx dy dr = λ pR (r) πr2 dr,
r=0 x2 +y2 ≤r2 r=0
since the double integral within brackets is equal to the area (= πr2 ) of the disk with
radius r. With the uniform radius distribution, pR (r) = 1/a for 0 ≤ r ≤ a, and pR (r) = 0
otherwise, we get
λ πr2 πa2
Z a
m(A) = dr = λ .
0 a 3
Hence, the probability that the origin is covered by at least one disk is P(N(A) ≥ 1) =
1 − P(N(A) = 0) = e−m(A) = 1 − exp(−λ πa2 /3).
3:10. First prove that m(q) = cq for all rational q:
Spectral representations
4:1. The spectral density is, by its definition, equal to the integral (or Fourier transform)
Z +∞
RX ( f ) , rX (τ)e−i2π f τ dτ.
−∞
Since this integral sometimes is cumbersome to compute we use the Fourier transfor-
mation table which you can find in the table of formulas. Using this table, we get
α α
RX ( f ) = + 2 .
α 2 + (2π f0 − 2π f )2 α + (2π f0 + 2π f )2
∞
g(t)e−i2πt f dt, is that it equals the
R
A nice property of the Fourier transform, G( f ) = −∞
Fourier inverse transform, if the function transform is even symmetric, g(t) = g(−t):
Z ∞
G( f ) = g(t)e−i2πt f dt
−∞
Z ∞
= g(t) cos(−2πt f ) + i sin(−2πt f ) dt
−∞ | {z } | {z }
=cos(2πt f ), even function =−i sin(2πt f ), odd function
Z ∞ Z ∞
= g(t) cos(2πt f )dt − i g(t) sin(2πt f )dt
−∞ −∞
| {z }
Convince yourself that this is zero, if g(t)=g(−t)
Z ∞
= g(t)e+i2πt f dt.
−∞
This means, that for even functions (such as the covariance function for a stationary
process), we can use the Fourier transformation table in either direction. It also implies
that the Fourier transformation of an even function is real valued.
q 2
4:2. (a) RX ( f ) = απ exp − (2π4αf ) , (b) RX ( f ) = πe−2π| f | .
α12
4:3. R( f ) = α02 δ ( f ) + 4 (δ ( f − f0 ) + δ ( f + f0 )).
9
10 CHAPTER 4. SPECTRAL REPRESENTATIONS
4:4. (a) r(τ) = A2 e−|τ| , (b) r(τ) = A4 e−2|τ| , (c) r(τ) = A 1 −|τ|
− 14 e−2|τ|
3 2e . Hint: Assume
the following partial fraction decomposition
B C
R3 ( f ) = 2
+ .
1 + (2π f ) 4 + (2π f )2
4:5. a – g, b – f, c – e, d – h .
4:6. (a)
4:7. Only A, E, I can be covariance functions; they have their largest value at τ = 0. G, H,
C are spectral densities as they are the only ones that are positive for all values. Conse-
quently D, B, F are realizations. B-A-G belong to the same process as a slowly varying
process has a covariance function that varies slowly and a spectral density concentrated
at low frequency. D-C-I belong to a faster varying process with higher frequency. F-E-H
belong to a ’noisy’ process with many frequencies in the spectral density. The covari-
ance function does also decay faster (compare with the spectral density for white noise
which is constant and has a covariance function which is zero for all values except for
τ = 0).
4:8. Hint: Investigate if there exist any minima for R( f ) and investigate the endpoints f =
0 and f = ∞. As R(0) = 1 + B/4, B ≥ −4. For f → ∞, R( f ) = f12 (1/(1/ f 2 + 1) +
B/(4/ f 2 + 1)) can be studied. Then B ≥ −1. Conclusion B ≥ −1.
4:9. (a) Covariance functions: A and F, as r(0) > r(τ) for τ > 0. Spectral densities: C and
E, as R( f ) > 0. Conclusively, realizations: B and D.
(b) B ↔ A ↔ E, as the frequencies of the covariance function and the process realization
are in concordance with the frequency values of the spectrum, (continuous). D ↔ C ↔
F, where the process and the covariance function switches sign for every sample. This
is the highest possible frequency of a discrete time process, i.e., f = 0.5, where also the
spectrum has its maximum.
(c) h = 3 for B ↔ A ↔ E and h = 1 for D ↔ C ↔ F.
1 1 1
0 0 0
−4 −2 0 2 4 −0.5 0 0.5 −2 0 2
f (Hz) f (Hz) f (Hz)
Figure 4.1: Spectral densities of continuous process (I), the process sampled once every second
(II) and five times every second (III).
cos2 ( f · π2 ) + cos2 (( f − 1) π2 ), f ≥ 0,
=
cos2 ( f · π2 ) + cos2 (( f + 1) π2 ), f ≤ 0,
cos2 ( f · π2 ) + sin2 ( f · π2 ) = 1,
f ≥ 0,
=
sin2 (( f + 1) π2 ) + cos2 (( f + 1) π2 ) =, 1 f ≤, 0
1
= 1 for all | f | ≤ ;
2
see Figure 4.1.
For (2), the spectrum is RX ( f ), | f | ≤ 2.5 Hz; see Figure 4.1.
4:11. (a) The sample frequency should be chosen at least to fs = 1/d = 40 Hz to avoid alias-
ing.
(b) After sampling the signal contains the frequency area 10 ≤ | f | ≤ 15 and the distur-
bance is located at the frequency 10 Hz.
12 CHAPTER 4. SPECTRAL REPRESENTATIONS
Chapter 5
Gaussian processes
5:2. As usual, it is a good idea to start with the definition of the covariance function:
rY (s,t) = C[Y (s), Y (t)] = C[X(s) − 0.4X(s − 2), X(t) − 0.4X(t − 2)]
= rX (t − s) + 0.16rX (t − s) − 0.4rX (t − s − 2) − 0.4rX (t − s + 2).
Since this does only depend on t − s (convince yourself that rY (s + c,t + c) = rY (s,t)
for any constant c) and since mY (t) is constant, {Y (t)} is a weakly stationary process.
It is a Gaussian process, since every process that is a linear combination of Gaussian
processes is a Gaussian process. A (real valued) Gaussian weakly stationary process is
also strictly stationary.
5:3. Φ − √10.6 ≈ 0.0985. One can imagine the process as a sum of two independent pro-
1
cesses rX (τ) = rA (τ) + rB (τ), one process with the covariance function rA (τ) = 1+τ 2
(this is a common process), and one with the covariance function rB (τ) = 1 (this pro-
cess is a constant, but a random one!). To simulate a realization of this process, take the
outcome of a Gaussian random variable and then add it to a realization of the process
with covariance function rA (τ).
5:4. RC ≈ 1.92. Hint: First, you should understand that in order to compute P(Z(t)2 < 1)
you need the distribution of the random variable Z(t)2 . Second, you recognize that this
is not a Gaussian random variable (show it!) and that the distribution would be tiresome
2
event that Z(t) < 1 is the same event as
to find. Luckily though, we realize that the
2
−1 < Z(t) < 1, and therefore P Z(t) < 1 = P (−1 < Z(t) < 1).
5:5. (a) It is found that E[Yn+1 ] = E[Yn ] + log a + E[en ] = E[Yn ] + log a. As Y0 = 0, we get
E[Yn ] = n log a.
13
14 CHAPTER 5. GAUSSIAN PROCESSES
Further
C[Ym ,Yn ] = C[Ym ,Yn−1 ] + C[Ym , en−1 ] = C[Ym ,Yn−1 ].
for m < n. Repeating,
C[Ym ,Yn ] = C[Ym ,Ym ] = V[Ym ]
if m < n. On the other hand
giving
C[Ym ,Yn ] = min(m, n)(K log a)2 ,
which gives a non-stationary process.
(b) The probability is
P(Xn+1 /Xn > 1 + 100p) = P(log Xn+1 − log Xn > log(1 + 100p)) =
P(Yn+1 −Yn > log(1 + 100p)) = P(en > log(1 + 100p) − log a) =
log(1 + 100p) − log a − 0
1−Φ ≈ 1 − Φ(99/30) = 0.00048.
K log a
In average 250 · 0.00048 = 0.12 occasions per year.
5:6. (a) The Wiener process is a Gaussian process, and hence, Y (t) is Gaussian distributed.
Z(t) is a linear combination of Gaussian random variables, and is thus Gaussian. It
remains to compute the expectation and variance. The expectation of Z(t) is
Y (t) −Y (t/2)
E[Z(t)] = E √
t
E[Y (t)] − E[Y (t/2)]
= √ = 0,
t
and the variance is
Y (t) −Y (t/2) Y (t) −Y (t/2)
V[Z(t)] = C √ , √
t t
1 1
= (t − t/2 − t/2 + t/2) = .
t 2
We conclude that Z(t) ∼ N 0, 12 .
(b) The expectation (found in (a)) does not depend on the time, t. We need to compute
the covariance function:
Y (s) −Y (s/2) Y (t) −Y (t/2)
rZ (s,t) , C[Z(s), Z(t)] = C √ , √
s t
15
1
= √ (min(s, t) − min(s, t/2) − min(s/2,t) + min(s/2, t/2))
ts
1
√ (s − s − s/2 + s/2) =0 s ≤ t/2
ts
√1 (s − t/2 − s/2 + s/2) = √1 (s − t/2) t/2 ≤ s ≤ t
= ts ts .
√1 (t − t/2 − s/2 + t/2) = √1 (t − s/2) t ≤ s ≤ 2t
ts ts
0 s > 2t
Since this is not a function of the distance between s and t, we conclude that the process
is not weakly stationary.
At every time point t, X(t) is determined by many very small contributions (each
≤ 10−2 ) and they sum to approximately 100. Then X(t) is approximately Gaussian
distributed with
E[X(t + 1) − X(t)] = 0
2
P(X(t + 1) − X(t) > 2) = 1 − Φ √ = 0.006.
1 − e−1
16 CHAPTER 5. GAUSSIAN PROCESSES
Chapter 6
6:1. With
Z Z
Y (t) = h(u)X(t − u) du = (δ0 (u) − δ1 (u))X(t − u) du
= X(t) − X(t − 1)
we get
= rX (0) + rX (0) + 4rX (0) + 2rX (2) − 4rX (1) − 4rX (−1)
1 1 1 12
= 1+1+4+2· −4· −4· = .
5 2 2 5
6:2. The output is a Gaussian process as the filter is linear with Gaussian input. Further, the
output spectral density is, for | f | ≤ q,
2
2 4
RY ( f ) = |H( f )|2 RX ( f ) = (1 + | f |) = .
1+|f| 1+|f|
17
18 CHAPTER 6. LINEAR FILTERS – GENERAL THEORY
The variance is
Z 1
4
rY (0) = 2 d f = 8 [ln(1 + f )]10 = 8 ln 2 ≈ 5.545.
0 1+ f
giving
!
4−2
P (Y (t) ≤ 4) = Φ p ≈ Φ(0.85) ≈ 0.802.
8 ln(2)
6:3. The covariance function of the output signal is rY (τ) = − 32 e−|τ| + 43 e−2|τ| .
R f0 +∆ f /2
6:5. (a) E[Y (t)2 ] = 2 f0 −∆ f /2 RX ( f ) d f .
1 2
(b) RX ( f0 ) ≈ 2∆ f E[Y (t) ].
H( f ) = 1 for | f | ≤ 10,
and zero for all other values. This means that the frequencies f1 = 4.5 Hz and f2 = 7 Hz
go unchanged through the filter while f3 = 11 Hz is completely stopped.
(b) By sampling with d = 0.1 the sampling frequency is fs = 1/d = 10 Hz and therefore
the highest possible frequency is fs /2 = 5 Hz. The frequency f1s = 4.5 · 0.1 = 0.45 after
sampling but f2 = 7 Hz is aliased to 10 − 7 = 3 Hz giving f2s = 3 · 0.1 = 0.3.
19
6:7. The spectral density of the derivative is (2π f )2 times the spectral density of the process:
The covariance function of the derivative is the negated second derivative of the co-
variance function. The covariance function is the inverse Fourier transformation of the
spectral density. The table of Fourier transforms gives us that F e −α|τ| 2α
= α 2 +(2π f )2 .
τ→ f
Since the function is even symmetric, the Fourier transform of e−α|τ| is equal to the
inverse Fourier transformation of e−α|τ| . Thus,
rX (τ) = F RX ( f ) = F
−1 −1
πe−2π| f |
f →τ f →τ
= F πe−2π| f | = π F e−2π| f |
f →τ f →τ
2 · 2π 1
=π 2 2
=
(2π) + (2πτ) 1 + τ2
This gives the covariance function of the derivative,
d 2 rX (τ) d2 1 2 − 6τ 2
rX 0 (τ) = − = − = .
dτ 2 dτ 2 1 + τ 2 (1 + τ 2 )3
6:8. A function r(t) is a covariance function of a (real valued) stationary process if and only
if its Fourier transform, R( f ), is (even symmetric), non negative, and integrable with
Z ∞
R( f ) d f < ∞.
−∞
We compute the Fourier transform of rX (t) using the table of Fourier transforms:
2 · 41 2 · 12
RX ( f ) = F rX (t) = 4 1
− 2 1
t→τ
42
+ (2π f )2 22
+ (2π f )2
24
= .
(1 + 16(2π f ) )(1 + 4(2π f )2 )
2
differentiable. However; in this case it is more difficult to show that the covariance
function is twice differentiable, since the two terms 4e−2π|t|/4 and −2e−|t|/2 are not
differentiable at t = 0.
As mentioned, another possibility is to consider the integral
24(2π f )2
Z ∞ Z ∞
(2π f )2 R( f ) d f = 2 2
df.
−∞ −∞ (1 + 16(2π f ) )(1 + 4(2π f ) )
6:9.
2
σ ,2
k = 0,
−σ , k = 1,
rY,Z (n, n + k) = C[Y (n), Z(n + k)] = σ 2 , k = 2,
3σ 2 , k = −1,
0, otherwise.
6:10. (a) The cross-covariance function is rX,Y (τ) = 0∞ h(s)rX (τ − s) ds = (h ∗ rX )(τ), where
R
r(0) + ar(1) = σ 2 10 + a · 5 = σ 2
⇐⇒
r(1) + ar(0) = 0 5 + a · 10 = 0
with the solution r(0) = 9.6, r(1) = 6.4, and r(2) = 1.6. The spectral density is
σ2 4
R( f ) = 2
=
|∑k ak e−i2π f k | |1 − e−i2π f + 0.5e−i4π f |2
4
= .
2.25 − 3 cos 2π f + cos 4π f
7:4. q = 2, b1 = 0, b2 = 21 , RY ( f ) = 54 + cos 4π f , f ∈ − 21 , 1
2 .
7:5. (a) {Zt } is an MA(3)-process if we choose M such that {Zt } is a zero mean process.
This gives M = E[Xt ] = E[et + 2et−1 − et−3 ] = 2m. Since {Zt } is equivalent to the MA-
process
Zt = ct + 2ct−1 − ct−3 ,
21
22 CHAPTER 7. AR, MA, AND ARMA-MODELS
where ct is an innovation process with zero mean, we can compute the covariance func-
tion of {Zt } by using the standard formula for covariance of an MA-process:
2
σ ∑ j−k=τ b j bk |τ| ≤ q
r(τ) =
0 |τ| > q,
q 2
2 −i2π f k
RZ ( f ) = σ ∑ bk e
k=0
(b) Since {Xt } and {Zt } only differ by an additive constant, they will share the same
covariance function and spectral density (if you don’t find it too obvious, prove it!). The
expectation function is E[Xt ] = 2m.
7:6. Every covariance function r(τ) has its maximum in zero: r(0) ≥ r(τ). Thus, Figures (a)
and (c) are realizations and (b) and (d) are estimated covariance functions. In each time
step, it is likely that a realization of {X(t)} changes sign, since it is a realization of an
AR(1)-process with positive parameter. This fits well with realization (a) and covariance
function (d). The MA-process is a smoothing average of a white noise process, so it
will change more slowly, which matches realization (c). Theoretically, the covariance
function of {Yt } will be zero for time lags greater or equal to three. This corresponds to
the estimated covariance function (b).
7:7. (a) A spectral density function is always positive, which means that only Figures A and
F are spectral densities. The covariance function always has the largest value at zero,
which holds for D and E. Then B and C have to be realizations.
(b) Figure α illustrates an AR-process of order four as there are four poles. Figure β
gives an MA-process of order three as there are three zeros. For the AR-process the
spectral density F has two peaks at frequencies 0.1 and 0.3, which correspond to the
angle frequencies of the poles ω = 2π f .
The spectral density A might come from the MA-process as it has low energy corre-
sponding to the zeros. The covariance function E has values that differ from zero for
large τ, so it has to be the AR-process. Covariance function D seems to be zero at τ = 4
23
7:8. C belongs to I and 1 as a low frequency for the spectral density means a smaller angle
for the pole and a slower variation for the covariance function. B belongs to 2 and III
as a pole far from the unit circle gives a damped covariance function and a peak in the
spectral density which is not that sharp. A belongs to II and 3 where a higher frequency
in the spectral density means a larger angle for the pole and a faster variation of the
covariance function.
3.5
7:9. (1 − 1.559 + 0.81) · E[Wt ] = 3.5 ⇒ E[Wt ] = 0.251 = 13.94. Wt − 13.94 is an AR(2)-
process and V[Wt ] = r(0) is given from the Yule-Walker equations:
r(0) − 1.559 r(1) + 0.81 r(2) = 4
r(1) − 1.559 r(0) + 0.81 r(1) = 0
r(2) − 1.559 r(1) + 0.81 r(0) = 0
r(0) = 45.062,
Solution:p p r(1) = 38.813, r(2) = 24.009, where the standard deviation is
given as V[Wt ] = r(0) ≈ 6.71 m.
where
X2m−1 = −a1 X2m−2 + e2m−1 = −a1Ym−1 + e2m−1
and
Here um has V[um ] = (a21 + 1)σ 2 , and is uncorrelated with Ym−1 ,Ym−2 , . . . Conclusion:
Ym − a21Ym−1 = um is an AR(1)-process with parameter −a21 and V(um ) = (a21 + 1)σ 2 .
7:11.
0, k < 0,
C[Xt+k , et ] =
2 · 0.4k , k ≥ 0.
rX (0) ≈ 0.64, b
7:12. (a) b rX (1) ≈ −0.12.
[
(b) ab ≈ 0.19, (σ 2 ) ≈ 0.62.
24 CHAPTER 7. AR, MA, AND ARMA-MODELS
1 10
b=
m ∑ Xt .
10 t=1
To compute the variance we need the covariance function of the process {Xt }, which is
the same as the covariance function of the process {Xt −m}. We compute the covariance
function using the Yule-Walker equations:
This gives
16 16
rX (0) = , rX (k) = (−0.25)k , k > 0.
15 15
The variance of the estimator is
" # " #
1 10 1 10 10
b =V
V[m] ∑ Xt = 100 C ∑ Xt , ∑ Xs
10 t=1 t=1 s=1
1
= 10rX (0) + 2 · 9rX (1) + 2 · 8rX (2) + · · · + 2rX (9)
100
1 1 1
= 10 · 16 − 18 · 4 + · · · − 2 · ≈ 0.06741.
100 15 16384
The expectation of m
b is m, and mb is Gaussian since it is a linear combination of
b ∼ N (m, 0.06741). A point estimate of
the Gaussian random variables et . Thus m
m is 10 −1 10 x = 0.7347. A 95% confidence interval is given by: 0.7347 ± 1.96 ·
√ ∑t=1 t
0.06741 = (0.2258, 1.2436).
A(T −1 )
et = Xt ,
C(T −1 )
we get
C(T −1 ) C(T −1 ) − A(T −1 ) −1
Xt+1 = et+1 = et+1 + T et+1
A(T −1 ) A(T −1 )T −1
∞
1
= et+1 + Xt = ∑ (−0.5)k Xt−k .
C(T −1 ) k=0
Rs ( f )
H( f ) = ,
RS ( f ) + RN ( f )
where
RS ( f ) = ∑ rS (τ)e−i2π f τ = 2 − 2 cos(2π f ),
τ
and
RN ( f ) = ∑ rN (τ)e−i2π f τ = 2 + 2 cos(2π f ).
τ
We get
H( f ) = 0.5 − 0.5 cos(2π f ) = 0.5 − 0.25(e−i2π f + ei2π f )
for −0.5 < f ≤ 0.5. The impulse response is h(0) = 0.5, h(±1) = −0.25 and zero for
all other u.
8:3. The spectral density RX ( f ) of the temperature and the spectral density RN ( f ) of the
noise are given by (use the Fourier transformation table!)
4 40
RX ( f ) = , RN ( f ) = .
4 + (2π f )2 202 + (2π f )2
RX ( f )
H( f ) = ,
RX ( f ) + RN ( f )
25
26 CHAPTER 8. LINEAR FILTERS – APPLICATIONS
2
4
R2X ( f ) 4+(2π f )2
= =4 40
RX ( f ) + RN ( f )4+(2π f )2
+ 400+(2π f )2
16(400 + (2π f )2 ) 4 (400 + (2π f )2 )
= =
(4 + (2π f )2 )(1760 + 44(2π f )2 ) 11 (4 + (2π f )2 )(40 + (2π f )2 )
4 11 10
= ( − ),
11 (4 + (2π f ) ) (40 + (2π f )2 )
2
8:4. (a)
1
(
1+ 100
, 100 ≤ | f | ≤ 1000,
H( f ) = |f|
0, otherwise,
SNRmax ≈ 5.28.
cAe−b(T −u) , t ≤ T,
h(u) =
0, otherwise,
A2
SNRmax = .
2bN0
(b)
cAe−b(T −t) , 0 ≤ t ≤ T,
h1 (t) =
0, otherwise.
2 −2bT
A 1−e
SNRh1 = .
2bN0
ln 100
(c) T > 2b .
27
8:7.
T −t
T −ε ≤ t ≤ T
ε
1 ε ≤ t ≤ T −ε
h(t) = t
0≤t ≤ε
ε
0 otherwise.
8:8. (a)
(b)
1, k = T − 3, T − 2, T − 1, T,
h(k) =
0, otherwise.
This filter is causal if, for example, the decision time, T , is 3. The decision level is 2.
With symmetric error probabilities, we obtain
We get
7 7 3
min f (a, b) = a2 + b2 + ab − 3a + 2
2 2 2
28 CHAPTER 8. LINEAR FILTERS – APPLICATIONS
∂f 3b
= 7a + − 3 = 0,
∂a 2
∂f 3
= 7b + a = 0,
∂b 2
giving b = −18/187 ≈ −0.0963 and a = 84/187 ≈ 0.449 and the impulse response
h(0) = 0.449, h(−1) = −0.0963 and zero for all other t.
8:10. First initialize the state variable using the unconditional stationary distribution of an
AR(1)-process,
with variance
σe2
VXX (0 | 0) = ,
1 − φ12
and take
Xb0|0 = X0 .
Xbt+1|t = φ1 Xbt|t ,
VXXX (t + 1 | t) = φ12 +VXX (t | t) + σe2 ,
Ybt+1|t = Xbt+1|t .
Update:
9:1.
n−1
X ( f + 1) = ∑ x(t)e−i2π( f +1)t ,
t=0
n−1 n−1
= ∑ x(t)e−i2π f t e−i2πt = ∑ x(t)e−i2π f t ,
t=0 t=0
9:2.
9:3. From the periodogram, we can conclude that there seems to be two closely spaced
sinusoids around f = 0.3 − 0.35 and from the modified periodogram we see a weaker
sinusoid at f = 0.1. Our conclusion could then be that there are three sinusoids in the
signal, two closely spaced and one weaker. However, we could not be sure if there are
other signals that are even weaker or several more closely spaced around f = 0.3. The
exercise illustrates the importance of using several estimation methods when dealing
with unknown data. The true signal was
for 0 ≤ n ≤ 63.
29
30 CHAPTER 9. FREQUENCY ANALYSIS AND SPECTRAL ESTIMATION
9:4. (a) For simplification we can restrict to f = 0 as the spectral density is constant RX ( f ) =
σ 2 . For all other values of f , the result will then be the same. We get
Z 1/2
E[Rbw (0)] = RX (u)|W (u)|2 du
−1/2
Z 1/2 n−1
= σ2 |W (u)|2 du = σ 2 ∑ w2 (t),
−1/2 t=0
using Parseval’s theorem. An unbiased spectral estimate is given if E[Rbw (0)] = RX (0) =
σ 2 , i.e.,
Z 1/2 n−1
|W (u)|2 du = ∑ w2 (t) = 1.
−1/2 t=0
The usual periodogram without any data window, is unbiased for white noise.
(b) All covariances except the actual variance, are known to be zero for a white noise
process, and to estimate r̂x (0) = σ̂ 2 we rely on that the integral of the spectral density
gives the variance as,
Z 1/2
rX (0) = RX ( f )d f .
−1/2
A periodogram is the estimated spectral density for discrete frequency values and if
the periodogram is applied without zero-padding, the spectrum estimates for f = k/n,
k = 0, . . . n/2 − 1, are independent. We could note that the spectrum estimates for
k = n/2, . . . , n − 1 are a mirrored copy for real-valued signals. Averaging the n/2 in-
dependent spectrum estimates,
2 2 n/2−1
σ̂ = ∑ R̂x (k/n),
n k=0
is an unbiased estimate of the white noise variance. As the variance of the periodogram
is V [R̂x ( f )] ≈ R2X ( f ) = σ 4 , the variance of the average will be
2 n/2−1
V [σ̂ ] = V [ ∑ R̂x (k/n)] ≈ 2σ 4 /n.
2
n k=0
9:5. The answer is not easy since the Welch method gives considerable bias of the peak
where the modified periodogram instead gives severe variance, which is seen as a large
variation of the different estimates.
2018
Mathematical Statistics
Centre for Mathematical Sciences
Lund University
Box 118, SE-221 00 Lund, Sweden
http://www.maths.lu.se/