0% found this document useful (0 votes)
8 views

MIT Random Process

1. The document discusses random process theory, which deals with modeling real-world experiments whose outcomes are waveforms. Random processes generalize the concept of random variables to infinite collections of joint random variables indexed by a continuous parameter like time. 2. Key concepts introduced are random processes, sample functions, sample variables, and sample values. A random process is fully characterized by the joint distributions of any finite collection of sample variables. 3. Examples given of fully characterized random processes are a single-frequency wave and a Gaussian random process.

Uploaded by

sharmaald901
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

MIT Random Process

1. The document discusses random process theory, which deals with modeling real-world experiments whose outcomes are waveforms. Random processes generalize the concept of random variables to infinite collections of joint random variables indexed by a continuous parameter like time. 2. Key concepts introduced are random processes, sample functions, sample variables, and sample values. A random process is fully characterized by the joint distributions of any finite collection of sample variables. 3. Examples given of fully characterized random processes are a single-frequency wave and a Gaussian random process.

Uploaded by

sharmaald901
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Optical Propagation, Detection, and

Communication

Jeffrey H. Shapiro
Massachusetts Institute of Technology

c 1988,2000
Chapter 4

Random Processes

In this chapter, we make the leap from N joint random variables—a random
vector—to an infinite collection of joint random variables—a random wave-
form. Random process1 theory is the branch of mathematics that deals with
such entities. This theory is useful for modeling real-world situations which
possess the following characteristics.
• The three attributes, listed in Chapter 3, for useful application of prob-
abilistic models are present.
• The experimental outcomes are waveforms.
The shot noise and thermal noise currents discussed in our photodetector phe-
nomenology are, of course, the principal candidates for random process model-
ing in this book. Random process theory is not an area with which the reader
is assumed to have significant prior familiarity. Yet, even though this field is
rich in new concepts, we shall hew to the straight and narrow, limiting our
development to the material that is fundamental to succeeding chapters—first
and second moment theory, and Gaussian random processes. We begin with
some basic definitions.

4.1 Basic Concepts


Consider a real-world experiment, suitable for probabilistic analysis, whose
outcomes are waveforms. Let P = {Ω, Pr(·)} be a probability-space model for
this experiment, and let { x(t, ω) : ω ∈ Ω } be an assignment of deterministic
waveforms—functions of t—to the sample points {ω}, as sketched in Fig. 4.1.
This probabilistic construct creates a random process, x(t, ·), on the probabil-
1
The term stochastic process is also used.

69
70 CHAPTER 4. RANDOM PROCESSES

x(t,ω )
1

ω
1 t

ω2 x(t,ω )
2

ω
3
t

x(t,ω )
3

Figure 4.1: Assignment of waveforms to sample points in a probability space

ity space P, i.e., because of the uncertainty as to which ω will occur when
the experiment modeled by P is performed, there is uncertainty as to which
waveform will be produced.
We will soon abandon the full probability-space notation for random pro-
cesses, just as we quickly did in Chapter 3 for the corresponding case of random
variables. Before doing so, however, let us hammer home the preceding defi-
nition of a random process by examining some limiting cases of x(t, ω).

random process With t and ω both regarded as variables, i.e., −∞ < t < ∞
and ω ∈ Ω, then x(t, ω) refers to the random process.

sample function With t variable and ω = ω1 fixed, then x(t, ω1 ) is a de-


terministic function of t—the sample function of the random process,
x(t, ω), associated with the sample point ω1 .

sample variable With t = t1 fixed and ω variable, then x(t1 , ω) is a deter-


ministic mapping from the sample space, Ω, to the real line, R1 . It
is thus a random variable—the sample variable of the random process,
x(t, ω), associated with the time2 instant t1 .
2
Strictly speaking, a random process is a collection of joint random variables indexed by
an index parameter. Throughout this chapter, we shall use t to denote the index parameter,
4.1. BASIC CONCEPTS 71

sample value With t = t1 and ω = ω1 both fixed, then x(t1 , ω1 ) is a number.


This number has two interpretations: it is the time sample at t1 of the
sample function x(t, ω1 ); and it is also the sample value at ω1 of the
random variable x(t1 , ω).

For the most part, we shall no longer carry along the sample space notation.
We shall use x(t) to denote a generic random process, and x(t1 ) to refer to the
random variable obtained by sampling this process at t = t1 . However, when
we are sketching typical sample functions of our random-process examples, we
shall label such plots x(t, ω1 ) vs. t, etc., to emphasize that they represent
the deterministic waveforms associated with specific sample points in some
underlying Ω.
If one time sample of a random process, x(t1 ), is a random variable, then
two such time samples, x(t1 ) and x(t2 ), must be two joint random variables,
and N time samples, { x(tn ) : 1 ≤ n ≤ N }, must be N joint random variables,
i.e., a random vector  
x(t1 )
 x(t2 ) 
 
x ≡  .. 

. (4.1)
 . 
x(tN )
A complete statistical characterization of a random process x(t) is defined to
be the information sufficient to deduce the probability density for any random
vector, x, obtained via sampling, as in Eq. 4.1. This must be true for all
choices of the sampling times, { tn : 1 ≤ n ≤ N }, and for all dimensionalities,
1 ≤ N < ∞. It is not necessary that this characterization comprise an explicit
catalog of densities, {px (X)}, for all choices and dimensionalities of the sample-
time vector  
t1
 t2 
 
t ≡  ..  .
  (4.2)
 . 
tN
Instead, the characterization may be given implicitly, as the following two
examples demonstrate.

single-frequency wave Let θ be a random variable that is uniformly dis-


tributed on the interval 0 ≤ θ ≤ 2π, and let P and f0 be positive
and call it time. Later, we will have occasion to deal with random processes with multidi-
mensional index parameters, e.g., a 2-D spatial vector in the entrance pupil of an optical
system.
72 CHAPTER 4. RANDOM PROCESSES

constants. The single-frequency wave, x(t), is then



x(t) ≡ 2P cos(2πf0 t + θ). (4.3)

Gaussian random process A random process, x(t), is a Gaussian random


process if, for all t and N, the random vector, x, obtained by sampling
this process is Gaussian. The statistics of a Gaussian random process
are completely characterized3 by knowledge of its mean function

mx (t) ≡ E[x(t)], for −∞ < t < ∞, (4.4)

and its covariance function

Kxx (t, s) ≡ E[∆x(t)∆x(s)], for −∞ < t, s < ∞, (4.5)

where ∆x(t) ≡ x(t) − mx (t).

We have sketched a typical sample functio√n of the single-frequency wave


in Fig. 4.2. It is a pure tone of amplitude 2P , frequency f0 , and phase
θ(ω1 ). This certainly does not look like a random process—it is not noisy.
Yet, Eq. 4.3 does generate a random process, according to our definition. Let
P = {Ω, Pr(·)} be the probability space that underlies the random variable
θ. Then, Eq. 4.3 implies the deterministic sample-point-to-sample-function
mapping √
x(t, ω) = 2P cos[2πf0 t + θ(ω)], for ω ∈ Ω, (4.6)
which, with the addition of the probability measure Pr(·), makes x(t) a random
process. Physically, there is only one random variable in this random process—
the phase of the wave.4 Thus, this random process is rather trivial, although
it may be used to model the output of an ideal oscillator whose amplitude and
frequency are known, but whose phase, with respect to an observer’s clock, is
completely random.
The Gaussian random process example is much more in keeping with our
intuition about noise. For example, in Fig. 4.3 we have sketched a typical
3
All time-sample vectors from a Gaussian random process are Gaussian. To find their
probability densities we need only supply their mean vectors and their covariance matrices.
These can be found from the mean function and covariance function—the continuous-time
analogs of the mean vector and covariance matrix—as will be seen below.
4
As a result, it is a straightforward—but tedious—task to go from the definition of the
single-frequency wave to an explicit collection of sample-vector densities. The calculations
for N = 1 and N = 2 will be performed in the home problems for this chapter.
4.1. BASIC CONCEPTS 73

1.5

0.5
x(t)/(2P)1/2

-0.5

-1

-1.5

-2
-3 -2 -1 0 1 2 3
fo t

Figure 4.2: Typical sample function of the single-frequency wave

sample function for the Gaussian random process, x(t), whose mean function
is
mx (t) = 0, for −∞ < t < ∞, (4.7)
and whose covariance function is

Kxx (t, s) = P exp(−λ|t − s|), for −∞ < t, s < ∞, (4.8)

where P and λ are positive constants.


Some justification for Fig. 4.3 can be provided from our Chapter 3 knowl-
edge of Gaussian random vectors. For the Gaussian random process whose
mean function and covariance function are given by Eqs. 4.7 and 4.8, the
probability density for a single time sample, x(t1 ), will be Gaussian, with
E[x(t1 )] = mx (t1 ) = 0, and var[x(t1 )] = Kxx (t1 , t1 ) = √
P . Thus, as seen in
Fig. 4.3, this time sample will typically fall within a few P of 0, even though
there is some probability that values approaching ±∞ will occur.
To justify the dynamics of the Fig. 4.3 sample function, we need—at the
least—to consider the jointly Gaussian probability density for two time sam-
ples, viz. x(t1 ) and x(t2 ). Equivalently, we can suppose that x(t1 ) = X1
has occurred, and examine the conditional statistics for x(t2 ). We know that
this conditional density will be Gaussian, because x(t1 ) and x(t2 ) are jointly
Gaussian; the conditional mean and conditional variance are as follows [cf.
74 CHAPTER 4. RANDOM PROCESSES

1
x(t)/(P)1/2

-1

-2

-3
0 1 2 3 4 5 6 7 8 9 10

Figure 4.3: Typical sample function for a Gaussian random process with mean
function Eq. 4.7 and covariance function Eq. 4.8

Eqs. 3.87 and 3.88]:


Kxx (t2 , t1 )
E[ x(t2 ) | x(t1 ) = X1 ] = mx (t2 ) + [X1 − mx (t1 )]
Kxx (t1 , t1 )
= exp(−λ|t2 − t1 |)X1 , (4.9)

and
Kxx (t2 , t1 )2
var[ x(t2 ) | x(t1 ) = X1 ] = Kxx (t2 , t2 ) −
Kxx (t1 , t1 )
= P [1 − exp(−2λ|t2 − t1 |)]. (4.10)

Equations 4.9 and 4.10 support the waveform behavior shown in Fig. 4.3.
Recall that exponents must be dimensionless. Thus if t is time, in units of
seconds, then 1/λ must have these units too. For |t2 − t1 | ≪ 1/λ, we see that
the conditional mean of x(t2 ), given x(t1 ) = X1 has occurred, is very close to
X1 . Moreover, under this condition, the conditional variance of x(t2 ) is much
less than its a priori variance. Physically, this means that the process cannot
have changed much over the time interval from t1 to t2 . Conversely, when
|t2 − t1 | ≫ 1/λ prevails, we find that the conditional mean and the conditional
variance of x(t2 ), given x(t1 ) = X1 has occurred, are very nearly equal to
the unconditional values, i.e., x(t2 ) and x(t1 ) are approximately statistically
4.2. SECOND-ORDER CHARACTERIZATION 75

independent for such long time separations. In summary, 1/λ is a correlation


time for this process in that x(t) can’t change much on time scales appreciably
shorter than 1/λ.
More will be said about Gaussian random processes in Section 4.4. We
will first turn our attention to an extended treatment of mean functions and
covariance functions for arbitrary random processes.

4.2 Second-Order Characterization


Complete statistical characterization of a random process is like knowing the
probability density of a single random variable—all meaningful probabilities
and expectations can be calculated, but this full information may not really
be needed. A substantial, but incomplete, description of a single random vari-
able is knowledge of its mean value and its variance. For a random vector,
the corresponding first and second moment partial characterization consists of
the mean vector and the covariance matrix. In this section, we shall develop
the second-order characterization of a random process, i.e., we shall consider
the information provided by the mean function and the covariance function of
a random process. As was the case for random variables, this information is
incomplete—there are a wide variety of random processes, with wildy different
sample functions, that share a common second-order characterization.5 Never-
theless, second-order characterizations are extraordinarily useful, because they
are relatively simple, and they suffice for linear-filtering/signal-to-noise-ratio
calculations.
Suppose x(t) is a random process—not necessarily Gaussian—with mean
function mx (t) and covariance function Kxx (t, s), as defined by Eqs. 4.4 and
4.5, respectively. By direct notational translation of material from Chapter 3
we have the following results.

mean function The mean function, mx (t), of a random process, x(t), is a


deterministic function of time whose value at an arbitrary specific time,
t = t1 , is the mean value of the random variable x(t1 ).

covariance function The covariance function, Kxx (t, s), of a random pro-
cess, x(t), is a deterministic function of two time variables; its value at
an arbitrary pair of times, t = t1 and s = s1 , is the covariance between
the random variables x(t1 ) and x(s1 ).
5
One trio of such processes will be developed in the home problems for this chapter.
76 CHAPTER 4. RANDOM PROCESSES

Thus, mx (t) is the deterministic part of the random process, i.e., ∆x(t) ≡
x(t) − mx (t) is a zero-mean random process—the noise part of x(t)—which
satisfies x(t) = mx (t) + ∆x(t) by construction. We also know that var[x(t)] =
E[∆x(t)2 ] = Kxx (t, t) measures the mean-square noise strength in the random
6 s, we have that
process as a function of t. Finally, for t =
Kxx (t, s)
ρxx (t, s) ≡ q (4.11)
Kxx (t, t)Kxx (s, s)

is the correlation coefficient between samples of the random process taken at


times t and s. If ρxx (t1 , s1 ) = 0, then the time samples x(t1 ) and x(s1 ) are
uncorrelated, and perhaps statistically independent. If |ρxx (t1 , s1 )| = 1, then
these time samples are completely correlated, and x(t1 ) can be found, with
certainty, from x(s1 ) [cf. Eq. 3.69].
All of the above properties and interpretations ascribed to the second-
order—mx (t) and Kxx (t, s)—characterization of a random process rely on their
probabilistic origins. Random processes, however, combine both probabilistic
and waveform notions. Thus, our main thrust in this chapter will be to develop
some of the latter properties, in the particular context of linear filtering. Before
doing so, we must briefly address some mathematical constraints on covariance
functions.
Any real-valued deterministic function of a single parameter, f (·), can in
principle be the mean function of a random process. Indeed, if we start with the
Gaussian random process x(t) whose mean function and covariance function
are given by Eqs. 4.7 and 4.8, respectively, and define6

y(t) = f (t) + x(t), (4.12)

then it is a simple matter—using the linearity of expectation—to show that

my (t) = f (t), (4.13)

and
Kyy (t, s) = Kxx (t, s) = P exp(−λ|t − s|). (4.14)
We thus obtain a random process with the desired mean function.
An arbitrary real-valued deterministic function of two parameters, g(·, ·),
may not be a possible covariance function for a random process, because of
6
Equation 4.12 is a transformation of the original random process x(t) into a new random
process y(t); in sample-function terms it says that y(t, ω) = f (t)+x(t, ω), for ω ∈ Ω. Because
these random processes are defined on the same probability space, they are joint random
processes.
4.2. SECOND-ORDER CHARACTERIZATION 77

the implicit probabilistic constraints on covariance functions that are listed


below.

Kxx (t, s) = cov[x(t), x(s)] = Kxx (s, t), for all t, s, (4.15)
Kxx (t, t) = var[x(t)] ≥ 0, for all t, (4.16)
q
|Kxx (t, s)| ≤ Kxx (t, t)Kxx (s, s), for all t, s. (4.17)

Equations 4.15 and 4.16 are self-evident; Eq. 4.17 is a reprise of correlation
coefficients never exceeding one in magnitude.
The preceding covariance function constraints comprise necessary condi-
tions that a real-valued deterministic g(·, ·) must satisfy for it to be a possible
Kxx (t, s); they are not sufficient conditions. Let x(t) be a random process
with covariance function Kxx (t, s), let {t1 , t2 , . . . , tN } be an arbitrary collec-
tion of sampling times, and let {a1 , a2 , . . . , aN } be an arbitrary collection of
real constants, and define a random variable z according to
N
X
z≡ an x(tn ). (4.18)
n=1

Because var(z) ≥ 0 must prevail, we have that7


N X
X N
var(z) = an am Kxx (tn , tm ) ≥ 0, for all {tn }, {an }, and N. (4.19)
n=1 m=1

Equations 4.16 and 4.17 can be shown to be the N = 1 and N = 2 special


cases, respectively, of Eq. 4.19. Functions which obey Eq. 4.19 are said to be
non-negative definite. More importantly, any real-valued deterministic function
of t and s which is a symmetric function of its arguments and is non-negative
definite can be the covariance function of a random process—Eqs. 4.15 and
4.19 are the necessary and sufficient conditions for a valid covariance function.8
It can be difficult to check whether or not a given function of two variables
is non-negative definite. We shall see, in the next section, that there is an
important special case—that of wide-sense stationary processes—for which this
verification is relatively simple. Thus, we postpone presentation of some key
covariance function examples until we have studied the wide-sense stationary
case.
7
The derivation of this formula parallels that of Eq. 3.107.
8
The preceding arguments can be cast backwards into the random vector arena—any
real-valued deterministic N vector can be the mean vector of a random N vector, any real-
valued deterministic N × N matrix that is symmetric and non-negative definite can be the
covariance matrix of a random N vector.
78 CHAPTER 4. RANDOM PROCESSES

Noise
n(t)

Message Received
m(t) Waveform
x(t)
SOURCE RECEIVER

Figure 4.4: A simple communication system

4.3 Linear Filtering of Random Processes


To motivate our treatment of linear filtering of a random process, x(t), let
us pursue the signal and noise interpretations assigned to mx (t) and ∆x(t) in
the context of the highly simplified communication system shown in Fig. 4.4.
Here, the source output is a deterministic message waveform, m(t), which is
conveyed to the user through an additive-noise channel—the received waveform
is x(t) = m(t)+n(t), where n(t) is a zero-mean random process with covariance
function Knn (t, s). We have the following second-order characterization for
x(t) [cf. Eq. 4.12]
mx (t) = m(t), (4.20)
and
Kxx (t, s) = Knn (t, s). (4.21)
Given the structure of Fig. 4.4 and the discussion following Eq. 3.35, it is
reasonable to define an instantaneous signal-to-noise ratio, SNR(t), for this
problem by means of

mx (t)2 m(t)2
SNR(t) ≡ = . (4.22)
Kxx (t, t) Knn (t, t)

If SNR(t) ≫ 1 prevails for all t, then the Chebyschev inequality guarantees


that x(t) will, with high probability, be very close to m(t) at any time.
The above description deals only with the probabilistic structure of the
problem. Now, let us add some waveform characteristics. Suppose that the
message m(t) is a baseband speech waveform, whose frequency content is from
30 to 3000 Hz, and suppose that the noise n(t) is a broadband thermal noise,
with significant frequency content from 10 Hz to 100 MHz. If we find that
SNR(t) ≪ 1 prevails, the message will be buried in noise. Yet, it is intuitively
clear that the received waveform should be narrowband filtered, to pass the
signal components and reject the out-of-band noise components of x(t). After
4.3. LINEAR FILTERING OF RANDOM PROCESSES 79

Input System Output


x(t) S y(t)

Figure 4.5: Deterministic continuous-time system

such filtering, the signal-to-noise ratio may then obey SNR(t) ≫ 1. The
random process machinery for analyzing this problem will be developed below,
after a brief review of deterministic linear systems.

Deterministic Linear Systems


The reader is expected to have had a basic course in deterministic continuous-
time linear systems. Our review of this material will prove immediately useful
for linear filtering of random processes. Its translation into functions of 2-
D spatial vectors will comprise Chapter 5’s linear system—Fourier optics—
approach to free-space propagation of quasimonochromatic, paraxial, scalar
fields.
Figure 4.3 shows a deterministic continuous-time system, S, for which ap-
plication of a deterministic input waveform x(t) produces a deterministic out-
put waveform y(t). Two important properties which S may possess are as
follows.

linearity S is a linear system if it obeys the superposition principle, i.e., if


x1 (t) and x2 (t) are arbitrary input waveforms which, when applied to S,
yield corresponding output waveforms y1 (t) and y2 (t), respectively, and
if a1 and a2 are constants, then applying a1 x1 (t) + a2 x2 (t) to S results
in a1 y1 (t) + a2 y2 (t) as the output.

time invariance S is a time-invariant system if shifting an input waveform


along the time axis—delaying it or retarding it—yields a corresponding
output that shifts in an identical manner, i.e.,
S
x1 (t) −→ y1 (t) (4.23)

implies that
S
x1 (t − T ) −→ y1 (t − T ), (4.24)
for arbitrary input waveforms, x1 (t), and all values of the time shift, T .

Linearity and time invariance are not tightly coupled properties—a system
may be linear or nonlinear, time-invariant or time-varying, in any combination.
80 CHAPTER 4. RANDOM PROCESSES

However, the confluence of these properties results in an especially important


class of systems—the linear time-invariant (LTI) systems. To understand just
how special LTI systems are, we need to recall two waveform building-block
procedures.
sifting integral Any reasonably-smooth9 deterministic waveform, x(t), can
be represented as a superposition of impulses via the sifting integral
Z ∞
x(t) = x(τ )δ(t − τ ) dτ. (4.25)
−∞

inverse Fourier transform integral Any reasonably-smooth deterministic


waveform, x(t), can be represented as a superposition of complex sinu-
soids via an inverse Fourier transform integral
Z ∞
x(t) = X(f )ej2πf t df, (4.26)
−∞

where Z ∞
X(f ) = x(t)e−j2πf t dt. (4.27)
−∞

Suppose S is an LTI system, with input/output relation y(t) = S[x(t)].


By use of the sifting-integral representation, Eq. 4.25, and the superposition
principle, the input/output relation for S can be reduced to a superposition
integral Z ∞
y(t) = x(τ )h(t, τ ) dτ, (4.28)
−∞

where h(t, τ ) is the response of S at time t to a unit-area impulse at time τ .


Because of time invariance, h(t, τ ) must equal h(t − τ ) ≡ h(t − τ, 0), reducing
Eq. 4.28 to a convolution integral
Z ∞
y(t) = x(τ )h(t − τ ) dτ. (4.29)
−∞

Physically, h(t) is the response of the LTI system S to a unit-area impulsive


input at time t = 0. Naturally, h(t) is called the impulse response of the
system.
By Fourier transformation of Eq. 4.29 we can obtain the following alterna-
tive LTI-system input/output relation

Y (f ) = X(f )H(f ), (4.30)


9
Reasonably-smooth is a code phrase indicating that we will not be concerned with the
rigorous limits of validity of these procedures.
4.3. LINEAR FILTERING OF RANDOM PROCESSES 81

Random Random
Process h(t), H(f) Process
x(t) y(t)

Figure 4.6: LTI system with a random process input

where Y (f ) and H(f ) are obtained from y(t) and h(t) via equations similar to
Eq. 4.27. The fact that Fourier transformation changes convolution into multi-
plication is an important calculational technique to be cognizant of. Physically,
it more important to understand Eq. 4.30 from the inverse-Fourier-transform
approach to signal representation. Specifically, Eqs. 4.26 and 4.30 imply that
sinusoids are eigenfunctions of LTI systems, i.e., if A cos(2πf t+ φ) is the input
to an LTI system, the corresponding output will be |H(f )|A cos(2πf t + φ +
arg[H(f )]). In words, the response of an LTI system to a sinusoid of frequency
f is also a sinusoid of frequency f ; the system merely changes the amplitude
of the sinusoid by |H(f )| and shifts its phase by arg[H(f )]. H(f ) is called the
frequency response of the system.

Mean and Covariance Propagation


Now let us examine what happens when we apply a random process, x(t), at
the input to a deterministic LTI system whose impulse response is h(t) and
whose frequency response is H(f ), as shown in Fig. 4.6. Equation 4.29, which
presumes a deterministic input, can be employed on a sample-function basis
to show that the output of the system will be a random process, y(t), whose
sample functions are related to those of the input by convolution with the
impulse response, viz.10
Z ∞
y(t, ω) = x(τ, ω)h(t − τ ) dτ. (4.31)
−∞

Thus, y(t) and x(t) are joint random processes defined on a common proba-
bility space. Moreover, given the second-order characterization of the process
x(t) and the linearity of the system, we will be able to find the second-order
characterization of process y(t) using techniques that we established in our
work with random vectors.
Suppose x(t) has mean function mx (t) and covariance function Kxx (t, s).
What are the resulting mean function and covariance function of the output
10
It follows from this result that we can use Eq. 4.29 for a random process input. We
shall eschew use of the random process version of Eq. 4.30, and postpone introduction of
frequency-domain descriptions until we specialize to wide-sense stationary processes.
82 CHAPTER 4. RANDOM PROCESSES

process y(t)? The calculations are continuous-time versions of the Chapter 3


component-notation manipulations for N-D linear transformations. We have
that
Z ∞ 
my (t) = E x(τ )h(t − τ ) dτ
−∞
Z ∞
= E[x(τ )h(t − τ )] dτ (4.32)
−∞
Z ∞
= E[x(τ )]h(t − τ ) dτ (4.33)
−∞
Z ∞
= mx (τ )h(t − τ ) dτ = mx (t) ∗ h(t), (4.34)
−∞

where ∗ denotes convolution. Equation 4.32 is obtained using “the average


of the sum is the sum of the averages”, as in Chapter 3, only here the sum-
mation is an integral. Equation 4.33 follows from “the average of a constant
times a random quantity equals the constant times the average of the random
quantity”, as in Chapter 3, only here the constant and random quantity are
indexed by a continuous—rather than a discrete—parameter. It is Eq. 4.34,
however, that is of greatest importance; it shows that the mean output of an
LTI system driven by a random process is the mean input passed through the
system.11
Because the input to and the output from the LTI filter in Fig. 4.6 equal
their mean functions plus their noise parts, Eq. 4.34 also tells us that the noise
in the output of a random-process driven LTI system is the noise part of the
input passed through the system. We then find that the covariance function
of the output obeys
Z ∞ Z ∞ 
Kyy (t, s) = E ∆x(τ )h(t − τ ) dτ ∆x(τ ′ )h(s − τ ′ ) dτ ′
−∞ −∞
Z ∞ Z ∞ 
= E dτ dτ ′ ∆x(τ )∆x(τ ′ )h(t − τ )h(s − τ ′ ) .
−∞ −∞
Z ∞ Z ∞
= dτ dτ ′ E[∆x(τ )∆x(τ ′ )h(t − τ )h(s − τ ′ )]
−∞ −∞
Z ∞ Z ∞
= dτ dτ ′ Kxx (τ, τ ′ )h(t − τ )h(s − τ ′ ). (4.35)
−∞ −∞

This derivation is the continuous-time analog of the componentwise linear-


transformation covariance calculations performed in Chapter 3. Of particular
11
This property was seen in Chapter 3. As stated in words, it depends only on the
linearity, not on the time invariance, of the system. Time invariance permits the use of a
convolution integral, instead of a superposition integral, for the input/output relation.
4.3. LINEAR FILTERING OF RANDOM PROCESSES 83

note, for future similar manipulations, is the employment of different dummy


variables of integration so that the convolution integrals for ∆y(t) and ∆y(s)
could be combined into a double integral amenable to the interchange of ex-
pectation and integration.
Equation 4.35 is essentially a double-convolution of the input covariance
with the system’s impulse response. Fortunately, we will seldom have to ex-
plicitly carry out such integrations. Before we turn to frequency-domain con-
siderations, let us augment our second-order characterization by introducing
the cross-covariance function of the processes x(t) and y(t), namely

Kxy (t, s) ≡ E[∆x(t)∆y(s)]. (4.36)

The cross-covariance function, Kxy (t, s), is a deterministic function of two time
values; at t = t1 and s = s1 , this function equals the covariance between the
random variables x(t1 ) and y(s1 ).12 When the process y(t) is obtained from
the process x(t) as shown in Fig. 4.6, we have that
Z ∞
Kxy (t, s) = Kxx (t, τ )h(s − τ ) dτ . (4.37)
−∞

The cross-covariance function provides a simple, but imperfect, measure of


the degree of statistical dependence between two random processes. We will
comment further on this issue in Section 4.5.

Wide-Sense Stationarity
Let us find the mean function and the covariance function of the single-
frequency wave, x(t) from Eq. 4.3. These are easily shown to be

mx (t) = E[ 2P cos(2πf0 t + θ)]
Z 2π √
1
= 2P cos(2πf0 t + θ) dθ = 0, (4.38)
0 2π
and

Kxx (t, s) = E[2P cos(2πf0 t + θ) cos(2πf0 s + θ)]


= E{P cos[2πf0 (t − s)]} + E{P cos[2πf0 (t + s) + 2θ]}
= P cos[2πf0 (t − s)], (4.39)
12
The term cross-covariance function is used because this function quantifies the covari-
ances between time samples from two random processes. The term auto-covariance is some-
times used for functions like Kxx and Kyy , because they each specify covariances between
time samples from a single random process.
84 CHAPTER 4. RANDOM PROCESSES

respectively. Equations 4.38 and 4.39 have a variety of interesting properties.


That the mean function, Eq. 4.38, should be zero at all times is intuitively
clear from Eq. 4.3—the sample functions of x(t) comprise all possible phase
shifts of an amplitude 2P , frequency f0 sinusoid, and these occur with equal
probability, because of the uniform distribution of θ. That the correlation
coefficient, ρxx (t, s), associated with the covariance function, Eq. 4.39, should
give ρxx (t, t + n/f0 ) = 1 for n an integer follows directly from Fig. 4.3—all the
sample functions of x(t) are sinusoids of period 1/f0 .
The very specific characteristics of the single-freqency wave’s mean and
covariance function are not the point we are driving at. Rather it is the fact
that the mean function is a constant, viz.

mx (t) = mx (0), for all t, (4.40)

and that its covariance function depends only on time differences, namely

Kxx (t, s) = Kxx (t − s, 0), for all t, s, (4.41)

that matters. Equations 4.40 and 4.41 say that the second-order characteriza-
tion of this random process is time invariant—the mean and variance of any
single time sample of the process x(t) are independent of the time at which that
sample is taken, and the covariance between two different time samples of the
process x(t) depends only the their time separation. We call random processes
which obey Eqs. 4.40 and 4.41 wide-sense stationary random processes.13
The single-frequency wave is wide-sense stationary (WSS)—a sinusoid of
known amplitude and frequency but completely random phase certainly has
no preferred time origin. The Gaussian random process whose typical sam-
ple function was sketched in Fig. 4.3 is also WSS—here the WSS conditions
were given at the outset. Not all random processes are wide-sense stationary,
however. For example, consider the random-frequency wave, x(t), defined by

x(t) ≡ 2P sin(2πf t), (4.42)

where P is a positive constant, and f is a random variable which is uniformly


distributed on the interval f0 ≤ f ≤ 2f0 , for f0 a positive constant. Two typ-
ical sample functions for this process have been sketched in Fig. 4.7. Clearly,
this process is not wide-sense stationary—a preferred time origin is apparent
at t = 0.
13
A strict-sense stationary random process is one whose complete statistical characteri-
zation is time invariant [cf. Section 4.4].
4.3. LINEAR FILTERING OF RANDOM PROCESSES 85

2
1/2
x(t,ω )/(2P)
1 1/2
x(t,ω )/(2P)
2
1.5

0.5

-0.5

-1

-1.5

-2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
f t
0

Figure 4.7: Typical sample functions of a random-frequency wave

We make no claim for the physical importance of the preceding random-


frequency wave. Neither do we assert that all physically interesting random
processes must be wide-sense stationary. It seems reasonable, however, to
expect that the thermal-noise current of a resistor in thermal equilibruim at
temperature T K should be a WSS random process. Likewise, the shot-noise
current produced by constant-power illumination of a photodetector should
also be wide-sense stationary. Thus the class of WSS processes will be of some
interest in the optical communication analyses that follow. Our present task is
to examine the implications of Eqs. 4.40 and 4.41 with respect to LTI filtering
of a wide-sense stationary random process.
Suppose that the input, x(t), in Fig. 4.6 is a wide-sense stationary process,
whose mean function is
mx = E[x(t)], (4.43)
and whose covariance function is

Kxx (τ ) = E[∆x(t + τ )∆x(t)], (4.44)

where we have exploited Eq. 4.40 in suppressing the time argument of the
mean function, and Eq. 4.41 in writing a covariance function that depends
only on time difference, τ .14 We have, from Eqs. 4.34 and 4.35, that the mean
14
An unfortunate recurring problem of technical writing—particularly in multidisciplinary
86 CHAPTER 4. RANDOM PROCESSES

and covariance function of the output process, y(t), are


Z ∞
my (t) = mx h(t − α) dα = mx H(0) = my (0), (4.45)
−∞
for all t,

and
Z ∞ Z ∞
Kyy (t, s) = dα dβ Kxx (α − β)h(t − α)h(s − β)
−∞ −∞
Z ∞ Z ∞
= dα dβ Kxx (t − s − α + β)h(α)h(β), (4.46)
−∞ −∞
= Kyy (t − s, 0), for all t, s,

where Eq. 4.46 has been obtained via the change of variables α −→ t − α,
β −→ s − β.
We see that y(t) is a wide-sense stationary random process. This is to be
expected. The input process has no preferred time origin in its second-order
characterization because it is WSS; the second-order characterization of the
output process can be obtained from that of the input because the filter is
linear; and the filter imposes no preferred time origin into the propagation of
the second-order characterization because the filter is time invariant. In the
notation for WSS processes, the above results become

my = mx H(0), (4.47)
Z ∞ Z ∞
Kyy (τ ) = dα dβ Kxx (τ − α + β)h(α)h(β). (4.48)
−∞ −∞

Furthermore, the qualitative argument given in support of y(t)’s being wide-


sense stationary extends to the cross-covariance function for which Eq. 4.37
can be reduced to
Z ∞
Kxy (τ ) ≡ E[∆x(t + τ )∆y(t)] = Kxx (τ + β)h(β) dβ. (4.49)
−∞

Two wide-sense stationary random processes whose cross-covariance function


depends only on time differences are said to be jointly wide-sense stationary
(JWSS) processes—such is the case for the x(t) and y(t) processes here.
endeavors like optical communication—is that there is not enough good notation to go
around. In deterministic LTI system theory, τ is the preeminient choice for the convolution-
integral’s dummy variable. In random process theory, τ is the preferred time-difference
argument for a covariance function. In what follows we shall replace τ and τ ′ in Eqs. 4.34,
4.35, and 4.37 with α and β, respectively.
4.3. LINEAR FILTERING OF RANDOM PROCESSES 87

Deeper appreciation for the WSS case can be developed by examining


Eqs. 4.47, 4.48, and 4.49 from the frequency domain. The mean function
of a WSS random process is a constant—a sinusoid of zero frequency—so that
Eq. 4.47 is merely the sinusoidal eigenfunction version of “the mean output of
the LTI system is the mean input passed through the system”. The frequency-
domain versions of Eqs. 4.48 and 4.49 require that we introduce the Fourier
transforms—the spectral densities—of the input and output covariance func-
tions, namely
Z ∞
Sxx (f ) ≡ Kxx (τ )e−j2πf τ dτ, (4.50)
−∞
Z ∞
Syy (f ) ≡ Kyy (τ )e−j2πf τ dτ, (4.51)
−∞

as well as their cross-spectral density, i.e., the Fourier transform of the in-
put/output cross-covariance function
Z ∞
Sxy (f ) ≡ Kxy (τ )e−j2πf τ dτ. (4.52)
−∞

We then can reduce the convolution-like integrations involved in the time-


domain results Eqs. 4.48 and 4.49 to the following multiplicative relations

Syy (f ) = Sxx (f )|H(f )|2, (4.53)

and
Sxy (f ) = Sxx (f )H(f )∗. (4.54)
Aside from the calculational advantages of multiplication as opposed to in-
tegration, Eq. 4.53 has important mathematical properties and a vital physical
interpretation. The 1:1 nature of Fourier transformation tells us that covari-
ance functions can be recovered from their associated spectra by an inverse
Fourier integral, e.g.,
Z ∞
Kxx (τ ) = Sxx (f )ej2πf τ df. (4.55)
−∞

Combining this result with the WSS forms of Eqs. 4.15 and 4.16 yields
) (
Kxx (−τ ) = Kxx (τ ) Sxx (−f ) = Sxx (f )
←→ , (4.56)
Kxx real-valued Sxx real-valued

and Z ∞
0 ≤ var[x(t)] = Kxx (0) = Sxx (f ) df. (4.57)
−∞
88 CHAPTER 4. RANDOM PROCESSES

H(f)

˘f ˘f
1

f
-f f
0 0 0

Figure 4.8: Ideal passband filter

As per our discussion following Eqs. 4.15–4.17, the constraints just exhibited
for a WSS covariance function and its associated spectrum are necessary but
not sufficient conditions for a function of a single variable to be a valid Kxx
or Sxx . Nevertheless, Eq. 4.57 suggests an interpretation of Sxx (f ) whose
validation will lead us to the necessary and sufficient conditions for the WSS
case.
We know that var[x(t)] is the instantaneous mean-square noise strength
in the random process x(t). For x(t) wide-sense stationary, this variance can
be found—according to Eq. 4.57—by integrating the spectral density Sxx (f )
over all frequencies. This frequency-domain calculation is consistent with the
following property.
spectral-density interpretation For x(t) a WSS random process with spec-
tral density Sxx (f ), and f0 ≥ 0 an arbitrary frequency,15 Sxx (f0 ) is the
mean-square noise strength per unit bilateral bandwidth in x(t)’s fre-
quency f0 component.
The above property, which we will prove immediately below, certainly jus-
tifies referring to Sxx (f ) as the spectral density of the x(t) process. Its proof
is a simple juxtaposition of the physical interpretation and the mathematical
analysis of var[y(t)] for the Fig. 4.6 arrangement when H(f ) is the ideal pass-
band filter shown in Fig. 4.8. This ideal filter passes, without distortion, the
frequency components of x(t) that lie within a 2∆f bilateral bandwidth vicin-
ity of frequency f0 , and completely suppresses all other frequencies.16 Thus,
15
Strictly speaking, f0 should be a point of continuity of Sxx (f ) for this property to hold.
16
Because we are dealing with real-valued time functions and exponential Fourier trans-
forms, H(−f ) = H(f )∗ must prevail. We shall only refer to positive frequencies in discussing
the spectral content of the filter’s output, but we must employ its bilateral —positive and
negative frequency—bandwidth in calculating var[y(t)].
4.3. LINEAR FILTERING OF RANDOM PROCESSES 89

for ∆f sufficiently small, we have that var[y(t)]/2∆f is the mean-square noise


strength per unit bilateral bandwidth in the frequency f0 component of the
process x(t). On the other hand, a direct mathematical calculation, based on
Eqs. 4.57, 4.56, and 4.53, shows that
Z
var[y(t)] Kyy (0) 1 ∞
= = Syy (f ) df
2∆f 2∆f 2∆f −∞
Z ∞
1
= Syy (f ) df
∆f 0
Z ∞
1
= Sxx (f )|H(f )|2 df
∆f 0
1 Z f0 +∆f /2
= Sxx (f ) df
∆f f0 −∆f /2
≈ Sxx (f0 ), for ∆f sufficiently small, (4.58)

which proves the desired result.


The spectral-density interpretation is the major accomplishment of our
second-order characterization work.17 This interpretation lends itself to treat-
ing the simple-minded communication example we used to introduce our de-
velopment of LTI filtering of random processes. It is also of great value in
understanding the temporal content of a random process, albeit within the
limitations set by the incompleteness of second-order characterization. In the
next subsection we shall present and discuss several Kxx ↔ Sxx examples.

Spectral-Density Examples
As a prelude to the examples, we note the following corollary to our spectral-
density interpretation: the spectral density of a WSS random process is non-
negative,
Sxx (f ) ≥ 0, for all f . (4.59)
Moreover, it can be shown that the inverse Fourier transform of a real-valued,
even, non-negative function of frequency is a real-valued, even, non-negative
definite function of time. Thus, Eqs. 4.56 and 4.59 are necessary and sufficient
conditions for an arbitrary deterministic function of frequency to be a valid
spectral density for a wide-sense stationary random process. This makes the
task of selecting valid Kxx ↔ Sxx examples fairly simple—retrieve from our
17
The term power -spectral density is often used, with some imprecision. If x(t) has
physical units “widgets”, then Sxx (f ) has units “widgets2 /Hz”. Only when widgets2 are
watts is Sxx (f ) really a power spectrum. Indeed, the most common spectrum we shall deal
with in our photodetection work is that of electrical current; its units are A2 /Hz.
90 CHAPTER 4. RANDOM PROCESSES

storehouse of deterministic linear system theory all Fourier transform pairs


for which the frequency function is real-valued, even, and non-negative. The
following examples will suffice to illustrate some key points.

single-frequency spectrum The single-frequency wave’s covariance func-


tion, Eq. 4.39, written in WSS notation,

Kxx (τ ) = P cos(2πf0 τ ), (4.60)

is associated with the following spectrum,

P P
Sxx (f ) = δ(f − f0 ) + δ(f + f0 ). (4.61)
2 2

Lorentzian spectrum The exponential covariance function, Eq. 4.8, written


in WSS notation,
Kxx (τ ) = P exp(−λ|τ |), (4.62)
is associated with the Lorentzian spectrum,

2P λ
Sxx (f ) = . (4.63)
(2πf )2 + λ2

bandlimited spectrum The ideal bandlimited spectrum,


(
P/2W, for |f | ≤ W ,
Sxx (f ) = (4.64)
0, otherwise,

is associated with the sin(x)/x covariance function,

sin(2πW τ )
Kxx (τ ) = P . (4.65)
2πW τ

Gaussian spectrum The Gaussian covariance function,


!
τ2
Kxx (τ ) = P exp − 2 , (4.66)
tc

is associated with the Gaussian spectrum


q
Sxx (f ) = P πt2c exp[−(πf tc )2 ]. (4.67)
4.4. GAUSSIAN RANDOM PROCESSES 91

white noise The white-noise spectrum,18


Sxx (f ) = q, for all f , (4.68)
is associated with the impulsive covariance function,
Kxx (τ ) = qδ(τ ). (4.69)

We have plotted these Kxx ↔ Sxx examples in Fig. 4.9. The single-
frequency wave’s spectrum is fully consistent with our understanding of its
sample functions—all the mean-square noise strength in this process is con-
centrated at f = f0 . The Lorentzian, bandlimited, and Gaussian examples all
can be assigned reasonably-defined correlation times and bandwidths, as shown
in the figure. These evidence the Fourier-transform uncertainty principle, i.e.,
to make a covariance decay more rapidly we must broaden its associated spec-
trum proportionally. In physical terms, for two time-samples of a WSS process
taken at time-separation τ s to be weakly correlated, the process must contain
significant spectral content at or beyond 1/2πτ Hz. This is consistent with our
earlier discussion of the Gaussian-process sample function shown in Fig. 4.3.
The white-noise spectrum deserves some additional discussion. Its name
derives from its having equal mean-square noise density at all frequencies. This
infinite bandwidth gives it both infinite variance and zero correlation time—
both characteristics at odds with physical reality. Yet, white-noise models
abound in communication theory generally, and will figure prominently in our
study of optical communications. There need be no conflict between realistic
modeling and the use of white-noise spectra. If a wide-sense stationary input
process in the Fig. 4.6 arrangement has a true spectral density that is very
nearly flat over the passband of the filter, no loss in output-spectrum accuracy
results from replacing the true input spectrum with a white-noise spectrum of
the appropriate level. We must remember, when using a white-noise model,
that meaningless answers—infinite variance, zero correlation-time—will ensue
if no bandlimiting filter is inserted between the source of the noise and our
observation point. Any measurement apparatus has some intrinsic bandwidth
limitation, so this caution is not unduly restrictive.

4.4 Gaussian Random Processes


A random process, x(t), is a collection of random variables indexed by the
time parameter, t. A Gaussian random process is a collection of jointly Gaus-
18
There is not enough good notation to go around; q is not the electron charge in this
expression.
92 CHAPTER 4. RANDOM PROCESSES

2 2

1.8
1.5

1.6

1
1.4

0.5
1.2
Kxx(τ)/Kxx(0)

Sxx(f)
0 1
Impulses of Area P/2

0.8
-0.5

0.6
-1
0.4

-1.5
0.2

-2 0
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
f0τ f/f
0

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
Kxx(τ)/Kxx(0)

Sxx(τ)/Sxx(0)
0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
-5 -4 -3 -2 -1 0 1 2 3 4 5 -3 -2 -1 0 1 2 3
λτ f/λ

1 2

1.8
0.8
1.6

0.6 1.4

1.2
Sxx(f)/Sxx(0)

0.4
Kxx(τ)/Kxx(0)

0.2
0.8

0.6
0

0.4
-0.2
0.2

-0.4 0
-5 -4 -3 -2 -1 0 1 2 3 4 5 -3 -2 -1 0 1 2 3
Wτ f/W
1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6
Kxx(τ)/Kxx(0)

Sxx(τ)/Sxx(0)

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2

0.1 0.1

0 0
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
τ/t π ft
c c

2 2

1.8 1.8

1.6 1.6

1.4 1.4

1.2 1.2
Sxx(f)/Sxx(0)

Impulse of Area q
Kxx(τ)

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
-3 -2 -1 0 1 2 3 -5 -4 -3 -2 -1 0 1 2 3 4 5
τ f

Figure 4.9: Covariance/spectral-density examples, top to bottom: (a) single-


frequency wave, (b) Lorentzian spectrum, (c) bandlimited spectrum, (d) Gaus-
sian spectrum, (e) white noise
4.4. GAUSSIAN RANDOM PROCESSES 93

sian random variables indexed by time. We introduced the Gaussian random


process (GRP) early on in this chapter, to provide a quick random-process
example whose sample functions looked appropriately noisy. Now, having ex-
tensively developed the second-order characterization for an arbitrary random
process, we return to the Gaussian case. This return will be worthwhile for
several reasons: Gaussian random processes are good models for random wave-
forms whose microscopic composition consists of a large number of more-or-less
small, more-or-less independent contributions; and second-order characteriza-
tion provides complete statistics for a Gaussian random process.
Let x(t) be a GRP with mean function mx (t) and covariance function
Kxx (t, s); wide-sense stationarity will not be assumed yet. By definition, the
random vector x, from Eq. 4.1, obtained by sampling this process at the times
specified by t, from Eq. 4.2, is Gaussian distributed. To complete explicit
evaluation of the probability density, px (X), for this random vector we need
only determine its mean vector and covariance matrix; they are given by
 
mx (t1 )

 mx (t2 ) 

mx = 
 .. ,
 (4.70)
 . 
mx (tN )
and  
Kxx (t1 , t1 ) Kxx (t1 , t2 ) · · · Kxx (t1 , tN )

 Kxx (t2 , t1 ) Kxx (t2 , t2 ) · · · Kxx (t2 , tN ) 

Λx = 
 .. .. .. ...
. (4.71)
 . . . 

Kxx (tN , t1 ) Kxx (tN , t2 ) · · · Kxx (tN , tN )
Thus, whereas knowledge of mean and covariance functions provides only a
partial characterization of a general random process, it provides a complete
characterization of a Gaussian random process. Furthermore, if a GRP x(t) is
wide-sense stationary, then it is also strict-sense stationary, i.e., for arbitrary t,
the random vector x has the same probability density function as the random
vector x′ defined as follows
 
x(0)
 


x(t2 − t1 ) 

x′ ≡ 
 x(t3 − t2 ) 
. (4.72)
 .. 

 . 

x(tN − t1 )
Arbitrary random processes which are wide-sense stationary need not be strict-
sense stationary.
94 CHAPTER 4. RANDOM PROCESSES

When we introduced jointly Gaussian random variables in Chapter 3, we


focused on their closure under linear transformations. The same is true for
Gaussian random processes: if the input process in Fig. 4.6 is Gaussian, then
the output process is also Gaussian. We shall omit the proof of this property—
it merely involves combining the convolution integral input/output relation
for the filter with the linear-closure definition of jointly Gaussian random vari-
ables to prove that y(t) is a collection of jointly Gaussian random variables
indexed by t. Of greater importance is the fact that the mean and covariance
propagation results from our second-order characterization now yield complete
statistics for the output of an LTI filter that is driven by Gaussian noise.

4.5 Joint Random Processes

The principal focus of the material thus far in this chapter has been on a
single random process. Nevertheless, we have noted that the random-process
input and output in Fig. 4.6 comprise a pair of joint random processes on
some underlying probability space. We even went so far as to compute their
cross-covariance function. Clearly, there will be cases, in our optical communi-
cation analyses, when we will use measurements of one random process to infer
characteristics of another. Thus, it is germane to briefly examine the complete
characterization for joint random processes, and discuss what it means for two
random processes to be statistically independent. Likewise, with respect to
partial statistics, we ought to understand the joint second-order characteriza-
tion for two random processes, and what it means for them to be uncorrelated.
These tasks will be addressed in this final section. Although the extension to
N joint random processes is straightforward, we will restrict our remarks to
the 2-D case.

Let x(t) and y(t) be joint random processes. Their complete statistical
characterization is the information sufficient to deduce the probability density,
4.5. JOINT RANDOM PROCESSES 95

pz (Z), for the random vector


   
x(t1 )



 x(t2 ) 






 .. 



 


 .  

x 
 x(tN ) 

z≡ −−− =  −−−
   
, (4.73)
   
y 
 y(t′1 ) 


 y(t2 ) 
   
 



 ..  


  .  
y(t′M )

for arbitrary {tn , tm }, N, and M. In words, complete joint-characterization
amounts to having the joint statistics for any set of time samples from the two
random processes. The joint second-order characterization of the processes
x(t) and y(t) consists of their mean functions, covariance functions, and their
cross-covariance function. These can always be found from a complete joint
characterization; the converse is not generally true.
We can now deal with the final properties of interest.
statistically-independent processes Two random processes are statisti-
cally independent if and only if all the time samples of one process are
statistically independent of all the time samples of the other process.

uncorrelated random processes Two random processes are uncorrelated


if and only if all the time samples of one process are uncorrelated with
all the time samples of the other process.
For x(t) and y(t) statistically independent random processes, the probability
density for a random vector z, obtained via Eq. 4.73 with arbitrary sample
times, factors according to
 
X
for all Z =  − − −  ;
 
pz (Z) = px (X)py (Y), (4.74)
Y

for x(t) and y(t) uncorrelated random processes, we have that

Kxy (t, s) = 0, for all t, s. (4.75)

Statistically independent random processes are always uncorrelated, but


uncorrelated random processes may be statistically dependent. In the context
96 CHAPTER 4. RANDOM PROCESSES

of our photodetector phenomenology, the physical independence of the various


noise currents—the light current, the dark current, and the thermal current—
will lead us to model them as statistically independent random processes.
Although there is a great deal more to be said about random processes. we
now have sufficient foundation for our immediate needs. However, before devel-
oping the random-process machinery that goes with the generic-photodetector
model of Chapter 2, we shall use Chapter 5 to establish a similar analytic
beach head in the area of Fourier optics. While not desperately necessary for
direct-detection statistical modeling, Fourier optics will be critical to under-
standing heterodyne detection, and it will serve as the starting point for our
coverage of unguided propagation channels.
MIT OpenCourseWare
https://ocw.mit.edu

6.453 Quantum Optical Communication


Fall 2016

For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.

You might also like