0% found this document useful (0 votes)
7 views

EECE 522 Notes - 01a Review of Probability - Rev - 1

The document provides an overview of probability and random variables. It defines key concepts such as: 1) Random variables which assign numerical values to random outcomes and can be either discrete (e.g. dice rolls) or continuous (e.g. temperature). 2) Probability density functions (PDFs) which describe the probabilities of random variable values. The most commonly used PDF is the Gaussian or normal distribution. 3) Key characteristics of random variables including the mean (expected value), variance (spread of values), and correlation between multiple random variables. Independent random variables have a joint PDF that is the product of their individual PDFs, while dependent variables have joint PDFs that change based on the value
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

EECE 522 Notes - 01a Review of Probability - Rev - 1

The document provides an overview of probability and random variables. It defines key concepts such as: 1) Random variables which assign numerical values to random outcomes and can be either discrete (e.g. dice rolls) or continuous (e.g. temperature). 2) Probability density functions (PDFs) which describe the probabilities of random variable values. The most commonly used PDF is the Gaussian or normal distribution. 3) Key characteristics of random variables including the mean (expected value), variance (spread of values), and correlation between multiple random variables. Independent random variables have a joint PDF that is the product of their individual PDFs, while dependent variables have joint PDFs that change based on the value
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Review of Probability

1
Random Variable
 Definition
Numerical characterization of outcome of a random
event

Examples
1) Number on rolled dice
2) Temperature at specified time of day
3) Stock Market at close
4) Height of wheel going over a rocky road

2
Random Variable
But we can make
 Non-examples these into RV’s
1) ‘Heads’ or ‘Tails’ on coin
2) Red or Black ball from urn

 Basic Idea – don’t know how to completely


determine what value will occur
– Can only specify probabilities of RV values occurring.

3
Two Types of Random Variables

Random Variable

Discrete RV Continuous RV
• Die • Temperature
• Stocks • Wheel height

4
PDF for Continuous RV
Given Continuous RV X…
What is the probability that X = x0 ?
Oddity : P(X = x0) = 0
Otherwise the Prob. “Sums” to infinity
Need to think of Prob. Density Function (PDF)

pX(x) The Probability density function


of RV X

xo xo + ∆ x
P ( x0 < X < x0 + ∆ ) = area shown
xo + ∆
= ∫x
o
p X ( x )dx
5
Most Commonly Used PDF: Gaussian

A RV X with the following PDF is called a Gaussian RV

1 −( x − m ) 2 / 2σ 2
p X ( x) = e
σ 2π

m & σ are parameters of the Gaussian pdf


m = Mean of RV X
σ = Standard Deviation of RV X (Note: σ > 0)
σ2 = Variance of RV X

Notation: When X has Gaussian PDF we say X ~ N(m,σ 2)


6
Zero-Mean Gaussian PDF

 Generally: take the noise to be Zero Mean

1 x 2 2σ 2
p x ( x) = e
σ 2π

7
Effect of Variance on Gaussian PDF
pX(x)
σ σ
Area within ±1 σ of mean = 0.683
= 68.3%
x=m
x

pX(x)
Small σ
Small Variability
(Small Uncertainty)
x

pX(x)
Large σ Large Variability
(Large Uncertainty)

x 8
Why Is Gaussian Used?
Central Limit theorem (CLT)
The sum of N independent RVs has a pdf
that tends to be Gaussian as N → ∞

So What! Here is what : Electronic systems generate


internal noise due to random motion of electrons in electronic
components. The noise is the result of summing the random
effects of lots of electrons.

CLT applies Guassian Noise

9
Joint PDF of RVs X and Y p XY ( x, y )
Describes probabilities of joint events concerning X and Y. For
example, the probability that X lies in interval [a,b] and Y lies in
interval [a,b] is given by:
bd
Pr{( a < X < b) and ( c < Y < d )} = ∫ ∫ p XY ( x, y )dxdy
ac

This graph shows a Joint PDF


Graph from B. P. Lathi’s book: Modern Digital & Analog Communication Systems 10
Conditional PDF of Two RVs
When you have two RVs… often ask: What is the PDF of Y if X is
constrained to take on a specific value.
In other words: What is the PDF of Y conditioned on the fact X is
constrained to take on a specific value.
Ex.: Husband’s salary X conditioned on wife’s salary = $100K?
First find all wives who make EXACTLY $100K… how are their
husband’s salaries distributed.
Depends on the joint PDF because there are two RVs… but it
should only depend on the slice of the joint PDF at Y=$100K.
Now… we have to adjust this to account for the fact that the joint
PDF (even its slice) reflects how likely it is that Y=$100K will
occur (e.g., if Y=105 is unlikely then pXY(x,105) will be small); so…
if we divide by pY(105) we adjust for this.
11
Conditional PDF (cont.)
Thus, the conditional PDFs are defined as (“slice and normalize”):
 p XY ( x, y )  p XY ( x, y )
 , p X ( x) ≠ 0  , pY ( y ) ≠ 0
pY | X ( y | x ) =  p X ( x ) p X |Y ( x | y ) =  pY ( y )
0, 0,
 otherwise  otherwise

x is held y is held
fixed fixed

“slice and
normalize”
y is held fixed

This graph shows a Conditional PDF


Graph from B. P. Lathi’s book: Modern Digital & Analog Communication Systems 12
Independent RV’s
Independence should be thought of as saying that:

Neither RV impacts the other statistically – thus, the


values that one will likely take should be irrelevant to the
value that the other has taken.

In other words: conditioning doesn’t change the PDF!!!


p XY ( x, y )
pY | X = x ( y | x ) = = pY ( y )
p X ( x)

p XY ( x, y )
p X |Y = y ( x | y ) = = p X ( x)
pY ( y )
13
Independent and Dependent Gaussian PDFs
y
Contours of pXY(x,y).

Independent If X & Y are independent,


x
(zero mean) then the contour ellipses
are aligned with either the
x or y axis

Independent y
Different slices
(non-zero mean) give
x same normalized
curves

Different slices
y
Dependent give
different normalized
x curves
14
An “Independent RV” Result

RV’s X & Y are independent if:

p XY ( x, y ) = p X ( x ) pY ( y )

Here’s why:

p XY ( x, y ) p X ( x ) pY ( y )
pY | X = x ( y | x ) = = = pY ( y )
p X ( x) p X ( x)

15
Characterizing RVs
 PDF tells everything about an RV
– but sometimes they are “more than we need/know”
 So… we make due with a few Characteristics
– Mean of an RV (Describes the centroid of PDF)
– Variance of an RV (Describes the spread of PDF)
– Correlation of RVs (Describes “tilt” of joint PDF)

Mean = Average = Expected Value

Symbolically: E{X}

16
Motivating Idea of Mean of RV
Motivation First w/ “Data Analysis View”
Consider RV X = Score on a test Data: x1, x2,… xN

Possible values of RV X : V0 V1 V2... V100


0 1 2 … 100


N
x N 0V0 + N1V1 + ... N nV100 100 N i
= ∑Vi
Test i =1 i
Average = x = N
=
N N
i =0

Ni = # of scores of value Vi
n
≈ P(X = Vi)
N =∑ i =1
N i (Total # of scores)

Statistics
This is called Data Analysis View
But it motivates the Data Modeling View Probability 17
Theoretical View of Mean
Data Analysis View leads to Probability Theory:
Data Modeling
 For Discrete random Variables :
n
E{ X } = ∑ xi PX ( xi )
n =1
Probability Function
 This Motivates form for Continuous RV:

E{ X } = ∫ x p X ( x ) dx
−∞
Probability Density Function

Notation: E{ X } = X Shorthand Notation 18


Aside: Probability vs. Statistics
Probability Theory Statistics
» Given a PDF Model » Given a set of Data
» Describe how the » Determine how the
data will likely behave data did behave
∞ N


1
E{ X } = ∫ x p X ( x )dx Avg =
N
∑x
i =1
i

−∞ “Law of Large
PDF Numbers” Data
Dummy Variable

There is no DATA here!!! There is no PDF here!!!


The PDF models how The Statistic measures how
the data will likely behave the data did behave

19
Variance of RV
There are similar Data vs. Theory Views here…
But let’s go right to the theory!!

Variance: Characterizes how much you expect the


RV to Deviate Around the Mean
Variance: σ 2 = E{( X − m x ) 2 }

= ∫ ( x − m x ) 2 p X ( x )dx

Note : If zero mean…


σ 2 = E{ X 2 }

= ∫ x 2 p X ( x )dx
20
Motivating Idea of Correlation
Motivate First w/ Data Analysis View
Consider a random experiment that observes the
outcomes of two RVs:
Example: 2 RVs X and Y representing height and weight, respectively

y
Positively Correlated

x x

21
Illustrating 3 Main Types of Correlation
N
∑ ( xi − x )( yi − y )
1
Data Analysis View: C xy =
N i =1

y−y y−y y−y

x−x x−x x−x

Positive Correlation Zero Correlation Negative Correlation


“Best Friends” i.e. uncorrelated “Worst Enemies”
“Complete Strangers”

GPA Height Student Loans


& & &
Starting Salary $ in Pocket Parents’ Salary
22
Prob. Theory View of Correlation
To capture this, define Covariance :

σ XY = E{( X − X )(Y − Y )}
σ XY = ∫ ∫ ( x − X )( y − Y ) p XY ( x, y )dxdy

If the RVs are both Zero-mean : σ XY = Ε{XY }

If X = Y: σ XY = σ X2 = σ Y2

If X & Y are independent, then: σ XY = 0


23
If σ XY = E{( X − X )(Y − Y )} = 0
Then… Say that X and Y are “uncorrelated”

If σ XY = E{( X − X )(Y − Y )} = 0

Then E{ XY } = X Y
Called “Correlation of X & Y ”

So… RVs X and Y are said to be uncorrelated


if σXY = 0
or equivalently… if E{XY} = E{X}E{Y}
24
Independence vs. Uncorrelated

X & Y are Implies X & Y are


Independent Uncorrelated
f XY ( x, y ) E{ XY }

= f X ( x ) fY ( y ) = E{ X }E{Y }

PDFs Separate Means Separate


Uncorrelated

Independence

INDEPENDENCE IS A STRONGER CONDITION !!!!


25
Confusing Covariance and
Correlation Terminology
Covariance : σ XY = E{( X − X )(Y − Y )}

Same if zero mean


Correlation : E{XY }

σ XY
Correlation Coefficient : ρ XY =
σ XσY
−1 ≤ ρ XY ≤ 1
26
Covariance and Correlation For
Random Vectors…
x = [ X 1 X 1  X N ]T
Correlation Matrix :
 E {X 1 X 1 } E {X 1 X 2 }  E {X 1 X N }
 
 E {X 2 X 1 } E {X 2 X 2 }  E {X 2 X N }
R x = E{xx } = 
T 
     
 
 E {X N X 1 } E {X N X 2 }  E {X N X N }

Covariance Matrix :

C x = E{(x − x )( x − x )T }
27
A Few Properties of Expected Value
E{ X + Y } = E{ X } + E{Y } E{aX } = aE{ X } E{ f ( X )} = ∫ f ( x ) p X ( x )dx

σ 2 + σ 2 + 2σ
 X Y XY
var{ X + Y } =  var{aX } = a 2σ X2
σ 2 + σ 2 , if X & Y are uncorrelated
 X Y

{(
var{ X + Y } = E X + Y − X − Y )} 2 σ X2
var{a + X } =

= E {( X + Y ) } where X = X − X
z z
2
z

= E {( X ) + (Y ) +2 X Y }
z
2
z
2
z z

= E {( X ) }+ E {(Y ) }+ 2 E { X Y }
z
2
z
2
z z

= σ X2 + σ Y2 + 2σ XY
28
Joint PDF for Gaussian
Let x = [X1 X2 … XN]T be a vector of random variables. These random variables
are said to be jointly Gaussian if they have the following PDF

1  1 
p(x ) = exp − ( x − μ x )T C −x1 ( x − μ x ) 
 2 
N
( 2π ) 2 det(C x )

where µx is the mean vector and Cx is the covariance matrix:


μ x = E{x} C x = E{(x − μ x )( x − μ x )T }

For the case of two jointly Gaussian RVs X1 and X2 with


E{Xi} = µi var{Xi} = σi2 E{(X1 – µ1) (X2 – µ2)} = σ12 ρ = σ12/ (σ1 σ2)
Then…
1  1  ( x1 − µ1 ) 2 ( x1 − µ1 )( x2 − µ2 ) ( x2 − µ2 ) 2  
p( x1 , x2 ) = exp − 2 
− 2ρ + 
2πσ 1 σ 2 1− ρ 2
 2 (1 − ρ )  σ 2
1 σ σ
1 2 σ 22  

It is easy to verify that X1 and X2 are uncorrelated (and independent!) if ρ = 0


29
Linear Transform of Jointly Gaussian RVs
Let x = [X1 X2 … XN]T be a vector of jointly Gaussian random variables with
mean vector µx and covariance matrix Cx…

Then the linear transform y = Ax + b is also jointly Gaussian with

μ y = E{y} = Aμ x + b

C y = E{( y − μ y )( y − μ y )T } = AC x A T

A special case of this is the sum of jointly Gaussian RVs… which can be
handled using A = [1 1 1 … 1]

30
Moments of Gaussian RVs
Let X be zero mean Gaussian with variance σ2

Then the moments E{Xk} are as follows:

1 ⋅ 3 ( k − 1)σ k , k even


E{ X k } = 
0, k odd

Let X1 X2 X3 X4 be any four jointly Gaussian random variables with zero mean

Then…

E{X1X2X3X4} = E{X1X2}E{X3X4} + E{X1X3}E{X2X4} + E{X1X4}E{X2X3}

Note that this can be applied to find E{X2Y2} if X and Y are jointly Gaussian

31
Chi-Squared Distribution
Let X1 X2 … XN be a set of zero-mean independent jointly Gaussian random
variables each with unit variance.

Then the RV Y = X12 + X22 + … + XN2 is called a chi-squared (χ2) RV of N


degrees of freedom and has PDF given by

 1 ( N /2) −1 − y /2
 N /2 y e , y≥0
p( y ) =  2 Γ( N / 2)
0, y<0

For this RV we have that:

E{Y} = N and var{Y} = 2N

32

You might also like