EECE 522 Notes - 01a Review of Probability - Rev - 1
EECE 522 Notes - 01a Review of Probability - Rev - 1
1
Random Variable
Definition
Numerical characterization of outcome of a random
event
Examples
1) Number on rolled dice
2) Temperature at specified time of day
3) Stock Market at close
4) Height of wheel going over a rocky road
2
Random Variable
But we can make
Non-examples these into RV’s
1) ‘Heads’ or ‘Tails’ on coin
2) Red or Black ball from urn
3
Two Types of Random Variables
Random Variable
Discrete RV Continuous RV
• Die • Temperature
• Stocks • Wheel height
4
PDF for Continuous RV
Given Continuous RV X…
What is the probability that X = x0 ?
Oddity : P(X = x0) = 0
Otherwise the Prob. “Sums” to infinity
Need to think of Prob. Density Function (PDF)
xo xo + ∆ x
P ( x0 < X < x0 + ∆ ) = area shown
xo + ∆
= ∫x
o
p X ( x )dx
5
Most Commonly Used PDF: Gaussian
1 −( x − m ) 2 / 2σ 2
p X ( x) = e
σ 2π
1 x 2 2σ 2
p x ( x) = e
σ 2π
7
Effect of Variance on Gaussian PDF
pX(x)
σ σ
Area within ±1 σ of mean = 0.683
= 68.3%
x=m
x
pX(x)
Small σ
Small Variability
(Small Uncertainty)
x
pX(x)
Large σ Large Variability
(Large Uncertainty)
x 8
Why Is Gaussian Used?
Central Limit theorem (CLT)
The sum of N independent RVs has a pdf
that tends to be Gaussian as N → ∞
9
Joint PDF of RVs X and Y p XY ( x, y )
Describes probabilities of joint events concerning X and Y. For
example, the probability that X lies in interval [a,b] and Y lies in
interval [a,b] is given by:
bd
Pr{( a < X < b) and ( c < Y < d )} = ∫ ∫ p XY ( x, y )dxdy
ac
x is held y is held
fixed fixed
“slice and
normalize”
y is held fixed
p XY ( x, y )
p X |Y = y ( x | y ) = = p X ( x)
pY ( y )
13
Independent and Dependent Gaussian PDFs
y
Contours of pXY(x,y).
Independent y
Different slices
(non-zero mean) give
x same normalized
curves
Different slices
y
Dependent give
different normalized
x curves
14
An “Independent RV” Result
p XY ( x, y ) = p X ( x ) pY ( y )
Here’s why:
p XY ( x, y ) p X ( x ) pY ( y )
pY | X = x ( y | x ) = = = pY ( y )
p X ( x) p X ( x)
15
Characterizing RVs
PDF tells everything about an RV
– but sometimes they are “more than we need/know”
So… we make due with a few Characteristics
– Mean of an RV (Describes the centroid of PDF)
– Variance of an RV (Describes the spread of PDF)
– Correlation of RVs (Describes “tilt” of joint PDF)
Symbolically: E{X}
16
Motivating Idea of Mean of RV
Motivation First w/ “Data Analysis View”
Consider RV X = Score on a test Data: x1, x2,… xN
∑
N
x N 0V0 + N1V1 + ... N nV100 100 N i
= ∑Vi
Test i =1 i
Average = x = N
=
N N
i =0
Ni = # of scores of value Vi
n
≈ P(X = Vi)
N =∑ i =1
N i (Total # of scores)
Statistics
This is called Data Analysis View
But it motivates the Data Modeling View Probability 17
Theoretical View of Mean
Data Analysis View leads to Probability Theory:
Data Modeling
For Discrete random Variables :
n
E{ X } = ∑ xi PX ( xi )
n =1
Probability Function
This Motivates form for Continuous RV:
∞
E{ X } = ∫ x p X ( x ) dx
−∞
Probability Density Function
≈
1
E{ X } = ∫ x p X ( x )dx Avg =
N
∑x
i =1
i
−∞ “Law of Large
PDF Numbers” Data
Dummy Variable
19
Variance of RV
There are similar Data vs. Theory Views here…
But let’s go right to the theory!!
= ∫ ( x − m x ) 2 p X ( x )dx
= ∫ x 2 p X ( x )dx
20
Motivating Idea of Correlation
Motivate First w/ Data Analysis View
Consider a random experiment that observes the
outcomes of two RVs:
Example: 2 RVs X and Y representing height and weight, respectively
y
Positively Correlated
x x
21
Illustrating 3 Main Types of Correlation
N
∑ ( xi − x )( yi − y )
1
Data Analysis View: C xy =
N i =1
σ XY = E{( X − X )(Y − Y )}
σ XY = ∫ ∫ ( x − X )( y − Y ) p XY ( x, y )dxdy
If X = Y: σ XY = σ X2 = σ Y2
If σ XY = E{( X − X )(Y − Y )} = 0
Then E{ XY } = X Y
Called “Correlation of X & Y ”
= f X ( x ) fY ( y ) = E{ X }E{Y }
Independence
σ XY
Correlation Coefficient : ρ XY =
σ XσY
−1 ≤ ρ XY ≤ 1
26
Covariance and Correlation For
Random Vectors…
x = [ X 1 X 1 X N ]T
Correlation Matrix :
E {X 1 X 1 } E {X 1 X 2 } E {X 1 X N }
E {X 2 X 1 } E {X 2 X 2 } E {X 2 X N }
R x = E{xx } =
T
E {X N X 1 } E {X N X 2 } E {X N X N }
Covariance Matrix :
C x = E{(x − x )( x − x )T }
27
A Few Properties of Expected Value
E{ X + Y } = E{ X } + E{Y } E{aX } = aE{ X } E{ f ( X )} = ∫ f ( x ) p X ( x )dx
σ 2 + σ 2 + 2σ
X Y XY
var{ X + Y } = var{aX } = a 2σ X2
σ 2 + σ 2 , if X & Y are uncorrelated
X Y
{(
var{ X + Y } = E X + Y − X − Y )} 2 σ X2
var{a + X } =
= E {( X + Y ) } where X = X − X
z z
2
z
= E {( X ) + (Y ) +2 X Y }
z
2
z
2
z z
= E {( X ) }+ E {(Y ) }+ 2 E { X Y }
z
2
z
2
z z
= σ X2 + σ Y2 + 2σ XY
28
Joint PDF for Gaussian
Let x = [X1 X2 … XN]T be a vector of random variables. These random variables
are said to be jointly Gaussian if they have the following PDF
1 1
p(x ) = exp − ( x − μ x )T C −x1 ( x − μ x )
2
N
( 2π ) 2 det(C x )
μ y = E{y} = Aμ x + b
C y = E{( y − μ y )( y − μ y )T } = AC x A T
A special case of this is the sum of jointly Gaussian RVs… which can be
handled using A = [1 1 1 … 1]
30
Moments of Gaussian RVs
Let X be zero mean Gaussian with variance σ2
Let X1 X2 X3 X4 be any four jointly Gaussian random variables with zero mean
Then…
Note that this can be applied to find E{X2Y2} if X and Y are jointly Gaussian
31
Chi-Squared Distribution
Let X1 X2 … XN be a set of zero-mean independent jointly Gaussian random
variables each with unit variance.
1 ( N /2) −1 − y /2
N /2 y e , y≥0
p( y ) = 2 Γ( N / 2)
0, y<0
For this RV we have that:
32