CL202: Introduction To Data Analysis: MB+SCP
CL202: Introduction To Data Analysis: MB+SCP
MB+SCP
mbhushan,[email protected]
Spring 2015
Today’s Lecture:
Probability (Chapter 3) completed.
Chapter 4 of textbook.
Random variables.
X :S →R
The sequence HHTH of coin tosses is not a random variable. It must have a
numerical mapping.
Random variable denoted by uppercase letter such as X .
When an experiment is performed, the value obtained by the random variable
is denoted by a lowercase letter such as x (x = 6 feet).
PDF of X is large for high probability and low for low probability.
For a single value of X
Z a
P(X = a) = f (x)dx = 0
a
for small .
i.e. f (a) is a measure of how likely it is that the random variable will take
values in a small neighbourhood at a.
Z a
F (a) = P{X ∈ (−∞, a]} = f (x)dx
−∞
C (4x − 2x 2 ),
0<x <2
f (x) =
0, otherwise
Then, R2
(i) C = 3/8 from 0 f (x)dx = 1.
R∞
(ii) P{X > 1} = 1 f (x)dx = 1/2.
F (x, y ) = P{X ≤ x, Y ≤ y }
Can compute the probability of any statement concerning the values of X and Y.
Given two discrete random variables X and Y in the same experiment, the
joint PMF of X and Y is
p(xi , yj ) = P(X = xi , Y = yj )
Formally:
[
{X = xi } = {X = xi , Y = yj }
j
Similarly, P
pY (yj ) = P{Y = yj } = p(xi , yj ).
i
XX
pX (xi ) = P(X = xi ) = pX ,Y ,Z (X = xi , Y = yj , Z = zk )
j k
3 batteries are randomly chosen from a group of 3 new, 4 used but still working,
and 5 defective batteries. Let X , Y denote the number of new and used but
working batteries that are chosen respectively. Find
p(xi , yj ) = P{X = xi , Y = yj }.
Solution: Let T =12 C3
p(0, 0) = (5 C3 )/T
p(0, 1) = (4 C1 )(5 C2 )/T
p(0, 2) = (4 C2 )(5 C1 )/T
p(0, 3) = (4 C3 )/T
p(1, 0) = (3 C1 )(5 C2 )/T
p(1, 1) = (3 C1 )(4 C1 )(5 C1 )/T
p(1, 2) = ...
p(2, 0) = ...
p(2, 1) = ...
p(3, 0) = ...
0 1 2 3 Row Sum
(P{X = i})
0 10/220 40/220 30/220 4/220 84/220
1 30/220 60/220 18/220 0 108/220
2 15/220 12/220 0 0 27/220
3 1/220 0 0 0 1/220
Col sum 56/220 112/220 48/220 4/220
(P{Y = j})
Random variables X , Y .
Joint probability density f (x, y ) is a function defined for all real x and y and has
the property that for every set C of pairs of real numbers (i.e. C is a set in
two-dimensional plane):
Z Z
P{(X , Y ) ∈ C } = f (x, y )dxdy
(x,y )∈C
X,Y said to be jointly continuous and f (x, y ) is the joint probability density
function of X , Y .
Z b Z a
F (a, b) = P{X ∈ (−∞, a], Y ∈ (−∞, b]} = f (x, y )dxdy
−∞ −∞
Thus,
∂2
f (a, b) = F (a, b)
∂a∂b
wherever the partial derivatives exist.
Interpretation of joint density function: for small , δ
f (a, b) a measure of how likely it is that random vector (X,Y) will be near
(a,b).
Z ∞
fX (x) = f (x, y )dy
−∞
since
2e −x e −2y
0 < x < ∞, 0 < y < ∞
f (x, y ) =
0 otherwise
Compute: (a) P{X > 1, Y < 1}, (b) P{X < Y }, (c) P{X < a}
(a)
Z 1 Z ∞
P{X > 1, Y < 1} = 2e −x e −2y dxdy
0 1
Z 1
= 2e −2y (−e −x |∞
1 )dy
0
Z 1
= e −1 2e −2y dy
0
= e −1 (1 − e −2 )