Econ-2042- Unit 4-HO
Econ-2042- Unit 4-HO
Another important intersection is the one that occurs when sampling from a population. Suppose that
X1 ; X2 ; :::; Xn denote the outcomes on n successive trials in an experiment. The measurements could be the
income of n households or the measurement of n characteristics of a household. A speci…c set of outcomes,
or sample measurements, may be represented in terms of n events (X1 = x1 ; X2 = x2 ; :::; Xn = xn ). To make
inferences about the population from which the sample was drawn, we will need to calculate the probability
of the intersection (x1 ; x2 ; :::; xn ).
fX;Y (x; y) = P (X = x; Y = y)
is called the joint (bivariate) probability distribution for X and Y . The function f (x; y) will be referred
to as the joint probability mass function.
c). Let X and Y be two random variables. The joint (bivariate) distribution function FX;Y (x; y) is given
by
FX;Y (x; y) = P (X x; Y y)
1
d). Two random variables will said to be jointly continuous if their joint distribution function F (x; y) is
continuous in both arguments.
e). Let X and Y be two continuous random variables with joint Z x distribution
Z y function F (x; y). If there
exists a nonnegative function f (x; y) such that F (x; y) = f (u; v) dvdu for any real numbers
1 1
x and y, then X and Y are said to be jointly continuous random variables. The function f (x; y) is
called the joint probability density function.
Z 1
f). The marginal Probability density function of X is de…ned as fX (x) = f (x; y) dy, and the marginal
Z 1 1
1). FX;Y (x; 1) and FX;Y (4; 1) are the univariate functions, as functions of x and y, respectively.
X n Y 1 2 3 4 5 6
1 1 1 1 1 1
1 (1; 1) ; 36 (1; 2) ; 36 (1; 3) ; 36 (1; 4) ; 36 (1; 5) ; 36 (1; 6) ; 36
1 1 1 1 1 1
2 (2; 1) ; 36 (2; 2) ; 36 (2; 3) ; 36 (2; 4) ; 36 (2; 5) ; 36 (2; 6) ; 36
1 1 1 1 1 1
3 (3; 1) ; 36 (3; 2) ; 36 (3; 3) ; 36 (3; 4) ; 36 (3; 5) ; 36 (3; 6) ; 36
1 1 1 1 1 1
4 (4; 1) ; 36 (4; 2) ; 36 (4; 3) ; 36 (4; 4) ; 36 (4; 5) ; 36 (4; 6) ; 36
1 1 1 1 1 1
5 (5; 1) ; 36 (5; 2) ; 36 (5; 3) ; 36 (5; 4) ; 36 (5; 5) ; 36 (5; 6) ; 36
1 1 1 1 1 1
6 (6; 1) ; 36 (6; 2) ; 36 (6; 3) ; 36 (6; 4) ; 36 (6; 5) ; 36 (6; 6) ; 36
2
Example (4.2): In the above exercise let
X = the number of 4’s shown up
Y = the number of 1’s shown up
a). Now we would like to derive the joint probability distribution of X and Y .
c). We can observe the number of 4’s and 1’s either 0 or 1 or 2 times in each case.
f(1; 1); (1; 2); (1; 3); (1; 5); (1; 6); (2; 1); (2; 2); (2; 3); (2; 5); (2; 6); (3; 1); (3; 2); (3; 3); (3; 5);
(3; 6); (5; 1); (5; 2)(5; 3); (5; 5); (5; 6); (6; 1); (6; 2); (6; 3); (6; 5); (6; 6)g
f(1; 4); (2; 4); (3; 4); (4; 1); (4; 2); (4; 3); (4; 5); (4; 6); (5; 4); (6; 4)g
f(2; 2); (2; 3); (2; 4); (2; 5); (2; 6); (3; 2); (3; 3); (3; 4); (3; 5); (3; 6); (4; 2); (4; 3); (4; 4); (4; 5);
(4; 6); (5; 2); (5; 3); (5; 4); (5; 5); (5; 6); (6; 2); (6; 3); (6; 4); (6; 5); (6; 6)g ;
f(1; 2); (1; 3); (1; 4); (1; 5); (1; 6); (2; 1); (3; 1); (4; 1); (5; 1); (6; 1)g , and
f(2; 2); (2; 3); (2; 5); (2; 6); (3; 2); (3; 3); (3; 5); (3; 6); (5; 2)(5; 3); (5; 5); (5; 6); (6; 2); (6; 3); (6; 5); (6; 6)g
f(2; 4); (3; 4); (4; 2); (4; 3); (4; 5); (4; 6); (5; 4); (6; 4)g
c). The intersection of X = 2 and Y = 0 has the following 1 points f(4; 4)g
f(1; 2); (1; 3); (1; 5); (1; 6); (2; 1); (3; 1); (5; 1); (6; 1)g
3
e). The intersection of X = 1 and Y = 1, has the following 2 points f(1; 4); (4; 1)g,
f). The intersection of X = 0 and Y = 2, has the following 1 point f(1; 1)g and
g). Note that when X = 2, Y = 1, cannot take values greater than 0 and vice versa, also when X = 1; Y = 2
cannot take values greater than 0 and there is no sample point where X = 2, Y = 2.
X nY 0 1 2 fX (x)
16 8 1 25
0 36 36 36 36
8 2 10
1 36 36 0 36
1 1
2 36 0 0 36
25 10 1
fY (y) 36 36 36 1
Example (4.3): A bowl contains 3 black, 2 red and 4 white balls. Two balls are selected at random without
replacement.
Let the r.v X denote number of black balls in the sample of 2 balls selected.
Let the r.v. Y denote number of red balls in the sample of 2 balls selected.
9
Since there are 9 balls in all and we are selecting two balls, there are 2 = 36 ways of doing so, and
3 2
x ways of selecting x black balls out of the 3 black balls, y ways of selecting y red balls out of the 2
red balls, and given that 2 x y is the number of white balls that can be selected, we can select white
4
balls in 2 x y ways.
4
Note that f (1; 2) = f (2; 1) = f (2; 2) = 0
X nY 0 1 2 fX (x)
1 2 1 5
0 6 9 36 12
1 1 1
1 3 6 0 2
1 1
2 12 0 0 12
7 7 1
fY (y) 12 18 36 1
a). f (x; y) 0
PP
b). f (x; y) = 1
x y
P (X = 0) = fX (0) = 5=12
P (X = 1) = fX (1) = 1=2
P (X = 2) = fX (2) = 1=12
X fX (x)
5
0 12
1
1 2
1
2 12
Similarly if we sum the three rows we get the marginal probability distribution of Y as follows:
P (Y = 0) = fY (0) = 7=12
P (Y = 1) = fY (1) = 7=18
P (Y = 2) = fY (2) = 1=36
and thus
Y fY (y)
7
0 12
7
1 18
1
2 36
5
4.2 Conditional Distribution and Independence
Conditional distribution: the conditional probability distribution of Y given X = x is de…ned as
fX;Y (x; y)
fY jX (yjx) = if fX (x) 6= 0.
fX (x)
fX;Y (x; y)
fXjY (xjy) = if fY (y) 6= 0
fY (y)
Note that this de…nition holds for both discrete and continuous random variables.
Notice the similarity between the conditional probability and conditional probability distribution de…ned as
P (A \ B) fX;Y (x; y)
P (BjA) = ; and fY jX (yjx) = .
P (A) fX (x)
The notion of independence: when the information that the random variable X takes a particular value x
is irrelevant to the determination of the probability that another random variable Y takes a value y, we say
that Y is independent of X. Formally, two random variables, X and Y , are said to be independent if and
only if any one of the following three conditions hold:
4.3 Expectation
a). If X and Y are two discrete random variables with joint probability function f (x; y), then
XX X X
E (X) = xf (x; y) = x f (x; y)
x y x y
X
= xfX (x)
x
Similarly
XX X X
E (Y ) = yf (x; y) = y f (x; y)
x y y x
X
= yfY (y)
y
b). If X and Y are two continuous random variables with joint probability density function f (x; y),
then
Z 1 Z 1 Z 1 Z 1
E (X) = xf (x; y) dxdy = x f (x; y) dy dx
1 1 1 1
Z 1
= xfX (x) dx
1
6
Similarly,
Z 1 Z 1 Z 1 Z 1
E (Y ) = yf (x; y) dxdy = y f (x; y) dx dy
1 1 1 1
Z 1
= yfY (y) dy
1
An important measure when we have two random variables is that of covariation of X, and Y . It is de…ned
as follows:
In the case where the two random variables are continuous, the covariance between the two random variables
can be evaluated as follows:
cov (X; Y ) = E (X x) Y y
Z 1Z 1
= (x x) y y f (x; y) dxdy
1 1
We de…ne the coe¢ cient of correlation between X and Y , denoted by or x;y , as follows:
a). 1 1.
7
b). cov (X; Y ) = cov (Y; X), and
The conditional expectation of Y given X = x turns out to be a function of x. That is, E (Y jX = x) = m(x),
or is called the regression function of Y on X. It tells us how the mean of Y varies with changes in X.
Example (4.3): In our earlier example of the hypergeometric distribution we had the following results:
3 2 4
x y 2 x y
f (x; y) = P (X = x; Y = y) = 9 ; x = 0; 1; 2; y = 0; 1; 2; x + y 2
2
X nY 0 1 2 fX (x)
1 2 1 5
0 6 9 36 12
1 1 1
1 3 6 0 2
1 1
2 12 0 0 12
7 7 1
fY (y) 12 18 36 1
8
Note:
5 1 1 2
1). E (X) = 0 12 +1 2 +2 12 = 3
7 7 1 4
2). E (Y ) = 0 12 +1 18 +2 36 = 9
5 1 1 5
3). E X 2 = 02 12 + 12 2 + 22 12 = 6
7 7 1 1
4). E Y 2 = 02 12 + 12 18 + 22 36 = 2
2 2 5 2 2 7
5). x = E X2 [E (X)] = 6 3 = 18
2 2 1 4 2 49
6). y =E Y2 [E (Y )] = 2 9 = 162
1 2 5 1 1 1
7). E (XY ) = 0 0 6 +0 1 9 +0 2 12 +1 0 3 +1 1 6 + = 6
1 2 4 7
8). cov (X; Y ) = E (XY ) E (X) E (Y ) = 6 3 9 = 54
Note: The covariance ends up negative because of the restriction given by x + y 2. So that when x
increases, y must go down and thus, the negative relationship between the two random variables.
cov(X;Y ) 7=54 1
p
9). = =p = 7= p1
X;Y x y (7=18) (49=162) 7 7
2 0
As a result the slope of the regression of Y on X is linear with the slope of 1=3.
Let X and Y be two random variables with probability function f (x; y), then
9
a) If X and Y are independent, then E (XY ) = E (X) E (Y )
Proof:
XX XX
E (XY ) = xyf (x; y) = xyfX (x) fY (y)
x y x y
X X
= xfX (x) yfY (y) = E (X) E (Y )
x y
b). If X and Y are two independent random variables, then cov (X; Y ) = 0 and x;y =0
(a) Proof:
cov (X; Y ) = E (XY ) E (X) E (Y ) = E (X) E (Y ) E (X) E (Y ) = 0
Though independence of two random variables implies that they have zero correlation, the converse is not
necessarily true. That is, two random variables may have zero correlation; it does not necessarily mean that
they are independent.
Example (4.4): Let the joint probability distribution of two random variables X and Y be given as follows
X Y 1 0 1 fX (x)
1 1
1 0 4 0 4
1 1 1
0 4 0 4 2
1 1
1 0 4 0 4
1 1 1
fY (y) 4 2 4 1
1 1
The sum of the conditional distribution should add up to unity. Similarly, the sum of the marginal distrib-
ution also adds up to one.
If X and Y are independent fY jX (yjX = x) = fY (y) for all X, but in the above example this does not hold,
therefore, the two variables are not independent.
10
i ). Let X and Y be two random variables and de…ne a new random variable Z as Z = X + Y
Proof:
h i h i
2 2
var(Z) = E (Z E (Z)) = E (X + Y E (X + Y ))
h i
2
= E ((X E (X)) + (Y E (Y )))
h i
2 2
= E (X E (X)) + (Y E (Y )) + 2 (X E (X)) (Y E (Y ))
h i h i
2 2
= E (X E (X)) + E (Y E (Y )) + 2E [(X E (X)) (Y E (Y ))]
ii ). Let X and Y be two random variables and de…ne a new random variable Z as Z = X + Y where
and are constants.
2 2
Proposition: var(Z) = var(X) + var(Y ) + 2 cov(X; Y )
Proof:
h i h i
2 2
var(Z) = E (Z E (Z)) = E ( X + Y E ( X + Y ))
h i
2
= E ( (X E (X)) + (Y E (Y )))
h i
2 2
= E 2 (X E (X)) + 2 (Y E (Y )) + 2 (X E (X)) (Y E (Y ))
h i h i
2 2
= 2
E (X E (X)) + 2 E (Y E (Y )) + 2 E [(X E (X)) (Y E (Y ))]
2 2
= var(X) + var(Y ) + 2 cov(X; Y )
iii ). Generally, let X1 ; X2 ; : : : :; Xk be k random variables and de…ne a new random variable Z as Z =
1 X1 + 2 X2 + + k Xk , then let
2 3 2 3
1 X1
6 7 6 7
6 7 6 7
6 2 7 6 X2 7
=6 7
6 .. 7 ; X=6 7
6 .. 7
6 . 7 6 . 7
4 5 4 5
k Xk
11
0 0
= E (X E (X)) (X E (X))
0 0
= E (X E (X)) (X E (X))
0
= cov (X)
0
Note that and are 1 k and k 1 vectors, respectively, and cov (X) is a matrix. Namely,
2 3
11 12 1k
6 7
6 7
6 21 22 2k 7
cov (X) = 6
6 .. .. .. ..
7 ; variance covariance matrix of the vector X
7
6 . . . . 7
4 5
k1 k2 kk
2 2 2
var (Z) = 1 11 + 2 22 + + k kk +2 1 2 12 +2 1 3 13 + +2 1 k 1k
+2 2 3 23 + +2 2 k 2k + +2 k 1 k k 1;k
Note:
2 2 2
var (Z) = 1 11 + 2 22 + + k kk
0
= diag 11 22 kk
2
b). If var(Xi ) = 8i, and if Xi and Xj (i 6= j) are not correlated
2 2 2 2 2 0
var (Z) = 1 + 2 + + k =
Exercise (4.1): Given an experiment of rolling a fair, four-sided die twice, let X denote the outcome of
the …rst die, and Y be the sum of the two rolls.
2 2
b). Find E(X), E(Y ), x, y, cov(X; Y ), x;y , and E(Y jX) for X = 1; 2; 3; 4.
Theorem (4.1): Let X1 ; X2 ; : : : :; Xn be n random variable with joint probability density function f (x1 ; x2 ; : : : ::; xn )
and Yi = i (x1 ; x2 ; : : : ::; xn ) for i = 1; 2; : : : :; n. Let Xi = i (y1 ; y2 ; : : : ::; yn ) for i = 1; 2; : : : :; n be a one-
to-one inverse transformation, and let J be the Jacobian of the inverse transformation given by:
0 1
@ 1 @ 1 @ 1
B @y1 @y2 @y2 C
B @ 2 @ 2 @ 2 C
B @y1 @y2 @yn C
J =B
B .. .. .. ..
C;
C
B . . . . C
@ A
@ n @ n @ n
@y1 @y2 @yn
12
then the density function for Y1 ; Y2 ; : : : ::; Yn is given by:
h (y1 ; y2 ; : : : ::; yn ) = f ( 1 (y1 ; y2 ; : : : ::; yn ) ; 2 (y1 ; y2 ; : : : ::; yn ) ; : : : ::; n (y1 ; y2 ; : : : ::; yn )) kJk
Example (4.5): Let X1 and X2 be two random variables with joint probability density function given by:
8
< e (x1 +x2 ) ; x > 0; x > 0
1 2
f (x1 ; x2 ) =
: 0 ; elsewhere
X1
Let Y1 = X1 + X2 ; and Y2 = X1 +X2 ; i.e.,
X1
1 (x1 ; x2 ) = X1 + X2 ; and 1 (x1 ; x2 ) =
X1 + X2
Solution :
However,
X1 X1
Y2 = = ) X1 = Y1 Y2
X1 + X2 Y1
Y1 = X1 + X2 ) X2 = Y1 X1 = Y1 Y1 Y2
) X1 = 1 (y1 ; y2 ) = Y1 Y2
) X2 = 2 (y1 ; y2 ) = Y1 Y1 Y2
@ 1 @ 1
y2 y1
) jJj = @y1 @y2
= = y1
@ 2 @ 2
@y1 @y2 1 y2 y1
13