0% found this document useful (0 votes)
62 views

Correlation and Linear Regression

The document discusses correlation and linear regression. It defines correlation as the tendency of simultaneous variation between two variables. Correlation can be positive, negative, or zero. The degree of linear correlation is measured using the correlation coefficient r, which ranges from -1 to 1. A scatter plot can show the relationship graphically, and a line of best fit can be drawn to represent the linear regression between the variables. The correlation coefficient is calculated using the covariance and variances of the variables.

Uploaded by

Neha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

Correlation and Linear Regression

The document discusses correlation and linear regression. It defines correlation as the tendency of simultaneous variation between two variables. Correlation can be positive, negative, or zero. The degree of linear correlation is measured using the correlation coefficient r, which ranges from -1 to 1. A scatter plot can show the relationship graphically, and a line of best fit can be drawn to represent the linear regression between the variables. The correlation coefficient is calculated using the covariance and variances of the variables.

Uploaded by

Neha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Correlation and Linear Regression

Paper: Probability and Statistics

Lesson: Correlation and Linear Regression

Course Developer: Dr. Manoj Kumar Varshney & Mr. V.Ravi

Department/College: Assistant Professor, Department of


Statistics, Hindu College, University of Delhi

&

Assistant Professor, Department of Statistics, Lady Sri Ram


College, University of Delhi

Institute of Lifelong Learning, University of Delhi pg. 1


Correlation and Linear Regression

Institute of Lifelong Learning, University of Delhi pg. 2


Correlation and Linear Regression

Table of Contents

Chapter: Correlation and Linear Regression


 1: Learning Outcomes
 2: Introduction
 3: Correlation
o 3.1: Types of Correlation
o 3.2: Measurement of Correlation
o 3.3. Line of Best Fit
 4: Correlation Coefficient
o 4.1: Assumptions for Correlation Coefficient
o 4.2: Properties of Correlation Coefficient
 Exercise 1
 5: Regression
o 5.1: Line of Regression
 6: Linear Regression
 Exercise 2
 7: Moment Generating Function
 8: Covariance
 Exercises 3
 Summary
 References, Web links and Suggested reading

1. Learning outcomes:

After studying this chapter reader should be able to understand the

 Correlation
 Types of Correlation
 Measurement of Correlation
 Line of Best Fit
 Correlation Coefficient
 Assumptions for Correlation Coefficient
 Properties of Correlation Coefficient
 Regression
 Line of Regression
 Linear Regression
 Moment Generating Function
 Covariance

Institute of Lifelong Learning, University of Delhi pg. 3


Correlation and Linear Regression

2. Introduction:

If for every value of a variable X ,we know a corresponding values of


second variable Y, the resulting series of pairs of values of two variables
is known as Bivariate population and its distribution is known as Bivariate
distribution. For example , asset of pairs of the heights of husbands and
their wives at the time of marriage, a series of the weights and heights of
students in a college, a series of demand and supply of commodity.

3. Correlation:
In a bivariate distribution if the change in one variable appears, to be
accompanied by a change in other variable and vice-versa, the two
variables are said to be correlated and this relationship is called
correlation.
In other words, the tendency of simultaneous variation of the two
variables is called covariation.

3.1. Types of Correlation:


(i) Positive (ii) Negative (iii) No correlation

When the increase (or decrease) in one variable results in a


corresponding increase ( or decrease) in the other variable, the
correlation is said to be positive correlation. For example, the volume and
temperature of a gas are “positively correlated”.

When the increase (or decrease) in one variable results in a


corresponding decrease (or increase) in the other variable, the correlation
is said to be Negative correlation. For example, the volume and pressure
of a gas are “negatively correlated”.

When the increase (or decrease) in one variable results no effect in the
other variable, the correlation is said to be Zero or No correlation. For
example heights of the students and marks obtained in a particular
subject have “zero correlation” or “No correlation”.
It is generally denoted by ρ (rho).

3.2. Measurement of Correlation:


The following method may be used for the study and measurement of
correlation (Linear correlation):

1) Scatter Diagram (Graphical Method)


2) The Correlation coefficient (a numerical measure)

3.2.1. Scatter Diagram (Graphical Method):

Scatter diagram is the simplest technique of diagrammatic


representation of bivariate data.Therefore ,for the bivariate distribution

Institute of Lifelong Learning, University of Delhi pg. 4


Correlation and Linear Regression

(Xi,Yi),i=1,2….n, if the values of the variable X and Y are plotted along x-


axis and y-axis respectively in x-y plane, the diagram of dots so obtained
is known as scatter diagram.
Following are the different scatter diagrams to represent the relationship
between two variables.

Merits:
The following are the merits of scatter diagram:
1) The scatter diagram gives an idea at glance about the existence or
absence of relationship between two variables.
2) It also exhibits the type of the correlation.
3) It indicates the presence of perfect of perfect negative correlation.
Demerits:
1) It does not indicate a definite measure of the degree of correlation.
2) In case of few observations, the use of scatter diagram is limited.
3) However, with nominal degree of variation, it fails to ascertain the
perfect positive or negative correlation.

Example 1: The local ice cream shop keeps track of how much ice cream
they sell versus the noon temperature on that day. Here are their figures
for the last 12 days:

Ice Cream Sales vs


Temperature
Ice
Temperature
Cream
°C
Sales
14.2° $215

Institute of Lifelong Learning, University of Delhi pg. 5


Correlation and Linear Regression

16.4° $325
11.9° $185
15.2° $332
18.5° $406
22.1° $522
19.4° $412
25.1° $614
23.4° $544
18.1° $421
22.6° $445
17.2° $408

And here is the same data as a Scatter Plot:

It is now easy to see that warmer weather leads to more sales, but
the relationship is not perfect.

3.3. Line of Best Fit

We can also draw a "Line of Best Fit" (also called a "Trend Line") on our
scatter plot:

Institute of Lifelong Learning, University of Delhi pg. 6


Correlation and Linear Regression

Try to have the line as close as possible to all points, and as many points
above the line as below.

4. Correlation Coefficient :
To determine the intensity or degree of the linear correlation between two
variables, Karl Pearson,the great British Statistician defined a numerical
measure called Karl person correlation coefficient of simply correlation
coefficient. It is generally denoted by symbol “r” and given by

Degree to which X and Y vary together


r 
Degree to which X and Y vary separately
covariance of X and Y

variance of X and Y
COV(X, Y) E(XY)  E(X)E(Y)
r  OR r 
x * y Var(X) * Var(Y)

More precisely ,we can understand as,


Let (xi,yi) ,i=1,2…….n be a set of observations of size n from a bivariate
population of (X,Y). Let x and y be the means of X and Y respectively,
then

2 2
1 n 1 n
Var(X)    xi  x  , Var(Y)    yi  y  and
n i 1 n i 1
1 n
COV(X, Y)    xi  x   yi  y 
n i 1
1 n

  x  x   yi  y 
n i 1 i
r
1 n 1 n
     yi  y 
2 2
x i  x *
n i 1 n i 1
After simplification, we may write,
n n n
n  xiyi    xi  yi
r = i 1 i 1 i 1

 n 2  n  2
  n 2  n 2 
n xi    xi   * n yi    yi  
 i 1  i 1    i 1  i 1  

Institute of Lifelong Learning, University of Delhi pg. 7


Correlation and Linear Regression

 x i  x   yi  y 
r i 1
n n

  xi  x   y  y
2 2
* i
i 1 i 1

4.1. Assumptions:
The correlation coefficient is based on the following assumptions:
1) In each of the correlated bivariate population, aLarge number
independent causes are operating so as to produce normal
distribution.
2) The forces so operated are related in a casual way.
3) The relationship between two variables is linear.
5.2. Properties:
1) It is rigidly defined.
2) The linear correlation coefficient rxyis a pure number ( or ratio) and
thus has no unit of measurement. By symmetry it is clear that
rxy=ryx .
3) It is based on all the observations.
4) The correlation coefficient is independent of change of origin and
change of scale.
5) It ranges from -1 to +1
R=-1 when there is a perfect negative correlation.
R=0 when there is no linear correlation.
R=+1 when there is a perfect positive correlation.

Examples 2: Calculate and analyze the correlation coefficient between


the number of study hours and the number of sleeping hours of different
students.

Number of Study hours 2 4 6 8 10


Number of sleeping
10 9 8 7 6
hours

Solution: The necessary calculation is given below:

 xi  x 
X Y  xi  x   yi  y   xi  x   yi  y 
2 2

 yi  y 
2 10 -4 2 -8 16 4

Institute of Lifelong Learning, University of Delhi pg. 8


Correlation and Linear Regression

4 9 -2 1 -2 4 1
6 8 0 0 0 0 0
8 7 2 -1 -2 4 1
10 6 4 -2 -8 16 1
5 5 5 5 5 5

  xi  x
5
 yi  y    xi  x  yi  y    xi  x 
2
 yi
2
 xi   y  y
2
i
i 1 i 1 i 1 i 1 i 1 i 1
i 1
=30 =40 =0 =0 =-20 =40 =10

n n

x i
30 y i
40
x= i
=  6, y i
 8
n 5 n 5
and
1 n
  x  x   yi  y 
n i 1 i =
20
 1
r
1 n 1 n 20

n i 1
 xi  x  *  y  y
n i 1 i
There is perfect negative correlation between the number of study hours
and the number of sleeping hours.

Example 3: Let the random variable X and Y have the joint p.d.f

x  y , 0  x  1 and 0  y  1
f(x, y)  
0, Otherwise

Compute correlation coefficient.

Solution: We know that the mean of random variable X can be written as


:
11 7
E( X )    x(x  y )dxdy 
00 12
And
Variance (X)=E(X2)-[E(X)]2
11 2
 7  11
=   x2 (x  y )dxdy     144
00  12 
Similarly,the mean and variance of r.v. Y can be obtained as
11 7
E(Y )    y(x  y )dxdy 
00 12
Variance (Y)=E(Y2)-[E(Y)]2
11 2
 7  11
=   y 2 (x  y )dxdy     144
00  12 
The covariance of X and Y can be obtained as
COV (XY) = E(XY)-E(X)*E(Y)

Institute of Lifelong Learning, University of Delhi pg. 9


Correlation and Linear Regression

11 2
 7  1
   xy(x  y )dxdy    
00  12  144
Using all the above calculated values ,we can find the correlation between
X and Y as:
1
 1
r 144 
11 11 11
*
144 144

Exercises 1:
1. What is the relationship between hours studying (X) and scores on a
quiz (Y)? Plot Scatter diagram.

STUDEN HOURS SCO


T RE

A 1 1

B 1 3

C 3 2

D 4 5

E 6 4

F 7 5

2. An investigator is interested in predicting scores for a measure of


cognitive function from the number of hours of sleep a person gets on
average. She obtains scores for both the number of hours of sleep and
cognitive function from 12 people. Higher scores for cognitive function
reflect higher levels of performance.

Hours of 5. 6. 7. 8. 6. 7.
8.1 9 9.4 8.7 9.3 9.2
Sleep 4 1 4 5 4 6
Cognitiv
e 10 12 13 11 10 11
Function 0 79 72 62 2 89 76 1 0 92 1 5

1. provide a scatterplot with the number of hours of sleep on the X-


axis and scores for cognitive function on the Y-axis.
2. What is the correlation coefficient between the number of hours of
sleep and scores on the measure of cognitive function?

Institute of Lifelong Learning, University of Delhi pg. 10


Correlation and Linear Regression

5. Regression:

The term regression was first used by a British Biometrician Sir


Fransis Galton (1822-1911) in the later part of the nineteenth century, in
connection with the height of parents and their off springs. However in
other words, the regression means “stepping back”. But now a days, the
term regression stands for some sort of functional relationship between
two or more related variables.
If we have only two variables to study, then one of them is considered as
independent and other as dependent variable.

5.1. Line of Regression:

If we are given the joint distribution of two random variables X and


Y and X is known to take on the value x, the basic problem of bivariate
regression is that of determining the conditional mean E(Y/X) or  Y / x i.e.
the “average” value of Y for the given value of X. The problems involving
more than two random variables, i.e. multiple regression, we are
concerned with quantities such as E(Z/x,y), the mean of Z for given
values of X and Y , E(X4/x1,x2,x3) , the mean of X4 for the given values of
X1,X2 and X3 and so on.

If f(x,y) is the value of the joint density of two random variables X


and Y at (x,y), the problem of bivariate regression is simply that of
determining the conditional density of Y given X=x and the evaluate the
integral

E(Y / x)   y.w(y / x)dy
0

The resulting equation is called the regression equation of Y on X.


Alternatively, we can obtain the regression equation X on Y as

E(X / y)   x.f(x / y)dx
0

If the data is discrete in nature then we use probability distribution


instead of probability density and the integrals in the two regression
equations given in above equations will be replaced by sums as follows:

E(Y / x)   y.w(y / x)
y

E(X / y)   x.f(x / y)
x

These are regression equations Y on X and X on Y respectively for the


discrete cases.

Institute of Lifelong Learning, University of Delhi pg. 11


Correlation and Linear Regression

Example 4: For the two random variables X and Y the joint density
function is given by

x.e x((1 y) , for x  0 and y  0


f(x, y)  
0, Otherwise -(1)
Find the regression equation of Y on X and draw the regression curve.

Solution: The joint densityof (X,Y) is given by (1). We can obtain the
marginal density of X by integrating eq. (1) w.r.t Y as follows:

e x , for x  0
g(x)  
0, Otherwise

The conditional density of Y given X=x is given by

f(x, y) x.e x((1 y)


w(y / x)  
g(x) e x
x. e xy , for y  0

0 , otherwise
1
This is an exponential density with  
x
Hence

E(Y / x)   y.x. e xy.dy
0

Finally,using the property of exponential distribution, we find the


regression equation of Y on X as :
1
E(Y / x) 
x
The given regression line can be shown graphically in the following figure

Regression Curve
E(Y/x)=1/x
1.5

0.5

0
0 2 4 6

Example 5: If X and Y have the multinomial distribution

Institute of Lifelong Learning, University of Delhi pg. 12


Correlation and Linear Regression

 n  x y n x  y
f(x, y)    .1 .2 .(1  1  2 )
 x, y,n  x  y 

For x=0,1,2…..n and y= 0,1,2….n with x+y ≤ n. Find the regression


equation of Y on X

Solution: For the given joint distribution, the marginal distribution of X


can be obtained as

n x
 n 
  x, y,n  x  y  .
x y
g(x)  1 .2 .(1  1  2 )n  x  y

y 0 
n x
   .1 .(1  1 )n  x
x

For x=0,1,2…n , which is Binomial distribution with the parameter n and


1 . Therefore, the conditional distribution
 n  x y n x  y
  .1 .2 .(1  1  2 )
f(x, y)  x, y,n  x  y 
w(y / x)  
g(x) n x n x
  .1 .(1  1 )
x
n  x  y n x  y
  .2 .(1  1  2 )
y
  
(1  1 )n  x
For y=0,1,2…..,n-x and rewriting the formula, we get,
y n x  y
 n  x   2   1  1  2 
w(y / x)   .  . 
 y   1  1   1  1 
We can identify easily that the given conditional distribution of Y given
2
X=x is a binomial distribution with the parameter n-x and .
1  1
Hence, the regression equation Y on X can be written as :
(n  x).2
E(Y / x) 
1  1
Illustration:
If X be the number of times that an even number comes up in 30 rolls of
a unbiased die and Y be the number of times that the result is a 5.

Since there are 3 equally likely possibilities 1,3 or 5 for each of the 30-x
outcomes that are not even.Then the regression equation can be obtained
as

Institute of Lifelong Learning, University of Delhi pg. 13


Correlation and Linear Regression

1
(30  x).
E(Y / x)  6
1
1
2
1

(30  x)
3
Example 6: If the joint density of X1, X2, X3 is given by
 x
(x1  x2 )e 3 , for 0  x1  1, 0  x2  1, x3  0
f(x1, x2 , x3 )  
0,
 Otherwise
Find the regression equation of X2 on X1 and X3

Solution: The joint marginal density oX1 and X3 for the given distribution
is given by

1
1  x3
m(x1, x3 )   (x
0
1  x2 )e x3 dx2  (x1 
2
)e

OR
 1 x
(x  )e 3 , for 0  x1  1, x3  0
m(x1, x3 )   1 2

0, Otherwise
Therefore,

f(x1, x2 , x3 )
E(X2 / x1, x3 ) 

x 2
m(x1, x2 )

2
1
x2 (x1  x2 ) x1 
 dx2  3
1 2x1  1
0 (x1  )
2
We can see that the obtained conditional distribution is depending on x 1
only not on x3.This may be possible because of pairwise independence
between x2 and x3

6. Linear Regression: (Alternative Approach)

Let us take a bivariate distribution of continuous random variables (X, Y)


given by the joint p.d.f. f(x, y) . Then E(Y | x) as a function of x is defined
as the Regression of Y on X and E(X | y) as a function of Y is defined as
the Regression of X on Y.

Let us assume that both E(Y | x) and E(X | y) are linear i.e.


E(Y | x)  y

f(y | x)    x , (1)

Institute of Lifelong Learning, University of Delhi pg. 14


Correlation and Linear Regression


and E(X | y)  x

f(x |y)    y . (2)

Then denoting f1(x) as the marginal p.d.f. of X

 
f
 1 (x)E(Y | x)dx   f1(x)(  βx)dx    β E(X)
 
 
or   y f1(x)f (y | x)dydx     E( X )
 

 
   y[f (x, y )]dxdy     E( X ) (3)
 

E(Y )     E(X ) (4)

which shows that the regression line of y on x passes through the mean
of X and y i.e. E( X ) and E(Y ) .

Similarly, by taking E(X | Y  y)     y , we have

E(X )  γ  δ E(Y ) (5)

Thus the regression line of x on y also passes through (E(X ), E(Y )) .


Hence, the regression lines intersect at a point (E(X ), E(Y )) . Again,
multiplying (1) by xf1(x) and integrating with respect to X, we have


 x f1(x)E(Y | x)dx   x f1(x)(   x)dx

   
2
or   xyf1(x) f (y | x)dydx    x f1(x)dx    x f1(x)dx
   
 
or   xyf (x, y)dxdy   E(X )   E(X ). (6)
2

 

Similarly, multiplying (2) by yf2 (y ) and integrating with respect to Y, we


have
E(XY )  γ E(Y )  δ E(Y 2 ) (7)
Equations (4) and (7) are known as normal equations for the regression
of y on x. Solving then we get
E( XY )  E( X )E(Y )
  (8)
E( X 2 )  [E( X )]2

Institute of Lifelong Learning, University of Delhi pg. 15


Correlation and Linear Regression

and   E(Y )   E(X ) where  is called the regression coefficient of y on


x.
Similarly, solving (8.0.5) and (7) we get,
E( XY )  E( X )E(Y )
  and   E(X )   E(Y ) (9)
E(Y 2 )  [E(Y )]2
where  is called the regression coefficient of x on y. Further,
E(XY )  E(X )E(Y )  E[X  E(X )][Y  E(Y )] (10)
is called the co-variance of X and Y and is denoted as Cov(X, Y) .
Therefore,
Cov(X,Y) Cov( X , Y )
  and   (11)
Var(X) Var (Y )
Let
Cov( X , Y )
   *  x = Cor(X , Y). (12)
Var ( X )Var (Y )
 is the geometric mean of the regression coefficients when both the
regressions are linear, and is called the product moment correlation
coefficient between X and Y . It is a measure of the strength of linear
relationship between two random variables x and y given that the
regression of one of the variable on the other is linear.
From (12) and (8), (9) we can write  ,  ,  and  as
Y
=  and  = E(Y) –  E(X), (13)
X
X
 = and  = E(X) –  E(Y) (14)
y

Theorem 1: If the regression of Y on X is linear, then



E(Y/x) = 2   2 (x  1 )
1
And if regression X on Y is linear then

E(X/y) = 1   1 (y  2 )
2
Proof: Since E(Y / x)    x is linear that

 y.w(y / x)dy    x

And if we multiply the expression on both sides of this equation by g(x),


the corresponding value of the marginal density of X and integrate w.r.t.
x , we get

  y.w(y / x)g(x)dydx   g(x)dx   x.g(x)dx

Institute of Lifelong Learning, University of Delhi pg. 16


Correlation and Linear Regression

Or

2= +1

Since w(y/x)=f(x,y)

If wemultiplied the equation for E(Y/x) on both side by x.g(x) before


integrating on x, we would have obtained
2
  xy.f(x, y)dydx   x.g(x)dx   x .g(x)dx
Or
E(XY)  1  E(x2 )

Solving 2= +1 and E(XY)  1  E(x2 ) for  and  and making
use the fact that E(XY)  12  12 and E(X2 )  12  12 , we find that
 
  2  12 .1  2   2 .1
12 1
12 2
And   2  
1 1
Finally, we can write the linear regression equation of Y on X as

E(Y/x) = 2   2 (x  1 )
1
And the linear regression equation of X on Y using similar steps as

E(X/y) = 1   1 (y  2 )
2

Exercises 2:

1. Five children aged 2, 3, 5, 7 and 8 years old weigh 14, 20, 32, 42
and 44 kilograms respectively.

(i) Find the equation of the regression line of age on weight.


(ii) Based on this data, what is the approximate weight of a six
year old child?

2. The heights (in centimeters) and weight (in kilograms) of 10


basketball players on a team are:
Height (X) 186 189 190 192 193 193 198 201 203 205
Weight (Y) 85 85 86 90 87 91 93 103 100 101

Calculate:
(i) The regression line of y on x.

Institute of Lifelong Learning, University of Delhi pg. 17


Correlation and Linear Regression

(ii) The coefficient of correlation.


(iii) The estimated weight of a player who measures 208 cm.

3. A group of 50 individuals has been surveyed on the number of


hours devoted each day to sleeping and watching TV. The responses
are summarized in the following table:

No. of sleeping hours


6 7 8 9 10
(x)
No. of hours of
4 3 3 2 1
television (y)
Absolute frequencies
3 16 20 10 1
(fi)

(i) Calculate the correlation coefficient.


(ii) Determine the equation of the regression line of y on x.
(iii) If a person sleeps eight hours, how many hours of TV are
they expected to watch?

4. For the two random variables X and Y the joint density function is
given by

x.e x((1 y) , for x  0 and y  0


f(x, y)  
0, Otherwise
2
Show that the regression equation of X on Y is E(X/y) =
1 y
,
Also draw the regression curve.

5. For the following joint pdf


2
 (2x  3y ), for 0  x  1 and 0  y  1
f(x,y) = 5
0 , otherwise

Find E(Y/x) and E(X/y)

6. Given the joint pdf


6 x, for 0  x  1 and 0  y  1
f(x,y) = 
0 , otherwise
Find E(Y/x) and E(X/y)

7. Given the joint pdf

Institute of Lifelong Learning, University of Delhi pg. 18


Correlation and Linear Regression

 2x
 , for x  0 and y  0
f(x,y) = (1  x  xy )3

0 , otherwise
1
E(Y / x)  1  and that var(Y / x) does not exist.
x

8. .If the values of the joint probability distribution of X and Y as


shown in the table

0 1 2

1 1 1
0
12 6 24
1 1 1
1
4 4 40
1 1
2
8 20
1
3
120

(a) Find the conditional distribution of X given Y=1 and then the
regression line E(X/1).
(b) Find the conditional distribution of Y given X=0 and then
regression line E(Y/0).

9. Given the joint density


2, for 0  y  x  1
f(x,y) = 
0 , otherwise

x 1 y
Show that E(Y / x)  and E(X / y ) 
2 2
10. Given the joint density
24xy, for x  0 , y  0 and x  y  1
f(x,y) = 
0 , otherwise

2 1 y
Show that E(Y / x)  (1  x) and E(X / y ) 
3 2

7. Moment Generating Function:


Probability or Frequency distributions in statistics are tools which
summarize the pattern followed by uncertain events or data in hand. The
distributions encompass a wide range of values for different kinds of

Institute of Lifelong Learning, University of Delhi pg. 19


Correlation and Linear Regression

outcomes. The information stored in these distributions is further used to


make estimations and predictions for events.
Any form of distribution is characterized by certain constants which are
usually the measures of location, dispersion, skewness and kurtosis. For
a probability distribution, calculation of these requires the expected
values E(X), E(X2), E(X3), ..., and E(Xr) which are called moments. The ith
moment about origin is defined as  i' = E[Xi] and a function that

generates these moments is known as Moment Generating Function.


Definition: For a real valued random variable X, the moment generating
function (mgf) of X is a function M defined as MX(t) = E(etX) where the
dummy constant t ϵ  . Therefore,


MX(t) = E 1  tX 
tX 
2

tX 
3

   
 2! 3! 
Practically, in many situations we find variables occurring in pairs rather
than singly. In such a scenario, we are looking at joint probability
distributions which further give way to marginal and conditional
distributions. For such distributions, we define the moment generating
functions as
MX,Y(t,s) = E(etX+sY)
= E(etXesY)
=

E 1  tX 
tX 2  tX 3     1  sY  sY 2  sY 3     
 
 2! 3!  2! 3! 

 s 2Y 2   ts 2 XY 2 
1  sY         tX  tsXY      
 2!   2! 
 E 
 2 2 2 2 2 2 2 2

  t X  t sX Y  t s X y      
  2! 2! 2!2 
 

The following results hold true for the multivariate moment generating
functions:
1. MX,Y(0,s) = E(esY) = MY(s)
2. MX,Y(t,0) = E(etX) = MX(t)
3. X andY are independent if and only if MX,Y(t,s) = MX(t) MY(s)

Institute of Lifelong Learning, University of Delhi pg. 20


Correlation and Linear Regression

 2 M X ,Y t , s 
4.  EXY 
ts t 0, s 0

  s 2 XY 2  

0   X  sXY        
M X ,Y t , s    2!  
Hint:-  E
t  2 2 2 2 2

 2tX  2tsX Y  2ts X y     
 2! 2! 2!2 


 2sXY 2  
 
 XY        
 2 M X ,Y t , s   2!  
 E
ts  2 2 2

 2tX Y  4tsX Y     
 2! 2!2 


M X ,Y 0, s 
5.  EY 
s s 0

M X ,Y t ,0
6.  EX 
t t 0

 2 M X ,Y 0, s 
7.
s 2
 EY2  
s 0

 2 M X ,Y t ,0
8.
t 2
E X2  
t 0

 r  s M X ,Y t , s 
9. In general,
t s
r s

 E X rY s 
t 0 , s 0

10. MaX+bY(t) = E(e(aX+bY)t) = E(e(at)X+(bt)Y)) = MX+Y(at,bt)

8. Covariance:
The Results 4, 5 and 6 can be used to calculate the covariance between
a pair of variables for a bivariate distribution as :
COV(X,Y)=E[XY]-E[X]*E[Y]
 2 M t, s    M X ,Y  t , 0    M X ,Y  0, s  
COV ( X , Y )   X ,Y
  * 
 ts 
t  0, s  0   t t 0 
  s s 0 

Exercise 3:

Institute of Lifelong Learning, University of Delhi pg. 21


Correlation and Linear Regression

1: Suppose that (X,Y) is uniformly distributed on the


2
triangle T={(x,y)∈R :0≤x≤y≤1}. Find:

a. The joint moment generating function of (X,Y).

e s t  1 es 1
Hint: -Mx,y(s,t) = 2 2
ss  t  st

b. The moment generating function of X.

 e2 1 1
Hint: -Mx(t) = 2 2  2  
t t t

c. The moment generating function of Y.

 se s  e s  1 
Hint: - My(s) = 2 
 s2 

Partial Solution:

f(x,y) = 2, 0≤x≤y≤1

 y sx 
 
1 y 1 1
2 ty sy
Mx,y(s,t) =   2e e dxdy = 2 e   e dx  dy =  e e  1 dy
sx ty ty

0 0 0  0  s0

2
1 1
 2  e  s  t   1 e t  1
=   e  s t  y dy   e ty dy  =   
s 0 0  s  s  t t 

2: Suppose that (X,Y) has probability density


function f(x,y)=x+y for 0≤x≤1, 0≤y≤1. Find:

a. The joint moment generating function (X,Y).

e s t  2st  s  t   e s st  s  t   s  t
Hint: -Mx,y(s,t) =
s 2t 2

b. The moment generating function of X.

3te 2  2e 2  t  2
Hint: -Mx(t) =
2t 2

c. The moment generating function of Y.

Institute of Lifelong Learning, University of Delhi pg. 22


Correlation and Linear Regression

3se s  2e s  s  2
Hint: -My(s) =
2s 2

3: (a) If (U,V) ~ BVN (0, 0, 1, 1, ρ), then using MGF, show that
Correlation between U and V is given by ρ.
(b) If (X,Y) ~ BVN (μx, μy, σx2, σy2, ρ), then using MGF, show that
Correlation between X and Y is given by ρ.
(c) If (U,V) ~ BVN (0, 0, 1, 1, ρ), find the distribution of linear
combination of lX + mY using MGF.

4: A production firm sells a productA, which comprises of three


components X, Y, Z. The costs incurred in manufacturing each of these
components obey the following moment generating function:
Mx(t) = (1 - 2t)-3, My(t) = (1 - 2t)-2.5, Mz(t) = (1 - 2t)-4.5
Find the moment generating function of the product A and hence obtain
then E(A), E(A)2 and E(A)3.

5: X and Y are independent random variables with common moment


t2
generating function M(t) = e 2
. Let U = X + Y and V = Y - X. Then find
the joint moment generating function of U and V, and hence find the
correlation between them.
6: X and Y are independent Gamma distributed variables with density
1
f(t)= = et t  1 , t> 0
 

Find the joint moment generating function of U = X/Y nd V = X+Y.

7: Suppose X and Y are jointly distributed random variables with moment

 2e t  3 e s  1 
generating function Mx,y(t,s) = 
  10

. Find the Correlation



 8 
between X and Y.
8: Suppose X and Y are jointly distributed random variables with p.d.f
given by

Institute of Lifelong Learning, University of Delhi pg. 23


Correlation and Linear Regression

f(x,y) = 2 - x - y, 0≤x≤1 and 0≤y≤1


Find the joint moment generating function of X and Y and hence find cov
(X,Y).
Summary:

In this chapter we have emphasize on the followings

 Correlation
 Types of Correlation
 Measurement of Correlation
 Line of Best Fit
 Correlation Coefficient
 Assumptions for Correlation Coefficient
 Properties of Correlation Coefficient
 Regression
 Line of Regression
 Linear Regression
 Moment Generating Function
 Covariance

References:

 Robert V. Hogg, Joseph W. McKean and Allen T.


Craig:Introduction to Mathematical Statistics, Pearson Education,
Asia, 2007.
 Irwin Miller and Marylees Miller:John E. Freund’s Mathematical
Statistics with Applications (7th Edition), Pearson Education, Asia,
2006.
 Sheldon Ross:Introduction to Probability Models (9th Edition),
Academic Press, Indian Reprint, 2007.
 Biswas, Suddhendu and Srivastava, G.L.: Mathematical
Statistics: A Textbook,2011.
 Sharma, H.S, Chaudhary,S.S and Gupta Madhu:Descriptive
Statistics , Students Friends & Company,2002.

Web Links:

Institute of Lifelong Learning, University of Delhi pg. 24


Correlation and Linear Regression

 www.users.miamioh.edu/claypohm/.../Chapter%209%20correlatio
n.doc
 http://www.vitutor.com/statistics/regression/problems_regression.h
tml
 http://www.mathsisfun.com/data/scatter-xy-plots.html
 http://www.emathzone.com/tutorials/basic-statistics/examples-of-
correlation.html

Suggesting Reading:

 Alexander M. Mood, Franklin A. Graybilland Duane C.


Boes:Introduction to the Theory of Statistics, (3rd Edition), Tata
McGraw- Hill, Reprint 2007.
 Gupta ,S.C and Kapoor, V.K : Fundamentals of Mathematical
Statistics,Eleventh Edition (Reprint) 2014.
 Gun,A.M.,Gupta, M.K.,Dasgupta,B.:Fundamental of
Statistics,Vol.1, The world Press Private Ltd,2002.
 Arora,Bansilal and BansiLal:New Mathematical
Statistics,SatyaPrakashan, 1997.

Institute of Lifelong Learning, University of Delhi pg. 25

You might also like