0% found this document useful (0 votes)

88 views25 pages

Correlation and Linear Regression

The document discusses correlation and linear regression. It defines correlation as the tendency of simultaneous variation between two variables. Correlation can be positive, negative, or zero. The degree of linear correlation is measured using the correlation coefficient r, which ranges from -1 to 1. A scatter plot can show the relationship graphically, and a line of best fit can be drawn to represent the linear regression between the variables. The correlation coefficient is calculated using the covariance and variances of the variables.

Uploaded by

Neha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views25 pages

Correlation and Linear Regression

Uploaded by

Neha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Correlation and Linear Regression

Paper: Probability and Statistics

Lesson: Correlation and Linear Regression

Course Developer: Dr. Manoj Kumar Varshney & Mr. V.Ravi

Department/College: Assistant Professor, Department of

Statistics, Hindu College, University of Delhi

Assistant Professor, Department of Statistics, Lady Sri Ram

College, University of Delhi

Institute of Lifelong Learning, University of Delhi pg. 1

Correlation and Linear Regression

Institute of Lifelong Learning, University of Delhi pg. 2

Correlation and Linear Regression

Table of Contents

Chapter: Correlation and Linear Regression

 1: Learning Outcomes
 2: Introduction
 3: Correlation
o 3.1: Types of Correlation
o 3.2: Measurement of Correlation
o 3.3. Line of Best Fit
 4: Correlation Coefficient
o 4.1: Assumptions for Correlation Coefficient
o 4.2: Properties of Correlation Coefficient
 Exercise 1
 5: Regression
o 5.1: Line of Regression
 6: Linear Regression
 Exercise 2
 7: Moment Generating Function
 8: Covariance
 Exercises 3
 Summary
 References, Web links and Suggested reading

1. Learning outcomes:

After studying this chapter reader should be able to understand the

 Correlation
 Types of Correlation
 Measurement of Correlation
 Line of Best Fit
 Correlation Coefficient
 Assumptions for Correlation Coefficient
 Properties of Correlation Coefficient
 Regression
 Line of Regression
 Linear Regression
 Moment Generating Function
 Covariance

Institute of Lifelong Learning, University of Delhi pg. 3

Correlation and Linear Regression

2. Introduction:

If for every value of a variable X ,we know a corresponding values of

second variable Y, the resulting series of pairs of values of two variables
is known as Bivariate population and its distribution is known as Bivariate
distribution. For example , asset of pairs of the heights of husbands and
their wives at the time of marriage, a series of the weights and heights of
students in a college, a series of demand and supply of commodity.

3. Correlation:
In a bivariate distribution if the change in one variable appears, to be
accompanied by a change in other variable and vice-versa, the two
variables are said to be correlated and this relationship is called
correlation.
In other words, the tendency of simultaneous variation of the two
variables is called covariation.

3.1. Types of Correlation:

(i) Positive (ii) Negative (iii) No correlation

When the increase (or decrease) in one variable results in a

corresponding increase ( or decrease) in the other variable, the
correlation is said to be positive correlation. For example, the volume and
temperature of a gas are “positively correlated”.

When the increase (or decrease) in one variable results in a

corresponding decrease (or increase) in the other variable, the correlation
is said to be Negative correlation. For example, the volume and pressure
of a gas are “negatively correlated”.

When the increase (or decrease) in one variable results no effect in the
other variable, the correlation is said to be Zero or No correlation. For
example heights of the students and marks obtained in a particular
subject have “zero correlation” or “No correlation”.
It is generally denoted by ρ (rho).

3.2. Measurement of Correlation:

The following method may be used for the study and measurement of
correlation (Linear correlation):

1) Scatter Diagram (Graphical Method)

2) The Correlation coefficient (a numerical measure)

3.2.1. Scatter Diagram (Graphical Method):

Scatter diagram is the simplest technique of diagrammatic

representation of bivariate data.Therefore ,for the bivariate distribution

Institute of Lifelong Learning, University of Delhi pg. 4

Correlation and Linear Regression

(Xi,Yi),i=1,2….n, if the values of the variable X and Y are plotted along x-

axis and y-axis respectively in x-y plane, the diagram of dots so obtained
is known as scatter diagram.
Following are the different scatter diagrams to represent the relationship
between two variables.

Merits:
The following are the merits of scatter diagram:
1) The scatter diagram gives an idea at glance about the existence or
absence of relationship between two variables.
2) It also exhibits the type of the correlation.
3) It indicates the presence of perfect of perfect negative correlation.
Demerits:
1) It does not indicate a definite measure of the degree of correlation.
2) In case of few observations, the use of scatter diagram is limited.
3) However, with nominal degree of variation, it fails to ascertain the
perfect positive or negative correlation.

Example 1: The local ice cream shop keeps track of how much ice cream
they sell versus the noon temperature on that day. Here are their figures
for the last 12 days:

Ice Cream Sales vs

Temperature
Ice
Temperature
Cream
°C
Sales
14.2° $215

Institute of Lifelong Learning, University of Delhi pg. 5

Correlation and Linear Regression

16.4° $325
11.9° $185
15.2° $332
18.5° $406
22.1° $522
19.4° $412
25.1° $614
23.4° $544
18.1° $421
22.6° $445
17.2° $408

And here is the same data as a Scatter Plot:

It is now easy to see that warmer weather leads to more sales, but
the relationship is not perfect.

3.3. Line of Best Fit

We can also draw a "Line of Best Fit" (also called a "Trend Line") on our
scatter plot:

Institute of Lifelong Learning, University of Delhi pg. 6

Correlation and Linear Regression

Try to have the line as close as possible to all points, and as many points
above the line as below.

4. Correlation Coefficient :
To determine the intensity or degree of the linear correlation between two
variables, Karl Pearson,the great British Statistician defined a numerical
measure called Karl person correlation coefficient of simply correlation
coefficient. It is generally denoted by symbol “r” and given by

Degree to which X and Y vary together

r 
Degree to which X and Y vary separately
covariance of X and Y

variance of X and Y
COV(X, Y) E(XY)  E(X)E(Y)
r  OR r 
x * y Var(X) * Var(Y)

More precisely ,we can understand as,

Let (xi,yi) ,i=1,2…….n be a set of observations of size n from a bivariate
population of (X,Y). Let x and y be the means of X and Y respectively,
then

2 2
1 n 1 n
Var(X)    xi  x  , Var(Y)    yi  y  and
n i 1 n i 1
1 n
COV(X, Y)    xi  x   yi  y 
n i 1
1 n

  x  x   yi  y 
n i 1 i
r
1 n 1 n
     yi  y 
2 2
x i  x *
n i 1 n i 1
After simplification, we may write,
n n n
n  xiyi    xi  yi
r = i 1 i 1 i 1

 n 2  n  2
  n 2  n 2 
n xi    xi   * n yi    yi  
 i 1  i 1    i 1  i 1  

Institute of Lifelong Learning, University of Delhi pg. 7

Correlation and Linear Regression

 x i  x   yi  y 
r i 1
n n

  xi  x   y  y
2 2
* i
i 1 i 1

4.1. Assumptions:
The correlation coefficient is based on the following assumptions:
1) In each of the correlated bivariate population, aLarge number
independent causes are operating so as to produce normal
distribution.
2) The forces so operated are related in a casual way.
3) The relationship between two variables is linear.
5.2. Properties:
1) It is rigidly defined.
2) The linear correlation coefficient rxyis a pure number ( or ratio) and
thus has no unit of measurement. By symmetry it is clear that
rxy=ryx .
3) It is based on all the observations.
4) The correlation coefficient is independent of change of origin and
change of scale.
5) It ranges from -1 to +1
R=-1 when there is a perfect negative correlation.
R=0 when there is no linear correlation.
R=+1 when there is a perfect positive correlation.

Examples 2: Calculate and analyze the correlation coefficient between

the number of study hours and the number of sleeping hours of different
students.

Number of Study hours 2 4 6 8 10

Number of sleeping
10 9 8 7 6
hours

Solution: The necessary calculation is given below:

 xi  x 
X Y  xi  x   yi  y   xi  x   yi  y 
2 2

 yi  y 
2 10 -4 2 -8 16 4

Institute of Lifelong Learning, University of Delhi pg. 8

Correlation and Linear Regression

4 9 -2 1 -2 4 1
6 8 0 0 0 0 0
8 7 2 -1 -2 4 1
10 6 4 -2 -8 16 1
5 5 5 5 5 5

  xi  x
5
 yi  y    xi  x  yi  y    xi  x 
2
 yi
2
 xi   y  y
2
i
i 1 i 1 i 1 i 1 i 1 i 1
i 1
=30 =40 =0 =0 =-20 =40 =10

n n

x i
30 y i
40
x= i
=  6, y i
 8
n 5 n 5
and
1 n
  x  x   yi  y 
n i 1 i =
20
 1
r
1 n 1 n 20

n i 1
 xi  x  *  y  y
n i 1 i
There is perfect negative correlation between the number of study hours
and the number of sleeping hours.

Example 3: Let the random variable X and Y have the joint p.d.f

x  y , 0  x  1 and 0  y  1
f(x, y)  
0, Otherwise

Compute correlation coefficient.

Solution: We know that the mean of random variable X can be written as

:
11 7
E( X )    x(x  y )dxdy 
00 12
And
Variance (X)=E(X2)-[E(X)]2
11 2
 7  11
=   x2 (x  y )dxdy     144
00  12 
Similarly,the mean and variance of r.v. Y can be obtained as
11 7
E(Y )    y(x  y )dxdy 
00 12
Variance (Y)=E(Y2)-[E(Y)]2
11 2
 7  11
=   y 2 (x  y )dxdy     144
00  12 
The covariance of X and Y can be obtained as
COV (XY) = E(XY)-E(X)*E(Y)

Institute of Lifelong Learning, University of Delhi pg. 9

Correlation and Linear Regression

11 2
 7  1
   xy(x  y )dxdy    
00  12  144
Using all the above calculated values ,we can find the correlation between
X and Y as:
1
 1
r 144 
11 11 11
*
144 144

Exercises 1:
1. What is the relationship between hours studying (X) and scores on a
quiz (Y)? Plot Scatter diagram.

STUDEN HOURS SCO

T RE

A 1 1

B 1 3

C 3 2

D 4 5

E 6 4

F 7 5

2. An investigator is interested in predicting scores for a measure of

cognitive function from the number of hours of sleep a person gets on
average. She obtains scores for both the number of hours of sleep and
cognitive function from 12 people. Higher scores for cognitive function
reflect higher levels of performance.

Hours of 5. 6. 7. 8. 6. 7.
8.1 9 9.4 8.7 9.3 9.2
Sleep 4 1 4 5 4 6
Cognitiv
e 10 12 13 11 10 11
Function 0 79 72 62 2 89 76 1 0 92 1 5

1. provide a scatterplot with the number of hours of sleep on the X-

axis and scores for cognitive function on the Y-axis.
2. What is the correlation coefficient between the number of hours of
sleep and scores on the measure of cognitive function?

Institute of Lifelong Learning, University of Delhi pg. 10

Correlation and Linear Regression

5. Regression:

The term regression was first used by a British Biometrician Sir

Fransis Galton (1822-1911) in the later part of the nineteenth century, in
connection with the height of parents and their off springs. However in
other words, the regression means “stepping back”. But now a days, the
term regression stands for some sort of functional relationship between
two or more related variables.
If we have only two variables to study, then one of them is considered as
independent and other as dependent variable.

5.1. Line of Regression:

If we are given the joint distribution of two random variables X and

Y and X is known to take on the value x, the basic problem of bivariate
regression is that of determining the conditional mean E(Y/X) or  Y / x i.e.
the “average” value of Y for the given value of X. The problems involving
more than two random variables, i.e. multiple regression, we are
concerned with quantities such as E(Z/x,y), the mean of Z for given
values of X and Y , E(X4/x1,x2,x3) , the mean of X4 for the given values of
X1,X2 and X3 and so on.

If f(x,y) is the value of the joint density of two random variables X

and Y at (x,y), the problem of bivariate regression is simply that of
determining the conditional density of Y given X=x and the evaluate the
integral

E(Y / x)   y.w(y / x)dy
0

The resulting equation is called the regression equation of Y on X.

Alternatively, we can obtain the regression equation X on Y as

E(X / y)   x.f(x / y)dx
0

If the data is discrete in nature then we use probability distribution

instead of probability density and the integrals in the two regression
equations given in above equations will be replaced by sums as follows:

E(Y / x)   y.w(y / x)
y

E(X / y)   x.f(x / y)
x

These are regression equations Y on X and X on Y respectively for the

discrete cases.

Institute of Lifelong Learning, University of Delhi pg. 11

Correlation and Linear Regression

Example 4: For the two random variables X and Y the joint density
function is given by

x.e x((1 y) , for x  0 and y  0

f(x, y)  
0, Otherwise -(1)
Find the regression equation of Y on X and draw the regression curve.

Solution: The joint densityof (X,Y) is given by (1). We can obtain the
marginal density of X by integrating eq. (1) w.r.t Y as follows:

e x , for x  0
g(x)  
0, Otherwise

The conditional density of Y given X=x is given by

f(x, y) x.e x((1 y)

w(y / x)  
g(x) e x
x. e xy , for y  0

0 , otherwise
1
This is an exponential density with  
x
Hence

E(Y / x)   y.x. e xy.dy
0

Finally,using the property of exponential distribution, we find the

regression equation of Y on X as :
1
E(Y / x) 
x
The given regression line can be shown graphically in the following figure

Regression Curve
E(Y/x)=1/x
1.5

0.5

0
0 2 4 6

Example 5: If X and Y have the multinomial distribution

Institute of Lifelong Learning, University of Delhi pg. 12

Correlation and Linear Regression

 n  x y n x  y
f(x, y)    .1 .2 .(1  1  2 )
 x, y,n  x  y 

For x=0,1,2…..n and y= 0,1,2….n with x+y ≤ n. Find the regression

equation of Y on X

Solution: For the given joint distribution, the marginal distribution of X

can be obtained as

n x
 n 
  x, y,n  x  y  .
x y
g(x)  1 .2 .(1  1  2 )n  x  y

y 0 
n x
   .1 .(1  1 )n  x
x

For x=0,1,2…n , which is Binomial distribution with the parameter n and

1 . Therefore, the conditional distribution
 n  x y n x  y
  .1 .2 .(1  1  2 )
f(x, y)  x, y,n  x  y 
w(y / x)  
g(x) n x n x
  .1 .(1  1 )
x
n  x  y n x  y
  .2 .(1  1  2 )
y
  
(1  1 )n  x
For y=0,1,2…..,n-x and rewriting the formula, we get,
y n x  y
 n  x   2   1  1  2 
w(y / x)   .  . 
 y   1  1   1  1 
We can identify easily that the given conditional distribution of Y given
2
X=x is a binomial distribution with the parameter n-x and .
1  1
Hence, the regression equation Y on X can be written as :
(n  x).2
E(Y / x) 
1  1
Illustration:
If X be the number of times that an even number comes up in 30 rolls of
a unbiased die and Y be the number of times that the result is a 5.

Since there are 3 equally likely possibilities 1,3 or 5 for each of the 30-x
outcomes that are not even.Then the regression equation can be obtained
as

Institute of Lifelong Learning, University of Delhi pg. 13

Correlation and Linear Regression

1
(30  x).
E(Y / x)  6
1
1
2
1

(30  x)
3
Example 6: If the joint density of X1, X2, X3 is given by
 x
(x1  x2 )e 3 , for 0  x1  1, 0  x2  1, x3  0
f(x1, x2 , x3 )  
0,
 Otherwise
Find the regression equation of X2 on X1 and X3

Solution: The joint marginal density oX1 and X3 for the given distribution
is given by

1
1  x3
m(x1, x3 )   (x
0
1  x2 )e x3 dx2  (x1 
2
)e

OR
 1 x
(x  )e 3 , for 0  x1  1, x3  0
m(x1, x3 )   1 2

0, Otherwise
Therefore,

f(x1, x2 , x3 )
E(X2 / x1, x3 ) 

x 2
m(x1, x2 )

2
1
x2 (x1  x2 ) x1 
 dx2  3
1 2x1  1
0 (x1  )
2
We can see that the obtained conditional distribution is depending on x 1
only not on x3.This may be possible because of pairwise independence
between x2 and x3

6. Linear Regression: (Alternative Approach)

Let us take a bivariate distribution of continuous random variables (X, Y)

given by the joint p.d.f. f(x, y) . Then E(Y | x) as a function of x is defined
as the Regression of Y on X and E(X | y) as a function of Y is defined as
the Regression of X on Y.

Let us assume that both E(Y | x) and E(X | y) are linear i.e.


E(Y | x)  y

f(y | x)    x , (1)

Institute of Lifelong Learning, University of Delhi pg. 14

Correlation and Linear Regression


and E(X | y)  x

f(x |y)    y . (2)

Then denoting f1(x) as the marginal p.d.f. of X

 
f
 1 (x)E(Y | x)dx   f1(x)(  βx)dx    β E(X)
 
 
or   y f1(x)f (y | x)dydx     E( X )
 

 
   y[f (x, y )]dxdy     E( X ) (3)
 

E(Y )     E(X ) (4)

which shows that the regression line of y on x passes through the mean
of X and y i.e. E( X ) and E(Y ) .

Similarly, by taking E(X | Y  y)     y , we have

E(X )  γ  δ E(Y ) (5)

Thus the regression line of x on y also passes through (E(X ), E(Y )) .

Hence, the regression lines intersect at a point (E(X ), E(Y )) . Again,
multiplying (1) by xf1(x) and integrating with respect to X, we have


 x f1(x)E(Y | x)dx   x f1(x)(   x)dx

   
2
or   xyf1(x) f (y | x)dydx    x f1(x)dx    x f1(x)dx
   
 
or   xyf (x, y)dxdy   E(X )   E(X ). (6)
2

 

Similarly, multiplying (2) by yf2 (y ) and integrating with respect to Y, we

have
E(XY )  γ E(Y )  δ E(Y 2 ) (7)
Equations (4) and (7) are known as normal equations for the regression
of y on x. Solving then we get
E( XY )  E( X )E(Y )
  (8)
E( X 2 )  [E( X )]2

Institute of Lifelong Learning, University of Delhi pg. 15

Correlation and Linear Regression

and   E(Y )   E(X ) where  is called the regression coefficient of y on

x.
Similarly, solving (8.0.5) and (7) we get,
E( XY )  E( X )E(Y )
  and   E(X )   E(Y ) (9)
E(Y 2 )  [E(Y )]2
where  is called the regression coefficient of x on y. Further,
E(XY )  E(X )E(Y )  E[X  E(X )][Y  E(Y )] (10)
is called the co-variance of X and Y and is denoted as Cov(X, Y) .
Therefore,
Cov(X,Y) Cov( X , Y )
  and   (11)
Var(X) Var (Y )
Let
Cov( X , Y )
   *  x = Cor(X , Y). (12)
Var ( X )Var (Y )
 is the geometric mean of the regression coefficients when both the
regressions are linear, and is called the product moment correlation
coefficient between X and Y . It is a measure of the strength of linear
relationship between two random variables x and y given that the
regression of one of the variable on the other is linear.
From (12) and (8), (9) we can write  ,  ,  and  as
Y
=  and  = E(Y) –  E(X), (13)
X
X
 = and  = E(X) –  E(Y) (14)
y

Theorem 1: If the regression of Y on X is linear, then


E(Y/x) = 2   2 (x  1 )
1
And if regression X on Y is linear then

E(X/y) = 1   1 (y  2 )
2
Proof: Since E(Y / x)    x is linear that

 y.w(y / x)dy    x

And if we multiply the expression on both sides of this equation by g(x),

the corresponding value of the marginal density of X and integrate w.r.t.
x , we get

  y.w(y / x)g(x)dydx   g(x)dx   x.g(x)dx

Institute of Lifelong Learning, University of Delhi pg. 16

Correlation and Linear Regression

2= +1

Since w(y/x)=f(x,y)

If wemultiplied the equation for E(Y/x) on both side by x.g(x) before

integrating on x, we would have obtained
2
  xy.f(x, y)dydx   x.g(x)dx   x .g(x)dx
Or
E(XY)  1  E(x2 )

Solving 2= +1 and E(XY)  1  E(x2 ) for  and  and making
use the fact that E(XY)  12  12 and E(X2 )  12  12 , we find that
 
  2  12 .1  2   2 .1
12 1
12 2
And   2  
1 1
Finally, we can write the linear regression equation of Y on X as

E(Y/x) = 2   2 (x  1 )
1
And the linear regression equation of X on Y using similar steps as

E(X/y) = 1   1 (y  2 )
2

Exercises 2:

1. Five children aged 2, 3, 5, 7 and 8 years old weigh 14, 20, 32, 42
and 44 kilograms respectively.

(i) Find the equation of the regression line of age on weight.

(ii) Based on this data, what is the approximate weight of a six
year old child?

2. The heights (in centimeters) and weight (in kilograms) of 10

basketball players on a team are:
Height (X) 186 189 190 192 193 193 198 201 203 205
Weight (Y) 85 85 86 90 87 91 93 103 100 101

Calculate:
(i) The regression line of y on x.

Institute of Lifelong Learning, University of Delhi pg. 17

Correlation and Linear Regression

(ii) The coefficient of correlation.

(iii) The estimated weight of a player who measures 208 cm.

3. A group of 50 individuals has been surveyed on the number of

hours devoted each day to sleeping and watching TV. The responses
are summarized in the following table:

No. of sleeping hours

6 7 8 9 10
(x)
No. of hours of
4 3 3 2 1
television (y)
Absolute frequencies
3 16 20 10 1
(fi)

(i) Calculate the correlation coefficient.

(ii) Determine the equation of the regression line of y on x.
(iii) If a person sleeps eight hours, how many hours of TV are
they expected to watch?

4. For the two random variables X and Y the joint density function is
given by

x.e x((1 y) , for x  0 and y  0

f(x, y)  
0, Otherwise
2
Show that the regression equation of X on Y is E(X/y) =
1 y
,
Also draw the regression curve.

5. For the following joint pdf

2
 (2x  3y ), for 0  x  1 and 0  y  1
f(x,y) = 5
0 , otherwise

Find E(Y/x) and E(X/y)

6. Given the joint pdf

6 x, for 0  x  1 and 0  y  1
f(x,y) = 
0 , otherwise
Find E(Y/x) and E(X/y)

7. Given the joint pdf

Institute of Lifelong Learning, University of Delhi pg. 18

Correlation and Linear Regression

 2x
 , for x  0 and y  0
f(x,y) = (1  x  xy )3

0 , otherwise
1
E(Y / x)  1  and that var(Y / x) does not exist.
x

8. .If the values of the joint probability distribution of X and Y as

shown in the table

0 1 2

1 1 1
0
12 6 24
1 1 1
1
4 4 40
1 1
2
8 20
1
3
120

(a) Find the conditional distribution of X given Y=1 and then the
regression line E(X/1).
(b) Find the conditional distribution of Y given X=0 and then
regression line E(Y/0).

9. Given the joint density

2, for 0  y  x  1
f(x,y) = 
0 , otherwise

x 1 y
Show that E(Y / x)  and E(X / y ) 
2 2
10. Given the joint density
24xy, for x  0 , y  0 and x  y  1
f(x,y) = 
0 , otherwise

2 1 y
Show that E(Y / x)  (1  x) and E(X / y ) 
3 2

7. Moment Generating Function:

Probability or Frequency distributions in statistics are tools which
summarize the pattern followed by uncertain events or data in hand. The
distributions encompass a wide range of values for different kinds of

Institute of Lifelong Learning, University of Delhi pg. 19

Correlation and Linear Regression

outcomes. The information stored in these distributions is further used to

make estimations and predictions for events.
Any form of distribution is characterized by certain constants which are
usually the measures of location, dispersion, skewness and kurtosis. For
a probability distribution, calculation of these requires the expected
values E(X), E(X2), E(X3), ..., and E(Xr) which are called moments. The ith
moment about origin is defined as  i' = E[Xi] and a function that

generates these moments is known as Moment Generating Function.

Definition: For a real valued random variable X, the moment generating
function (mgf) of X is a function M defined as MX(t) = E(etX) where the
dummy constant t ϵ  . Therefore,


MX(t) = E 1  tX 
tX 
2

tX 
3

   
 2! 3! 
Practically, in many situations we find variables occurring in pairs rather
than singly. In such a scenario, we are looking at joint probability
distributions which further give way to marginal and conditional
distributions. For such distributions, we define the moment generating
functions as
MX,Y(t,s) = E(etX+sY)
= E(etXesY)
=

E 1  tX 
tX 2  tX 3     1  sY  sY 2  sY 3     
 
 2! 3!  2! 3! 

 s 2Y 2   ts 2 XY 2 
1  sY         tX  tsXY      
 2!   2! 
 E 
 2 2 2 2 2 2 2 2

  t X  t sX Y  t s X y      
  2! 2! 2!2 
 

The following results hold true for the multivariate moment generating
functions:
1. MX,Y(0,s) = E(esY) = MY(s)
2. MX,Y(t,0) = E(etX) = MX(t)
3. X andY are independent if and only if MX,Y(t,s) = MX(t) MY(s)

Institute of Lifelong Learning, University of Delhi pg. 20

Correlation and Linear Regression

 2 M X ,Y t , s 
4.  EXY 
ts t 0, s 0

  s 2 XY 2  

0   X  sXY        
M X ,Y t , s    2!  
Hint:-  E
t  2 2 2 2 2

 2tX  2tsX Y  2ts X y     
 2! 2! 2!2 


 2sXY 2  
 
 XY        
 2 M X ,Y t , s   2!  
 E
ts  2 2 2

 2tX Y  4tsX Y     
 2! 2!2 


M X ,Y 0, s 
5.  EY 
s s 0

M X ,Y t ,0
6.  EX 
t t 0

 2 M X ,Y 0, s 
7.
s 2
 EY2  
s 0

 2 M X ,Y t ,0
8.
t 2
E X2  
t 0

 r  s M X ,Y t , s 
9. In general,
t s
r s

 E X rY s 
t 0 , s 0

10. MaX+bY(t) = E(e(aX+bY)t) = E(e(at)X+(bt)Y)) = MX+Y(at,bt)

8. Covariance:
The Results 4, 5 and 6 can be used to calculate the covariance between
a pair of variables for a bivariate distribution as :
COV(X,Y)=E[XY]-E[X]*E[Y]
 2 M t, s    M X ,Y  t , 0    M X ,Y  0, s  
COV ( X , Y )   X ,Y
  * 
 ts 
t  0, s  0   t t 0 
  s s 0 


Exercise 3:

Institute of Lifelong Learning, University of Delhi pg. 21

Correlation and Linear Regression

1: Suppose that (X,Y) is uniformly distributed on the

2
triangle T={(x,y)∈R :0≤x≤y≤1}. Find:

a. The joint moment generating function of (X,Y).

e s t  1 es 1
Hint: -Mx,y(s,t) = 2 2
ss  t  st

b. The moment generating function of X.

 e2 1 1
Hint: -Mx(t) = 2 2  2  
t t t

c. The moment generating function of Y.

 se s  e s  1 
Hint: - My(s) = 2 
 s2 

Partial Solution:

f(x,y) = 2, 0≤x≤y≤1

 y sx 
 
1 y 1 1
2 ty sy
Mx,y(s,t) =   2e e dxdy = 2 e   e dx  dy =  e e  1 dy
sx ty ty

0 0 0  0  s0

2
1 1
 2  e  s  t   1 e t  1
=   e  s t  y dy   e ty dy  =   
s 0 0  s  s  t t 

2: Suppose that (X,Y) has probability density

function f(x,y)=x+y for 0≤x≤1, 0≤y≤1. Find:

a. The joint moment generating function (X,Y).

e s t  2st  s  t   e s st  s  t   s  t
Hint: -Mx,y(s,t) =
s 2t 2

b. The moment generating function of X.

3te 2  2e 2  t  2
Hint: -Mx(t) =
2t 2

c. The moment generating function of Y.

Institute of Lifelong Learning, University of Delhi pg. 22

Correlation and Linear Regression

3se s  2e s  s  2
Hint: -My(s) =
2s 2

3: (a) If (U,V) ~ BVN (0, 0, 1, 1, ρ), then using MGF, show that
Correlation between U and V is given by ρ.
(b) If (X,Y) ~ BVN (μx, μy, σx2, σy2, ρ), then using MGF, show that
Correlation between X and Y is given by ρ.
(c) If (U,V) ~ BVN (0, 0, 1, 1, ρ), find the distribution of linear
combination of lX + mY using MGF.

4: A production firm sells a productA, which comprises of three

components X, Y, Z. The costs incurred in manufacturing each of these
components obey the following moment generating function:
Mx(t) = (1 - 2t)-3, My(t) = (1 - 2t)-2.5, Mz(t) = (1 - 2t)-4.5
Find the moment generating function of the product A and hence obtain
then E(A), E(A)2 and E(A)3.

5: X and Y are independent random variables with common moment

t2
generating function M(t) = e 2
. Let U = X + Y and V = Y - X. Then find
the joint moment generating function of U and V, and hence find the
correlation between them.
6: X and Y are independent Gamma distributed variables with density
1
f(t)= = et t  1 , t> 0
 

Find the joint moment generating function of U = X/Y nd V = X+Y.

7: Suppose X and Y are jointly distributed random variables with moment

 2e t  3 e s  1 
generating function Mx,y(t,s) = 
  10

. Find the Correlation


 8 
between X and Y.
8: Suppose X and Y are jointly distributed random variables with p.d.f
given by

Institute of Lifelong Learning, University of Delhi pg. 23

Correlation and Linear Regression

f(x,y) = 2 - x - y, 0≤x≤1 and 0≤y≤1

Find the joint moment generating function of X and Y and hence find cov
(X,Y).
Summary:

In this chapter we have emphasize on the followings

References:

 Robert V. Hogg, Joseph W. McKean and Allen T.

Craig:Introduction to Mathematical Statistics, Pearson Education,
Asia, 2007.
 Irwin Miller and Marylees Miller:John E. Freund’s Mathematical
Statistics with Applications (7th Edition), Pearson Education, Asia,
2006.
 Sheldon Ross:Introduction to Probability Models (9th Edition),
Academic Press, Indian Reprint, 2007.
 Biswas, Suddhendu and Srivastava, G.L.: Mathematical
Statistics: A Textbook,2011.
 Sharma, H.S, Chaudhary,S.S and Gupta Madhu:Descriptive
Statistics , Students Friends & Company,2002.

Web Links:

Institute of Lifelong Learning, University of Delhi pg. 24

Correlation and Linear Regression

 www.users.miamioh.edu/claypohm/.../Chapter%209%20correlatio
n.doc
 http://www.vitutor.com/statistics/regression/problems_regression.h
tml
 http://www.mathsisfun.com/data/scatter-xy-plots.html
 http://www.emathzone.com/tutorials/basic-statistics/examples-of-
correlation.html

Suggesting Reading:

 Alexander M. Mood, Franklin A. Graybilland Duane C.

Boes:Introduction to the Theory of Statistics, (3rd Edition), Tata
McGraw- Hill, Reprint 2007.
 Gupta ,S.C and Kapoor, V.K : Fundamentals of Mathematical
Statistics,Eleventh Edition (Reprint) 2014.
 Gun,A.M.,Gupta, M.K.,Dasgupta,B.:Fundamental of
Statistics,Vol.1, The world Press Private Ltd,2002.
 Arora,Bansilal and BansiLal:New Mathematical
Statistics,SatyaPrakashan, 1997.

Institute of Lifelong Learning, University of Delhi pg. 25

Correlation Regression
No ratings yet
Correlation Regression
20 pages
Liebherr Excavator 11032024
No ratings yet
Liebherr Excavator 11032024
2 pages
Notes - Correlation and Regression
No ratings yet
Notes - Correlation and Regression
26 pages
Correlation and Regression
No ratings yet
Correlation and Regression
27 pages
Unit II Notes Correlation and Regression
No ratings yet
Unit II Notes Correlation and Regression
19 pages
Regression and correlation notes
No ratings yet
Regression and correlation notes
28 pages
Instant Download Calibration in Analytical Science: Methods and Procedures Paweł Kościelniak PDF All Chapters
100% (3)
Instant Download Calibration in Analytical Science: Methods and Procedures Paweł Kościelniak PDF All Chapters
31 pages
Umhlaba Ofile - Isahluko 1 & 2
No ratings yet
Umhlaba Ofile - Isahluko 1 & 2
21 pages
Statistics module 3hejeiehhwwhgsysysudhhdbb
No ratings yet
Statistics module 3hejeiehhwwhgsysysudhhdbb
44 pages
Pearson R
No ratings yet
Pearson R
25 pages
(eBook PDF) Social Science: An Introduction to the Study of Society 16th Edition - The ebook in PDF format is ready for immediate access
100% (1)
(eBook PDF) Social Science: An Introduction to the Study of Society 16th Edition - The ebook in PDF format is ready for immediate access
50 pages
Correction
No ratings yet
Correction
10 pages
Correlation and Regression
No ratings yet
Correlation and Regression
11 pages
Stat
No ratings yet
Stat
17 pages
BS Module 2
No ratings yet
BS Module 2
7 pages
1732868803
No ratings yet
1732868803
15 pages
Correlation Regression
No ratings yet
Correlation Regression
5 pages
Correlation of Experimental Data CLIL 2017
No ratings yet
Correlation of Experimental Data CLIL 2017
8 pages
CORRELATION-AND-REGRESSION-ANALYSIS
No ratings yet
CORRELATION-AND-REGRESSION-ANALYSIS
37 pages
Correlation Analysis
No ratings yet
Correlation Analysis
49 pages
Correlation & Regression
No ratings yet
Correlation & Regression
26 pages
lecture 5
No ratings yet
lecture 5
30 pages
PB-E-M1-DS-010_Rev3_IFB MDS Diesel Biohydrocarbon Unloading Pump - 2G02_code 1
No ratings yet
PB-E-M1-DS-010_Rev3_IFB MDS Diesel Biohydrocarbon Unloading Pump - 2G02_code 1
3 pages
Correlation 26-2-24
No ratings yet
Correlation 26-2-24
16 pages
Correlation Analysis
No ratings yet
Correlation Analysis
16 pages
Eme - Digital Notes
No ratings yet
Eme - Digital Notes
81 pages
UNIT III PORIYAN NOTES (1)
No ratings yet
UNIT III PORIYAN NOTES (1)
33 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
Chapter 4 (Correlation part)
No ratings yet
Chapter 4 (Correlation part)
16 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
100 pages
Correlation
No ratings yet
Correlation
14 pages
Module 2 - Section 4 (Linear Regression) - 11
No ratings yet
Module 2 - Section 4 (Linear Regression) - 11
20 pages
Correlation Ansd Simple Regression
No ratings yet
Correlation Ansd Simple Regression
27 pages
Correlation
No ratings yet
Correlation
6 pages
Correlation Analysis
100% (1)
Correlation Analysis
51 pages
Correlation & Regression (Complete) .PDF Theory Module-6-B
100% (1)
Correlation & Regression (Complete) .PDF Theory Module-6-B
9 pages
Microsoft PowerPoint Session 4 PDF
No ratings yet
Microsoft PowerPoint Session 4 PDF
86 pages
Statistics Regression Final Project
100% (2)
Statistics Regression Final Project
12 pages
PPP Correlation BIOSTATISTICS
No ratings yet
PPP Correlation BIOSTATISTICS
14 pages
Qt Module II Correlation and Regression Analysis
No ratings yet
Qt Module II Correlation and Regression Analysis
10 pages
Regression Analysis: A Journey from Simple to Complex
From Everand
Regression Analysis: A Journey from Simple to Complex
Pasquale De Marco
No ratings yet
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
11 pages
Correction and Regression
No ratings yet
Correction and Regression
30 pages
Final Exam Guidelines-1
No ratings yet
Final Exam Guidelines-1
9 pages
Lecture-25 CORRELATION - 34861774 - 2024 - 05 - 04 - 23 - 38
No ratings yet
Lecture-25 CORRELATION - 34861774 - 2024 - 05 - 04 - 23 - 38
4 pages
WRITER'S EFFECT NOTES FOR CAMBRIDGE 0500 (First Language English Oral Endorsement)
No ratings yet
WRITER'S EFFECT NOTES FOR CAMBRIDGE 0500 (First Language English Oral Endorsement)
2 pages
PSNM - Ch. 1
No ratings yet
PSNM - Ch. 1
16 pages
Research Format - Guidelines Undergraduate
No ratings yet
Research Format - Guidelines Undergraduate
28 pages
Bell's Inequality Untwisted
From Everand
Bell's Inequality Untwisted
Jim Spinosa
No ratings yet
Exam Revision Class Notes For Block 9 and 10
No ratings yet
Exam Revision Class Notes For Block 9 and 10
3 pages
Correlation
No ratings yet
Correlation
19 pages
Assignment
No ratings yet
Assignment
4 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
Correlation and Regression
No ratings yet
Correlation and Regression
23 pages
Lesson 11 - Regression and Correlation Analysis
No ratings yet
Lesson 11 - Regression and Correlation Analysis
8 pages
Measures of Correlation Module
No ratings yet
Measures of Correlation Module
24 pages
MGT 6203 - Sri - M4 - Logistic Regression v042919
No ratings yet
MGT 6203 - Sri - M4 - Logistic Regression v042919
31 pages
Correlation and Regression
No ratings yet
Correlation and Regression
23 pages
L3 - Correlation & Rank Correlation
No ratings yet
L3 - Correlation & Rank Correlation
11 pages
QuickDent Implants Brochure 2022
No ratings yet
QuickDent Implants Brochure 2022
54 pages
Choy K. Chemical Vapour Deposition (CVD) - Advances, Tech and App 2019
No ratings yet
Choy K. Chemical Vapour Deposition (CVD) - Advances, Tech and App 2019
417 pages
Unit 3-1
No ratings yet
Unit 3-1
12 pages
MRS - Diana-Correlation Analysis-Notes
No ratings yet
MRS - Diana-Correlation Analysis-Notes
16 pages
Correlation Bmlt
No ratings yet
Correlation Bmlt
5 pages
1 - 5 - 1 - English Core
No ratings yet
1 - 5 - 1 - English Core
16 pages
Correlation Regression
100% (1)
Correlation Regression
25 pages
Shubhra Ranjan Ias Study General Studies Paper Ii Test 08
No ratings yet
Shubhra Ranjan Ias Study General Studies Paper Ii Test 08
3 pages
Solution Ej 4 EDO
No ratings yet
Solution Ej 4 EDO
7 pages
Correlation Analysis Notes-2
No ratings yet
Correlation Analysis Notes-2
5 pages
1134 EURAMET CG 18 03 - NAWI
No ratings yet
1134 EURAMET CG 18 03 - NAWI
84 pages
Huawei Gabinete - Indoor
No ratings yet
Huawei Gabinete - Indoor
2 pages
Correlation
No ratings yet
Correlation
19 pages
Lina Yuliza Quirife Ingles
No ratings yet
Lina Yuliza Quirife Ingles
8 pages
McKinsey & Company
No ratings yet
McKinsey & Company
1 page
Oe Statistics Notes
No ratings yet
Oe Statistics Notes
32 pages
Guide OGP Pour Le Report Des Accident Incident
No ratings yet
Guide OGP Pour Le Report Des Accident Incident
30 pages
Course Pack Correlation
No ratings yet
Course Pack Correlation
12 pages
Fourier NB PDF
No ratings yet
Fourier NB PDF
12 pages
Decision Making: Chapter Learning Objective
No ratings yet
Decision Making: Chapter Learning Objective
19 pages
Chapter Four Correlation Analysis: Positive or Negative
No ratings yet
Chapter Four Correlation Analysis: Positive or Negative
15 pages
Notes of CH 2 Physical Features of India - Class 9th Geography Study Rankers PDF
100% (2)
Notes of CH 2 Physical Features of India - Class 9th Geography Study Rankers PDF
10 pages
XK3118K8 Manul Book
67% (3)
XK3118K8 Manul Book
25 pages
Law 1 Stat q3 Week 1 2
No ratings yet
Law 1 Stat q3 Week 1 2
12 pages
Certificate of Participation: Shyama Prasad Mukherji College For Women University of Delhi
No ratings yet
Certificate of Participation: Shyama Prasad Mukherji College For Women University of Delhi
1 page
Montg Manual
100% (1)
Montg Manual
20 pages
Thermodynamics - How To Explain The Maxwell Boltzmann Distribution Graph (Physically) ? - Physics Stack Exchange PDF
No ratings yet
Thermodynamics - How To Explain The Maxwell Boltzmann Distribution Graph (Physically) ? - Physics Stack Exchange PDF
7 pages
Calculus III Essentials
From Everand
Calculus III Essentials
Editors of REA
1/5 (2)
Test de Heterogeneidad
No ratings yet
Test de Heterogeneidad
6 pages
Syllabus Science Chemistry Sem-3-4
No ratings yet
Syllabus Science Chemistry Sem-3-4
12 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Solution Manual For John E. Freund's Mathematical Statistics With Applications 8/e Miller
No ratings yet
Solution Manual For John E. Freund's Mathematical Statistics With Applications 8/e Miller
9 pages
Mathematical Analysis 1: theory and solved exercises
From Everand
Mathematical Analysis 1: theory and solved exercises
Alessio Mangoni
5/5 (1)
JavaScript Array Methods - Shiksha Online
No ratings yet
JavaScript Array Methods - Shiksha Online
10 pages
China Deyuan Marine Fitting Co., LTD
No ratings yet
China Deyuan Marine Fitting Co., LTD
15 pages