Prob Stats Module 3
Prob Stats Module 3
S. Devi Yamini
Module - III 1 / 19
Correlation
Correlation deals with the measure of strength of the linear relationship
between variables.
Module - III 2 / 19
Correlation
Correlation deals with the measure of strength of the linear relationship
between variables.
Module - III 2 / 19
Scatter Plot
Module - III 3 / 19
Correlation
Module - III 4 / 19
Correlation
Module - III 4 / 19
Properties
1 −1 ≤ rXY ≤ 1
If rXY = −1 =⇒ perfect negative correlation
If rXY = 1 =⇒ perfect positive correlation
If rXY = 0 =⇒ Uncorrelated (no linear relationship bet X and Y )
Module - III 5 / 19
Properties
1 −1 ≤ rXY ≤ 1
If rXY = −1 =⇒ perfect negative correlation
If rXY = 1 =⇒ perfect positive correlation
If rXY = 0 =⇒ Uncorrelated (no linear relationship bet X and Y )
P P P
N UV − U V
2 rXY = rUV = √
U 2 −[ U]2 )(N
P P P 2 P 2
(N V −[ V ] )
X −a Y −b
where U = h and V = k
Module - III 5 / 19
Regression
Module - III 6 / 19
Regression
Regression line
Line which gives the best estimate to the value of one variable for any
specific value of the other variable.
Module - III 6 / 19
Regression Line
Module - III 7 / 19
Lines of regression
Regression line of y on x
σy
y − ȳ = rxy (x − x̄)
σx
Module - III 8 / 19
Lines of regression
Regression line of y on x
σy
y − ȳ = rxy (x − x̄)
σx
Regression line of x on y
σx
x − x̄ = rxy (y − ȳ )
σy
Module - III 8 / 19
Lines of regression
Regression line of y on x
σy
y − ȳ = rxy (x − x̄)
σx
Regression line of x on y
σx
x − x̄ = rxy (y − ȳ )
σy
Regression coefficients
σy
byx = rxy
σx
σx
bxy = rxy
σy
Module - III 8 / 19
Properties
1 2 =b ∗b
rxy xy yx
Module - III 9 / 19
Properties
1 2 =b ∗b
rxy xy yx
2 rxy , bxy , byx will have same sign
Module - III 9 / 19
Properties
1 2 =b ∗b
rxy xy yx
2 rxy , bxy , byx will have same sign
3 Both the lines of regression pass through (X̄ , Ȳ )
Module - III 9 / 19
Properties
1 2 =b ∗b
rxy xy yx
2 rxy , bxy , byx will have same sign
3 Both the lines of regression pass through (X̄ , Ȳ )
4 If there is a perfect correlation between two variables, then there is
only one regression line.
Module - III 9 / 19
Properties
1 2 =b ∗b
rxy xy yx
2 rxy , bxy , byx will have same sign
3 Both the lines of regression pass through (X̄ , Ȳ )
4 If there is a perfect correlation between two variables, then there is
only one regression line.
5 If r = 0, then the two lines of regression are perpendicular. If r = 1 or
−1, then the two lines coincide.
Module - III 9 / 19
Properties
1 2 =b ∗b
rxy xy yx
2 rxy , bxy , byx will have same sign
3 Both the lines of regression pass through (X̄ , Ȳ )
4 If there is a perfect correlation between two variables, then there is
only one regression line.
5 If r = 0, then the two lines of regression are perpendicular. If r = 1 or
−1, then the two lines coincide.
6 P P P
N XY − X Y
bXY = P 2 P 2
N Y −[ Y]
Module - III 9 / 19
Properties
1 2 =b ∗b
rxy xy yx
2 rxy , bxy , byx will have same sign
3 Both the lines of regression pass through (X̄ , Ȳ )
4 If there is a perfect correlation between two variables, then there is
only one regression line.
5 If r = 0, then the two lines of regression are perpendicular. If r = 1 or
−1, then the two lines coincide.
6 P P P
N XY − X Y
bXY = P 2 P 2
N Y −[ Y]
7 P P P
N XY − X Y
bYX = P 2 P 2
N X − [ X]
Module - III 9 / 19
Problems
1. Calculate the correlation coefficient for the following heights (in inches)
of fathers’ (x) and their sons’ (y ):
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
Obtain the lines of regression for the above data and find the estimate of
x for y = 70
Module - III 10 / 19
Problems
1. Calculate the correlation coefficient for the following heights (in inches)
of fathers’ (x) and their sons’ (y ):
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
Obtain the lines of regression for the above data and find the estimate of
x for y = 70
rxy = 0.603, x̄ = 68, ȳ = 69, σx = 4.5, σy = 5.5
Module - III 10 / 19
Problems
1. Calculate the correlation coefficient for the following heights (in inches)
of fathers’ (x) and their sons’ (y ):
x 65 66 67 67 68 69 70 72
y 67 68 65 68 72 72 69 71
Obtain the lines of regression for the above data and find the estimate of
x for y = 70
rxy = 0.603, x̄ = 68, ȳ = 69, σx = 4.5, σy = 5.5
Regression equation of x on y : x = 0.5454y + 30.3674
Regression equation of y on x: y = 0.6666x + 23.6712
Module - III 10 / 19
Try!!!!
X 25 30 28 29 32 24 36 28 27 21
Y 18 20 21 16 14 13 22 15 19 12
Module - III 11 / 19
Try!!!!
X 25 30 28 29 32 24 36 28 27 21
Y 18 20 21 16 14 13 22 15 19 12
Module - III 11 / 19
Problems
3. A computer while calculating the correlation coefficient between x and
y from 25
P pairs of observations, obtained thePfollowing: P
n = 25, x = 125, x 2 = 650, y = 100, y 2 = 460, xy = 508. It
P P
was later discovered that they had copied two pairs as (6, 14) and (8, 6)
while the correct values were (8, 12) and (6, 8). Obtain the correct value
of the correlation coefficient.
Module - III 12 / 19
Problems
3. A computer while calculating the correlation coefficient between x and
y from 25
P pairs of observations, obtained thePfollowing: P
n = 25, x = 125, x 2 = 650, y = 100, y 2 = 460, xy = 508. It
P P
was later discovered that they had copied two pairs as (6, 14) and (8, 6)
while the correct values were (8, 12) and (6, 8). Obtain the correct value
of the correlation coefficient.
rxy = 0.667
4. Can y = 5 + 2.8x and x = 3 − 0.5y be the estimated regression
equations of y on x and x on y respectively?
Module - III 12 / 19
Problems
3. A computer while calculating the correlation coefficient between x and
y from 25
P pairs of observations, obtained thePfollowing: P
n = 25, x = 125, x 2 = 650, y = 100, y 2 = 460, xy = 508. It
P P
was later discovered that they had copied two pairs as (6, 14) and (8, 6)
while the correct values were (8, 12) and (6, 8). Obtain the correct value
of the correlation coefficient.
rxy = 0.667
4. Can y = 5 + 2.8x and x = 3 − 0.5y be the estimated regression
equations of y on x and x on y respectively? No
Module - III 12 / 19
Problems
3. A computer while calculating the correlation coefficient between x and
y from 25
P pairs of observations, obtained thePfollowing: P
n = 25, x = 125, x 2 = 650, y = 100, y 2 = 460, xy = 508. It
P P
was later discovered that they had copied two pairs as (6, 14) and (8, 6)
while the correct values were (8, 12) and (6, 8). Obtain the correct value
of the correlation coefficient.
rxy = 0.667
4. Can y = 5 + 2.8x and x = 3 − 0.5y be the estimated regression
equations of y on x and x on y respectively? No
5. Out of two lines of regression, which is the regression line of X on Y .
X + 2Y − 5 = 0, 2X + 3Y − 8 = 0
Also, obtain (i) the value of correlation coefficient, (ii) mean values of X
and Y , (iii) if the variance of X is 12, find σY .
Module - III 12 / 19
Problems
3. A computer while calculating the correlation coefficient between x and
y from 25
P pairs of observations, obtained thePfollowing: P
n = 25, x = 125, x 2 = 650, y = 100, y 2 = 460, xy = 508. It
P P
was later discovered that they had copied two pairs as (6, 14) and (8, 6)
while the correct values were (8, 12) and (6, 8). Obtain the correct value
of the correlation coefficient.
rxy = 0.667
4. Can y = 5 + 2.8x and x = 3 − 0.5y be the estimated regression
equations of y on x and x on y respectively? No
5. Out of two lines of regression, which is the regression line of X on Y .
X + 2Y − 5 = 0, 2X + 3Y − 8 = 0
Also, obtain (i) the value of correlation coefficient, (ii) mean values of X
and Y , (iii) if the variance of X is 12, find σY .
rXY = −0.866, bXY = −1.5, bYX = −0.5
X̄ = 1, Ȳ = 2, σY = 2
Module - III 12 / 19
Partial correlation
Module - III 13 / 19
Partial correlation
Module - III 13 / 19
Partial correlation
Module - III 13 / 19
Problems
Module - III 14 / 19
Problems
Module - III 14 / 19
Problems
Module - III 15 / 19
Multiple correlation
Module - III 15 / 19
Multiple correlation
Properties
0 ≤ R1.23 ≤ 1
R1.23 ≥ r12 , r13 , r23
Module - III 15 / 19
Problems
Module - III 16 / 19
Problems
Module - III 16 / 19
Problems
Module - III 16 / 19
Multiple Regression
Module - III 17 / 19
Multiple Regression
Problem
Find the multiple linear regression of X1 on X2 and X3 from the data
relating to three variables
X1 4 6 7 9 13 15
X2 15 12 8 6 4 3
X3 30 24 20 14 10 4
Module - III 17 / 19
Multiple Regression
Problem
Find the multiple linear regression of X1 on X2 and X3 from the data
relating to three variables
X1 4 6 7 9 13 15
X2 15 12 8 6 4 3
X3 30 24 20 14 10 4
Module - III 17 / 19
Multiple Regression
For a multivariate data, the regression equation of X on Y and Z is
ω11 ω12 ω13
(X − X̄ ) + (Y − Ȳ ) + (Z − Z̄ ) =0
σ1 σ2 σ3
where
1 r12 r13
ω = det r12 1
r23
r13 r23 1
1 r23
ω11 = det
r23 1
r12 r23
ω12 = −det
r13 1
r12 1
ω13 = det
r13 r23
Module - III 18 / 19
Problems
Module - III 19 / 19
Problems
Module - III 19 / 19