3 Unit (1) - Merged
3 Unit (1) - Merged
1. Scatter diagram
r e= :yr
axay
Here K
2
= G 2)x - x)(y - y) =
2 2 - I:(x -x)2 I:(y-y)2 I:X2. I:Y2
ax a y -
n
. ----
n
= ----'=--
n2
where X = x - x and Y = y- y
K2 <
-
0-2. 0-2
X y
lrl <
') I
Note:
X-a Y-b
.Let U = h , V == k so that
X ===a+ hU and Y = b + kV where a, b, h, k are constants; h > 0, k > 0.
/
We shall prove that r(X, Y) = r(U, V)
Note: Metl1od for finding correlatio n co-efficie nt (discrete caseJ
(X-X) (Y-Y)
r == ~..:.-- -=----
na xo-y
n~XY - (~X) (~Y)
- J J
n ~X 2 - (~X) 2 n ~Y 2 - · (~Y) 2
a.
ngu1al uni ts o f d at
• •
... •
te n n s o f o . . ,~-~.,."'
find o u t th ~,,,_ .,A
ab le s 111
co rr el at ed w e ca n
th e sa le s a n d ad v er ti si n g are o r the a m o u n t n
ed
E .t·aJ npl e: If xp e n d it u re
g iv e n ad v er ti si n g
e
o f sa l~ s fo r a
amount s.
n an1ount o f sale
at ta in in g t11e g iv e
Lines o f r e g r e s s io n n b e tw e e n
5.1.6 , th e re ex is ts a n as so ci at io
i. e.
ar ia b le s X a n d Y a re c o rr e la te d o re o r le ss co n ce n tr at ed a ro u n d
,r m
If (\·re
e th a t th e sc a tt e r d ia g ra m will b e
them, ,ve c a n se rve o f regression.
T h is c u rv e is c a ll e d C u
f re g re ss io n a n d th e re g re s-
e. o
a curv
st ra ig h t li n e , it is c a ll e d the line
If the curve is a
is a li n e a r re g re ssion. re ss io n li n e o f X and Y a n d
sicn n lines as the re g
sh a ll h a v e tw o re g re ss io
e o f Y a n d X g ives th e m o st
We
o f Y a n d X . T h e re g re ss io n li n li n e o f) (_ a n d } '"
io n li n e g re ss io n
the regress n v a lu es o f X a n d the
re
Y fo r g iv e
probable value o f f X fo r g iv e n v a lues o f Y .
g iv es the m o st p
ro b a b le values o
is a n d R e g re s s ion Analysis
o n b e tw e e n C o rrelation Analys
Table 5.1 Relati Regression Analy
sis
o C o rr el a tion Analysis re g re ss io n co efficients are
S.N he
at io n co ef fi ci en t r b et w ee n T al m ea su re s ex p re ss in g
1. C o rr el m at h em at ic
d Y is a m ea su re o f li n ea r re - e re la ti o n sh ip b et w ee
n
X an th e av er ag
et w ee n X and Y
la ti o n sh ip b th e f·w o variables. -ci-
ss io n co ef li t~ nts reflect o n
R eg re
ef fi ci en t d o es not b le i.e, w h ic h is
2. T h e co rr el at io n co th e n a tu re o f v ar ia
t u p o n th e n at u re o f variable v ar ia b le . In o th er words,
reflec d ep en d en t
d ep en d en t o r d ep en d en t vari- cs th e v al u e o f d ep en d en t
(in it csti1n at
able) ia b le fo r an y g iv en va lu e o f in -
v ar
le .
d e p e n d e n t v ar ia b
5.6 Statistics for Management
Cly . .
where r a- x 1s the regression coefficient of y on x.
(ii) Equation of line of regression of X on y is
X -
-
X == r -0-x ( y - y)
0-y
h Clx .
w ere r CJY is the regr essi on co-e ffic ient of X on y.
. An 1 .
Correlation, Regression ' Time Senes
a ys1s and Index N um bers 5.7
~l
I
Note:
1. The regression coefficients can be denoted by
b CJy
yx =r -
CJx
and bxy == CJ
r ~
. ay
2. The regress10n co-efficients are obt . th
discrete values of X and y amed by e following expressions for
5. If one of the regression coefficients is greater than unity the other must be
less than unit-y.
6. Regression coefficients are independent of the change of origin but not of
scale.
7. Both the regression coefficients will have the same sign, i.e, they will be
either both positive or both negative. The coefficient correlation will have
the same sign as that of regression coefficients, i.e. if regression coefficients
have a negative sign, r will also have negative sign and if the regression
coefficients have a positive sign, r will also be positive.
usual meaning.
where r O'X
h
O'Y have t e
Proof : ' ' . of y on X and X on Y are
. ion lines
Equation of the regress _
y _ y"' byx(x - ~)
x _ X"' bxy(Y - y)
5.8 Statistics for Management
Ir·~ - r":x I
= 1 + (r · ~) (~ · ~)
11 !!JL . vx
Ir - -r CY x
n-2
Since r2 < 1 and ax and a Y are positive, the angle between the lines is
l - r2 CJ"xCJ"y
tan0 == - - -
r CJ"X2 + CJ"2y
Note:
(i) Supposer == 0. Then tan 0 == oo ⇒ 0 == ; == 90°
The two regression lines are perpendicular to each other and the equations
will be
y == y and x == x.
(ii) If r == ±l, then tan0 == o ⇒ 0 == 0 or 7r
Here the lines of regression coincide. They cannot be parallel since theY
have a common point (x, y).
Solved Problem 5.1
Calculate the correlation coefficient for thefollowing heights (in inches) of fathers
( ) and their sons (y).
X 65 66 67 67 68 69 70 72
y: 67 68 65 68 72 72 69 71
5.9
Correlation, Regression, Time Series Analysis and Index Numbers
SSION,
Solution
xy
65 67 4489 4355
4225
66 68 4624 4488
4356
67 65 4355
4489 4225
67 68 4556
4489 4624
68 72 4896
4624 5184
69 72 4761 5184 4968
70 69 4900 4761 4830
72 71 5184
Total: 544
5041 5112
552 37028 38132 37560
Y =
38132, XY =
37560, n =8
= 0.603
67, v =
y -
68, we have
X u = X - 67 V= y - 68
2 uv
65 67 2 -1 4 1 2
66 68 0 0
67 65 0 9 0
67 68 0 0 0 0
68 72 4 16 4
69 72 4 4 16 8
70 69 9 1 3
72 71 25 9 15
44 52 32
5.10
Statisties for Management
TY w" ()
8(32)-(8)(8)
11) (8)'8(62)-(8)
0.603
uv
V=y
-
120
u =-60
9 49 21
57 113 3 7
9 3
59 117 -1 3
36 12
0TP 62 126 6
36 18
63 126 6
16 100 40
64 130 10
25 81 45
65 129 9
-9 25 81 45
55 111
-4 4 16 8
58 116 2
57 112 -3 -8 9 64 24
V9(102) -029(472) -
02
= 0.9844
0.667
The
following table gives, according
age, the frequency
to age,
of marks
obtained
6 4 19
20-30 5 4
30-40 6 8 10 11 35
40-50 4 4 6 22
2 4 4 10
50-60
2 3 1 6
60-70
31 28 100
Total 19 22
Solution
U=
=
- 3 35
5
v
u = t
-
19, 10
Let
Statistics tor ManageentA A
5.18 T 2 Total
-1
19 20 21 fv
Age X 18
Mid
value/Mark
10-20
4 ® 2 O , 8-16 -16 32 fuy A
-2 15 4O|6O|4 19 -19
20-30 O| 19
-1 25 35 9
8O|1oO||0 0 0
30-40 6O|
o 35 22
4O|6 O8 22 22
4 0 18
40-504
145 2044 10 20
2 55
50-60
20 31 6 18 24
54 15
60-70
65
3 19 22 31 28 N-100 25|167| 52
f
Total 0 31 56 68
-19
fu
19
0 31 112 162
fu
13 30 52
t9to4- fuv 9
NE/ry(2f«)(Efv)
r=
N / ? - ( E f * V N L / P - (E/ y)?
Nfuv-(2/u2/)
v?
NEfu- (fu"VNEfv? (2S
-
x -
x
Solved Problem 5.12
for tne
Cuate the coefficient of correlation and obtain the lines of regression
following.
X: 1 2 3 4 5 6 7 8 9
Y: 9 8 10 12 11 13 14 16 15
6.2.
estimate of Y which should correspond to the value X
=
Obtain an
5.20 Statistics for
NManagement
Solution
Y NY XY
81
61 16
3 10 100 30
412 16 144 48
5 11 25 121 65
13 36 169
7 14 49
78
196 98
16 64 256 128
9 15 225 135
45 108 285 1356 597
X=45, Y
597, n= 9
=
108. X =
285, EY 1356.5 XY
-
T-X45
5
108
= 12
9
Correlation coefficient
nXY-(X)(Y)
VnX- (2X) VnY-(2Y
(9 x 597)-(45 x 108)
Regression coefficient of X on Y
nXY-(XICY
bey nY-(EY¥
(9x 597)- (45 x 108)
= 0.95
(9x 1356) - (108)2
Regression coefficient of Y on X
b XY-(x)(Y
nX- (EX
(9 x 597)- (45 x108) - 0.95
(9x 285) - (45)2
5.21
orTelation, Regression, Time Series Analvsis and Inder Niumbers
Regression line of X on Y is
-T b,u(y-)
-5-0.95(y 12)
0.95y-114
0.95 y-6.4
Regression line of Y on X is
y-7 byr (r- 7)
r-12 0.95(r- 5)
= 0.95 r-4.75
y=0.95r + 7.25
Value of y
coresponding to r =
6.2 is
The
following data relate
marketing expenditure in lakhs of rupees and the
to
Corresponding sales of a product in crores of rupees. Estimate the marketing ex-
penditure to attain a sales target of Rs.40 crores.
Marketing expenditure 10 12 15 20 23
Product sales 14 17 23 21 25
Also find the coefficient of correlation between
sales.
marketing expenditure and
Solution
Let c be marketing expenditure and g be-product sales.
xy
10 14 100 196 140
12 17 144 289 204
15 23 225 529 345
20 21 400 441 420
23 25 529 625 575
Total 80 100 1398 2080 1684
5.22 Statistics for Management
ny-( E)
n -(
(5)(1684)-(80) 100=105
5(2080)- (100)2
Now
T= = 16
7-- n 5
=20
Regression line of X on Y is
z-T =
bzyy-7)
T- 16 =
1.05(y 20) -
T = 1.05 y - 5
nay-( )
Vnr-(2-* Vn2-(2y
(5)(1684)(80)(100)
V5(1398) (80)2 (5)(2080) (100)2
-
= 0.8646
F= 1, 7 = 2
Using note (3)
Let us assume that () is the
of r
regression line of y on r and (2)
(2) isis
on th
y. the
gres ico,n lite
regressi
Then (1)2y= -r + 5
y=-+
by
(2)2x = -3y + 8
3
T=-y+4
bry 2
Now 1
by 2
1.e.,
1.e.,
oy- V3 =2
Variance of y =
o = 4
5.25
mbers
Number
Index
Correlation, Regression, Time Series Analysis and
In a
partially destroyed laboratorv records on the analysisof
the following results are legible: =
=0and
40x - 18y
Variance of x 9. 10y + 66
=
Regression equations 8x-
214. Find (a) mean values
of xand y(b) Correction coefhcre
(c) S.D. ofy.
Solution
1. The equations of the regression lines are (1)
8x- 10y+66 0
(2)
214
40r 18 =
2. Let
(1) be the regression line of y on x and
(2) be the regression line of x ony.
8 66
.yTo"T10
18 214
T 4040
8 18
y 10' bry 40
V 0 =18
0.6
3. Now
8
ay
10
0.6 x =0.8
ay =
0.8 x 3 4
0.6
Solution
at Chennai and
Y be the price at M
Let X be the price bai
Given T= 65, = 67, O =
0.5,
oy3.5, -y3.1
Correlation coefficient =r=
20,0
(0.5)2 +(3.5) (3.1)2
2(0.5)(3.5)
Y is
The regression line of X on
t - T= ( y - 7)
Oy
0.8257 x 0.5,
t 65 =
3.5 -(-67)
t = 0.1179y + 57.1007
TO(-T)
0.8257x 3.5,
y-67= 0.5
-(r - 65)
y =5.7799z - 308.6935
The price at Mumbai corresponding to the price 68 at
Chennai
(5.7799 68) 308.6935
=
x -
= Rs.85.63.
Solution
Let marks
Given
in
Mathematics be z and marks in Ernglish De
T
=62.5, = 39, a, =9.5, ay =
10, r=0.0
Regression line of y on r is
y - - TOy (r- 7)
0.60 10-62.5)
9.5
=
0.6316r -
62.5)
=
0.6316: -
39.475
y =
0.6316r -0.475
()
i) Marks in English when marks in Marks in Mathematics is 70
=
(0.6316x 70) 0.475 43.74
Regression line of r on y is
Oy
I - 62.5 =
0.609.5
10
u-39)
= 0.57(y39)
0.57y 22.33
. I =
0.57y + 40.27
(i) Marks in Mathematics corresponding to 54 marks in
English
=
(0.57x 54) + 40.27
=
30.78 + 40.27 71.05