0% found this document useful (0 votes)
50 views20 pages

Regression 9

Uploaded by

Manish Sandilya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
50 views20 pages

Regression 9

Uploaded by

Manish Sandilya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 20
5 GRESSION The term ‘regression’ literally means “stepping back towards the average’. Regression analysis is a mathematical measure of the average relationship between two or more variables in terms of the original units of the data. 537“Line of Regression The line of regression is the line which gives the best estimate to the value of one variable for any specific value of the other variable. Thus the line of regression is the line of best fit and is obtained by the principle of least squares. The regression equation of Y on X is obtained on minimizing the sum of the Squares of the errors parallel to Y axis while the regression equation of X on "is obtained on minimizing the sum of the squares of the errors parallel to X ais, Ifthe variables in a bivariate distribution are related, then the points in the Scatter diagram (which is a diagrammatic representation of bivariate data) will ‘hister round some curve called the curve of regression. If the curve is a Sttight line, it is called the line of regression and said to be linear regression between the Variables, otherwise regression is said to be curvilinear. 410 & Probability and Random Processes 5.522 Equations of Lines of Regression ‘The regression linc of y on x 1 -y=p—(x-% y-Fap>(x-3) which is used to predict or estimate the value of y for ANY given ‘The coefficient of x in the regression line of y on x is tye coefficient of y on x and is denoted by ty aja b,, The regression line of x on y is pZ(y-j) a (P3 which is used to predict or estimate the value of x for any given y ‘The coefficient of y’ in the regression line of x on }’ 18 the teen coefficient of x on y and is denoted by The correlation coefficient p in terms of the regression coefficients is Dat by Pye 5.543 Properties of Correlation and Regression Coefficients Correlation coefficient is the geometric mean between regression coefficients, i.c. pt yby by, The correlation coefficient cannot numerically exceed the arithmetic me2* between regression coefficient. If one of the regression coefficients is greater than unity, the other mst be less than unity numerically. Regression coefficients are independent of the origin, but not scale The correlation coefficient and the two regression coefficients have | | xv s . m The regression coefficients are obtained by the following expressions case of discrete values of X and ¥: Twordimensional Random Variables 4 411 Oy _ nexy= (EH) (Ey) _ Ue 5)(y—Fy bye PG. EEX ene oy _ MEAY= (ZH) (By) _ Ux-F)(y~F) -e ee By) oy nby? = (Ey) Ly~ 5 “angle petween the Regression Lines 554 jeetneen te rogression lines is given by le ‘he ang! sate: g Whon r= #1, fan 0=0, ic. O= 0 or x. In this case, the two lines ee x F of regression coincide. When o=55 the lines of regression are perpendicular, {iy Whenever the two lines intersect, there are two angles between them, one acute angle and the other obtuse angle. fi) Both regression lines pass through the point (x, ¥) where ¥ = mean ofx and ¥ = mean of y. (i) Regression curve of ¥ on X is y = E(YIN = x) and regression curve of X on ¥ is x = E(X/Y = y). EXAMPLE F vein the correlation coefficient for the following heights (ininches) of fathers X and their sons Y. {AU June 06, November 07] X | 65 | 66 67 | 67 | 68 | 69 | 70 | 72 Y 67 | 68 65 | 68 72 | 72 | 69 | 71 Solution Method 1: x ¥ XY. me 7 | 6 67 4355 4225 4489 ie 68 4488 4356 4624 i 65 4355 4489 4225 Hes 68 4556 4489 4624 6 “72 4896 4624 5184 0 72 4968 4761 5184 Lin 69 4830 4900 4761 lino aN S112 5184 5041 S284 | nay = 552 [ex = 37560 | £.x2 = 37028 | 21? = 38132 | 412 & Probability and Random Processes Now, RY = 68 x 69 = 4692 loy2_ 722 37028 604 = a puiee ax? f= 2131 <2 _ [38132 a Pay? 7? = =z 7 4761 = 2.345 = 4—__ = =>—_—____—_- = 0, Oy 2I2ixa2s45 6030 Note: Correlation coefficient is independent of change of origin ang Seal, ie. : n(X, Y) = (U,V) where u= X-a ok ¥-b id a ant Ve K Here a and b are some arbitrary constants and usually the mid-values of the given data X and Y respectively. pLE an Find the coefficient of correlation between industrial woe and export using the following data: [AU May 706] Production (x)| 55_| 56 | 58 | 59 | 60 | o | 62 Export (9) Sil eo OHe |e 32d |e om 44a ease lead Solution X=x—58| Y=y-40 AY. x2 3 3 -5 15 9 25 -2 -2 4 4 4 0 3 0 0 9 1 cl -l 1 1 2 4 8 4 16 2 3 6 4 9 4 4 16 16 16 zxy=4 zY=0 EXY = 48 | EX? = 38 | EY? = 80 x = 4 Losing no7 pe ZX 0g n OF; Cov(x, Y) = EXY _ gp 248 _ 96.857 n lity and Random Processes 2_ Bs = [= - (0.5714) 7 (0.5714) = 225% sy? z7_ [go y= art a7 = | o=aae _ Cov(X¥)_ 6.857 = Fg, 2258x338 414 @ Probabl 0.898 EXAMPLE 5/11 Find the correlation coefficient for th folloyi ng dy (x [10 [44 [1s [22 | 26 [30 [> [as [2 [2% [6 | 30 [36 Solution cl YY TA 10 | 18 3 3 F 4} 12 -2 4 a 18 | 24 = 0 ; 22 6 0 0 0 26 | 30 1 1 1 ; 30_| 36 2 2 Fi ‘ mY=-3 xy=-3 | 2av=12 zits 9 [ern] Cov(X, Y) = ——— X¥ 12 7 = FON AON=F Sx? =) fd Oy = —- X? = f= - (0.5) =1.708 s =e eS) EY? oe —— - x7 n =1.708 Ox Oy N= ‘TWordimensional Random Variables © 415 G Caleulate the correlation coeticien and the lines of te the following data: TAU December *04, June 06) ee inal) e s is 5 ia [1320] 74 ¢ a6 125 208 ape i 339 312 4 40 240 es -26 130 a 20 20 0 0 0 1 -13 -13 2 15 30 a B 172 xy=-13 | ZY =-80 | SAY = 89) 1 -13 45x =2x(-13)=— t3x 3% 3) a 1 Egy 2 x¢-90) =~ 19 Ley i. 1 Foxy = 5x69) = 111.375 2 alsy2 at ay oe Oy = LEX! XP =e x (147) — # 7 I: t5y2 n ; (6440) — (-10)? = 705 -13 1 -| ==) e109) ged? -A¥ 111.375 (3 Je ) eee p09 Oxo, ~*~ K2GSS id hhobain the repression lines: "egression line of x on y is Pry = x-¥=p ZO-D Which is used to Predict or estimate bs value of x for any given y. -13 ™~ 416 © Probability and Random Processes = -13 x=X SOS aro +70 = 68.375 Y = ¥ +165=-10+165=155 3.96 > = $8.375 = 0.9x——(y- 155 7 2655019) x — 68.375 = 0.134(y — 155) x = 0,134y — 20.8 + 68.375 > x= 0.134y + 47.5 The regression line of y on x is gy _ y-y =P -X) Oy 26.55 ‘y — 155 = 09x 68.375 a : * 3:96 : y = 6.034x — 412.58 + 155 = y= 6.034x — 257.58 EXAMPLE 5.73 Calculate the correlation coefficient between the variables x and y from the following data: x 1 2 3 4 5 6 7 9 8 10 12} 11 13 14 EXAMPLE sy If ¥=970, F=18, 0, = 38,0, =2 and p = 0.6, find the line of regression of x on y. [AU December ’09] Solution Given: *¥ = 970, ¥ = 18, 0, = 38, o, = 2, p= 0.6 The regression line of x on y is = — oO; = x-¥ = by-H=p—(y-¥) ey (& - 970) = o6(38)0v —18) x= 970 + (0.6) (19) (y— 18) = 764.8 + 11.4y +. The regression line of x on y is x= ll4y + 764.8 TXAMPLE 440 The two regression lines are 4x — Sy + 33 = 0 and tie 9) = 107, and Var(X) = 25, Find () the means of x and y 2 the values of p and o,. and It) the angle between the regression lines. [AU June ’06, December ’07] 424 & Probability and Random Processes Solution i) Given; 4x - Sy =~ 33 ie 20x - Sy = 107 Since all the regression lines pass through the means 4x -—5y =-33 20% —9F = 107 Multiplying Eq. (i) by 5, 20% — 25F =-165 Equation (iii)-Eq. (ii) gives Substituting in Eq. (i), we get 4x =-33 + 5(17) 85-33 _ 4 x= 13 (ii) = 20x - 9y = 107 er oe > 20 x= O.45y + 5.35 = 0.45 = coefficient of y oy Sy=33 + 4x a .8 = coefficient of x P=, Xb, = 0.36 The correlation coefficient p = 0.6 iii) Given: g? = 25 r o,=5 b= en o45 0.6x5 =*=0. o, = %, G8 0.45 = 6.67 | @ Two-dimensional Random Vari bles 425 o\ 4 angle between the regression lines are itiprne a 9 P= tan'0.512) = ge ay u 1 IFEX, Y denote the deviation Of variance ff de iW on tom the arithmeti wat ifp= 05, EXY = 120, 0, = 8, 5x2 = 9p pre "number of times ot von X= x-¥,Yay-F win Given ‘ 226-D0-7) Sear = * : : t v2 Yop 1 (20) adler x9.48 x vn 120 Saag Wi Saaxexos n=10 BUAMPLE 5.82 Can Y = 5 + 2.8¥ and x = 3 ~ 0.5Y be the estimated regession equations of ¥ on X and X' on ¥ reg Pectively? Explain your answer, [AU June *06, November °07] Siition Given: X=3-O5Y = by = 95 Y=S428V > 5, =28 P= byy X by = (0.5) x (28) = 1.4 P= V-1.4 which is imaginary quantity. Pcannot be imaginary, ® The given lings are not estimated as regression equations EXAMPLE, rat Iy= 2x ~3 and y= Sx +7 are the two regression lines, fal te mean values of x an id y. Find the correlation coefficient between dy Find an estimate of x when y = 1 Souion The tw © regression lines always pass through their means (¥, 7) * We have y= 25-3 426 @ Probability and Random Processes T4+7 => 2x Solving we get -10 +. The correlation coefficient Pyy = Pry Pyx = Aliter heed If we choose the regression equation for x as x = >¥ +5 and for yas y= Sx +7, then by = s Itis nt a EXAMPLE 544 Given that x = 4y +5 and y = Ax +4 are the lines of regression of x on y and yon x respectively, show that 0 < 4k < 1. If k= 1/16, find the means of two variables and coefficient of correlation between them Solution Given: x= 4y + 5, regression line of x on y : by = ‘The regression line of y on x is yeket4 b.=k Bu 0s Psl = tl ci aoa Itk= a then Pathe tod > path ” Two-dimensional Random Variables © 427 i positive, is also positive, yy abot” a cs oo ‘ession are yo ines — Et S x =244 2 16 regression lines pass through their means the 0 x = 4y ¢ z= 47 +5 @ _ x yard Gi) jon) + 4% Ea GA gives Eu rei+2l = 3F=84 > F228 ating in Eq. (1) subst 28= 4y+5 = y=5.75 +t}a=3 8 EQ?) ra 1 xf S 23 ne I (ade Je sl +4}ac : alle = 3/8 ~ 215 Two: ‘dimensional Random, Variables © 433 it can be proved that i Pe a 2 Var(X) = E(x) — [Eee = zy _B 1s \g 960 Var() = EW?) ~[eqyp . 23 7 ol EY) = J I 9 FC y)dxdy 1 =F) foc? +a 00 Cov(x, ¥) CovKsy) > Pxr= Ox oy 70 960 1.960 960, 6a 93 = 760 = Is Per = 93 “line of regression of X on ¥ is given by X~ BQ) = LUX My _ gery o 434 @ Probability and Random Processes 1 5__/,_5 x Ble 3) 960 -2-5{r-2) 8 73 8 -15 55 XY + a 23 73 The regression line of Y on Xis Y Y-EQ)= Cr Diy — Ec) ox EXAMPLE 5.92 Let the random variable X have the marginal densi Y=L ! < re f(x)=1, pans and the conditional density of Y be 1 JO!x) = 1, xeyexth-7ers0 FL -x

You might also like