0% found this document useful (0 votes)

37 views

Mukti Linear Regresson

The document provides an overview of multiple linear regression (MLR). MLR extends simple linear regression to model relationships between a dependent variable and multiple independent variables. It introduces key concepts like the least squares method for estimating regression coefficients to minimize the sum of squared residuals, calculating fitted values and residuals, and partitioning total sum of squares. Properties of residuals like having a sum of zero and being orthogonal to covariates are also discussed.

Uploaded by

D-497 Neha Malviya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views

Mukti Linear Regresson

Uploaded by

D-497 Neha Malviya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

Multiple Linear Regression (MLR) Handouts

Yibi Huang

• Data and Models

• Least Squares Estimate, Fitted Values, Residuals
• Sum of Squares
• Do Regression in R
• Interpretation of Regression Coefficients
• t-Tests on Individual Regression Coefficients
• F -Tests on Multiple Regression Coefficients/Goodness-of-Fit

MLR - 1
Data for Multiple Linear Regression
Multiple linear regression is a generalized form of simple linear
regression, in which the data contains multiple explanatory
variables.
SLR MLR
x y x1 x2 . . . xp y
case 1: x1 y1 x11 x12 . . . x1p y1
case 2: x2 y2 x21 x22 . . . x2p y2
.. .. .. .. .. .. ..
. . . . . . .
case n: xn yn xn1 xn2 ... xnp yn
I For SLR, we observe pairs of variables.
For MLR, we observe rows of variables.
Each row (or pair) is called a case, a record, or a data point
I yi is the response (or dependent variable) of the ith
observation
I There are p explanatory variables (or covariates, predictors,
independent variables), and xik is the value of the explanatory
variable xk of the ith case
MLR - 2
Multiple Linear Regression Models

yi = β0 + β1 xi1 + . . . + βp xip + εi where εi ’s are i.i.d. N(0, σ 2 )

In the model above,

I εi ’s (errors, or noise) are i.i.d. N(0, σ 2 )
I Parameters include:

β0 = intercept;
βk = regression coefficients (slope) for the kth
explanatory variable, k = 1, . . . , p
σ 2 = Var(εi ) is the variance of errors

I Observed (known): yi , xi1 , xi2 , . . . , xip

Unknown: β0 , β1 , . . . , βp , σ 2 , εi ’s
I Random variables: εi ’s, yi ’s
Constants (nonrandom): βk ’s, σ 2 , xik ’s
MLR - 3
Fitting the Model — Least Squares Method
Recall for SLR, the least squares 18
estimate (βb0 , βb1 ) for (β0 , β1 ) is the 16

Y = % in poverty
intercept and slope of the straight
14
line with the minimum sum of
12
squared vertical distance to the data
points 10
8
Xn
2 6
(yi − β0 − β1 xi ) .
b b
i=1
75
80 85 90 95
MLR is just like SLR. The least squares estimate X =(%βb0 ,HS
. . . grad
, βbp ) for
(β0 , . . . , βp ) is the intercept and slopes of the (hyper)plane with
the minimum sum of squared vertical distance to the data points
n
X
(yi − βb0 − βb1 xi1 − . . . − βbp xip )2
i=1

MLR - 4
Solving the Least Squares Problem (1)
From now on, we use the “hat” symbol to differentiate the
estimated coefficient βbj from the actual unknown coefficient βj .
To find the (βb0 , βb1 , . . . , βbp ) that minimize
n
X
L(βb0 , βb1 , . . . , βbp ) = (yi − βb0 − βb1 xi1 − . . . − βbp xip )2
i=1

one can set the derivatives of L with respect to βbj to 0

n
∂L X
= −2 (yi − βb0 − βb1 xi1 − . . . − βbp xip )
∂ βb0 i=1
n
∂L X
= −2 xik (yi − βb0 − βb1 xi1 − . . . − βbp xip ), k = 1, 2, . . . , p
∂ βbk i=1

and then equate them to 0. This results in a system of (p + 1)

equations in (p + 1) unknowns.
MLR - 5
Solving the Least Squares Problem (2)
The least squares estimate (βb0 , βb1 , . . . , βbp ) is the solution to the
following system of equations, called normal equations.

+ βb1 ni=1 xi1 + · · · + βbp ni=1 xip = ni=1 yi

P P P
nβb0
βb0 ni=1 xi1 + βb1 ni=1 xi1
2 + · · · + βbp ni=1 xi1 xip = ni=1 xi1 yi
P P P P
..
.
β0 i=1 xik + β1 i=1 xik xi1 + · · · + βbp ni=1 xik xip = ni=1 xik yi
b Pn b Pn P P
..
.
β0 i=1 xip + β1 i=1 xip xi1 + · · · + βbp ni=1 xip2
Pn Pn
= ni=1 xip yi
b b P P

I Don’t worry about solving the equations.

R and many other softwares can do the computation for us.
I In general, βbj 6= βj , but they will be close under some
conditions

MLR - 6
Fitted Values

The fitted value or predicted value:

ybi = βb0 + βb1 xi1 + . . . + βbp xip

I Again, the “hat” symbol is used to differentiate the fitted

value ybi from the actual observed value yi .

MLR - 7
Residuals
I One cannot directly compute the errors

εi = yi − β0 − β1 xi1 − . . . − βp xip

since the coefficients β0 , β1 , . . . , βp are unknown.

I The errors εi can be estimated by the residuals ei defined as:

residual ei = observed yi − predicted yi

= yi − ybi
= yi − (βb0 + βb1 xi1 + . . . + βbp xip )
= β0 + β1 xi1 + . . . + βp xip + εi
− βb0 − βb1 xi1 − . . . − βbp xip

I ei 6= εi in general since βbj 6= βj

I Graphical explanation

MLR - 8
Properties of Residuals

Recall the least squares estimate (βb0 , βb1 , . . . , βbp ) satisfies the
equations
n
X
(yi − βb0 − βb1 xi1 − . . . − βbp xip ) = 0 and
| {z }
i=1
yi = ei = residual
= yi −b
n
X z }| {
xik (yi − β0 − β1 xi1 − . . . − βp xip ) = 0, k = 1, 2, . . . , p.
b b b
i=1

Thus the residuals ei have the properties

Xn Xn
ei = 0 , xik ei = 0, k = 1, 2, . . . , p.
| i=1{z } | i=1 {z }
Residuals add up to 0. Residuals are orthogonal to covariates.

MLR - 9
Sum of Squares
Observe that
yi − y = (b
yi − y ) + (yi − ybi )
Squaring up both sides we get

(yi − y )2 = (b
yi − y )2 + (yi − ybi )2 + 2(b
yi − y )(yi − ybi )

Summing up over all the cases i = 1, 2, . . . , n, we get

SST SSR SSE
z }| { z }| { z }| {
X n n
X n
X
(yi − y )2 = yi − y )2 +
(b (yi − ybi )2
| {z }
i=1 i=1 i=1 =ei
n
X
+2 yi − y )(yi − ybi )
(b
|i=1 {z }
= 0, see next page.

MLR - 10
Pn
Why i=1 (b
yi − y )(yi − ybi ) = 0?
n
X
yi − y )(yi − ybi )
(b
| {z }
i=1 =ei
n
X n
X
= ybi ei − y ei
i=1 i=1
n
X n
X
= (βb0 + βb1 xi1 + . . . + βbp xip )ei − y ei
i=1 i=1
n
X n
X n
X n
X
= βb0 ei +βb1 xi1 ei + . . . + βbp xip ei −y ei
i=1 i=1 i=1 i=1
| {z } | {z } | {z } | {z }
=0 =0 =0 =0
=0
Pn
in
Pnwhich we used the properties of residuals that i=1 ei = 0 and
i=1 xik ei = 0 for all k = 1, . . . , p.
MLR - 11
Interpretation of Sum of Squares
=ei
n
X n
X n z }| {
X
(yi − y )2 = y i − y )2 +
(b (yi − ybi )2
|i=1 {z } |i=1 {z } |i=1 {z }
SST SSR SSE

I SST = total sum of squares

I total variability of y

I depends on the response y only, not on the form of the

model

I SSR = regression sum of squares

I variability of y explained by x , . . . , x
1 p

I SSE = error (residual) sum of squares

Pn 2
i=1 (yi − β0 − β1 xi1 − · · · − βp xip )
I = min
β0 ,β1 ,...,βp
I variability of y not explained by x’s

MLR - 12
Degrees of Freedom
If the MLR model yi = β0 + β1 xi1 + . . . + βp xip + εi , εi ’s i.i.d.
∼ N(0, σ 2 ) is true, it can be shown that
SSE
∼ χ2n−p−1 ,
σ2
If we further assume that β1 = β2 = · · · = βp = 0, then
SST SSR
∼ χ2n−1 , ∼ χ2p
σ2 σ2
and SSR is independent of SSE.
Note the degrees of freedom of the 3 chi-square distributions

dfT = n − 1, dfR = p, dfE = n − p − 1

break down similarly

dfT = dfR + dfE

just like SST = SSR + SSE .

MLR - 13
Why SSE Has n − p − 1 Degrees of Freedom?

The n residuals e1 , . . . , en cannot all vary freely.

There are p + 1 constraints:

n
X n
X
ei = 0 and xki ei = 0 for k = 1, . . . , p.
i=1 i=1

So only n − (p + 1) of them can be freely varying.

The p + 1 constraints comes from the p + 1 coefficients β0 , . . . , βp

in the model, and each contributes one constraint ∂β∂ k = 0.

MLR - 14
Mean Square Error (MSE) — Estimate of σ 2
The mean squares is the sum of squares divided by its degrees of
freedom:
SST SST
MST = = = sample variance of Y ,
dfT n−1
SSR SSR
MSR = = ,
dfR p
SSE SSE
MSE = = b2
=σ
dfE n−p−1

I From the fact SSEσ2

∼ χ2n−p−1 and that the mean of a χ2k
distribution is k, we know that MSE is an unbiased
estimator for σ 2 .
I Though SSE always decreases as we add terms to the model,
adding unimportant terms may increases MSE.

MLR - 15
Example: Housing Price
Price BDR FLR FP RMS ST LOT BTH CON GAR LOC
53 2 967 0 5 0 39 1.5 1 0.0 0
55 2 815 1 5 0 33 1.0 1 2.0 0 Price = Selling price in $1000
56 3 900 0 5 1 35 1.5 1 1.0 0 BDR = Number of bedrooms
58 3 1007 0 6 1 24 1.5 0 2.0 0 FLR = Floor space in sq. ft.
64 3 1100 1 7 0 50 1.5 1 1.5 0 FP = Number of fireplaces
44 4 897 0 7 0 25 2.0 0 1.0 0 RMS = Number of rooms
49 5 1400 0 8 0 30 1.0 0 1.0 0 ST = Storm windows
70 3 2261 0 6 0 29 1.0 0 2.0 0 (1 if present, 0 if absent)
72 4 1290 0 8 1 33 1.5 1 1.5 0
LOT = Front footage of lot in feet
82 4 2104 0 9 0 40 2.5 1 1.0 0
BTH = Number of bathrooms
85 8 2240 1 12 1 50 3.0 0 2.0 0
CON = Construction
45 2 641 0 5 0 25 1.0 0 0.0 1
(1 if frame, 0 if brick)
47 3 862 0 6 0 25 1.0 1 0.0 1
49 4 1043 0 7 0 30 1.5 0 0.0 1 GAR = Garage size
56 4 1325 0 8 0 50 1.5 0 0.0 1 (0 = no garage,
60 2 782 0 5 1 25 1.0 0 0.0 1 1 = one-car garage, etc.)
62 3 1126 0 7 1 30 2.0 1 0.0 1
LOC = Location
64 4 1226 0 8 0 37 2.0 0 2.0 1
(1 if property is in zone A,
.
0 otherwise)
.
.
50 2 691 0 6 0 30 1.0 0 2.0 0
65 3 1023 0 7 1 30 2.0 1 1.0 0
MLR - 16
How to Do Regression Using R?

> housing = read.table("housing.txt",h=TRUE) # to load the data

> lm(Price ~ FLR+LOT+BDR+GAR+ST, data=housing)

Call:
lm(formula = Price ~ FLR + LOT + BDR + GAR + ST, data = housing)

Coefficients:
(Intercept) FLR LOT BDR GAR ST
24.63232 0.02009 0.44216 -3.44509 3.35274 11.64033

The lm() command above asks R to fit the model

Price = β0 + β1 FLR + β2 LOT + β3 BDR + β4 GAR + β5 ST + ε

and R gives us the regression equation

[ = 24.63 + 0.02FLR + 0.44LOT − 3.45BDR + 3.35GAR + 11.64ST

Price

MLR - 17
[ = 24.63+0.02FLR+0.44LOT−3.45BDR+3.35GAR+11.64ST
Price

The regression equation tells us:

I an extra square foot in floor area increases the price by $20 ,
I an extra foot in front footage by . . . . . . . . . . . . . . . . . $440 ,
I an additional bedroom by . . . . . . . . . . . . . . . . . . . . . . . −$3450 ,
I an additional space in the garage by . . . . . . . . . . . . . . $3350 ,
I using storm windows by . . . . . . . . . . . . . . . . . . . . . . . . . $11640 .

Question:
Why an additional bedroom makes a house less valuable?

MLR - 18
Interpretation of Regression Coefficients
I β0 = intercept = the mean value of y when all xj ’ are 0.
I may not have practical meaning

e.g., β0 is meaningless in the housing price model as no

housing unit has 0 floor space.

I βj : regression coefficient for xj , is the mean change in the

response y when xj is increased by one unit holding all other
xj ’s constant
I Interpretation of β depends on the presence of other
j
covariates in the model
e.g., the meaning of the 2 β1 ’s in the following 2 models
are different

Model 1 : yi = β0 + β1 xi1 + β2 xi2 + β3 xi3 + εi

Model 2 : yi = β0 + β1 xi1 + εi .

MLR - 19
What’s Wrong?
# Model 1
> lm(Price ~ BDR, data=housing)

(Intercept) BDR
43.487 3.921

The regression coefficient for BDR is 3.921 in the Model 1 above

but −3.445 in the Model 2 below.
# Model 2
> lm(Price ~ FLR+LOT+BDR+GAR+ST, data=housing)

(Intercept) FLR LOT BDR GAR ST

24.63232 0.02009 0.44216 -3.44509 3.35274 11.64033

Considering BDR alone, house prices increase with BDR.

However, an extra bedroom makes a housing unit less valuable
when when other covariates (FLR, LOT, etc) are fixed.
Does this make sense?
MLR - 20
More R Commands

> lm1 = lm(Price ~ FLR+RMS+BDR+GAR+LOT+ST+CON+LOC, data=housing)

> summary(lm1) # Regression output with more details
# including multiple R-squared,
# and the estimate of sigma

> lm1$coef # show the estimated beta’s

> lm1$fitted # show the fitted values
> lm1$res # show the residuals

MLR - 21
> lm1 = lm(Price ~ FLR+LOT+BDR+GAR+ST, data=housing)
> summary(lm1)

Call:
lm(formula = Price ~ FLR + LOT + BDR + GAR + ST, data = housing)

Residuals:
Min 1Q Median 3Q Max
-9.7530 -2.9535 0.1779 3.7183 12.9728

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 24.632318 4.836743 5.093 5.56e-05 ***
FLR 0.020094 0.003668 5.478 2.31e-05 ***
LOT 0.442164 0.150023 2.947 0.007965 **
BDR -3.445086 1.279347 -2.693 0.013995 *
GAR 3.352739 1.560239 2.149 0.044071 *
ST 11.640334 2.688867 4.329 0.000326 ***
---
Residual standard error: 5.79 on 20 degrees of freedom
Multiple R-squared: 0.8306,Adjusted R-squared: 0.7882
F-statistic: 19.61 on 5 and 20 DF, p-value: 4.306e-07

MLR - 22
t-Tests on Individual Regression Coefficients
For a MLR model Yi = β0 + β1 Xi1 + . . . + βp Xip + εi , to test the
hypotheses,
H0 : βj = c v.s. Ha : βj 6= c
the t-statistic is
βbj − c
t=
SE(βbj )
in which SE(βbj ) is the standard error for βbj .
I General formula for SE(βbj ) is a bit complicate but
unimportant in STAT222 and hence is omitted
I R can compute SE(βbj ) for us
I Formula for SE(βbj ) for a few special cases will be given later
This t-statistic also has a t-distribution with n − p − 1 degrees of
freedom

MLR - 23
Estimate Std. Error t value Pr(>|t|)
(Intercept) 24.632318 4.836743 5.093 5.56e-05 ***
FLR 0.020094 0.003668 5.478 2.31e-05 ***
LOT 0.442164 0.150023 2.947 0.007965 **
......(some rows are omitted)
ST 11.640334 2.688867 4.329 0.000326 ***
I the first column gives variable names
I the column Estimate gives the LS estimate βbj ’s for βj ’s
I the column Std. Error gives SE(βbj ), the standard error of βbj
βbj
I the column t value gives t-value =
SE (βbj )
I column Pr(>|t|) gives the P-value for testing H0 : βj = 0 v.s.
Ha : βj 6= 0.
E.g., for LOT, we see
βbLOT 0.442
βbLOT ≈ 0.442, SE(βbLOT ) ≈ 0.150, t = ≈ ≈ 2.947.
SE(βbLOT ) 0.150
The P-value 0.007965 is the 2-sided P-value for testing H0 :
βLOT = 0
MLR - 24
Nested Models
We say Model 1 is nested in Model 2 if Model 1 is a special case
of Model 2 (and hence Model 2 is an extension of Model 1).
E.g., for the 4 models below,

Model A : Y = β0 + β1 X1 + β2 X2 + β3 X3 + ε
Model B : Y = β0 + β1 X1 + β2 X2 + ε
Model C : Y = β0 + β1 X1 + β3 X3 + ε
Model D : Y = β0 + β1 (X1 + X2 ) + ε

I B is nested in A . . . . . . . . . . . since A reduces to B when β3 = 0

I C is also nested in A . . . . . . . since A reduces to C when β2 = 0
I D is nested in B . . . . . . . . . since B reduces to D when β1 = β2
I B and C are NOT nested in either way
I D is NOT nested in C

MLR - 25
Nested Relationship is Transitive
If Model 1 is nested in Model 2, and Model 2 is nested in Model 3,
then Model 1 is also nested in Model 3.
For example, for models in the previous slide,

D is nested in B, and B is nested in A,

implies D is also nested in A, which is clearly true because Model

A reduces to Model D when

β1 = β2 , and β3 = 0.

When two models are nested (Model 1 is nested in Model 2),

I the smaller model (Model 1) is called the reduced model,
and
I the more general model (Model 2) is called the full model.

MLR - 26
SST of Nested Models
Question: Compare the SST for Model A and the SST for Model
B. Which one is larger? Or are they equal?

What about the SST for Model C? For Model D?

MLR - 27
SSE of Nested Models
When a reduced model is nested in a full model, then
(i) SSEreduced ≥ SSEfull , and (ii) SSRreduced ≤ SSRfull .
Proof. We will prove (i) for
I the full model yi = β0 + β1 xi1 + β2 xi2 + β3 xi3 + εi and
I the reduced model yi = β0 + β1 xi1 + β3 xi3 + εi .
The proofs for other nested models are similar.
n
X
SSEfull = min (y1 − β0 − β1 xi1 − β2 xi2 − β3 xi3 )2
β0 ,β1 ,β2 ,β3
i=1
n
X
≤ min (y1 − β0 − β1 xi1 − β3 xi3 )2
β0 ,β1 ,β3
i=1
= SSEreduced
Part (ii) follows directly from (i), the identity SST = SSR + SSE ,
and the fact that all MLR models of the same data set have a
common SST
MLR - 28
General Framework for Testing Nested Models
H0 : reduced model is true v.s. Ha : full model is true

I As the reduced model is nested in the full model,

SSEreduced ≥ SSEfull
I Simplicity or Accuracy?
I The full model fits the data better (with smaller SSE)

but is more complicate

I The reduced model doesn’t fit as well but is simpler.
I If SSEreduced ≈ SSEfull , one can sacrifice a bit of
accuracy in exchange for simplicity
I If SSEreduced SSEfull , it would cost to much in
accuracy in exchange for simplicity. The full model is
preferred.
MLR - 29
The F -Statistic

(SSEreduced − SSEfull )/(dfreduced − dffull )

F =
MSEfull

I SSEreduced − SSEfull is the reduction in SSE from replacing

the reduced model with the full model.
I dfreduced is the df for error for the reduced model.
I dffull is the df for error for the full model.
I F ≥ 0 since SSEreduced ≥ SSEfull ≥ 0
I The smaller the F -statistic, the more we favor the reduced
model
I Under H0 , the F -statistic has an F -distribution with
dfreduced − dffull and dffull degrees of freedom.

MLR - 30
Testing All Coefficients Equal Zero
Testing the hypotheses

H0 : β1 = · · · = βp = 0 v.s. Ha : not all β1 . . . , βp = 0

is a test to evaluate the overall significance of a model.

Full :yi = β0 + β1 xi1 + · · · + βp Xip + εi

Reduced :yi = β0 + εi (all covariates are unnecessary)

I The LS estimate for β0 in the reduced model is βb0 = y , so

n
X X
SSEreduced = (yi − βb0 )2 = (yi − y )2 = SSTfull
i=1 i

I dfreduced = dfEreduced = n − 1,
because the reduced model has only one coefficient β0
I dffull = dfEfull = n − p − 1.
MLR - 31
Testing All Coefficients Equal Zero
Hence
(SSEreduced − SSEfull )/(dfreduced − dffull )
F =
MSEfull
(SSTfull − SSEfull )/[n − 1 − (n − p − 1)]
=
SSEfull /(n − p − 1)
SSRfull /p
= .
SSEfull /(n − p − 1)
Moreover, F ∼ Fp,n−p−1 under H0 : β1 = β2 = · · · = βp = 0.
In R, the F statistic and p-value are displayed in the last line of the
output of the summary() command.
> lm1 = lm(Price ~ FLR+LOT+BDR+GAR+ST, data=housing)
> summary(lm1)
... (output omitted)

Residual standard error: 5.79 on 20 degrees of freedom

Multiple R-squared: 0.8306,Adjusted R-squared: 0.7882
F-statistic: 19.61 on 5 and 20 DF, p-value: 4.306e-07

MLR - 32
ANOVA and the F -Test

The test of all coefficients equal zero is often summarized in an

ANOVA table.
Sum of Mean
Source df Squares Squares F
SSR MSR
Regression dfR = p SSR MSR = F =
dfR MSE
SSE
Error dfE = n − p − 1 SSE MSE =
dfE
Total dfT = n − 1 SST

MLR - 33
Testing Some Coefficients Equal to Zero

E.g., for the housing price data, we may want to test if we can
eliminate BDR and GAR from the model,
i.e., H0 : βBDR = βGAR = 0.

> lmfull = lm(Price ~ FLR+LOT+BDR+GAR+ST, data=housing)

> lmreduced = lm(Price ~ FLR+LOT+ST, data=housing)
> anova(lmreduced, lmfull)
Analysis of Variance Table

Model 1: Price ~ FLR + LOT + ST

Model 2: Price ~ FLR + LOT + BDR + GAR + ST
Res.Df RSS Df Sum of Sq F Pr(>F)
1 22 1105.01
2 20 670.55 2 434.46 6.4792 0.006771 **

Note SSE is called RSS (residual sum of square) in R.

MLR - 34
Testing Equality of Coefficients
Example. To test H0 : β1 = β2 = β3 , the reduced model is

Y = β0 + β1 X1 + β1 X2 + β1 X3 + β4 X4 + ε
= β0 + β1 (X1 + X2 + X3 ) + β4 X4 + ε

1. Make a new variable W = X1 + X2 + X3

2. Fit the reduced model by regressing Y on W and X4
3. Find SSEreduced and dfreduced − dffull = 2
4. In R
> lmfull = lm(Y ~ X1 + X2 + X3 + X4)
> lmreduced = lm(Y ~ I(X1 + X2 + X3) + X4)
> anova(lmreduced, lmfull)
The line lmreduced = lm(Y ~ I(X1 + X2 + X3) + X4) is
equivalent to
> W = X1 + X2 + X3
> lmreduced = lm(Y ~ W + X4)

MLR - 35

Ciaraan Robinson - Game Audio With FMOD and Unity-Routledge (2019)
No ratings yet
Ciaraan Robinson - Game Audio With FMOD and Unity-Routledge (2019)
279 pages
Iec 61851
0% (1)
Iec 61851
2 pages
Three Schools of Thought On Enterprise Architecture
100% (1)
Three Schools of Thought On Enterprise Architecture
7 pages
Multiple Linear Regression Ver1.1
No ratings yet
Multiple Linear Regression Ver1.1
61 pages
Regress
No ratings yet
Regress
11 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
FCDS - RA ch4 Sp21
No ratings yet
FCDS - RA ch4 Sp21
18 pages
Regression Analysis
No ratings yet
Regression Analysis
37 pages
Linear_least_squared
No ratings yet
Linear_least_squared
23 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
SRM Formula Sheet-2
100% (1)
SRM Formula Sheet-2
11 pages
Reading 4
No ratings yet
Reading 4
15 pages
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
No ratings yet
MAS316/Math352 Regression Analysis: 1 Multiple Linear Regression Models
12 pages
Linear Regression Analaysis - 21
No ratings yet
Linear Regression Analaysis - 21
22 pages
Reg02
No ratings yet
Reg02
46 pages
Education and Research: UP School of Statistics Student Council
No ratings yet
Education and Research: UP School of Statistics Student Council
26 pages
MLRMSB2
No ratings yet
MLRMSB2
21 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
18 pages
Appendix Robust Regression
No ratings yet
Appendix Robust Regression
8 pages
Robust Regression: 1 M-Estimation
No ratings yet
Robust Regression: 1 M-Estimation
8 pages
4 - Multiple Linear Regressions
No ratings yet
4 - Multiple Linear Regressions
61 pages
Math644 - Chapter 1 - Part2 PDF
No ratings yet
Math644 - Chapter 1 - Part2 PDF
14 pages
Linera Regression II PDF
No ratings yet
Linera Regression II PDF
14 pages
CH 11 Slides
No ratings yet
CH 11 Slides
41 pages
Econometric Theory: Module - Ii
No ratings yet
Econometric Theory: Module - Ii
11 pages
Day 24 Supervised Learning - REgression Analysis - 2
No ratings yet
Day 24 Supervised Learning - REgression Analysis - 2
18 pages
MLRM
No ratings yet
MLRM
22 pages
Paper On Polynomial Regression
No ratings yet
Paper On Polynomial Regression
7 pages
Linear Regression
No ratings yet
Linear Regression
108 pages
BS Classes V2
No ratings yet
BS Classes V2
70 pages
Chap7
No ratings yet
Chap7
7 pages
lecture-4 2
No ratings yet
lecture-4 2
50 pages
Econ 471 Notes 1
No ratings yet
Econ 471 Notes 1
14 pages
Matrix Model
No ratings yet
Matrix Model
6 pages
Econ 3044: Introduction To Econometrics Chapter-4: MLR: Further Issues and Dummy Variables
No ratings yet
Econ 3044: Introduction To Econometrics Chapter-4: MLR: Further Issues and Dummy Variables
43 pages
REg03
No ratings yet
REg03
39 pages
Robust Geodetic Parameter Estimation Under Least Squares Through Weighting On The Basis of The Mean Square Error
No ratings yet
Robust Geodetic Parameter Estimation Under Least Squares Through Weighting On The Basis of The Mean Square Error
12 pages
CE463 Fall 2024 Regression Lecture 12
No ratings yet
CE463 Fall 2024 Regression Lecture 12
16 pages
Lec 13
No ratings yet
Lec 13
10 pages
Ch03 - MLR Estimation - Ver2
No ratings yet
Ch03 - MLR Estimation - Ver2
52 pages
Ch03 - MLR Estimation - Ver2
No ratings yet
Ch03 - MLR Estimation - Ver2
52 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
Lec Topic3
No ratings yet
Lec Topic3
51 pages
Linear Regression
No ratings yet
Linear Regression
47 pages
Linear Regression Analysis: Module - Vii
No ratings yet
Linear Regression Analysis: Module - Vii
10 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Chapter2 (Simple Linear Regression)
No ratings yet
Chapter2 (Simple Linear Regression)
11 pages
Brief Notes #11 Linear Regression
No ratings yet
Brief Notes #11 Linear Regression
6 pages
LeastSquares_DeptMath
No ratings yet
LeastSquares_DeptMath
7 pages
Ch3_Regression Complete
No ratings yet
Ch3_Regression Complete
42 pages
FCDS - RA ch1 Sp21
No ratings yet
FCDS - RA ch1 Sp21
14 pages
WST 311 Notes part 2 2024
No ratings yet
WST 311 Notes part 2 2024
21 pages
Mult Regression
No ratings yet
Mult Regression
28 pages
24 Linreg 2
No ratings yet
24 Linreg 2
12 pages
6 Multiple Regression
No ratings yet
6 Multiple Regression
26 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Sparse Regression
No ratings yet
Sparse Regression
37 pages
Least Squares Method
No ratings yet
Least Squares Method
36 pages
CUHK STAT5102 Ch1
No ratings yet
CUHK STAT5102 Ch1
54 pages
Simple Linear Regression, Cont.: BIOST 515 January 13, 2004
No ratings yet
Simple Linear Regression, Cont.: BIOST 515 January 13, 2004
23 pages
Shortcuts to College Calculus Refreshment Kit
From Everand
Shortcuts to College Calculus Refreshment Kit
Juan Acevedo
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
Apb-Uart Vip
No ratings yet
Apb-Uart Vip
10 pages
Microsoft
No ratings yet
Microsoft
12 pages
Riedell ColorLab Leather Colors July2021
No ratings yet
Riedell ColorLab Leather Colors July2021
3 pages
Free - Proxy - List Usa
No ratings yet
Free - Proxy - List Usa
2 pages
ZEN DAC Manual Ver1.4 Ok
No ratings yet
ZEN DAC Manual Ver1.4 Ok
1 page
Question Answer
No ratings yet
Question Answer
15 pages
Games As A Service How Free to Play Design Can Make Better Games 1st Edition Oscar Clark pdf download
No ratings yet
Games As A Service How Free to Play Design Can Make Better Games 1st Edition Oscar Clark pdf download
52 pages
Presentation1 of IRI
No ratings yet
Presentation1 of IRI
29 pages
Solutions To Review Questions and Exercises For Part 2 - The Relational Model and Languages (Chapters 4 - 9)
No ratings yet
Solutions To Review Questions and Exercises For Part 2 - The Relational Model and Languages (Chapters 4 - 9)
61 pages
[FREE PDF sample] Developing Thinking Developing Learning 1st Edition Debra Mcgregor ebooks
100% (3)
[FREE PDF sample] Developing Thinking Developing Learning 1st Edition Debra Mcgregor ebooks
40 pages
Resume vivek (1)
No ratings yet
Resume vivek (1)
1 page
Unit 7
No ratings yet
Unit 7
6 pages
GraphRAG - Lettria
No ratings yet
GraphRAG - Lettria
20 pages
Download Complete Computational mathematics models methods and analysis with MATLAB and MPI 1st Edition Robert E. White PDF for All Chapters
100% (1)
Download Complete Computational mathematics models methods and analysis with MATLAB and MPI 1st Edition Robert E. White PDF for All Chapters
61 pages
PM Year 6 Overview
No ratings yet
PM Year 6 Overview
12 pages
What is Test Driven Development
No ratings yet
What is Test Driven Development
4 pages
Automation in Construction: Didem Gürdür Broo, Miguel Bravo-Haro, Jennifer Schooling
No ratings yet
Automation in Construction: Didem Gürdür Broo, Miguel Bravo-Haro, Jennifer Schooling
13 pages
Chapter4 - Selection of An Appropriate Project Approach
No ratings yet
Chapter4 - Selection of An Appropriate Project Approach
26 pages
FS Bank Report
No ratings yet
FS Bank Report
43 pages
Redefining Developer experience with AI and DevOps
No ratings yet
Redefining Developer experience with AI and DevOps
16 pages
Murang A CM Civil 11TH - 15TH March 2024
No ratings yet
Murang A CM Civil 11TH - 15TH March 2024
17 pages
Week 1 2 ADMSHS Emp Tech Q1 M1 L1 ICT and Its Current State
80% (5)
Week 1 2 ADMSHS Emp Tech Q1 M1 L1 ICT and Its Current State
15 pages
CAPL 2 CANNewsletter 201409 PressArticle en
No ratings yet
CAPL 2 CANNewsletter 201409 PressArticle en
2 pages
How To Create A Perfect ESL Lesson Plan in 6 Easy Steps (Plus 3 Ready-To-Use Lesson Plans!)
50% (2)
How To Create A Perfect ESL Lesson Plan in 6 Easy Steps (Plus 3 Ready-To-Use Lesson Plans!)
22 pages
Inequalities Basic Practice
No ratings yet
Inequalities Basic Practice
19 pages
Megaman Battle Network Tabletop RPG
No ratings yet
Megaman Battle Network Tabletop RPG
7 pages
Interview Guide For SAP BO DEVELOPER OR CONSULTANT
100% (2)
Interview Guide For SAP BO DEVELOPER OR CONSULTANT
65 pages

Mukti Linear Regresson

Uploaded by

Mukti Linear Regresson

Uploaded by

Multiple Linear Regression (MLR) Handouts

• Data and Models

yi = β0 + β1 xi1 + . . . + βp xip + εi where εi ’s are i.i.d. N(0, σ 2 )

In the model above,

I Observed (known): yi , xi1 , xi2 , . . . , xip

one can set the derivatives of L with respect to βbj to 0

and then equate them to 0. This results in a system of (p + 1)

+ βb1 ni=1 xi1 + · · · + βbp ni=1 xip = ni=1 yi

I Don’t worry about solving the equations.

The fitted value or predicted value:

ybi = βb0 + βb1 xi1 + . . . + βbp xip

I Again, the “hat” symbol is used to differentiate the fitted

since the coefficients β0 , β1 , . . . , βp are unknown.

residual ei = observed yi − predicted yi

I ei 6= εi in general since βbj 6= βj

Thus the residuals ei have the properties

Summing up over all the cases i = 1, 2, . . . , n, we get

I SST = total sum of squares

I depends on the response y only, not on the form of the

I SSR = regression sum of squares

I SSE = error (residual) sum of squares

dfT = n − 1, dfR = p, dfE = n − p − 1

break down similarly

dfT = dfR + dfE

just like SST = SSR + SSE .

The n residuals e1 , . . . , en cannot all vary freely.

There are p + 1 constraints:

So only n − (p + 1) of them can be freely varying.

The p + 1 constraints comes from the p + 1 coefficients β0 , . . . , βp

I From the fact SSEσ2

> housing = read.table("housing.txt",h=TRUE) # to load the data

The lm() command above asks R to fit the model

Price = β0 + β1 FLR + β2 LOT + β3 BDR + β4 GAR + β5 ST + ε

and R gives us the regression equation

[ = 24.63 + 0.02FLR + 0.44LOT − 3.45BDR + 3.35GAR + 11.64ST

The regression equation tells us:

e.g., β0 is meaningless in the housing price model as no

I βj : regression coefficient for xj , is the mean change in the

Model 1 : yi = β0 + β1 xi1 + β2 xi2 + β3 xi3 + εi

The regression coefficient for BDR is 3.921 in the Model 1 above

(Intercept) FLR LOT BDR GAR ST

Considering BDR alone, house prices increase with BDR.

> lm1 = lm(Price ~ FLR+RMS+BDR+GAR+LOT+ST+CON+LOC, data=housing)

> lm1$coef # show the estimated beta’s

I B is nested in A . . . . . . . . . . . since A reduces to B when β3 = 0

D is nested in B, and B is nested in A,

implies D is also nested in A, which is clearly true because Model

When two models are nested (Model 1 is nested in Model 2),

What about the SST for Model C? For Model D?

I As the reduced model is nested in the full model,

but is more complicate

(SSEreduced − SSEfull )/(dfreduced − dffull )

I SSEreduced − SSEfull is the reduction in SSE from replacing

H0 : β1 = · · · = βp = 0 v.s. Ha : not all β1 . . . , βp = 0

is a test to evaluate the overall significance of a model.

Full :yi = β0 + β1 xi1 + · · · + βp Xip + εi

I The LS estimate for β0 in the reduced model is βb0 = y , so

Residual standard error: 5.79 on 20 degrees of freedom

The test of all coefficients equal zero is often summarized in an

> lmfull = lm(Price ~ FLR+LOT+BDR+GAR+ST, data=housing)

Model 1: Price ~ FLR + LOT + ST

Note SSE is called RSS (residual sum of square) in R.

1. Make a new variable W = X1 + X2 + X3

You might also like