0% found this document useful (0 votes)
103 views12 pages

Chapter 1 Simple Linear Regression (Part 6: Matrix Version)

The document provides an overview of representing simple and multiple linear regression models using matrix notation. It reviews key matrix concepts such as dimension, transpose, addition, subtraction, and multiplication. It then shows how a linear regression model with one independent variable can be written as a matrix equation in the form Y = Xβ + E, where X is the independent variable matrix, β is the coefficient vector, and E is the error vector. Key matrix operations used in linear regression such as XX, XY, and YY are also defined.

Uploaded by

JapanjOt SinGh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views12 pages

Chapter 1 Simple Linear Regression (Part 6: Matrix Version)

The document provides an overview of representing simple and multiple linear regression models using matrix notation. It reviews key matrix concepts such as dimension, transpose, addition, subtraction, and multiplication. It then shows how a linear regression model with one independent variable can be written as a matrix equation in the form Y = Xβ + E, where X is the independent variable matrix, β is the coefficient vector, and E is the error vector. Key matrix operations used in linear regression such as XX, XY, and YY are also defined.

Uploaded by

JapanjOt SinGh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Chapter 1 Simple Linear Regression

(part 6: matrix version)

1 Overview

• Simple linear regression model: response variable Y , a single independent variable X

Y = β0 + β1 X + ε

• Multiple linear regression model: response Y , more than one independent variables
X1 , X2 , ..., Xp
Y = β0 + β1 X1 + β2 X2 + ... + βp Xp + ε

• To investigate multiple regression model, we need matrix

2 Review of Matrices

• A matrix: a rectangular array of elements arranged in rows and columns

• an example
Column Column
⎡ 1 2 ⎤
Row 1 100 22
Row 2 ⎣ 300 46 ⎦ .
Row 3 600 81

2.1 A matrix with r rows and c columns

• ⎡ ⎤
a11 a12 · · · a1j ··· a1c
⎢ a21 a22 · · · a2j ··· a2c ⎥
⎢ ⎥
⎢ .. .. .. .. ⎥
⎢ . . . . ⎥
A=⎢

⎥.
⎢ ai1 ai2 · · · aij ··· aic ⎥⎥
⎢ .. .. .. .. ⎥
⎣ . . . . ⎦
ar1 ar2 · · · arj ··· arc

• Sometimes denote it as A = [aij ] i = 1, ..., r; j = 1, ..., c

1
• r and c are called the dimension of a matrix

2.2 Square matrix and Vector

• Square matrix: equal number of rows and columns


⎡ ⎤
 a11 a12 a13
4 7
, ⎣ a21 a22 a23 ⎦ .
3 9
a31 a32 a33

• vector: matrix with only one row or one column


⎡ ⎤
15
A = [4 7 10] B = ⎣ 25 ⎦ .
20

2.3 Transpose of a matrix and equality of matrices



• transpose of a matrix A is another matrix denoted by A
⎡ ⎤
2 5 
⎣ ⎦  2 7 3
A = 7 10 , A = .
5 10 4
3 4

• two matrices are equal if they have the same dimension and all the corresponding
elements are equal

Suppose ⎡ ⎤ ⎡ ⎤
a11 a12 17 2
A = ⎣ a21 a22 ⎦ , B = ⎣ 14 5 ⎦ .
a31 a32 13 9
If A = B, then a11 = 17, a12 = 2, ...

2.4 Matrix addition and subtraction

• Adding or subtracting of two matrices require that they have the same dimension.
⎡ ⎤ ⎡ ⎤
1 4 1 2
A = ⎣ 2 5 ⎦, B = ⎣ 2 3 ⎦,
3 6 3 4
⎡ ⎤ ⎡ ⎤
1+1 4+2 2 6
A + B = ⎣ 2 + 2 5 + 3 ⎦ = ⎣ 4 8 ⎦,
3+3 6+4 6 10
⎡ ⎤ ⎡ ⎤
1−1 4−2 0 2
A−B=⎣ 2−2 5−3 ⎦=⎣ 0 2 ⎦
3−3 6−4 0 2

2
2.5 Matrix multiplication

• Multiplication of a Matrix by a Scalar


2 7
A= ,
9 3
 
2 7 8 28
4A = A4 = 4 =
9 3 36 12

• Multiplication of a Matrix by a Matrix. If A has dimension r × c and B has dimension


c × s, the product AB is a matrix of dimension r × s with the element in the ith row
and jth column:

c
aik bkj
k=1

•   
4 2 a1 4a1 + 2a2
A= =
5 8 a2 5a1 + 8a2

2.6 Regression examples

• It is easy to check ⎡ ⎤ ⎡ ⎤
1 X1 β0 + β1 X1
⎢ ⎥  ⎢ ⎥
⎢ 1 X2 ⎥ β0 ⎢ β0 + β1 X2 ⎥
⎢ .. ⎥ =⎢ .. ⎥
⎣ . ⎦ β1 ⎣ . ⎦
1 Xn β0 + β1 Xn

• Let ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
Y1 1 X1 ε1
⎢ ⎥ ⎢ 
⎢ Y2 ⎥ ⎢ 1 X2 ⎥
⎥ β0

⎢ ε2 ⎥

Y=⎢ .. ⎥, X=⎢ .. ⎥, β= , E =⎢ .. ⎥
⎣ . ⎦ ⎣ . ⎦ β1 ⎣ . ⎦
Yn 1 Xn εn

• The regression model

Y1 = β0 + β1 X1 + ε1 ,

Y2 = β0 + β1 X2 + ε2 ,
..
.

Yn = β0 + β1 Xn + εn

can be written as

Y = Xβ + E

3
• Other calculations
⎡ ⎤
1 X1
 ⎢  n
 1 1 ··· 1 ⎢ 1 X2 ⎥
⎥ n X
⎥ = n n i
XX= ⎢ .. i=1
2
X1 X2 · · · Xn ⎣ . ⎦ i=1 Xi i=1 Xi
1 Xn
⎡ ⎤ ⎡ ⎤
1 X1 Y1
⎢ 1 X2 ⎥ ⎢ Y2 ⎥  n
 ⎢ ⎥ ⎢ ⎥ Y
⎥ ⎢ .. ⎥ = n
i
XY=⎢ . i=1
⎣ .. ⎦ ⎣ . ⎦ i=1 Xi Yi
1 Xn Yn
⎡ ⎤
Y1
 ⎢ ⎥
⎢ Y2 ⎥
2
n
Y Y = Y1 Y2 · · · Yn ⎢ . ⎥ = Yi
⎣ .. ⎦
i=1
Yn

2.7 Special types of matrices



• Symmetric Matrix A = A ⎡⎤
1 4 6
A=⎣ 4 2 5 ⎦
6 5 3

• Diagonal Matrix: a square matrix whose off-diagonal elements are all zeros
⎡ ⎤
a1 0 0
⎣ 0 a2 0 ⎦
0 0 a3

• Identity Matrix ⎡ ⎤
1 0 0
I =⎣ 0 1 0 ⎦
0 0 1
facts: for any appropriate matrix AI = A and IB = B

• zero vector and unit vector

⎡ ⎤ ⎡ ⎤
0 1
⎢ 0 ⎥ ⎢ 1 ⎥
⎢ ⎥ ⎢ ⎥
0=⎢ .. ⎥, 1=⎢ .. ⎥
⎣ . ⎦ ⎣ . ⎦
0 1

4
2.8 Inverse of a square matrix

• the inverse of a square matrix A is another square matrix, denoted by A−1 , such that
AA−1 = A−1 A = I 
2 4
A=
3 1
Since   
−0.1 0.4 2 4 1 0
= =I
0.3 −0.2 3 1 0 1
or   
2 4 −0.1 0.4 1 0
= =I
3 1 0.3 −0.2 0 1
So 
−1 −0.1 0.4
A =
0.3 −0.2

2.9 Finding the Inverse of a matrix

• If 
a b
A=
c d
then 
d −b
−1
A = D
−c
D
a
D D
where D = ad − bc

• For high dimensional matrix, its inverse is not easy to calculate by hand

2.10 Regression example (continue)

• the inverse of matrix  n


nn X

XX= n
i=1 i
2
i=1 Xi i=1 Xi
n n n n
2 2 2 ( ni=1 Xi )
2
D=n i=1 Xi − ( i=1 Xi ) = n[ i=1 Xi − n ] =n i=1 (Xi − X̄)2
So
⎡ n ⎤
X2 − n X

n i=1 i 2
n i=1 i 2
(X X)−1 = ⎣ i −X̄)
n i=1
− n
(X
Xi
n i=1 (Xi −X̄)
n

i=1
n ni=1 (Xi −X̄)
2 n ni=1 (Xi −X̄)
2
 1 X̄ 2

+ n 2
n −X̄ 2
i=1 (Xi −X̄) i=1 (Xi −X̄)
n
= − X̄ 1
n 2
n 2
i=1 (Xi −X̄) i=1 (Xi −X̄)

5
2.11 Use of Inverse Matrix

• Suppose we want to solve two equations:

2y1 + 4y2 = 20

3y1 + y2 = 10

Rewrite the equations in matrix notation:


  
2 4 y1 20
=
3 1 y2 10

So the solution to the equations


  −1 
y1 2 4 20
=
y2 3 1 10
  
−0.1 0.4 20 2
= =
0.3 −0.2 10 4

• Estimation a regression model need to solve linear equations, and inverse matri is very
useful.

2.12 Other basic facts for matrices

• A+B = B+A

• C(A + B) = CA + CB

 
• (A ) = A

  
• (AB) = B A

• (A−1 )−1 = A

• (AB)−1 = B−1 A−1

 
• (A )−1 = (A−1 )

3 Random vector and matrices

• Random vector ⎡⎤
Y1
Y = ⎣ Y2 ⎦
Y3

6
• Expectation of random vector
⎡ ⎤
E{Y1 }
E{Y} = ⎣ E{Y2 } ⎦
E{Y3 }

• Random vector ⎡ ⎤ ⎡ ⎤
Y1 Z1
Y = ⎣ Y2 ⎦ , Z = ⎣ Z2 ⎦
Y3 Z3
Then
E(Y + Z) = EY + EZ

• Variance-covariance Matrix of random vector



Var{Y} = E{[Y − E{Y}][Y − E{Y}] }
⎡ ⎤
V ar{Y1 } Cov{Y1 , Y2 } Cov{Y1 , Y3 }
= ⎣ Cov{Y2 , Y1 } V ar{Y2 } Cov{Y2 , Y3 } ⎦
Cov{Y3 , Y1 } Cov{Y3 , Y2 } V ar{Y3 }
• In simple linear regression model, errors are uncorrelated, so V ar{E} = σ 2 I

[Proof: for example consider n = 3.


⎡ ⎤
σ2 0 0
Var{E} = ⎣ 0 σ 2 0 ⎦
0 0 σ2
]

3.1 Some basic facts

• If we have a random vector W equal to a random vector Y multiplied by a constant


matrix A
W = AY

we have
E{W} = AE{Y}

Var{W} = Var{AY} = AVar{Y}A

• If c is a constant vector, then

E(c + AY) = c + AEY

and

V ar(c + AY) = Var(AY) = A Var{Y}A

• In simple linear regression model, it follows from above V ar{Y} = σ 2 I

7
3.2 An illustration

•    
W1 1 −1 Y1 Y1 − Y2
= =
W2 1 1 Y2 Y1 + Y2
   
W1 1 −1 E{Y1 } E{Y1 } − E{Y2 }
E = =
W2 1 1 E{Y2 } E{Y1 } + E{Y2 }

•    
W1 1 −1 V ar{Y1 } Cov{Y1 , Y2 } 1 1
V ar{ }=
W2 1 1 Cov{Y2 , Y1 } V ar{Y2 } −1 1

4 Simple linear regression model (matrix version)

The model
Y1 = β0 + β1 X1 + ε1
Y2 = β0 + β1 X2 + ε2
..
.
Yn = β0 + β1 Xn + εn
with assumption

1. E(εi ) = 0,

2. V ar(εi ) = σ 2 , Cov(εi , εj ) = 0 for all 1 ≤ i = j ≤ n.

3. εi ∼ N (0, σ 2 ), i = 1, ..., n are independent

Recall, the model can be written as

Y = Xβ + E

Note that ⎡ ⎤
σ2 0 0 · · · 0
⎢ 0 σ2 0 · · · 0 ⎥
⎢ ⎥
E{E} = 0, Var{E} = ⎢ .. .. .. .. ⎥ = σ2 I
⎣ . . . . ⎦
0 0 0 ··· σ2
The assumptions can be rewritten as

1. E(E) = 0,

2. V ar(E) = σ 2 I

3. E ∼ N (0, σ 2 I)

8
Thus E(Y) = Xβ and V ar(Y) = σ 2 I. The model (with assumptions 1, 2, and 3.) can also
be written as
Y ∼ N(Xβ, σ 2 I)

or
Y = Xβ + E, E ∼ N(0, σ 2 I)

4.1 Least squares estimator b0 , b1

• The normal equations can be written as


 n   n  n
n n n X b0 nb 0 + b1 X Y
i=1 i
2 = i=1 i
= n i=1 i
i=1 Xi i=1 Xi b1 b0 ni=1 Xi + b1 ni=1 Xi2 i=1 Xi Yi

• ⎡ ⎤
1 X1
 ⎢  n
 1 1 ··· 1 ⎢ 1 X2 ⎥
⎥ n X
⎥ = n n i
XX= ⎢ .. i=1
2
X1 X2 · · · Xn ⎣ . ⎦ i=1 Xi i=1 Xi
1 Xn

⎡ ⎤⎡ ⎤
1 X1 Y1
⎢ 1 X2 ⎥ ⎢ ⎥  n
 ⎢ ⎥ ⎢ Y2 ⎥ Y
⎥ ⎢ .. ⎥ = n
i
XY=⎢ .. i=1
⎣ . ⎦⎣ . ⎦ i=1 Xi Yi
1 Xn Yn

• let 
b0
b=
b1
Then the normal equation is
 
X Xb = X Y

• we can find b by
 
b = (X X)−1 X Y

4.2 An example

• ⎡ ⎤ ⎡ ⎤
16 1 4
⎢ 5 ⎥ ⎢ 1 1 ⎥
⎢ ⎥ ⎢ ⎥
⎢ 10 ⎥ ⎢ 1 2 ⎥
Y=⎢

⎥;
⎥ X=⎢



⎢ 15 ⎥ ⎢ 1 3 ⎥
⎣ 13 ⎦ ⎣ 1 3 ⎦
22 1 4

9
•  
 6 17  81
XX= ,X Y =
17 55 261

•  −1 
6 17 81
b=
17 55 261

4.3 Fitted values and residuals in matrix form

• ⎡ ⎤ ⎡ ⎤
Ŷ1 b0 + b1 X1
⎢ Ŷ2 ⎥ ⎢ b0 + b1 X2 ⎥
⎢ ⎥ ⎢ ⎥
Ŷ = ⎢ .. ⎥=⎢ .. ⎥ = Xb
⎣ . ⎦ ⎣ . ⎦
Ŷn b0 + b1 Xn

• ⎡ ⎤
e1
⎢ e2 ⎥
⎢ ⎥
e = Y − Ŷ = ⎢ .. ⎥
⎣ . ⎦
en

 
Ŷ = X(X X)−1 X Y

 
• Denote X(X X)−1 X by H, we have

Ŷ = HY, e = (I − H)Y

4.4 Variance-covariance matrix for residuals e



• Var{e} = Var{(I − H)Y} = (I − H)Var{Y}(I − H)

• Var{Y} = Var{ε} = σ 2 I

  
• (I − H) = I − H = I − H

     
• HH = X(X X)−1 X X(X X)−1 X = X(X X)−1 X = H

• (I − H)(I − H) = I − 2H + HH = I − H

• Var{e} = σ 2 (I − H) estimated by σ̂ 2 (I − H)

10
4.5 Analysis of variance in matrix form
⎡ ⎤
1
⎢ 1 ⎥
⎢ ⎥
• Let 1 = ⎢ .. ⎥ then
⎣ . ⎦
1
⎡ ⎤
1 ··· 1
⎢ .. .. ⎥ = 11 .
J=⎣ . . ⎦
1 ··· 1

 1   1
SST = Y Y − Y JY = Y (I − J)Y
n n
[Proof

n
2 2 ( ni=1 Yi )2
SST = (Yi − Ȳ ) = Yi −
n
i=1 i=1

n


n
YY= Yi2 , 1
1 Y=Y1= Yi
i=1 i=1


n
  
( Yi )2 = Y 11 Y = Y JY
i=1

n
  
SSE = e2i = e e = (Y − Xb) (Y − Xb) = Y (I − H)Y
i=1

[Proof

     
SSE = (Y − Xb) (Y − Xb) = Y Y − 2b X Y + b X Xb
      
= Y Y − 2b X Y + b X X(X X)−1 X Y
       
= Y Y − 2b X Y + b X Y = Y Y − b X Y

= Y (I − H)Y


 1
SSR = SST − SSE = Y (H − J)Y
n

11
4.6 Variance-covariance matrix for b


 
b = (X X)−1 X Y

   
Var{b} = (X X)−1 X Var{Y}X(X X)−1 = σ 2 (X X)−1
 1 2 
+ n X̄ 2
n −X̄ 2
(X − (X −
= σ2
n i=1 i X̄) i=1 i X̄)
−X̄ n 1 n
2 2
i=1 (Xi −X̄) i=1 (Xi −X̄)

where σ 2 can be estimated by σ̂ 2 =MSE

4.7 Variance for the predicted value



• Ŷ = b0 + b1 X = 1 X b
 
1  1
• V ar{Ŷ } = 1 X V ar{b} = σ2 1 X (X X)−1
X X

12

You might also like