SlideShare a Scribd company logo
EE 615: Pattern Recognition & Machine Learning Fall 2016
Lecture 3 — August 11
Lecturer: Dinesh Garg Scribe: Harsha Vardhan Tetali
3.1 Properties of Least Square Regression Model (Cont.)
3.1.1 The sum of the residual errors over training set is zero
For least square regression model, we always have the following hold true
n
i=1
ei = 0 (3.1)
⇒
n
i=1
(ˆyi − yi) = 0 (3.2)
where, ˆyi is the predicted value and yi is the true value of the target variable. In order to
prove this claim, let us consider the Residual Sum of Squares (RSS) or the Sum of Squared
Errors (SSE) parameterized by the weight vector w.
RSS(w) = SSE(w) =
n
i=1
(wT
xi − yi)2
(3.3)
To minimize the function RSS(w), we need to set the gradient vector equal to the zero. That
is,
RSS(w) =






∂RSS(w)
∂w0
∂RSS(w)
∂w1
...
∂RSS(w)
∂wn






= 0 (3.4)
Let us consider the first component of the above gradient vector.
∂RSS(w)
∂w0 w∗
= 0 (3.5)
⇒
∂
n
i=1
(wT
xi − yi)2
∂w0
= 0 (3.6)
⇒
n
i=1
(wT
xi − yi) = 0 (3.7)
3-1
EE 615 Lecture 3 — August 11 Fall 2016
From the assumption made on the fitting curve, we have
ˆyi = w0 + w1xi1 + w2xi2 + · · · + wdxid = wT
xi (3.8)
Substituting (3.8) in (3.7) we get,
n
i=1
(ˆyi − yi) = 0 (3.9)
which proves the required claim. Q.E.D.
3.1.2 Total amount of over estimation is equal to the total amount
of under estimation
This fact follows from rewriting the Equation (3.9) in the following equivalent form
i|(ˆyi≥yi)
(ˆyi − yi) =
i|(ˆyi≤yi)
(yi − ˆyi) (3.10)
3.1.3 Vector ˆy is a projection of the vector y on to the column
space of X
While fitting the linear regression model, we ideally want a parameter vector w for which
Xw = y (3.11)
For above linear system to have an exact solution, we must have have y lying in the column
space of the coefficient matrix X. When this system of equations becomes unsolvable (i.e., y
does not lie in the column space of X), we solve this system of equation approximately. By
this, we mean that we try to find a vector ˆy lying in the column space of X which is as close
to y as possible. The corresponding coefficient vector w giving the vector ˆy would be our
desired solution. To find the optimal ˆy vector, we need to solve the following optimization
problem
arg minˆy∈colspace(X) y − ˆy
2
2
(3.12)
Note, any vector ˆy lying in the column space of X, should be of the following form:
ˆy =
d
j=0
αjX∗j (3.13)
for some αj ∈ R; i = 1, 2, . . . , d. Substituting this form of the vector ˆy in the previous
optimization problem, we get the following equivalent problem
arg minα∈Rd+1 y − Xα
2
2
(3.14)
3-2
EE 615 Lecture 3 — August 11 Fall 2016
One can easily verify that the optimal value of alpha would be
α∗
= (X X)−1
X (3.15)
which is the same as w∗
. The matrix (X X)−1
X is known as the projection matrix and
pre-multiplying it to any vector would yield the projection of that vector into the column
space of X.
We can visualize this interpretation of the regression parameters w∗
using figure below.
Let us begin by considering the matrix equation Ax = b to be solved for the best optimal
solution, where A is a matrix and x and b are column vectors, where x has to be estimated.
The mustard colored hyper-plane passes through the origin of the axes (only three are repre-
sented, because more than three would not accommodate) and represents the column space
of A. Let us assume that the orange colored vectors span the complete mustard colored
hyper-plane (only two have been shown, so that the clarity is not missed) and are placed in
the columns of A. Now the problem, Ax = b, where b is represented as the green vector,
can be restated as finding the optimal vector on the mustard colored plane nearest to the
green colored vector coming into space. As a solution to this geometric problem, we come
up with the naive solution of dropping the perpendicular from the tip of the green vector
onto the plane. Hence drawing a normal to the plane passing through the tip of the vector
that is to be estimated. The point of intersection of this normal and the plane can is the
3-3
EE 615 Lecture 3 — August 11 Fall 2016
optimal or the best possible vector that can be estimated. In the figure above this optimal
vector is shown in black.
3.2 Mean and Variance of the Target and the Predicted
Variables
In this section, we will define the sample mean and the sample variance of the target variables
as well as the predicted variables, and relate these two variances with the earlier defined RSS
function (evaluated at optimal w∗
).
3.2.1 Mean
Recall the following quantities defined earlier
y = [y1, y2, . . . , yn]
ˆy = [ˆy1, ˆy2, . . . , ˆyn]
X =





1 x11 x12 . . . x1d
1 x21 x22 . . . x2d
...
...
... . . .
...
1 xn1 xn2 . . . xnd





=





x1
x2
...
xn





Using these quantities, we define the following
Sample mean of y := ¯y =
1
n
n
i=1
yi (3.16)
Sample mean of ˆy := ¯ˆy =
1
n
n
i=1
ˆyi (3.17)
(3.18)
From the property of least square regression, we can say that
¯y = ¯ˆy (3.19)
This implies that the mean target value is the same as the mean predicted value for the least
square regression.
3-4
EE 615 Lecture 3 — August 11 Fall 2016
3.2.2 Variance
The variance of the target variable, the predicted variable, and residual error is given as
follows:
Var(y) =
1
n
n
i=1
(yi − y)2
(3.20)
Var(ˆy) =
1
n
n
i=1
(ˆyi − ˆy)2
=
1
n
n
i=1
(ˆyi − y)2
(3.21)
Var(e) =
1
n
n
i=1
(yi − ˆyi)2
= RSS (3.22)
The last expression follow because the mean of residual error vector e is zero as per previous
claim.
Let us write the above two expressions of variances in vector notation. For this we define
the vector ˆy as follows.
y = [y, y, . . . , y] (3.23)
In view of this vector, we can rewrite the expressions for variance in the following manner.
Var(y) =
1
n
(y − y) (y − y) (3.24)
Var(ˆy) =
1
n
(ˆy − y) (ˆy − y) (3.25)
Var(e) =
1
n
(y − ˆy) (y − ˆy) (3.26)
Below is an important result expressing the relationship between
Lemma 3.1.
Var(y) = Var(ˆy) + Var(e) (3.27)
total variance = explained variance + unexplained variance (3.28)
Proof: Let us start with the following expression
nVar(y) = (y − y) (y − y) (3.29)
= (y − ˆy + ˆy − y) (y − ˆy + ˆy − y) (3.30)
= (y − ˆy) (y − ˆy) + (ˆy − ˆy) (ˆy − ˆy) + 2(y − ˆy) (ˆy − y) (3.31)
= nVar(e) + nVar(ˆy) + 2(y − ˆy) (ˆy − y) (3.32)
Now let us examine the expression
(y − ˆy) (ˆy − y)
3-5
EE 615 Lecture 3 — August 11 Fall 2016
This expression can be written as
(y − ˆy) (ˆy − y) = (y − ˆy) ˆy − (y − ˆy) y (3.33)
Let us analyze each of the term on the RHS one-by-one:
Second Term = (y − ˆy) y
= (y − ˆy)





1
1
...
1





y
= y1 − ˆy1, y2 − ˆy2, · · · yn − ˆyn





1
1
...
1





y
=
n
i=1
(yi − ˆyi) y
= 0
where, the last equation follows from the property of the least square regression discussed
earlier. Now, let us analyze the first term. Recall that
First Term = (y − ˆy) ˆy
Substituting, ˆy = Xw∗
= X(X X)−1
X y, we get the following
First Term = y X(X X)−1
X y − y X(X X)−1
X y
= 0
This completes the proof of the desired lemma.
We call the residual sum of squares as unexplained variance as it comes from the error
measured, the error is assumed to be some unexplained process, hence the name.
3-6

More Related Content

What's hot (20)

Liner algebra-vector space-1 introduction to vector space and subspace
Liner algebra-vector space-1   introduction to vector space and subspace Liner algebra-vector space-1   introduction to vector space and subspace
Liner algebra-vector space-1 introduction to vector space and subspace
Manikanta satyala
 
Vector analysis
Vector analysisVector analysis
Vector analysis
Solo Hermelin
 
Gradient derivation
Gradient derivation Gradient derivation
Gradient derivation
Roy Salinas
 
Ch05 1
Ch05 1Ch05 1
Ch05 1
Rendy Robert
 
ベクトル(人間科学のための基礎数学)
ベクトル(人間科学のための基礎数学)ベクトル(人間科学のための基礎数学)
ベクトル(人間科学のための基礎数学)
Masahiro Okano
 
Gaussian Elimination Method
Gaussian Elimination MethodGaussian Elimination Method
Gaussian Elimination Method
Andi Firdaus
 
Matching
MatchingMatching
Matching
Praveen Kumar
 
Iterativos Methods
Iterativos MethodsIterativos Methods
Iterativos Methods
Jeannie
 
Ch04
Ch04Ch04
Ch04
swavicky
 
5 parametric equations, tangents and curve lengths in polar coordinates
5 parametric equations, tangents and curve lengths in polar coordinates5 parametric equations, tangents and curve lengths in polar coordinates
5 parametric equations, tangents and curve lengths in polar coordinates
math267
 
the cross product of vectors
the cross product of vectorsthe cross product of vectors
the cross product of vectors
Elias Dinsa
 
Week 7
Week 7Week 7
Week 7
EasyStudy3
 
Generalised Statistical Convergence For Double Sequences
Generalised Statistical Convergence For Double SequencesGeneralised Statistical Convergence For Double Sequences
Generalised Statistical Convergence For Double Sequences
IOSR Journals
 
Linear and non linear equation
Linear and non linear equationLinear and non linear equation
Linear and non linear equation
Harshana Madusanka Jayamaha
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
Ch02
Ch02Ch02
Ch02
swavicky
 
Mathematical blog #1
Mathematical blog #1Mathematical blog #1
Mathematical blog #1
Steven Pauly
 
Indefinite Integral
Indefinite IntegralIndefinite Integral
Indefinite Integral
Just Passing By
 
L4 one sided limits limits at infinity
L4 one sided limits limits at infinityL4 one sided limits limits at infinity
L4 one sided limits limits at infinity
James Tagara
 
Classical mechanics
Classical mechanicsClassical mechanics
Classical mechanics
MikeSantos Lousath
 
Liner algebra-vector space-1 introduction to vector space and subspace
Liner algebra-vector space-1   introduction to vector space and subspace Liner algebra-vector space-1   introduction to vector space and subspace
Liner algebra-vector space-1 introduction to vector space and subspace
Manikanta satyala
 
Gradient derivation
Gradient derivation Gradient derivation
Gradient derivation
Roy Salinas
 
ベクトル(人間科学のための基礎数学)
ベクトル(人間科学のための基礎数学)ベクトル(人間科学のための基礎数学)
ベクトル(人間科学のための基礎数学)
Masahiro Okano
 
Gaussian Elimination Method
Gaussian Elimination MethodGaussian Elimination Method
Gaussian Elimination Method
Andi Firdaus
 
Iterativos Methods
Iterativos MethodsIterativos Methods
Iterativos Methods
Jeannie
 
5 parametric equations, tangents and curve lengths in polar coordinates
5 parametric equations, tangents and curve lengths in polar coordinates5 parametric equations, tangents and curve lengths in polar coordinates
5 parametric equations, tangents and curve lengths in polar coordinates
math267
 
the cross product of vectors
the cross product of vectorsthe cross product of vectors
the cross product of vectors
Elias Dinsa
 
Generalised Statistical Convergence For Double Sequences
Generalised Statistical Convergence For Double SequencesGeneralised Statistical Convergence For Double Sequences
Generalised Statistical Convergence For Double Sequences
IOSR Journals
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
Mathematical blog #1
Mathematical blog #1Mathematical blog #1
Mathematical blog #1
Steven Pauly
 
L4 one sided limits limits at infinity
L4 one sided limits limits at infinityL4 one sided limits limits at infinity
L4 one sided limits limits at infinity
James Tagara
 

Similar to Lecture 3 - Linear Regression (20)

Machine learning (10)
Machine learning (10)Machine learning (10)
Machine learning (10)
NYversity
 
Calculas
CalculasCalculas
Calculas
Vatsal Manavar
 
Series_Solution_Methods_and_Special_Func.pdf
Series_Solution_Methods_and_Special_Func.pdfSeries_Solution_Methods_and_Special_Func.pdf
Series_Solution_Methods_and_Special_Func.pdf
mohamedtawfik358886
 
Triple_Integrals.pdf
Triple_Integrals.pdfTriple_Integrals.pdf
Triple_Integrals.pdf
SalimSaleh9
 
(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network
Masahiro Suzuki
 
Lecture 15
Lecture 15Lecture 15
Lecture 15
FahadYaqoob5
 
Lecture 15(graphing of cartesion curves)
Lecture 15(graphing of cartesion curves)Lecture 15(graphing of cartesion curves)
Lecture 15(graphing of cartesion curves)
FahadYaqoob5
 
Tracing of cartesian curve
Tracing of cartesian curveTracing of cartesian curve
Tracing of cartesian curve
Kaushal Patel
 
Lesson 2-2 - Math 8 - W2Q2_Slopes and Intercepts of Lines.pptx
Lesson 2-2 - Math 8 - W2Q2_Slopes and Intercepts of Lines.pptxLesson 2-2 - Math 8 - W2Q2_Slopes and Intercepts of Lines.pptx
Lesson 2-2 - Math 8 - W2Q2_Slopes and Intercepts of Lines.pptx
ErlenaMirador1
 
Cs229 notes9
Cs229 notes9Cs229 notes9
Cs229 notes9
VuTran231
 
Tensor 1
Tensor  1Tensor  1
Tensor 1
BAIJU V
 
MATHLECT1LECTUREFFFFFFFFFFFFFFFFFFHJ.pdf
MATHLECT1LECTUREFFFFFFFFFFFFFFFFFFHJ.pdfMATHLECT1LECTUREFFFFFFFFFFFFFFFFFFHJ.pdf
MATHLECT1LECTUREFFFFFFFFFFFFFFFFFFHJ.pdf
HebaEng
 
WRITING AND GRAPHING LINEAR EQUATIONS 1.pptx
WRITING AND GRAPHING LINEAR EQUATIONS 1.pptxWRITING AND GRAPHING LINEAR EQUATIONS 1.pptx
WRITING AND GRAPHING LINEAR EQUATIONS 1.pptx
KristenHathcock
 
Mathematics.pdf
Mathematics.pdfMathematics.pdf
Mathematics.pdf
zaraa30
 
Beginning direct3d gameprogrammingmath06_transformations_20161019_jintaeks
Beginning direct3d gameprogrammingmath06_transformations_20161019_jintaeksBeginning direct3d gameprogrammingmath06_transformations_20161019_jintaeks
Beginning direct3d gameprogrammingmath06_transformations_20161019_jintaeks
JinTaek Seo
 
4. Linear Equations in Two Variables 2.pdf
4. Linear Equations in Two Variables 2.pdf4. Linear Equations in Two Variables 2.pdf
4. Linear Equations in Two Variables 2.pdf
silki0908
 
A high accuracy approximation for half - space problems with anisotropic scat...
A high accuracy approximation for half - space problems with anisotropic scat...A high accuracy approximation for half - space problems with anisotropic scat...
A high accuracy approximation for half - space problems with anisotropic scat...
IOSR Journals
 
lec8.ppt
lec8.pptlec8.ppt
lec8.ppt
Rai Saheb Bhanwar Singh College Nasrullaganj
 
Linear equation in 2 variables
Linear equation in 2 variablesLinear equation in 2 variables
Linear equation in 2 variables
avb public school
 
lec14.ppt
lec14.pptlec14.ppt
lec14.ppt
Rai Saheb Bhanwar Singh College Nasrullaganj
 
Machine learning (10)
Machine learning (10)Machine learning (10)
Machine learning (10)
NYversity
 
Series_Solution_Methods_and_Special_Func.pdf
Series_Solution_Methods_and_Special_Func.pdfSeries_Solution_Methods_and_Special_Func.pdf
Series_Solution_Methods_and_Special_Func.pdf
mohamedtawfik358886
 
Triple_Integrals.pdf
Triple_Integrals.pdfTriple_Integrals.pdf
Triple_Integrals.pdf
SalimSaleh9
 
(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network
Masahiro Suzuki
 
Lecture 15(graphing of cartesion curves)
Lecture 15(graphing of cartesion curves)Lecture 15(graphing of cartesion curves)
Lecture 15(graphing of cartesion curves)
FahadYaqoob5
 
Tracing of cartesian curve
Tracing of cartesian curveTracing of cartesian curve
Tracing of cartesian curve
Kaushal Patel
 
Lesson 2-2 - Math 8 - W2Q2_Slopes and Intercepts of Lines.pptx
Lesson 2-2 - Math 8 - W2Q2_Slopes and Intercepts of Lines.pptxLesson 2-2 - Math 8 - W2Q2_Slopes and Intercepts of Lines.pptx
Lesson 2-2 - Math 8 - W2Q2_Slopes and Intercepts of Lines.pptx
ErlenaMirador1
 
Cs229 notes9
Cs229 notes9Cs229 notes9
Cs229 notes9
VuTran231
 
Tensor 1
Tensor  1Tensor  1
Tensor 1
BAIJU V
 
MATHLECT1LECTUREFFFFFFFFFFFFFFFFFFHJ.pdf
MATHLECT1LECTUREFFFFFFFFFFFFFFFFFFHJ.pdfMATHLECT1LECTUREFFFFFFFFFFFFFFFFFFHJ.pdf
MATHLECT1LECTUREFFFFFFFFFFFFFFFFFFHJ.pdf
HebaEng
 
WRITING AND GRAPHING LINEAR EQUATIONS 1.pptx
WRITING AND GRAPHING LINEAR EQUATIONS 1.pptxWRITING AND GRAPHING LINEAR EQUATIONS 1.pptx
WRITING AND GRAPHING LINEAR EQUATIONS 1.pptx
KristenHathcock
 
Mathematics.pdf
Mathematics.pdfMathematics.pdf
Mathematics.pdf
zaraa30
 
Beginning direct3d gameprogrammingmath06_transformations_20161019_jintaeks
Beginning direct3d gameprogrammingmath06_transformations_20161019_jintaeksBeginning direct3d gameprogrammingmath06_transformations_20161019_jintaeks
Beginning direct3d gameprogrammingmath06_transformations_20161019_jintaeks
JinTaek Seo
 
4. Linear Equations in Two Variables 2.pdf
4. Linear Equations in Two Variables 2.pdf4. Linear Equations in Two Variables 2.pdf
4. Linear Equations in Two Variables 2.pdf
silki0908
 
A high accuracy approximation for half - space problems with anisotropic scat...
A high accuracy approximation for half - space problems with anisotropic scat...A high accuracy approximation for half - space problems with anisotropic scat...
A high accuracy approximation for half - space problems with anisotropic scat...
IOSR Journals
 
Linear equation in 2 variables
Linear equation in 2 variablesLinear equation in 2 variables
Linear equation in 2 variables
avb public school
 

Recently uploaded (20)

Unit 4 Reverse Engineering Tools Functionalities & Use-Cases.pdf
Unit 4  Reverse Engineering Tools  Functionalities & Use-Cases.pdfUnit 4  Reverse Engineering Tools  Functionalities & Use-Cases.pdf
Unit 4 Reverse Engineering Tools Functionalities & Use-Cases.pdf
ChatanBawankar
 
ISO 27001 Lead Auditor Exam Practice Questions and Answers-.pdf
ISO 27001 Lead Auditor Exam Practice Questions and Answers-.pdfISO 27001 Lead Auditor Exam Practice Questions and Answers-.pdf
ISO 27001 Lead Auditor Exam Practice Questions and Answers-.pdf
infosec train
 
How to Setup Lunch in Odoo 18 - Odoo guides
How to Setup Lunch in Odoo 18 - Odoo guidesHow to Setup Lunch in Odoo 18 - Odoo guides
How to Setup Lunch in Odoo 18 - Odoo guides
Celine George
 
Odoo 18 Point of Sale PWA - Odoo Slides
Odoo 18 Point of Sale PWA  - Odoo  SlidesOdoo 18 Point of Sale PWA  - Odoo  Slides
Odoo 18 Point of Sale PWA - Odoo Slides
Celine George
 
[2025] Qualtric XM-EX-EXPERT Study Plan | Practice Questions + Exam Details
[2025] Qualtric XM-EX-EXPERT Study Plan | Practice Questions + Exam Details[2025] Qualtric XM-EX-EXPERT Study Plan | Practice Questions + Exam Details
[2025] Qualtric XM-EX-EXPERT Study Plan | Practice Questions + Exam Details
Jenny408767
 
Active Surveillance For Localized Prostate Cancer A New Paradigm For Clinical...
Active Surveillance For Localized Prostate Cancer A New Paradigm For Clinical...Active Surveillance For Localized Prostate Cancer A New Paradigm For Clinical...
Active Surveillance For Localized Prostate Cancer A New Paradigm For Clinical...
wygalkelceqg
 
THE CHURCH AND ITS IMPACT: FOSTERING CHRISTIAN EDUCATION
THE CHURCH AND ITS IMPACT: FOSTERING CHRISTIAN EDUCATIONTHE CHURCH AND ITS IMPACT: FOSTERING CHRISTIAN EDUCATION
THE CHURCH AND ITS IMPACT: FOSTERING CHRISTIAN EDUCATION
PROF. PAUL ALLIEU KAMARA
 
How to Use Owl Slots in Odoo 17 - Odoo Slides
How to Use Owl Slots in Odoo 17 - Odoo SlidesHow to Use Owl Slots in Odoo 17 - Odoo Slides
How to Use Owl Slots in Odoo 17 - Odoo Slides
Celine George
 
STUDENT LOAN TRUST FUND DEFAULTERS GHANA
STUDENT LOAN TRUST FUND DEFAULTERS GHANASTUDENT LOAN TRUST FUND DEFAULTERS GHANA
STUDENT LOAN TRUST FUND DEFAULTERS GHANA
Kweku Zurek
 
Paper 110A | Shadows and Light: Exploring Expressionism in ‘The Cabinet of Dr...
Paper 110A | Shadows and Light: Exploring Expressionism in ‘The Cabinet of Dr...Paper 110A | Shadows and Light: Exploring Expressionism in ‘The Cabinet of Dr...
Paper 110A | Shadows and Light: Exploring Expressionism in ‘The Cabinet of Dr...
Rajdeep Bavaliya
 
Types of Actions in Odoo 18 - Odoo Slides
Types of Actions in Odoo 18 - Odoo SlidesTypes of Actions in Odoo 18 - Odoo Slides
Types of Actions in Odoo 18 - Odoo Slides
Celine George
 
Quiz-E-Mataram (Under 20 Quiz Set) .pptx
Quiz-E-Mataram (Under 20 Quiz Set) .pptxQuiz-E-Mataram (Under 20 Quiz Set) .pptx
Quiz-E-Mataram (Under 20 Quiz Set) .pptx
SouptikUkil
 
Order Lepidoptera: Butterflies and Moths.pptx
Order Lepidoptera: Butterflies and Moths.pptxOrder Lepidoptera: Butterflies and Moths.pptx
Order Lepidoptera: Butterflies and Moths.pptx
Arshad Shaikh
 
Education Funding Equity in North Carolina: Looking Beyond Income
Education Funding Equity in North Carolina: Looking Beyond IncomeEducation Funding Equity in North Carolina: Looking Beyond Income
Education Funding Equity in North Carolina: Looking Beyond Income
EducationNC
 
A Brief Introduction About Jack Lutkus
A Brief Introduction About  Jack  LutkusA Brief Introduction About  Jack  Lutkus
A Brief Introduction About Jack Lutkus
Jack Lutkus
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak  Southwest US 5-30-2025.pptxYSPH VMOC Special Report - Measles Outbreak  Southwest US 5-30-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 5-30-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
General Knowledge Quiz / Pub Quiz lvl 1.pptx
General Knowledge Quiz / Pub Quiz lvl 1.pptxGeneral Knowledge Quiz / Pub Quiz lvl 1.pptx
General Knowledge Quiz / Pub Quiz lvl 1.pptx
fxbktvnp8b
 
New syllabus entomology (Lession plan 121).pdf
New syllabus entomology (Lession plan 121).pdfNew syllabus entomology (Lession plan 121).pdf
New syllabus entomology (Lession plan 121).pdf
Arshad Shaikh
 
Policies, procedures, subject selection and QTAC.pptx
Policies, procedures, subject selection and QTAC.pptxPolicies, procedures, subject selection and QTAC.pptx
Policies, procedures, subject selection and QTAC.pptx
mansk2
 
Research Handbook On Environment And Investment Law Kate Miles
Research Handbook On Environment And Investment Law Kate MilesResearch Handbook On Environment And Investment Law Kate Miles
Research Handbook On Environment And Investment Law Kate Miles
mucomousamir
 
Unit 4 Reverse Engineering Tools Functionalities & Use-Cases.pdf
Unit 4  Reverse Engineering Tools  Functionalities & Use-Cases.pdfUnit 4  Reverse Engineering Tools  Functionalities & Use-Cases.pdf
Unit 4 Reverse Engineering Tools Functionalities & Use-Cases.pdf
ChatanBawankar
 
ISO 27001 Lead Auditor Exam Practice Questions and Answers-.pdf
ISO 27001 Lead Auditor Exam Practice Questions and Answers-.pdfISO 27001 Lead Auditor Exam Practice Questions and Answers-.pdf
ISO 27001 Lead Auditor Exam Practice Questions and Answers-.pdf
infosec train
 
How to Setup Lunch in Odoo 18 - Odoo guides
How to Setup Lunch in Odoo 18 - Odoo guidesHow to Setup Lunch in Odoo 18 - Odoo guides
How to Setup Lunch in Odoo 18 - Odoo guides
Celine George
 
Odoo 18 Point of Sale PWA - Odoo Slides
Odoo 18 Point of Sale PWA  - Odoo  SlidesOdoo 18 Point of Sale PWA  - Odoo  Slides
Odoo 18 Point of Sale PWA - Odoo Slides
Celine George
 
[2025] Qualtric XM-EX-EXPERT Study Plan | Practice Questions + Exam Details
[2025] Qualtric XM-EX-EXPERT Study Plan | Practice Questions + Exam Details[2025] Qualtric XM-EX-EXPERT Study Plan | Practice Questions + Exam Details
[2025] Qualtric XM-EX-EXPERT Study Plan | Practice Questions + Exam Details
Jenny408767
 
Active Surveillance For Localized Prostate Cancer A New Paradigm For Clinical...
Active Surveillance For Localized Prostate Cancer A New Paradigm For Clinical...Active Surveillance For Localized Prostate Cancer A New Paradigm For Clinical...
Active Surveillance For Localized Prostate Cancer A New Paradigm For Clinical...
wygalkelceqg
 
THE CHURCH AND ITS IMPACT: FOSTERING CHRISTIAN EDUCATION
THE CHURCH AND ITS IMPACT: FOSTERING CHRISTIAN EDUCATIONTHE CHURCH AND ITS IMPACT: FOSTERING CHRISTIAN EDUCATION
THE CHURCH AND ITS IMPACT: FOSTERING CHRISTIAN EDUCATION
PROF. PAUL ALLIEU KAMARA
 
How to Use Owl Slots in Odoo 17 - Odoo Slides
How to Use Owl Slots in Odoo 17 - Odoo SlidesHow to Use Owl Slots in Odoo 17 - Odoo Slides
How to Use Owl Slots in Odoo 17 - Odoo Slides
Celine George
 
STUDENT LOAN TRUST FUND DEFAULTERS GHANA
STUDENT LOAN TRUST FUND DEFAULTERS GHANASTUDENT LOAN TRUST FUND DEFAULTERS GHANA
STUDENT LOAN TRUST FUND DEFAULTERS GHANA
Kweku Zurek
 
Paper 110A | Shadows and Light: Exploring Expressionism in ‘The Cabinet of Dr...
Paper 110A | Shadows and Light: Exploring Expressionism in ‘The Cabinet of Dr...Paper 110A | Shadows and Light: Exploring Expressionism in ‘The Cabinet of Dr...
Paper 110A | Shadows and Light: Exploring Expressionism in ‘The Cabinet of Dr...
Rajdeep Bavaliya
 
Types of Actions in Odoo 18 - Odoo Slides
Types of Actions in Odoo 18 - Odoo SlidesTypes of Actions in Odoo 18 - Odoo Slides
Types of Actions in Odoo 18 - Odoo Slides
Celine George
 
Quiz-E-Mataram (Under 20 Quiz Set) .pptx
Quiz-E-Mataram (Under 20 Quiz Set) .pptxQuiz-E-Mataram (Under 20 Quiz Set) .pptx
Quiz-E-Mataram (Under 20 Quiz Set) .pptx
SouptikUkil
 
Order Lepidoptera: Butterflies and Moths.pptx
Order Lepidoptera: Butterflies and Moths.pptxOrder Lepidoptera: Butterflies and Moths.pptx
Order Lepidoptera: Butterflies and Moths.pptx
Arshad Shaikh
 
Education Funding Equity in North Carolina: Looking Beyond Income
Education Funding Equity in North Carolina: Looking Beyond IncomeEducation Funding Equity in North Carolina: Looking Beyond Income
Education Funding Equity in North Carolina: Looking Beyond Income
EducationNC
 
A Brief Introduction About Jack Lutkus
A Brief Introduction About  Jack  LutkusA Brief Introduction About  Jack  Lutkus
A Brief Introduction About Jack Lutkus
Jack Lutkus
 
General Knowledge Quiz / Pub Quiz lvl 1.pptx
General Knowledge Quiz / Pub Quiz lvl 1.pptxGeneral Knowledge Quiz / Pub Quiz lvl 1.pptx
General Knowledge Quiz / Pub Quiz lvl 1.pptx
fxbktvnp8b
 
New syllabus entomology (Lession plan 121).pdf
New syllabus entomology (Lession plan 121).pdfNew syllabus entomology (Lession plan 121).pdf
New syllabus entomology (Lession plan 121).pdf
Arshad Shaikh
 
Policies, procedures, subject selection and QTAC.pptx
Policies, procedures, subject selection and QTAC.pptxPolicies, procedures, subject selection and QTAC.pptx
Policies, procedures, subject selection and QTAC.pptx
mansk2
 
Research Handbook On Environment And Investment Law Kate Miles
Research Handbook On Environment And Investment Law Kate MilesResearch Handbook On Environment And Investment Law Kate Miles
Research Handbook On Environment And Investment Law Kate Miles
mucomousamir
 

Lecture 3 - Linear Regression

  • 1. EE 615: Pattern Recognition & Machine Learning Fall 2016 Lecture 3 — August 11 Lecturer: Dinesh Garg Scribe: Harsha Vardhan Tetali 3.1 Properties of Least Square Regression Model (Cont.) 3.1.1 The sum of the residual errors over training set is zero For least square regression model, we always have the following hold true n i=1 ei = 0 (3.1) ⇒ n i=1 (ˆyi − yi) = 0 (3.2) where, ˆyi is the predicted value and yi is the true value of the target variable. In order to prove this claim, let us consider the Residual Sum of Squares (RSS) or the Sum of Squared Errors (SSE) parameterized by the weight vector w. RSS(w) = SSE(w) = n i=1 (wT xi − yi)2 (3.3) To minimize the function RSS(w), we need to set the gradient vector equal to the zero. That is, RSS(w) =       ∂RSS(w) ∂w0 ∂RSS(w) ∂w1 ... ∂RSS(w) ∂wn       = 0 (3.4) Let us consider the first component of the above gradient vector. ∂RSS(w) ∂w0 w∗ = 0 (3.5) ⇒ ∂ n i=1 (wT xi − yi)2 ∂w0 = 0 (3.6) ⇒ n i=1 (wT xi − yi) = 0 (3.7) 3-1
  • 2. EE 615 Lecture 3 — August 11 Fall 2016 From the assumption made on the fitting curve, we have ˆyi = w0 + w1xi1 + w2xi2 + · · · + wdxid = wT xi (3.8) Substituting (3.8) in (3.7) we get, n i=1 (ˆyi − yi) = 0 (3.9) which proves the required claim. Q.E.D. 3.1.2 Total amount of over estimation is equal to the total amount of under estimation This fact follows from rewriting the Equation (3.9) in the following equivalent form i|(ˆyi≥yi) (ˆyi − yi) = i|(ˆyi≤yi) (yi − ˆyi) (3.10) 3.1.3 Vector ˆy is a projection of the vector y on to the column space of X While fitting the linear regression model, we ideally want a parameter vector w for which Xw = y (3.11) For above linear system to have an exact solution, we must have have y lying in the column space of the coefficient matrix X. When this system of equations becomes unsolvable (i.e., y does not lie in the column space of X), we solve this system of equation approximately. By this, we mean that we try to find a vector ˆy lying in the column space of X which is as close to y as possible. The corresponding coefficient vector w giving the vector ˆy would be our desired solution. To find the optimal ˆy vector, we need to solve the following optimization problem arg minˆy∈colspace(X) y − ˆy 2 2 (3.12) Note, any vector ˆy lying in the column space of X, should be of the following form: ˆy = d j=0 αjX∗j (3.13) for some αj ∈ R; i = 1, 2, . . . , d. Substituting this form of the vector ˆy in the previous optimization problem, we get the following equivalent problem arg minα∈Rd+1 y − Xα 2 2 (3.14) 3-2
  • 3. EE 615 Lecture 3 — August 11 Fall 2016 One can easily verify that the optimal value of alpha would be α∗ = (X X)−1 X (3.15) which is the same as w∗ . The matrix (X X)−1 X is known as the projection matrix and pre-multiplying it to any vector would yield the projection of that vector into the column space of X. We can visualize this interpretation of the regression parameters w∗ using figure below. Let us begin by considering the matrix equation Ax = b to be solved for the best optimal solution, where A is a matrix and x and b are column vectors, where x has to be estimated. The mustard colored hyper-plane passes through the origin of the axes (only three are repre- sented, because more than three would not accommodate) and represents the column space of A. Let us assume that the orange colored vectors span the complete mustard colored hyper-plane (only two have been shown, so that the clarity is not missed) and are placed in the columns of A. Now the problem, Ax = b, where b is represented as the green vector, can be restated as finding the optimal vector on the mustard colored plane nearest to the green colored vector coming into space. As a solution to this geometric problem, we come up with the naive solution of dropping the perpendicular from the tip of the green vector onto the plane. Hence drawing a normal to the plane passing through the tip of the vector that is to be estimated. The point of intersection of this normal and the plane can is the 3-3
  • 4. EE 615 Lecture 3 — August 11 Fall 2016 optimal or the best possible vector that can be estimated. In the figure above this optimal vector is shown in black. 3.2 Mean and Variance of the Target and the Predicted Variables In this section, we will define the sample mean and the sample variance of the target variables as well as the predicted variables, and relate these two variances with the earlier defined RSS function (evaluated at optimal w∗ ). 3.2.1 Mean Recall the following quantities defined earlier y = [y1, y2, . . . , yn] ˆy = [ˆy1, ˆy2, . . . , ˆyn] X =      1 x11 x12 . . . x1d 1 x21 x22 . . . x2d ... ... ... . . . ... 1 xn1 xn2 . . . xnd      =      x1 x2 ... xn      Using these quantities, we define the following Sample mean of y := ¯y = 1 n n i=1 yi (3.16) Sample mean of ˆy := ¯ˆy = 1 n n i=1 ˆyi (3.17) (3.18) From the property of least square regression, we can say that ¯y = ¯ˆy (3.19) This implies that the mean target value is the same as the mean predicted value for the least square regression. 3-4
  • 5. EE 615 Lecture 3 — August 11 Fall 2016 3.2.2 Variance The variance of the target variable, the predicted variable, and residual error is given as follows: Var(y) = 1 n n i=1 (yi − y)2 (3.20) Var(ˆy) = 1 n n i=1 (ˆyi − ˆy)2 = 1 n n i=1 (ˆyi − y)2 (3.21) Var(e) = 1 n n i=1 (yi − ˆyi)2 = RSS (3.22) The last expression follow because the mean of residual error vector e is zero as per previous claim. Let us write the above two expressions of variances in vector notation. For this we define the vector ˆy as follows. y = [y, y, . . . , y] (3.23) In view of this vector, we can rewrite the expressions for variance in the following manner. Var(y) = 1 n (y − y) (y − y) (3.24) Var(ˆy) = 1 n (ˆy − y) (ˆy − y) (3.25) Var(e) = 1 n (y − ˆy) (y − ˆy) (3.26) Below is an important result expressing the relationship between Lemma 3.1. Var(y) = Var(ˆy) + Var(e) (3.27) total variance = explained variance + unexplained variance (3.28) Proof: Let us start with the following expression nVar(y) = (y − y) (y − y) (3.29) = (y − ˆy + ˆy − y) (y − ˆy + ˆy − y) (3.30) = (y − ˆy) (y − ˆy) + (ˆy − ˆy) (ˆy − ˆy) + 2(y − ˆy) (ˆy − y) (3.31) = nVar(e) + nVar(ˆy) + 2(y − ˆy) (ˆy − y) (3.32) Now let us examine the expression (y − ˆy) (ˆy − y) 3-5
  • 6. EE 615 Lecture 3 — August 11 Fall 2016 This expression can be written as (y − ˆy) (ˆy − y) = (y − ˆy) ˆy − (y − ˆy) y (3.33) Let us analyze each of the term on the RHS one-by-one: Second Term = (y − ˆy) y = (y − ˆy)      1 1 ... 1      y = y1 − ˆy1, y2 − ˆy2, · · · yn − ˆyn      1 1 ... 1      y = n i=1 (yi − ˆyi) y = 0 where, the last equation follows from the property of the least square regression discussed earlier. Now, let us analyze the first term. Recall that First Term = (y − ˆy) ˆy Substituting, ˆy = Xw∗ = X(X X)−1 X y, we get the following First Term = y X(X X)−1 X y − y X(X X)−1 X y = 0 This completes the proof of the desired lemma. We call the residual sum of squares as unexplained variance as it comes from the error measured, the error is assumed to be some unexplained process, hence the name. 3-6