05 Gradient Descent
05 Gradient Descent
Learning
Prof. Kailash Singh
Department of Chemical Engineering
MNIT Jaipur
Gradient Descent
• Gradient descent is an optimization algorithm to
minimize a cost function.
• What is gradient?
– A gradient is nothing but a derivative that defines the
effects on output of the function with a little bit of
variation in input.
• It is widely used in machine learning. It works by
iteratively adjusting the weights (or parameters)
of the model until the minimum the cost function
is reached.
He goes down the slope and takes large steps when the slope is
steep and small steps when the slope is less steep.
He decides his next position based on his current position and
stops when he gets to the bottom of the valley which is his goal.
Gradient Descent Algorithm
Minimize
The necessary condition is
Steps in Gradient descent method
Gradient Descent in Linear Regression
Let the linear model be
Let cost function be
is learning rate.
• If is high, the solution may diverge.
• If is low, the solution may converge slowly.
• Gradient descent may be fast when data is too
large.
• In case of linear regression determining
inverse of matrix may be too expensive on
computational effort when n is very large.
Estimate optimally
Or
where
Example 5.1
Fit the following data by using Gradient descent
method.
x y
1 7
3 16
5 18
7 30
9 32
11 44
13 45
import numpy as np
import matplotlib.pyplot as plt
def rsq(yp,y):
ypavg=np.mean(yp)
yavg=np.mean(y)
s1=np.power(sum((yp-ypavg)*(y-yavg)),2)
s2=sum(np.power(yp-ypavg,2))
s3=sum(np.power(y-yavg,2))
R2=s1/s2/s3
return R2
def GradientDescent(X,y,alpha,niter):
b0,b1=1,1 #Assume guess value
Xbar=X.mean()
ybar=y.mean()
n=len(X)
print("By Gradient Descent...")
fList=[]
for i in range(niter):
df0=b0+b1*Xbar-ybar
df1=b0*Xbar+b1*sum(X*X)/n - sum(X*y)/n
df=np.array([[df0],[df1]])
b0=b0-alpha*df0
b1=b1-alpha*df1
f=sum((y-b0-b1*X)**2)/(2*n)
fList.append(f)
dfnorm=np.linalg.norm(df)
print(f"f={f:.3f}, b0={b0:.3f}, b1={b1:.3f}, ||df||={dfnorm:.5f}")
return b0,b1,f,fList
#Main Program
X=np.array([1,3,5,7,9,11,13])
y=np.array([7,16,18,30,32,44,45])
plt.subplot(1,2,1)
plt.plot(range(niter),fList)
plt.xlabel('iter no.'); plt.ylabel('f')
plt.title('MSE/2')
plt.subplot(1,2,2)
plt.scatter(X,y)
plt.plot(X,yp)
plt.xlabel('X');plt.ylabel('y,yp')
plt.title('Gradient Descent')
plt.show()