0% found this document useful (0 votes)
29 views

Gradient Descent

Gradient descent is an algorithm used to minimize functions by iteratively moving in the direction of steepest descent defined by the negative gradient. It works by starting with random coefficient values and updating them in the direction that minimizes error until a minimum error is reached. The learning rate determines the size of updates at each step, with a smaller rate giving greater accuracy.

Uploaded by

thummalakuntas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Gradient Descent

Gradient descent is an algorithm used to minimize functions by iteratively moving in the direction of steepest descent defined by the negative gradient. It works by starting with random coefficient values and updating them in the direction that minimizes error until a minimum error is reached. The learning rate determines the size of updates at each step, with a smaller rate giving greater accuracy.

Uploaded by

thummalakuntas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Gradient descent is an optimization algorithm used to minimize

some function by iteratively moving in the direction of steepest


descent as defined by the negative of the gradient. In machine
learning, we use gradient descent to update the parameters of our
model.

When there are more than one inputs you can use a process of
optimizing values of coefficients by iteratively minimizing error of
model on your training data. This is called Gradient Descent and
works by starting with random values for each coefficient. The sum
of squared errors are calculated for each pair of input and output
variables.

A learning rate is used for each pair of input and output values. It is
a scalar factor and coefficients are updated in direction towards
minimizing error. The process is repeated until a minimum sum
squared error is achieved or no further improvement is possible.

When using this method, learning rate alpha determines the size of
improvement step to take on each iteration of procedure. In practise,
Gradient Descent is useful when there is a large dataset either in
number of rows or number of columns.

In short ,it is a minimization algorithm meant for minimizing a


given activation function.
The Gradient Descent algorithm determines the values of m and c,
such that line corresponding to those values is best fitting line /
gives minimum error.

First we need to calculate the loss function. The loss function is


defined as the error in our predicted value of m and c. Our aim is to
minimize this error to obtain most accurate value of m and c. To
calculate the loss, Mean Squared Error function is used.

· The difference between actual and predicted y value if found.

· The difference is squared.

· Then, the mean of squares for every value in x is calculated.

Now, we have to minimize “m” and “c”. To minimize these


parameters, Gradient Descent Algorithm is used.

1. Initially, let m = 0, c = 0
Where L = learning rate — controlling how much the value of “m”
changes with each step. The smaller the L, greater the accuracy. L =
0.001 for a good accuracy.

2. Calculating the partial derivative of loss function wrt “m” and give
current values of x, y , m and c to get the derivative D.

3. Similarly, D wrt c

4. Now, updating the current value of m and c,

5. We repeat this process until loss function is very small ,i.e. ideally
0 % error (100% accuracy).

When the learning rate is quite high, the process will be a very
haphazard one. So, the smaller the learning rate, the better the
result of the model will be.

After repeating this process for many times, the person arrives at the
bottom of the valley. With optimum value of m and c, model is
ready.

You might also like