Gradient Descent - A Quick, Simple Introduction - Built in
Gradient Descent - A Quick, Simple Introduction - Built in
Gradient Descent: An
Introduction to 1 of
Machine Learning’s Most
Popular Algorithms
Take a high-level look into gradient descent- one of ML's most
popular algorithms
Niklas Donges
July 23, 2021
Updated: August 1, 2021
LinkedIn
HackerRedditTwitterFacebook
:
G
radient descent is by far the most popular optimization
strategy used in machine learning and deep learning at the
Jobs moment. It isCompanies
used when training data
Tech models,
Topics can be Tech Hubs
Table of Contents
Introduction
What is a gradient?
How gradient descent works
Learning rate
How to make sure it works properly
Types of gradient descent: batch, stochastic, mini-batch
What is a Gradient?
WHAT IS A GRADIENT?
Imagine the image below illustrates our hill from a top-down view
and the red arrows are the steps of our climber. Think of a gradient in
this context as a vector that contains the direction of the steepest
step the blindfolded man can take and also how long that step should
be.
:
Note that the gradient ranging from X0 to X1 is much longer than the
one reaching from X3 to X4. This is because the steepness/slope of
the hill, which determines the length of the vector, is less. This
perfectly represents the example of the hill because the hill is getting
less steep the higher it's climbed. Therefore a reduced gradient goes
along with a reduced slope and a reduced step size for the hill
climber.
:
Find out who's hiring.
See all Data + Analytics jobs at top tech companies & startups
VIEW JOBS
Imagine you have a machine learning problem and want to train your
algorithm with gradient descent to minimize your cost-function J(w,
b) and reach its local minimum by tweaking its parameters (w and b).
:
The image below shows the horizontal axes represent the parameters
(w and b), while the cost function J(w, b) is represented on the vertical
axes. Gradient descent is a convex function.
How big the steps are gradient descent takes into the direction of the
local minimum are determined by the learning rate, which figures out
how fast or slow we will move towards the optimal weights.
:
For gradient descent to reach the local minimum we must set the
learning rate to an appropriate value, which is neither too low nor too
high. This is important because if the steps it takes are too big, it may
not reach the local minimum because it bounces back and forth
between the convex function of gradient descent (see left image
below). If we set the learning rate to a very small value, gradient
descent will eventually reach the local minimum but that may take a
while (see the right image).
So, the learning rate should never be too high or too low for this
reason. You can check if you’re learning rate is doing well by plotting
it on a graph.
There are some algorithms that can automatically tell you if gradient
descent has converged, but you must define a threshold for the
convergence beforehand, which is also pretty hard to estimate. For
this reason, simple plots are the preferred convergence test.
If the plot shows the learning curve just going up and down, without
:
really reaching a lower point, try decreasing the learning rate. Also,
when starting out with gradient descent on a given problem, simply
try 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, etc., as the learning rates and
look at which one performs the best.
This introductory video to gradient descent helps to explain one of machine learning's most
useful algorithms.
There are three popular types of gradient descent that mainly differ
in the amount of data they use:
Common mini-batch sizes range between 50 and 256, but like any
other machine learning technique, there is no clear rule because it
varies for different applications. This is the go-to algorithm when
training a neural network and it is the most common type of gradient
:
descent within deep learning.
VIEW JOBS
RELATED
LEARN MORE
Share
Built In NYC
Feedback
:
Report a Bug Built In San
Francisco
Remote Jobs
in Atlanta Built In
Seattle
Remote Jobs
in Dallas See All Tech
Hubs
Marketing
Jobs in DC
Browse Jobs
© Built In 2021 Accessibility Statement Copyright Policy Privacy Policy Terms of Use