SlideShare a Scribd company logo
Gradient Descent method:
Gradient descent is an optimization algorithm used to find the values of parameters (coefficients) of
a function (f) that minimizes a cost function (cost).
Gradient descent is best used when the parameters cannot be calculated analytically(e.g. using linear
algebra) and must be searched for by an optimization algorithm.
Think of a large bowl like what you would eat cereal out of or store fruit in. This bowl is a plot of
the cost function (f).
A random position on the surface of the bowl is the cost of the current values of the coefficients
(cost).
The bottom of the bowl is the cost of the best set of coefficients, the minimum of the function.
The goal is to continue to try different values for the coefficients, evaluatetheir cost and select new
coefficients that have a slightly better (lower) cost.
Repeating this process enough times will lead to the bottom of the bowl and you will know the
values of the coefficients that result in the minimum cost
Gradient Descent Procedure:
The procedure starts off with initial values for the coefficient or coefficients for the function. These
could be 0.0 or a small random value coefficient = 0.0
The cost of the coefficients is evaluated by plugging them into the function and calculating the cost.
cost = f(coefficient)
The derivative of the cost is calculated. The derivative is a concept from calculus and refers to the
slope of the function at a given point. We need to know the slope so that we know the direction
(sign) to move the coefficient values in order to get a lower cost on the next iteration.
delta = derivative(cost)
Now that we know from the derivative which direction is downhill, we can now update the
coefficient values. A learning rate parameter (alpha) must be specified that controls how much the
coefficients can change on each update.
coefficient = coefficient – (alpha * delta)
This process is repeated until the cost of the coefficients (cost) is 0.0 or close enough to zero to be
good enough.
You can see how simple gradient descent is. It does require you to know the gradient of your cost
function or the function you are optimizing, but besides that, it’s very straightforward. Next we will
see how we can use this in machine learning algorithms.
In theory this means that after applying enough iterations of the process to a data set we could see a
final closest minimum cost function to base further work on. – my understanding
Back Propagation Method:
It’s a common method of training artificial neural networks and used in conjunction with an
optimization method such as gradient descent.
The algorithm repeats a two phase cycle, propagation and weight update. When an input vector is
presented to the network, it is propagated forward through the network, layer by layer, until it
reaches the output layer.
The output of the network is then compared to the desired output, using a loss function, and an
error value is calculated for each of the neurons in the output layer. The error values are then
propagated backwards, starting from the output, until each neuron has an associated error value
which roughly represents its contribution to the original output.
Back propagation uses these error values to calculate the gradient of the loss function with respect
to the weights in the network. In the second phase, this gradient is fed to the optimization method,
which in turn uses it to update the weights, in an attempt to minimize the loss function.
The importance of this process is that, as the network is trained, the neurons in the intermediate
layers organize themselves in such a way that the different neurons learn to recognize different
characteristics of the total input space.
After training, when an arbitrary input pattern is present which contains noise or is incomplete,
neurons in the hidden layer of the network will respond with an active output if the new input
contains a pattern that resembles a feature that the individual neurons have learned to recognize
during their training.
For back propagation to work we need to make two main assumptions about the form of the cost
function. Before stating those assumptions, though, it's useful to have an example cost function in
mind.
the quadratic cost has the form
C=12n∑x‖ y(x)−aL(x)‖ 2
where: n is the total number of training examples; the sum is over individual training examples, x;
y=y(x) is the corresponding desired output; L denotes the number of layers in the network; and
aL=aL(x) is the vector of activations output from the network when x is input.
Okay, so what assumptions do we need to make about our cost function, C, in order that back
propagation can be applied? The first assumption we need is that the cost function can be written as
an average C=1n∑xCx over cost functions Cx for individual training examples, x. This is the case
for the quadratic cost function, where the cost for a single training example is Cx=12‖ y−aL‖ 2.
The second assumption we make about the cost is that it can be written as a function of the outputs
from the neural network:
For example, the quadratic cost function satisfies this requirement, since the quadratic cost for a
single training example x may be written as
C=12‖ y−aL‖ 2=12∑j(yj−aLj)2
and thus is a function of the output activations.
Steepest Descent Method:
An algorithm for finding the nearest local minimum of a function which presupposes that the
gradient of the function can be computed. The method of steepest descent, also called the gradient
descent method, starts at a point and, as many times as needed, moves from to by
minimizing along the line extending from in the direction of , the local downhill gradient.
When applied to a 1-dimensional function , the method takes the form of iterating
from a starting point for some small until a fixed point is reached. The results are illustrated
above for the function with and starting points and 0.01,
respectively.
This method has the severe drawback of requiring a great many iterations for functions which have
long, narrow valley structures. In such cases, a conjugate gradient method is preferable.
To find a local minimum of a function using gradient descent, one takes steps proportional to the
negative of the gradient (or of the approximate gradient) of the function at the current point.
If instead one takes steps proportional to the positive of the gradient, one approaches a local
maximum of that function; the procedure is then known as gradient ascent.
There is a chronical problem to the gradient descent. For functions that have valleys (in the case of
descent) or saddle points (in the case of ascent), the gradient descent/ascent algorithm zig-zags,
because the gradient is nearly orthogonal to the direction of the local minimum in these regions.
It is like being inside a round tube and trying to stay in the lower part of the tube. In case we are not,
the gradient tells us we should go almost perpendicular to the longitudinal direction of the tube. If
the local minimum is at the end of the tube, it will take a long time to reach it because we keep
jumping between the sides of the tube (zig-zag). The Rosenbrock function is used to test this
difficult problem:
f(y,x)=(1−y)2+100(x−y2)2
Ad

More Related Content

What's hot (20)

Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]
Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]
Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]
AI Robotics KR
 
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
AI Robotics KR
 
Lesson 4 ar-ma
Lesson 4 ar-maLesson 4 ar-ma
Lesson 4 ar-ma
ankit_ppt
 
BPstudy sklearn 20180925
BPstudy sklearn 20180925BPstudy sklearn 20180925
BPstudy sklearn 20180925
Shintaro Fukushima
 
Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...
Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...
Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...
AI Robotics KR
 
12 support vector machines
12 support vector machines12 support vector machines
12 support vector machines
TanmayVijay1
 
Adjusting PageRank parameters and comparing results : REPORT
Adjusting PageRank parameters and comparing results : REPORTAdjusting PageRank parameters and comparing results : REPORT
Adjusting PageRank parameters and comparing results : REPORT
Subhajit Sahu
 
Ecet 370 week 1 lab
Ecet 370 week 1 labEcet 370 week 1 lab
Ecet 370 week 1 lab
agevpaswind1984
 
Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2
Scilab
 
Maneuverable Target Tracking using Linear Kalman Filter
Maneuverable Target Tracking  using Linear Kalman FilterManeuverable Target Tracking  using Linear Kalman Filter
Maneuverable Target Tracking using Linear Kalman Filter
Annwesh Barik
 
PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learning
Akhilesh Joshi
 
17recursion
17recursion17recursion
17recursion
fyjordan9
 
Preemptive RANSAC by David Nister.
Preemptive RANSAC by David Nister.Preemptive RANSAC by David Nister.
Preemptive RANSAC by David Nister.
Ian Sa
 
The extended kalman filter
The extended kalman filterThe extended kalman filter
The extended kalman filter
Mudit Parnami
 
Bartlett's method pp ts
Bartlett's method pp tsBartlett's method pp ts
Bartlett's method pp ts
Diwaker Pant
 
Computer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC AlgorithmComputer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC Algorithm
allyn joy calcaben
 
Recursion
RecursionRecursion
Recursion
Nalin Adhikari
 
Instance based learning
Instance based learningInstance based learning
Instance based learning
swapnac12
 
13 unsupervised learning clustering
13 unsupervised learning   clustering13 unsupervised learning   clustering
13 unsupervised learning clustering
TanmayVijay1
 
Kalman filter - Applications in Image processing
Kalman filter - Applications in Image processingKalman filter - Applications in Image processing
Kalman filter - Applications in Image processing
Ravi Teja
 
Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]
Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]
Sensor Fusion Study - Ch13. Nonlinear Kalman Filtering [Ahn Min Sung]
AI Robotics KR
 
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
Sensor Fusion Study - Ch15. The Particle Filter [Seoyeon Stella Yang]
AI Robotics KR
 
Lesson 4 ar-ma
Lesson 4 ar-maLesson 4 ar-ma
Lesson 4 ar-ma
ankit_ppt
 
Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...
Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...
Sensor Fusion Study - Ch10. Additional topics in kalman filter [Stella Seoyeo...
AI Robotics KR
 
12 support vector machines
12 support vector machines12 support vector machines
12 support vector machines
TanmayVijay1
 
Adjusting PageRank parameters and comparing results : REPORT
Adjusting PageRank parameters and comparing results : REPORTAdjusting PageRank parameters and comparing results : REPORT
Adjusting PageRank parameters and comparing results : REPORT
Subhajit Sahu
 
Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2Scilab for real dummies j.heikell - part 2
Scilab for real dummies j.heikell - part 2
Scilab
 
Maneuverable Target Tracking using Linear Kalman Filter
Maneuverable Target Tracking  using Linear Kalman FilterManeuverable Target Tracking  using Linear Kalman Filter
Maneuverable Target Tracking using Linear Kalman Filter
Annwesh Barik
 
PCA and LDA in machine learning
PCA and LDA in machine learningPCA and LDA in machine learning
PCA and LDA in machine learning
Akhilesh Joshi
 
Preemptive RANSAC by David Nister.
Preemptive RANSAC by David Nister.Preemptive RANSAC by David Nister.
Preemptive RANSAC by David Nister.
Ian Sa
 
The extended kalman filter
The extended kalman filterThe extended kalman filter
The extended kalman filter
Mudit Parnami
 
Bartlett's method pp ts
Bartlett's method pp tsBartlett's method pp ts
Bartlett's method pp ts
Diwaker Pant
 
Computer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC AlgorithmComputer Vision: Feature matching with RANSAC Algorithm
Computer Vision: Feature matching with RANSAC Algorithm
allyn joy calcaben
 
Instance based learning
Instance based learningInstance based learning
Instance based learning
swapnac12
 
13 unsupervised learning clustering
13 unsupervised learning   clustering13 unsupervised learning   clustering
13 unsupervised learning clustering
TanmayVijay1
 
Kalman filter - Applications in Image processing
Kalman filter - Applications in Image processingKalman filter - Applications in Image processing
Kalman filter - Applications in Image processing
Ravi Teja
 

Similar to Ann a Algorithms notes (20)

Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning concepts
Joe li
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
MayuraD1
 
2. Linear regression with one variable.pptx
2. Linear regression with one variable.pptx2. Linear regression with one variable.pptx
2. Linear regression with one variable.pptx
Emad Nabil
 
working with python
working with pythonworking with python
working with python
bhavesh lande
 
Deep learning MindMap
Deep learning MindMapDeep learning MindMap
Deep learning MindMap
Ashish Patel
 
MS Project
MS ProjectMS Project
MS Project
Darin Hitchings, Ph.D.
 
MACHINE LEARNING NEURAL NETWORK PPT UNIT 4
MACHINE LEARNING NEURAL NETWORK PPT UNIT 4MACHINE LEARNING NEURAL NETWORK PPT UNIT 4
MACHINE LEARNING NEURAL NETWORK PPT UNIT 4
MulliMary
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm
Laura Petrosanu
 
Deeplearning for Computer Vision PPT with
Deeplearning for Computer Vision PPT withDeeplearning for Computer Vision PPT with
Deeplearning for Computer Vision PPT with
naveenraghavendran10
 
Machine Learning 1
Machine Learning 1Machine Learning 1
Machine Learning 1
cairo university
 
Linear logisticregression
Linear logisticregressionLinear logisticregression
Linear logisticregression
kongara
 
Nelder Mead Search Algorithm
Nelder Mead Search AlgorithmNelder Mead Search Algorithm
Nelder Mead Search Algorithm
Ashish Khetan
 
ML_ Unit 2_Part_B
ML_ Unit 2_Part_BML_ Unit 2_Part_B
ML_ Unit 2_Part_B
Srimatre K
 
Using the Pade Technique to Approximate the Function.pptx
Using the Pade Technique to Approximate the Function.pptxUsing the Pade Technique to Approximate the Function.pptx
Using the Pade Technique to Approximate the Function.pptx
MasoudIbrahim3
 
a a a a a a a a a a a a a a a a aa a a a a 41520z
a a a a a a a a a a a a a a a a aa a a a a 41520za a a a a a a a a a a a a a a a aa a a a a 41520z
a a a a a a a a a a a a a a a a aa a a a a 41520z
MasoudIbrahim3
 
PRML Chapter 5
PRML Chapter 5PRML Chapter 5
PRML Chapter 5
Sunwoo Kim
 
Daa chapter11
Daa chapter11Daa chapter11
Daa chapter11
B.Kirron Reddi
 
Divide and Conquer / Greedy Techniques
Divide and Conquer / Greedy TechniquesDivide and Conquer / Greedy Techniques
Divide and Conquer / Greedy Techniques
Nirmalavenkatachalam
 
A CONVERGENCE ANALYSIS OF GRADIENT_version1
A CONVERGENCE ANALYSIS OF GRADIENT_version1A CONVERGENCE ANALYSIS OF GRADIENT_version1
A CONVERGENCE ANALYSIS OF GRADIENT_version1
thanhdowork
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
BeyaNasr1
 
Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning concepts
Joe li
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
MayuraD1
 
2. Linear regression with one variable.pptx
2. Linear regression with one variable.pptx2. Linear regression with one variable.pptx
2. Linear regression with one variable.pptx
Emad Nabil
 
Deep learning MindMap
Deep learning MindMapDeep learning MindMap
Deep learning MindMap
Ashish Patel
 
MACHINE LEARNING NEURAL NETWORK PPT UNIT 4
MACHINE LEARNING NEURAL NETWORK PPT UNIT 4MACHINE LEARNING NEURAL NETWORK PPT UNIT 4
MACHINE LEARNING NEURAL NETWORK PPT UNIT 4
MulliMary
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm
Laura Petrosanu
 
Deeplearning for Computer Vision PPT with
Deeplearning for Computer Vision PPT withDeeplearning for Computer Vision PPT with
Deeplearning for Computer Vision PPT with
naveenraghavendran10
 
Linear logisticregression
Linear logisticregressionLinear logisticregression
Linear logisticregression
kongara
 
Nelder Mead Search Algorithm
Nelder Mead Search AlgorithmNelder Mead Search Algorithm
Nelder Mead Search Algorithm
Ashish Khetan
 
ML_ Unit 2_Part_B
ML_ Unit 2_Part_BML_ Unit 2_Part_B
ML_ Unit 2_Part_B
Srimatre K
 
Using the Pade Technique to Approximate the Function.pptx
Using the Pade Technique to Approximate the Function.pptxUsing the Pade Technique to Approximate the Function.pptx
Using the Pade Technique to Approximate the Function.pptx
MasoudIbrahim3
 
a a a a a a a a a a a a a a a a aa a a a a 41520z
a a a a a a a a a a a a a a a a aa a a a a 41520za a a a a a a a a a a a a a a a aa a a a a 41520z
a a a a a a a a a a a a a a a a aa a a a a 41520z
MasoudIbrahim3
 
PRML Chapter 5
PRML Chapter 5PRML Chapter 5
PRML Chapter 5
Sunwoo Kim
 
Divide and Conquer / Greedy Techniques
Divide and Conquer / Greedy TechniquesDivide and Conquer / Greedy Techniques
Divide and Conquer / Greedy Techniques
Nirmalavenkatachalam
 
A CONVERGENCE ANALYSIS OF GRADIENT_version1
A CONVERGENCE ANALYSIS OF GRADIENT_version1A CONVERGENCE ANALYSIS OF GRADIENT_version1
A CONVERGENCE ANALYSIS OF GRADIENT_version1
thanhdowork
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
BeyaNasr1
 
Ad

More from Prof. Neeta Awasthy (20)

Role of Teacher in the era of Generative AI
Role of Teacher in the era of Generative AIRole of Teacher in the era of Generative AI
Role of Teacher in the era of Generative AI
Prof. Neeta Awasthy
 
NEP 2020 .pptx
NEP 2020 .pptxNEP 2020 .pptx
NEP 2020 .pptx
Prof. Neeta Awasthy
 
Subhash Chandra Bose, His travels to Freedom
Subhash Chandra Bose, His travels to FreedomSubhash Chandra Bose, His travels to Freedom
Subhash Chandra Bose, His travels to Freedom
Prof. Neeta Awasthy
 
# 21 tips for a great presentation
# 21 tips for a great presentation# 21 tips for a great presentation
# 21 tips for a great presentation
Prof. Neeta Awasthy
 
Comparative Design thinking
Comparative Design thinking Comparative Design thinking
Comparative Design thinking
Prof. Neeta Awasthy
 
National Education Policy 2020
National Education Policy 2020 National Education Policy 2020
National Education Policy 2020
Prof. Neeta Awasthy
 
Personalised education (2)
Personalised education (2)Personalised education (2)
Personalised education (2)
Prof. Neeta Awasthy
 
Case study of digitization in india
Case study of digitization in indiaCase study of digitization in india
Case study of digitization in india
Prof. Neeta Awasthy
 
Student dashboard for Engineering Undergraduates
Student dashboard for Engineering UndergraduatesStudent dashboard for Engineering Undergraduates
Student dashboard for Engineering Undergraduates
Prof. Neeta Awasthy
 
Handling Capstone projects in Engineering Colllege
Handling Capstone projects in Engineering ColllegeHandling Capstone projects in Engineering Colllege
Handling Capstone projects in Engineering Colllege
Prof. Neeta Awasthy
 
Engineering Applications of Machine Learning
Engineering Applications of Machine LearningEngineering Applications of Machine Learning
Engineering Applications of Machine Learning
Prof. Neeta Awasthy
 
Design thinking in Engineering
Design thinking in EngineeringDesign thinking in Engineering
Design thinking in Engineering
Prof. Neeta Awasthy
 
Data Science & Artificial Intelligence for ALL
Data Science & Artificial Intelligence for ALLData Science & Artificial Intelligence for ALL
Data Science & Artificial Intelligence for ALL
Prof. Neeta Awasthy
 
Big data and Artificial Intelligence
Big data and Artificial IntelligenceBig data and Artificial Intelligence
Big data and Artificial Intelligence
Prof. Neeta Awasthy
 
Academic industry collaboration at kec dated 3.6.17 v 3
Academic industry collaboration at kec dated 3.6.17 v 3Academic industry collaboration at kec dated 3.6.17 v 3
Academic industry collaboration at kec dated 3.6.17 v 3
Prof. Neeta Awasthy
 
AI in Talent Acquisition
AI in Talent AcquisitionAI in Talent Acquisition
AI in Talent Acquisition
Prof. Neeta Awasthy
 
Big data in defence and national security malayasia
Big data in defence and national security   malayasiaBig data in defence and national security   malayasia
Big data in defence and national security malayasia
Prof. Neeta Awasthy
 
Cyber crimes in india Dr. Neeta Awasthy
Cyber crimes in india Dr. Neeta AwasthyCyber crimes in india Dr. Neeta Awasthy
Cyber crimes in india Dr. Neeta Awasthy
Prof. Neeta Awasthy
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17
Prof. Neeta Awasthy
 
Steepest descent method
Steepest descent methodSteepest descent method
Steepest descent method
Prof. Neeta Awasthy
 
Role of Teacher in the era of Generative AI
Role of Teacher in the era of Generative AIRole of Teacher in the era of Generative AI
Role of Teacher in the era of Generative AI
Prof. Neeta Awasthy
 
Subhash Chandra Bose, His travels to Freedom
Subhash Chandra Bose, His travels to FreedomSubhash Chandra Bose, His travels to Freedom
Subhash Chandra Bose, His travels to Freedom
Prof. Neeta Awasthy
 
# 21 tips for a great presentation
# 21 tips for a great presentation# 21 tips for a great presentation
# 21 tips for a great presentation
Prof. Neeta Awasthy
 
Case study of digitization in india
Case study of digitization in indiaCase study of digitization in india
Case study of digitization in india
Prof. Neeta Awasthy
 
Student dashboard for Engineering Undergraduates
Student dashboard for Engineering UndergraduatesStudent dashboard for Engineering Undergraduates
Student dashboard for Engineering Undergraduates
Prof. Neeta Awasthy
 
Handling Capstone projects in Engineering Colllege
Handling Capstone projects in Engineering ColllegeHandling Capstone projects in Engineering Colllege
Handling Capstone projects in Engineering Colllege
Prof. Neeta Awasthy
 
Engineering Applications of Machine Learning
Engineering Applications of Machine LearningEngineering Applications of Machine Learning
Engineering Applications of Machine Learning
Prof. Neeta Awasthy
 
Data Science & Artificial Intelligence for ALL
Data Science & Artificial Intelligence for ALLData Science & Artificial Intelligence for ALL
Data Science & Artificial Intelligence for ALL
Prof. Neeta Awasthy
 
Big data and Artificial Intelligence
Big data and Artificial IntelligenceBig data and Artificial Intelligence
Big data and Artificial Intelligence
Prof. Neeta Awasthy
 
Academic industry collaboration at kec dated 3.6.17 v 3
Academic industry collaboration at kec dated 3.6.17 v 3Academic industry collaboration at kec dated 3.6.17 v 3
Academic industry collaboration at kec dated 3.6.17 v 3
Prof. Neeta Awasthy
 
Big data in defence and national security malayasia
Big data in defence and national security   malayasiaBig data in defence and national security   malayasia
Big data in defence and national security malayasia
Prof. Neeta Awasthy
 
Cyber crimes in india Dr. Neeta Awasthy
Cyber crimes in india Dr. Neeta AwasthyCyber crimes in india Dr. Neeta Awasthy
Cyber crimes in india Dr. Neeta Awasthy
Prof. Neeta Awasthy
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17
Prof. Neeta Awasthy
 
Ad

Recently uploaded (20)

VISHAL KUMAR SINGH Latest Resume with updated details
VISHAL KUMAR SINGH Latest Resume with updated detailsVISHAL KUMAR SINGH Latest Resume with updated details
VISHAL KUMAR SINGH Latest Resume with updated details
Vishal Kumar Singh
 
Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control
 
Frontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend EngineersFrontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend Engineers
Michael Hertzberg
 
Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...
Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...
Optimizing Reinforced Concrete Cantilever Retaining Walls Using Gases Brownia...
Journal of Soft Computing in Civil Engineering
 
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
PawachMetharattanara
 
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdfDahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
PawachMetharattanara
 
AI Chatbots & Software Development Teams
AI Chatbots & Software Development TeamsAI Chatbots & Software Development Teams
AI Chatbots & Software Development Teams
Joe Krall
 
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
speedcomcyber25
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdfIBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
VigneshPalaniappanM
 
acid base ppt and their specific application in food
acid base ppt and their specific application in foodacid base ppt and their specific application in food
acid base ppt and their specific application in food
Fatehatun Noor
 
Automatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and BeyondAutomatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and Beyond
NU_I_TODALAB
 
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia
 
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
Guru Nanak Technical Institutions
 
AI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in RetailAI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in Retail
IJDKP
 
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdfSmart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
PawachMetharattanara
 
David Boutry - Specializes In AWS, Microservices And Python
David Boutry - Specializes In AWS, Microservices And PythonDavid Boutry - Specializes In AWS, Microservices And Python
David Boutry - Specializes In AWS, Microservices And Python
David Boutry
 
Understand water laser communication using Arduino laser and solar panel
Understand water laser communication using Arduino laser and solar panelUnderstand water laser communication using Arduino laser and solar panel
Understand water laser communication using Arduino laser and solar panel
NaveenBotsa
 
hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .
NABLAS株式会社
 
Working with USDOT UTCs: From Conception to Implementation
Working with USDOT UTCs: From Conception to ImplementationWorking with USDOT UTCs: From Conception to Implementation
Working with USDOT UTCs: From Conception to Implementation
Alabama Transportation Assistance Program
 
VISHAL KUMAR SINGH Latest Resume with updated details
VISHAL KUMAR SINGH Latest Resume with updated detailsVISHAL KUMAR SINGH Latest Resume with updated details
VISHAL KUMAR SINGH Latest Resume with updated details
Vishal Kumar Singh
 
Frontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend EngineersFrontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend Engineers
Michael Hertzberg
 
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
PawachMetharattanara
 
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdfDahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
Dahua Smart Cityyyyyyyyyyyyyyyyyy2025.pdf
PawachMetharattanara
 
AI Chatbots & Software Development Teams
AI Chatbots & Software Development TeamsAI Chatbots & Software Development Teams
AI Chatbots & Software Development Teams
Joe Krall
 
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx698642933-DdocfordownloadEEP-FAKE-PPT.pptx
698642933-DdocfordownloadEEP-FAKE-PPT.pptx
speedcomcyber25
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdfIBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
IBAAS 2023 Series_Lecture 8- Dr. Nandi.pdf
VigneshPalaniappanM
 
acid base ppt and their specific application in food
acid base ppt and their specific application in foodacid base ppt and their specific application in food
acid base ppt and their specific application in food
Fatehatun Noor
 
Automatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and BeyondAutomatic Quality Assessment for Speech and Beyond
Automatic Quality Assessment for Speech and Beyond
NU_I_TODALAB
 
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia
 
AI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in RetailAI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in Retail
IJDKP
 
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdfSmart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
PawachMetharattanara
 
David Boutry - Specializes In AWS, Microservices And Python
David Boutry - Specializes In AWS, Microservices And PythonDavid Boutry - Specializes In AWS, Microservices And Python
David Boutry - Specializes In AWS, Microservices And Python
David Boutry
 
Understand water laser communication using Arduino laser and solar panel
Understand water laser communication using Arduino laser and solar panelUnderstand water laser communication using Arduino laser and solar panel
Understand water laser communication using Arduino laser and solar panel
NaveenBotsa
 
hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .
NABLAS株式会社
 

Ann a Algorithms notes

  • 1. Gradient Descent method: Gradient descent is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function (cost). Gradient descent is best used when the parameters cannot be calculated analytically(e.g. using linear algebra) and must be searched for by an optimization algorithm. Think of a large bowl like what you would eat cereal out of or store fruit in. This bowl is a plot of the cost function (f). A random position on the surface of the bowl is the cost of the current values of the coefficients (cost). The bottom of the bowl is the cost of the best set of coefficients, the minimum of the function. The goal is to continue to try different values for the coefficients, evaluatetheir cost and select new coefficients that have a slightly better (lower) cost. Repeating this process enough times will lead to the bottom of the bowl and you will know the values of the coefficients that result in the minimum cost Gradient Descent Procedure: The procedure starts off with initial values for the coefficient or coefficients for the function. These could be 0.0 or a small random value coefficient = 0.0 The cost of the coefficients is evaluated by plugging them into the function and calculating the cost. cost = f(coefficient) The derivative of the cost is calculated. The derivative is a concept from calculus and refers to the slope of the function at a given point. We need to know the slope so that we know the direction (sign) to move the coefficient values in order to get a lower cost on the next iteration. delta = derivative(cost) Now that we know from the derivative which direction is downhill, we can now update the coefficient values. A learning rate parameter (alpha) must be specified that controls how much the coefficients can change on each update. coefficient = coefficient – (alpha * delta) This process is repeated until the cost of the coefficients (cost) is 0.0 or close enough to zero to be good enough.
  • 2. You can see how simple gradient descent is. It does require you to know the gradient of your cost function or the function you are optimizing, but besides that, it’s very straightforward. Next we will see how we can use this in machine learning algorithms. In theory this means that after applying enough iterations of the process to a data set we could see a final closest minimum cost function to base further work on. – my understanding
  • 3. Back Propagation Method: It’s a common method of training artificial neural networks and used in conjunction with an optimization method such as gradient descent. The algorithm repeats a two phase cycle, propagation and weight update. When an input vector is presented to the network, it is propagated forward through the network, layer by layer, until it reaches the output layer. The output of the network is then compared to the desired output, using a loss function, and an error value is calculated for each of the neurons in the output layer. The error values are then propagated backwards, starting from the output, until each neuron has an associated error value which roughly represents its contribution to the original output. Back propagation uses these error values to calculate the gradient of the loss function with respect to the weights in the network. In the second phase, this gradient is fed to the optimization method, which in turn uses it to update the weights, in an attempt to minimize the loss function. The importance of this process is that, as the network is trained, the neurons in the intermediate layers organize themselves in such a way that the different neurons learn to recognize different characteristics of the total input space. After training, when an arbitrary input pattern is present which contains noise or is incomplete, neurons in the hidden layer of the network will respond with an active output if the new input contains a pattern that resembles a feature that the individual neurons have learned to recognize during their training.
  • 4. For back propagation to work we need to make two main assumptions about the form of the cost function. Before stating those assumptions, though, it's useful to have an example cost function in mind. the quadratic cost has the form C=12n∑x‖ y(x)−aL(x)‖ 2 where: n is the total number of training examples; the sum is over individual training examples, x; y=y(x) is the corresponding desired output; L denotes the number of layers in the network; and aL=aL(x) is the vector of activations output from the network when x is input. Okay, so what assumptions do we need to make about our cost function, C, in order that back propagation can be applied? The first assumption we need is that the cost function can be written as an average C=1n∑xCx over cost functions Cx for individual training examples, x. This is the case for the quadratic cost function, where the cost for a single training example is Cx=12‖ y−aL‖ 2. The second assumption we make about the cost is that it can be written as a function of the outputs from the neural network: For example, the quadratic cost function satisfies this requirement, since the quadratic cost for a single training example x may be written as C=12‖ y−aL‖ 2=12∑j(yj−aLj)2 and thus is a function of the output activations.
  • 5. Steepest Descent Method: An algorithm for finding the nearest local minimum of a function which presupposes that the gradient of the function can be computed. The method of steepest descent, also called the gradient descent method, starts at a point and, as many times as needed, moves from to by minimizing along the line extending from in the direction of , the local downhill gradient. When applied to a 1-dimensional function , the method takes the form of iterating from a starting point for some small until a fixed point is reached. The results are illustrated above for the function with and starting points and 0.01, respectively. This method has the severe drawback of requiring a great many iterations for functions which have long, narrow valley structures. In such cases, a conjugate gradient method is preferable. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point. If instead one takes steps proportional to the positive of the gradient, one approaches a local maximum of that function; the procedure is then known as gradient ascent.
  • 6. There is a chronical problem to the gradient descent. For functions that have valleys (in the case of descent) or saddle points (in the case of ascent), the gradient descent/ascent algorithm zig-zags, because the gradient is nearly orthogonal to the direction of the local minimum in these regions. It is like being inside a round tube and trying to stay in the lower part of the tube. In case we are not, the gradient tells us we should go almost perpendicular to the longitudinal direction of the tube. If the local minimum is at the end of the tube, it will take a long time to reach it because we keep jumping between the sides of the tube (zig-zag). The Rosenbrock function is used to test this difficult problem: f(y,x)=(1−y)2+100(x−y2)2