Artificial Intelligence
Artificial Intelligence
✓2 types of learning
➢ The supervised learning
➢ The unsupervised learning
Dendrites: Input
Cell body: Processor
Synaptic: Link
Axon: Output
Y y
x2
w2
y = f ( yin )
X2
yin = x1w1 + x2 w2
3.7.2. ANN Working Methodology
✓ Now, let us have a look at the model of an artificial neuron.
𝒀 = (𝒙𝟏 ∗ 𝒘𝟏 + 𝒙𝟐 ∗ 𝒘𝟐 + 𝒙𝟑 ∗ 𝒘𝟑 + ⋯ . + 𝒙𝒏 ∗ 𝒘𝒏 ) + 𝒃𝒏
Here 𝒃𝒏 𝒓𝒆𝒑𝒓𝒆𝒔𝒆𝒏𝒕𝒔 𝒃𝒊𝒂𝒔.
Activation Function: Generally Sigmoid
𝟏
Sigmoid =
𝟏+𝒆−𝒚
3.8. ANN Activations Functions
3.9. Most Popular Activations Functions
Sigmoid
Relu
𝑹 𝒁 = 𝒎𝒂𝒙(𝟎, 𝒛)
4.0. Generalized Activations Functions
20
18
Linear
16
14
12
y=x
10
0
0 2 4 6 8 10 12 14 16 18 20
1.5
1
Logistic
0.5
1
0
y=
1 + exp( − x)
-0.5
-1
-1.5
-2
-10 -8 -6 -4 -2 0 2 4 6 8 10
1.5
1
Hyperbolic tangent
exp( x) − exp( − x)
0.5
y=
0
exp( x) + exp( − x)
-0.5
-1
-1.5
-2
-10 -8 -6 -4 -2 0 2 4 6 8 10
4.1. Single Layer Perceptron
References: https://machinelearningknowledge.ai/animated-explanation-of-feed-forward-neural-
network-architecture/
4.2. ANN Multilayer Perceptron
4.2. ANN Multilayer Perceptron
4.2. ANN Multilayer Perceptron
4.3. Feed Forward Neural Network
References: https://machinelearningknowledge.ai/animated-explanation-of-feed-forward-neural-
network-architecture/
4.4. Backpropagation in ANN
References: https://machinelearningknowledge.ai/animated-explanation-of-feed-forward-
neural-network-architecture/
4.5. Loss Function
✓The Loss Function is one of the important components of Neural
Networks. Loss is nothing but a prediction error of Neural Net. And the
method to calculate the loss is called Loss Function.
✓In simple words, the Loss is used to calculate the gradients. And gradients are
used to update the weights of the Neural Net. This is how a Neural Net is
trained.
4.6. Understanding Backpropagation
Suppose this is a neural network
Suppose this ANN output is 𝑦 and this is
actual output and suppose let 𝑦ො is predicted
output and 𝑛 is the total number of neurons
or nodes, so here LOSS FUNCTION defined
as:
𝒏
𝟏
𝑳𝑶𝑺𝑺 𝒚, 𝒚ෝ = ෝ)𝟐
(𝒚 − 𝒚
𝟐𝒏
𝒊=𝟏
𝝏𝑳
𝒘𝒏𝒆𝒘
𝟓 = 𝒘𝒐𝒍𝒅
𝟓 −𝜼
𝝏𝒘𝒐𝒍𝒅
𝟓
𝝏𝑳
𝒘𝒏𝒆𝒘
𝟏 = 𝒘𝒐𝒍𝒅
𝟏 −𝜼
𝝏𝒘𝒐𝒍𝒅
𝟏
𝝏𝑳
Here we expand are as follows:
𝝏𝒘𝒐𝒍𝒅
𝟏
𝝏𝑳 𝝏𝑳 𝝏𝒘𝟓 𝝏𝒐𝒖𝒕𝟏
= × ×
𝝏𝒘𝒐𝒍𝒅
𝟏
𝝏𝒘𝟓 𝝏𝒐𝒖𝒕𝟏 𝝏𝒘𝒐𝒍𝒅
𝟏
✓ Updation of 𝒘𝒏𝒆𝒘
𝟔 :
𝝏𝑳
𝒘𝒏𝒆𝒘
𝟔 = 𝒘𝒐𝒍𝒅
𝟔 −𝜼
𝝏𝒘𝒐𝒍𝒅
𝟔
✓ Updation of 𝒘𝒏𝒆𝒘
𝟏 :
𝝏𝑳
𝒘𝒏𝒆𝒘
𝟏 = 𝒘𝒐𝒍𝒅
𝟏 −𝜼
𝝏𝒘𝒐𝒍𝒅
𝟏
𝝏𝑳
Chain rule for are as follows:
𝝏𝒘𝒐𝒍𝒅
𝟏
𝝏𝑳 𝝏𝑳 𝝏𝒘𝟓 𝝏𝒐𝒖𝒕𝟏
= × ×
𝝏𝒘𝒐𝒍𝒅
𝟏
𝝏𝒘𝟓 𝝏𝒐𝒖𝒕𝟏 𝝏𝒘𝒐𝒍𝒅
𝟏
4.7. Mathematics behind Backpropagation
✓ Updation of 𝒘𝒏𝒆𝒘
𝟐 :
𝝏𝑳
𝒘𝒏𝒆𝒘
𝟐 = 𝒘𝒐𝒍𝒅
𝟐 −𝜼
𝝏𝒘𝒐𝒍𝒅
𝟐
𝝏𝑳
Chain rule for are as follows:
𝝏𝒘𝒐𝒍𝒅
𝟐
𝝏𝑳 𝝏𝑳 𝝏𝒘𝟓 𝝏𝒐𝒖𝒕𝟏
= × ×
𝝏𝒘𝒐𝒍𝒅
𝟐
𝝏𝒘𝟓 𝝏𝒐𝒖𝒕𝟏 𝝏𝒘𝒐𝒍𝒅
𝟐
✓ Updation of 𝒘𝒏𝒆𝒘
𝟑 :
𝝏𝑳
𝒘𝒏𝒆𝒘
𝟑 = 𝒘𝒐𝒍𝒅
𝟑 −𝜼
𝝏𝒘𝒐𝒍𝒅
𝟑
𝝏𝑳
Chain rule for are as follows:
𝝏𝒘𝒐𝒍𝒅
𝟑
𝝏𝑳 𝝏𝑳 𝝏𝒘𝟔 𝝏𝒐𝒖𝒕𝟐
= × ×
𝝏𝒘𝒐𝒍𝒅
𝟑
𝝏𝒘𝟔 𝝏𝒐𝒖𝒕𝟐 𝝏𝒘𝒐𝒍𝒅
𝟑
4.7. Mathematics behind Backpropagation
Updation of 𝒘𝒏𝒆𝒘
𝟒 :
𝝏𝑳
𝒘𝒏𝒆𝒘
𝟒 = 𝒘𝒐𝒍𝒅
𝟒 −𝜼
𝝏𝒘𝒐𝒍𝒅
𝟒
𝝏𝑳
Chain rule for are as follows:
𝝏𝒘𝒐𝒍𝒅
𝟒
𝝏𝑳 𝝏𝑳 𝝏𝒘𝟔 𝝏𝒐𝒖𝒕𝟐
= × ×
𝝏𝒘𝒐𝒍𝒅
𝟒
𝝏𝒘𝟔 𝝏𝒐𝒖𝒕𝟐 𝝏𝒘𝒐𝒍𝒅
𝟒
4.8. Various Optimizers in ANN
✓Gradient Descent.
✓Stochastic Gradient Descent (SGD)
✓Mini Batch Stochastic Gradient Descent (MB-SGD)
✓SGD with momentum.
✓Nesterov Accelerated Gradient (NAG)
✓Adaptive Gradient (AdaGrad)
✓AdaDelta.
✓RMSprop.
4.9. Most Popular -> Gradient Descent
Suppose we have a scalar function f(w) : →
f (w )
w1 points in direction of steepest ascent.
f (w ) =
w f ( w )
m f (w )
is the gradient in that direction
✓We should accept that there is a tendency to approach all important innovations as a
Rorschach test upon which we impose anxieties and hopes about what constitutes a good or
happy world. But the potential of AI and machine intelligence for good does not lie
exclusively, or even primarily, within its technologies. It lies mainly in its users. If we trust
(in the main) how our societies are currently being run then we have no reason not to trust
ourselves to do good with these technologies. And if we can suspend presentism and accept
that ancient stories warning us not to play God with powerful technologies are instructive
then we will likely free ourselves from unnecessary anxiety about their use.
END