0% found this document useful (0 votes)
37 views

Ann PDF

1. Artificial neural networks are machine learning models inspired by biological neural networks. They consist of interconnected nodes called artificial neurons that process input data through a series of weighted connections. 2. An artificial neural network contains an input layer, hidden layers, and an output layer. Data is fed through the input layer, processed through the hidden layers using weighted connections between nodes, and produces an output in the output layer. 3. Neural networks are trained using an optimization algorithm called gradient descent which minimizes a loss function by adjusting the weights of connections between nodes in the network. The loss gradient is computed through backpropagation to update these weights during training.

Uploaded by

Gilmer Abad Vera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Ann PDF

1. Artificial neural networks are machine learning models inspired by biological neural networks. They consist of interconnected nodes called artificial neurons that process input data through a series of weighted connections. 2. An artificial neural network contains an input layer, hidden layers, and an output layer. Data is fed through the input layer, processed through the hidden layers using weighted connections between nodes, and produces an output in the output layer. 3. Neural networks are trained using an optimization algorithm called gradient descent which minimizes a loss function by adjusting the weights of connections between nodes in the network. The loss gradient is computed through backpropagation to update these weights during training.

Uploaded by

Gilmer Abad Vera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Artificial Neural Networks

Lecturer: Javier Machacuay


Mechanical-Electrical Engineer (UDEP)
MicroMasters Program in Statistics and Data Science Graduate (MIT)
𝐼𝑛𝑝𝑢𝑡𝑠 𝑂𝑢𝑡𝑝𝑢𝑡𝑠
Process /
System

𝑥 𝑦
𝑓 𝑦=𝑓 𝑥
Examples (Tabular data: Classification)

Process /
System
Examples (Tabular data: Regression)

Process /
System
Examples (Computer vision: Image Classification)

𝐶𝑎𝑡
Process /
System
Examples (Computer vision: Object detection)
Examples (Natural Language Processing)
𝐼𝑛𝑝𝑢𝑡:

𝑂𝑢𝑡𝑝𝑢𝑡:
𝐼𝑛𝑝𝑢𝑡𝑠 𝑂𝑢𝑡𝑝𝑢𝑡𝑠
Process /
System

𝑥 𝑦
𝑓 𝑦=𝑓 𝑥
𝐼𝑛𝑝𝑢𝑡𝑠 𝑂𝑢𝑡𝑝𝑢𝑡𝑠
Process /
System

𝐼𝑛𝑝𝑢𝑡𝑠
Machine 𝑃𝑟𝑜𝑐𝑒𝑠𝑠/𝑆𝑦𝑠𝑡𝑒𝑚
𝑂𝑢𝑡𝑝𝑢𝑡𝑠
Learning
𝐼𝑛𝑝𝑢𝑡𝑠
Artificial Neural 𝑃𝑟𝑜𝑐𝑒𝑠𝑠/𝑆𝑦𝑠𝑡𝑒𝑚
𝑂𝑢𝑡𝑝𝑢𝑡𝑠
Network

𝑥
=
Artificial Neural
𝑓መ
𝑦
Network
What is an
artificial
neural
network?
𝑧1 = 𝑊11 𝑥1 + 𝑊12 𝑥2 + ⋯ + 𝑊1𝑑 𝑥𝑑 + 𝑏1
Artificial Neuron 𝑎1 = 𝑓 𝑧1

𝑥1 𝑥1

𝑥2 𝑥2
=
𝑧1
𝑧1 𝑎1 𝑎1
𝑥3 𝑥3

⋮ ⋮
𝑥𝑑 𝑥𝑑
Artificial Neuron
𝑥1
𝑊11

𝑥2 𝑊12
𝑧1
𝑎1 𝑎1 = 𝑓 𝑧1 = 𝑓 𝑊11 𝑥1 + 𝑊12 𝑥2 + ⋯ + 𝑊1𝑑 𝑥𝑑 + 𝑏1
𝑥3 𝑊13


Each connection has weights 𝑊𝑖𝑗
𝑊1𝑑

𝑥𝑑
(and each neuron a bias 𝑏𝑖 )
Artificial Neurons
𝑧1 𝑎1 = 𝑓 𝑧1 = 𝑓 𝑊11 𝑥1 + 𝑊12 𝑥2 + ⋯ + 𝑊1𝑑 𝑥𝑑 + 𝑏1
𝑥1 𝑎1

𝑥2

𝑥3

𝑥𝑑
Artificial Neurons
𝑧1 𝑎1 = 𝑓 𝑧1 = 𝑓 𝑊11 𝑥1 + 𝑊12 𝑥2 + ⋯ + 𝑊1𝑑 𝑥𝑑 + 𝑏1
𝑥1 𝑎1

𝑧2
𝑥2 𝑎2
𝑎2 = 𝑓 𝑧2 = 𝑓 𝑊21 𝑥1 + 𝑊22 𝑥2 + ⋯ + 𝑊2𝑑 𝑥𝑑 + 𝑏2

𝑥3

𝑥𝑑
Artificial Neurons
𝑧1 𝑎1 = 𝑓 𝑧1 = 𝑓 𝑊11 𝑥1 + 𝑊12 𝑥2 + ⋯ + 𝑊1𝑑 𝑥𝑑 + 𝑏1
𝑥1 𝑎1

𝑧2
𝑥2 𝑎2
𝑎2 = 𝑓 𝑧2 = 𝑓 𝑊21 𝑥1 + 𝑊22 𝑥2 + ⋯ + 𝑊2𝑑 𝑥𝑑 + 𝑏2

𝑥3 𝑧3 𝑎3 = 𝑓 𝑧3 = 𝑓 𝑊31 𝑥1 + 𝑊32 𝑥2 + ⋯ + 𝑊3𝑑 𝑥𝑑 + 𝑏3


𝑎3

𝑥𝑑
Artificial Neurons
𝑧1 𝑎1 = 𝑓 𝑧1 = 𝑓 𝑊11 𝑥1 + 𝑊12 𝑥2 + ⋯ + 𝑊1𝑑 𝑥𝑑 + 𝑏1
𝑥1 𝑎1

𝑧2
𝑥2 𝑎2
𝑎2 = 𝑓 𝑧2 = 𝑓 𝑊21 𝑥1 + 𝑊22 𝑥2 + ⋯ + 𝑊2𝑑 𝑥𝑑 + 𝑏2

𝑥3 𝑧3 𝑎3 = 𝑓 𝑧3 = 𝑓 𝑊31 𝑥1 + 𝑊32 𝑥2 + ⋯ + 𝑊3𝑑 𝑥𝑑 + 𝑏3


𝑎3
⋮ ⋮ ⋮
𝑧ℎ
𝑥𝑑 𝑎ℎ 𝑎ℎ = 𝑓 𝑧ℎ = 𝑓 𝑊ℎ1 𝑥1 + 𝑊ℎ2 𝑥2 + ⋯ + 𝑊ℎ𝑑 𝑥𝑑 + 𝑏ℎ
Artificial Neural Network (ANN)
𝑧1 𝑧1 𝑧1
𝑥1 𝑎1 𝑎1 𝑎1

𝑧2 𝑧2 𝑧2
𝑥2 𝑎2 𝑎2 𝑎2


𝑥3 𝑧3 𝑧3 𝑧3
𝑎3 𝑎3 𝑎3
⋮ ⋮ ⋮ ⋮
𝑧ℎ 𝑧ℎ 𝑧ℎ
𝑥𝑑 𝑎ℎ 𝑎ℎ 𝑎ℎ

𝑙=1 𝑙=2 𝑙=3 𝑙=𝐿


Artificial Neural Network (ANN): Forward Equations
𝑧1 𝑧1
𝑎1 𝑎1

𝑧2 𝑧2
𝑎2 𝑎2
𝑛𝑙−1
𝑙 𝑙 𝑙 𝑙−1 𝑙
𝑧3 𝑧3 𝑎𝑖 = 𝑓 𝑧𝑖 = 𝑓 ෍ 𝑊𝑖𝑗 𝑎𝑗 + 𝑏𝑖
𝑎3 𝑎3 𝑗=1

⋮ ⋮ 𝑊ℎ𝑒𝑟𝑒:
𝑧ℎ 𝑧ℎ 1
𝑎ℎ 𝑎ℎ 𝑎𝑗 = 𝑥𝑗
𝑙=2 𝑙=3
Artificial Neural Network (ANN): Forward Equations
𝑧1 𝑧1
𝑎1 𝑎1 𝑛𝑙−1
𝑙 𝑙 𝑙 𝑙−1 𝑙
𝑎𝑖 = 𝑓 𝑧𝑖 =𝑓 ෍ 𝑊𝑖𝑗 𝑎𝑗 + 𝑏𝑖
𝑧2 𝑧2 𝑗=1
𝑎2 𝑎2

=
𝑧3 𝑧3
𝑎3 𝑎3
⋮ ⋮
𝑧ℎ 𝑧ℎ 𝑎 𝑙
= 𝑓ሚ 𝑧 𝑙
= 𝑓ሚ 𝑊 𝑙 𝑎 𝑙−1
+𝑏 𝑙

𝑎ℎ 𝑎ℎ

𝑙=2 𝑙=3
Artificial Neural Network (ANN): Forward Equations
𝑧1 𝑧1
𝑎1 𝑎1
𝑎 𝑙
= 𝑓ሚ 𝑧 𝑙 ሚ 𝑙
=𝑓 𝑊 𝑎 𝑙−1
+𝑏 𝑙

𝑧2 𝑧2
𝑎2 𝑎2

𝑧3 𝑧3
𝑎3 𝑎3
⋮ ⋮
𝑧ℎ 𝑧ℎ
𝑎ℎ 𝑎ℎ

𝑙=2 𝑙=3
ANN: Forward equations (Computational Graph)
𝑙−1
𝑎
𝑎 𝑙
= 𝑓ሚ 𝑧 𝑙
= 𝑓ሚ 𝑊 𝑙 𝑎 𝑙−1
+𝑏 𝑙

𝑀𝑎𝑡𝑟𝑖𝑥
×
𝑃𝑟𝑜𝑑𝑢𝑐𝑡
𝑙 𝑙 𝑙
𝑊 𝑧 𝑎
+ 𝑓ሚ
𝑙
𝑏 𝐸𝑙𝑒𝑚𝑒𝑛𝑡 − 𝑤𝑖𝑠𝑒
nonlinear 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛
ANN: Forward Propagation

1 2 𝐿
𝑎 𝑎 𝑎


𝑊ℎ𝑒𝑟𝑒:
1
𝑎 =𝑥
ANN: Learnable parameters
1 2 𝐿
𝑎 𝑎 𝑎


𝑎 𝑙
= 𝑓ሚ 𝑧 𝑙
= 𝑓ሚ 𝑊 𝑙 𝑎 𝑙−1
+𝑏 𝑙

How to define them?


Machine Learning (Supervised Learning) Framework
Set up the optimization problem

min 𝐿 𝑑𝑎𝑡𝑎, 𝑊, 𝑏
𝑊,𝑏
Standard optimization algorithm in Machine Learning
𝐿 𝜃

𝑘+1 𝑘 𝜕𝐿
𝜃 =𝜃 − 𝛼
𝜕𝜃

𝜃
Optimization algorithm: Gradient descent

𝑘+1 𝑘 𝜕𝐿
𝜃 =𝜃 − 𝛼
𝜕𝜃

Remarks:
• Local optimizer.
• Only requires derivative computation.
The Loss Function 𝐿

min 𝐿 𝑑𝑎𝑡𝑎, 𝑊, 𝑏
𝑊,𝑏
𝑛
1 𝐿 𝑖 𝑖
2
𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛: 𝐿= ෍ 𝑎 −𝑦
2𝑛
𝑖 𝑑 𝑖=1
𝑦 ∈ ℝ , ∀𝑖
𝑛
1 𝑖 𝐿 𝑖
𝐶𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛: 𝐿 = − ෍ 𝑦 ln 𝑎
𝑖 𝑛
𝑦 ∈ 𝑌, ∀𝑖 𝑖=1
ANN: Derivatives (Backward) computation
𝜕𝐿
𝜕𝑎 𝑙−1 𝜕𝐿
𝜕𝑧 𝑙
×
𝜕𝐿 𝜕𝐿
𝜕𝑊 𝑙 𝜕𝑧 𝑙 𝜕𝐿
𝜕𝐿 + 𝑓ሚ
𝜕𝑎 𝑙
𝜕𝑏 𝑙
ANN: Derivatives (Backward) computation
(Mathematical proofs skipped)

𝑙
𝜕𝐿 𝜕𝐿 𝜕𝑎
𝑙
= 𝑙
∗ 𝑙
𝜕𝑧 𝜕𝑎 𝜕𝑧

𝜕𝐿 𝜕𝐿
𝑙
=
𝜕𝑏 𝜕𝑧 𝑙

𝜕𝐿 𝜕𝐿 𝑙−1 𝑇
𝑙
= 𝑙
×𝑎
𝜕𝑊 𝜕𝑧

𝜕𝐿 𝑙 𝑇
𝜕𝐿
𝑙−1
=𝑊 ×
𝜕𝑎 𝜕𝑧 𝑙
ANN: Backward propagation (backpropagation)

𝜕𝐿 𝜕𝐿 𝜕𝐿
𝜕𝑎 1 𝜕𝑎 2 𝜕𝑎 𝐿


ANN: Forward & Backward propagation

1 2 𝐿
𝑎 𝑎 𝑎


𝜕𝐿 𝜕𝐿 𝜕𝐿
𝜕𝑎 1 𝜕𝑎 2 𝜕𝑎 𝐿


Solving the optimization problem for ANNs
min 𝐿 𝑑𝑎𝑡𝑎, 𝑊, 𝑏
𝑊,𝑏

𝑙 𝑙
𝜕𝐿
𝑊 =𝑊 −𝛼
𝜕𝑊 𝑙

𝑙 𝑙
𝜕𝐿
𝑏 =𝑏 −𝛼
𝜕𝑏 𝑙
Worked example (Handwritten digit recognition)

“It’s a 5.”
Artificial Neural
Network

You might also like