ANN-Unit 7 - Parameter Tuning & Normalization
ANN-Unit 7 - Parameter Tuning & Normalization
Applied Neural
Networks
Unit – 7
Lecture Outline
▪ Deep Neural Networks
▪ Hyper parameter Tuning
▪ Batch Normaliazation
▪ Mini-Batches
▪ Regularization
▪ Softmax
▪ Orthogonalization
1
1/1/2024
Hyper-parameter Tuning
▪ α,β
▪ β1
▪ β2
▪ ε
▪ # hidden layers
▪ # hidden units
▪ Learning rates
▪ Mini-batch size
▪ Activation functions
▪ …..
Hyper-parameter Tuning
▪ Don’t Use a Grid
▪ Explore Randomly
▪ Coarse to Fine
2
1/1/2024
3
1/1/2024
Batch Normalization
▪ As we discussed before, normalizing the data is very
important to a machine learning model
▪ The Batch Normalization layer works normalizing the
data before the activation layer, it is essential as it
makes the model faster and more accurate.
▪ Using batch normalization, will speed up model
training, decreases the importance of initial
weights, regularizes the model a little and makes it a
little better.
(𝑖) 𝑧 (𝑖) − 𝜇𝐵
𝑧𝑛𝑜𝑟𝑚 =
𝜎2 + 𝜀
Where:
𝛾 𝑎𝑛𝑑 𝛽 are learnable parameters
ⅈ (𝑖)
𝑧ǁ = 𝛾𝑧𝑛𝑜𝑟𝑚 + 𝛽
Dr. Muhammad Usman Arif; Applied Neural Networks 1/1/2024 8
4
1/1/2024
10
5
1/1/2024
11
Covariate Shift
X→Y
12
6
1/1/2024
a2[2]
a3[2]
a4[2]
13
14
7
1/1/2024
Softmax Regression
15
Softmax Layer
z [l] = w [l] a [l−1] + b [l]
Activation Function:
[𝑙]
t = 𝑒𝑧
[𝑙]
𝑒𝑧 [𝑙] 𝑡𝑖
𝑎[𝑙] = σ4 , 𝑎𝑖 = σ4
𝑗=1 𝑡𝑖 𝑗=1 𝑡𝑖
z[l] = = a[l]
16
8
1/1/2024
Softmax Examples
17
Softmax Classifier
Softmax Hardmax
1
0
0
0
18
9
1/1/2024
Loss Function
0 0.3
1 0.2
𝑦= 𝑦ො =
0 0.1
0 0.4
4
1 1 1 𝑚
ℒ 𝑦ො , 𝑦 = − 𝑦𝑗 log 𝑦ො 𝑗 𝐽 𝑤 𝑏 … = 𝑚 𝑖=1 ℒ 𝑦ො 𝑖 , 𝑦 𝑖
𝑗=1
19
20
10
1/1/2024
21
22
11
1/1/2024
Orthogonalization
▪ Orthogonalization or orthogonality is a system design property that
assures that modifying an instruction or a component of an
algorithm will not create or propagate side effects to other
components of the system. It becomes easier to verify the algorithms
independently from one another, it reduces testing and development
time.
23
Orthogonalization
24
12
1/1/2024
25
13