Introduction Deep Eng (1)
Introduction Deep Eng (1)
Deep Learning
Ones Sidhom
Email 2024-2025
[email protected]
What is
learning
Deep
Deep Learning is a sub-field of Artificial Intelligence
(AI), more specifically Machine Learning, which relies on the
use of deep neural networks to model and learn
from
complex representations from large quantities of data.
Machine Vs Deep learning
In traditional machine learning, features are usually extracted manually by experts (hand-
crafted feature extraction), which can be a laborious process and requires in-depth domain
knowledge.
Machine learning algorithms are then used to classify the data according to these characteristics
Machine Vs Deep learning
Deep learning uses deep neural networks that can learn automatically.
extract relevant features directly from raw data.
This avoids feature engineering process and can lead to better performance, especially for complex
tasks.
Deep learning
Deep learning generally requires large data sets for several reasons:
• Automatic feature learning: Deep neural networks automatically learn to extract relevant
features from data. To do this, they need a large number of examples to be able to identify
complex patterns in the data.
• Model complexity: Deep neural networks are complex models with a large number of
parameters. To train them effectively and avoid overlearning, they require a large volume of
data to favor generalization to new examples rather than memorization of training data.
Deep learning
Deep learning generally requires large data sets for several reasons:
• Data variability: Real data can be highly varied, containing noise or anomalies. A large data set
helps to better represent data diversity and improve model robustness.
• Accuracy: To achieve high levels of accuracy, deep neural networks need to be trained on
large, high-quality data sets.
Layers in artificial neural networks (ANNs)
Artificial neural network (ANN) layers composed of three main types of layers:
• Input
• Hidden layers
• Output layer
Layers in artificial neural networks (ANNs)
1. Input layer :
• Role: Receives raw data.
• Function: Transmits data to hidden layers.
• Example: For an image, the input layer contains neurons for each pixel value.
Layers in artificial neural networks (ANNs)
2. Hidden :
• Hidden layers are the intermediate layers between input and output layers.
• They perform most of the calculations required by the network. The number and size of hidden layers
can vary according to the complexity of the task.
• Each hidden layer applies a set of weights and biases to the input data, followed by an activation
function
Layers in artificial neural networks (ANNs)
3. Output layer:
• The output layer is the last layer of an ANN.
• It produces output predictions.
• The number of neurons in this layer corresponds to the number of classes in a
classification problem or the number of outputs in a regression problem.
What is a single-layer perceptron?
This is one of the oldest neural networks. It was proposed by Frank Rosenblatt in 1958. The perceptron is
also known as an Artificial Neural Network.
feature 1
bias
X1 W1
b1
output of
Z1
the neuron
X2 W2
weight
feature 2
Calculating the output of a single neuron
• Forward propagation is the first stage of computation in a neural network. It's the
passage information from the inputs to the output, applying weights, biases and
activation layer by layer.
• f is an activation
⮚ These operations are repeated for each layer of the network until the final output is obtained.
Forward Propagation
b1
X1
X2
Forward Propagation
X1 W3
b2
X2
Forward Propagation
X1
X2 W6
b3
Forward Propagation
X1
X2
b4
Forward Propagation
0.5
Softmax
ReLU
• In practice, these two terms are often used interchangeably, but technically :
▪ Loss function: The error for a single example (input-output),
▪ Cost function: The average error over the entire dataset (or a mini-batch).
MSE (Mean Squared Error) Regression Classical regression (derivative more stable than MAE)
• True value : y
• Cost : C= y - a (2)
y a(2)
C
Backpropagation
• True value : y
• Cost : C= y - a(2)
a(2)= α(z(2))
z(2)
y a(2)
C
Backpropagation
• True value : y
• Cost : C= y - a(2)
a(2)= α(z(2)) w2 a(1) b2
z(2)= w2 a(1)+ b2 z(2)
y a(2)
C
Backpropagation
• True value : y
• Cost : C= y - a(2) Z(1)
a(2)= α(z(2)) w2 a(1) b2
z(2)= w2 a(1)+ b2 z(2)
a(1)= α(z(1))
y a(2)
C
Backpropagation
w1 a(0) b1
• True value : y
• Cost : C= y - a(2) Z(1)
a(2)= α(z(2)) w2 a(1) b2
z(2)= w2 a(1)+ b2 z(2)
a(1)= α(z(1))
y a(2)
z(1)= w1 a(0)+ b1
C
Backpropagation
• We want to know how modifying an element in this tree will affect the cost function
Partial derivative.
• The cost function cannot be derived directly from the weights, as the relationship is
indirect (through several intermediate functions).
• This is why we use the chain rule, which allows us to split the derivative into several smaller
partial derivatives, layer by layer. This allows us to calculate the impact of the weights on the
error, and thus to adjust them correctly.
Backpropagation
w1 a(0) b1
z(1)
w2 a(1) b2
z(2)
y a(2)
C
Backpropagation
w1 a(0) b1
z(1)
w2 a(1) b2
z(2)
y a(2)
C
Backpropagation
w1 a(0) b1
z(1)
w2 a(1) b2
• The chain rule allows you to "move up" gradually z(2)
in the network, by splitting the global derivative into products
y a(2)
of partial derivatives at each stage.
C
Backpropagation
Learning rate
• The learning rate is a fundamental parameter in training a machine.
neural network. It directly influences the speed and quality learning.
• The learning rate (often denoted) controls how much the network weights are modified with each
learning stage, depending on the error (the loss).
• This is a speed factor used in the gradient descent algorithm.
cost
cost cost
w w w
Learning rate
1. High learning rate= instability
• The model hops around the minimum.