Unit 1 DL
Unit 1 DL
f(x, y) = σ (x + y)
Derivatives of Functions with Multiple
Inputs
Creating New Features
• Perhaps the single most common operation in neural networks is to
form a “weighted sum” of these features, where the weighted sum
could emphasize certain features and de-emphasize others and thus
be thought of as a new feature that itself is just a combination of old
features.
• A concise way to express this mathematically is as a dot product of
this observation, with some set of “weights” of the same length as
the features.
• N = ν(X, W)= X × W = x1 × w1 + x2 × w2 + ... + xn × wn
Derivatives of Functions with Multiple
Vector Inputs
• For vector functions, it isn’t immediately obvious what the
derivative is: if we write a dot product as ν (X, W) = N
• the derivative with respect to each element of the matrix
Computational Graph
• Gradient of L
The Backward Pass
• Diagram
• the output of a linear regression model as the dot product of each
observation vector
Training the Model
• Backpropagation
Or dLdP = 2(P - Y)