Activation Functions
Activation Functions
As you can see the function is a line or linear. Therefore, the output of the
functions will not be confined between any range.
Equation : f(x) = x
The Nonlinear Activation Functions are the most used activation functions.
Non linearity helps to makes the graph look something like this
It makes it easy for the model to generalize or adapt with variety of data
and to differentiate between the output.
The logistic sigmoid function can cause a neural network to get stuck at
the training time.
tanh is also like logistic sigmoid but better. The range of the tanh function
is from (-1 to 1). tanh is also sigmoidal (s - shaped).
Fig: tanh v/s Logistic Sigmoid
The advantage is that the negative inputs will be mapped strongly negative
and the zero inputs will be mapped near zero in the tanh graph.
The ReLU is the most used activation function in the world right
now.Since, it is used in almost all the convolutional neural networks or
deep learning.
Fig: ReLU v/s Logistic Sigmoid
As you can see, the ReLU is half rectified (from bottom). f(z) is zero
when z is less than zero and f(z) is equal to z when z is above or equal to
zero.
Range: [ 0 to infinity)
But the issue is that all the negative values become zero immediately
which decreases the ability of the model to fit or train from the data
properly. That means any negative input given to the ReLU activation
function turns the value into zero immediately in the graph, which in turns
affects the resulting graph by not mapping the negative values
appropriately.
4. Leaky ReLU
The leak helps to increase the range of the ReLU function. Usually, the
value of a is 0.01 or so.
Perceptrons
A Perceptron is an Artificial Neuron
It is the simplest possible Neural Network
Neural Networks are the building blocks of Machine Learning.
Threshold = 1.5
Return true if the sum > 1.5 ("Yes I will go to the Concert")
Perceptron Terminology
Perceptron Inputs
Node values
Node Weights
Activation Function
Perceptron Inputs
Node Values
Node Weights
In the example above, the node weights are: 0.7, 0.6, 0.5, 0.3, 0.4
The activation functions maps the result (the weighted sum) into a
required value like 0 or 1.
In the example above, the activation function is simple: (sum > 1.5)
The binary output (1 or 0) can be interpreted as (yes or no) or (true or
false).
Neural Networks
In the Neural Network Model, input data (yellow) are processed against
a hidden layer (blue) and modified against more hidden layers (green) to
produce the final output (red).
Multilayer Perceptron
Each layer is feeding the next one with the result of their computation,
their internal representation of the data. This goes all the way through the
hidden layers to the output layer.But it has more to it.