0% found this document useful (0 votes)

6 views

Activation Functions

b.tech 6th sem notes

Uploaded by

ankitab7839

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Activation Functions

b.tech 6th sem notes

Uploaded by

ankitab7839

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Activation Function?

function that you use to get the output of node. It is also

known as Transfer Function.

Why we use Activation functions with Neural Networks?

It is used to determine the output of neural network like yes

or no. It maps the resulting values in between 0 to 1 or -1 to
1 etc. (depending upon the function).

The Activation Functions can be basically divided into 2 types-

1. Linear Activation Function

2. Non-linear Activation Functions

Linear or Identity Activation Function

As you can see the function is a line or linear. Therefore, the output of the
functions will not be confined between any range.

Equation : f(x) = x

Range : (-infinity to infinity)

It doesn’t help with the complexity or various parameters of usual data
that is fed to the neural networks.

Non-linear Activation Function

The Nonlinear Activation Functions are the most used activation functions.
Non linearity helps to makes the graph look something like this

Fig: Non-linear Activation Function

It makes it easy for the model to generalize or adapt with variety of data
and to differentiate between the output.

The main terminologies needed to understand for nonlinear functions are:

Derivative or Differential: Change in y-axis w.r.t. change in

x-axis.It is also known as slope.

Monotonic function: A function which is either entirely non-

increasing or non-decreasing.

The Nonlinear Activation Functions are mainly divided on the basis of

their range or curves-

1. Sigmoid or Logistic Activation Function

The Sigmoid Function curve looks like a S-shape.

Fig: Sigmoid Function

The main reason why we use sigmoid function is because it exists

between (0 to 1). Therefore, it is especially used for models where we
have to predict the probability as an output.Since probability of anything
exists only between the range of 0 and 1, sigmoid is the right choice.

The function is differentiable.That means, we can find the slope of the

sigmoid curve at any two points.

The function is monotonic but function’s derivative is not.

The logistic sigmoid function can cause a neural network to get stuck at
the training time.

The softmax function is a more generalized logistic activation function

which is used for multiclass classification.

2. Tanh or hyperbolic tangent Activation Function

tanh is also like logistic sigmoid but better. The range of the tanh function
is from (-1 to 1). tanh is also sigmoidal (s - shaped).
Fig: tanh v/s Logistic Sigmoid

The advantage is that the negative inputs will be mapped strongly negative
and the zero inputs will be mapped near zero in the tanh graph.

The function is differentiable.

The function is monotonic while its derivative is not monotonic.

The tanh function is mainly used classification between two classes.

Both tanh and logistic sigmoid activation functions are used

in feed-forward nets.

3. ReLU (Rectified Linear Unit) Activation Function

The ReLU is the most used activation function in the world right
now.Since, it is used in almost all the convolutional neural networks or
deep learning.
Fig: ReLU v/s Logistic Sigmoid

As you can see, the ReLU is half rectified (from bottom). f(z) is zero
when z is less than zero and f(z) is equal to z when z is above or equal to
zero.

Range: [ 0 to infinity)

The function and its derivative both are monotonic.

But the issue is that all the negative values become zero immediately
which decreases the ability of the model to fit or train from the data
properly. That means any negative input given to the ReLU activation
function turns the value into zero immediately in the graph, which in turns
affects the resulting graph by not mapping the negative values
appropriately.

4. Leaky ReLU

It is an attempt to solve the dying ReLU problem

Fig : ReLU v/s Leaky ReLU

Can you see the Leak?

The leak helps to increase the range of the ReLU function. Usually, the
value of a is 0.01 or so.

When a is not 0.01 then it is called Randomized ReLU.

Therefore the range of the Leaky ReLU is (-infinity to infinity).

Both Leaky and Randomized ReLU functions are monotonic in nature.

Also, their derivatives also monotonic in nature.

Perceptrons
A Perceptron is an Artificial Neuron
It is the simplest possible Neural Network
Neural Networks are the building blocks of Machine Learning.

The original Perceptron was designed to take a number of binary inputs,

and produce one binary output (0 or 1).

The idea was to use different weights to represent the importance of

each input, and that the sum of the values should be greater than
a threshold value before making a decision like true or false (0 or 1).
Perceptron Example

Imagine a perceptron (in your brain).

The perceptron tries to decide if you should go to a concert.

Is the artist good? Is the weather good?

What weights should these facts have?

The Perceptron Algorithm

Frank Rosenblatt suggested this algorithm:

1. Set a threshold value

2. Multiply all inputs with its weights

Criteria Input Weight

Artists is Good x1 = 0 or 1 w1 = 0.7

Weather is Good x2 = 0 or 1 w2 = 0.6

Friend will Come x3 = 0 or 1 w3 = 0.5

Food is Served x4 = 0 or 1 w4 = 0.3

Alcohol is Served x5 = 0 or 1 w5 = 0.4

3. Sum all the results
4. Activate the output

1. Set a threshold value:

 Threshold = 1.5

2. Multiply all inputs with its weights:

 x1 * w1 = 1 * 0.7 = 0.7
 x2 * w2 = 0 * 0.6 = 0
 x3 * w3 = 1 * 0.5 = 0.5
 x4 * w4 = 0 * 0.3 = 0
 x5 * w5 = 1 * 0.4 = 0.4

3. Sum all the results:

 0.7 + 0 + 0.5 + 0 + 0.4 = 1.6 (The Weighted Sum)

4. Activate the Output:

 Return true if the sum > 1.5 ("Yes I will go to the Concert")

Perceptron Terminology

 Perceptron Inputs
 Node values
 Node Weights
 Activation Function

Perceptron Inputs

Perceptron inputs are called nodes.

The nodes have both a value and a weight.

Node Values

In the example above, the node values are: 1, 0, 1, 0, 1

The binary input values (0 or 1) can be interpreted as (no or yes) or (false

or true).

Node Weights

Weights shows the strength of each node.

In the example above, the node weights are: 0.7, 0.6, 0.5, 0.3, 0.4

The Activation Function

The activation functions maps the result (the weighted sum) into a
required value like 0 or 1.

In the example above, the activation function is simple: (sum > 1.5)
The binary output (1 or 0) can be interpreted as (yes or no) or (true or
false).

Neural Networks

The Perceptron defines the first step into Neural Networks.

Multi-Layer Perceptrons can be used for very sophisticated decision

making.

In the Neural Network Model, input data (yellow) are processed against
a hidden layer (blue) and modified against more hidden layers (green) to
produce the final output (red).

The First Layer:

The 3 yellow perceptrons are making 3 simple decisions based on the
input evidence. Each single decision is sent to the 4 perceptrons in the
next layer.

The Second Layer:

The blue perceptrons are making decisions by weighing the results from
the first layer. This layer make more complex decisions at a more abstract
level than the first layer.

The Third Layer:

Even more complex decisions are made by the green perceptons.the first
layer. This layer make more complex decisions at a more abstract level
than the first layer.

The Third Layer:

Even more complex decisions are made by the green perceptons.
Perceptron

Although today the Perceptron is widely recognized as an algorithm, it

was initially intended as an image recognition machine. It gets its name
from performing the human-like function of perception, seeing and
recognizing images.

In particular, interest has been centered on the idea of a

machine which would be capable of conceptualizing inputs
impinging directly from the physical environment of light,
sound, temperature, etc. — the “phenomenal world” with
which we are all familiar — rather than requiring the
intervention of a human agent to digest and code the
necessary information.[4]

Rosenblatt’s perceptron machine relied on a basic unit of computation,

the neuron. Just like in previous models, each neuron has a cell that
receives a series of pairs of inputs and weights.

The major difference in Rosenblatt’s model is that inputs are combined

in a weighted sum and, if the weighted sum exceeds a predefined
threshold, the neuron fires and produces an output.

Perceptrons neuron model (left) and threshold logic (right). (Image by

author)

Threshold T represents the activation function. If the weighted sum of

the inputs is greater than zero the neuron outputs the value 1, otherwise
the output value is zero.

Multilayer Perceptron

The Multilayer Perceptron was developed to tackle this limitation. It is a

neural network where the mapping between inputs and output is non-
linear.
A Multilayer Perceptron has input and output layers, and one or
more hidden layers with many neurons stacked together. And while in
the Perceptron the neuron must have an activation function that imposes a
threshold, like ReLU or sigmoid, neurons in a Multilayer Perceptron can
use any arbitrary activation function.

Multilayer Perceptron. (Image by author)

Multilayer Perceptron falls under the category of feedforward algorithms,

because inputs are combined with the initial weights in a weighted sum
and subjected to the activation function, just like in the Perceptron. But the
difference is that each linear combination is propagated to the next layer.

Each layer is feeding the next one with the result of their computation,
their internal representation of the data. This goes all the way through the
hidden layers to the output layer.But it has more to it.

If the algorithm only computed the weighted sums in each neuron,

propagated results to the output layer, and stopped there, it wouldn’t be
able to learn the weights that minimize the cost function. If the algorithm
only computed one iteration, there would be no actual learning.

Formal Languages and Automata Solution Manual
No ratings yet
Formal Languages and Automata Solution Manual
2 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Neural Networks and CNN
No ratings yet
Neural Networks and CNN
25 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
19 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
18 pages
Unit 3 Deep Learning
No ratings yet
Unit 3 Deep Learning
11 pages
Unit 1.1
No ratings yet
Unit 1.1
44 pages
DeepLearing Theory
No ratings yet
DeepLearing Theory
51 pages
Module 5 AIML Notes
No ratings yet
Module 5 AIML Notes
77 pages
Unit 1
No ratings yet
Unit 1
19 pages
Activation Functions
No ratings yet
Activation Functions
8 pages
DOC-20241108-WA0006.
No ratings yet
DOC-20241108-WA0006.
70 pages
NN unit_1
No ratings yet
NN unit_1
27 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Understanding and Coding Neural Networks From Scratch in Python and R
No ratings yet
Understanding and Coding Neural Networks From Scratch in Python and R
12 pages
ML unit 4
No ratings yet
ML unit 4
23 pages
SC ESE Cae 1
No ratings yet
SC ESE Cae 1
25 pages
UNit 6 Machine Learning
No ratings yet
UNit 6 Machine Learning
23 pages
Lecture Notes 3
No ratings yet
Lecture Notes 3
5 pages
Understanding and Coding Neural Networks From Scratch in Python and R
100% (1)
Understanding and Coding Neural Networks From Scratch in Python and R
15 pages
Perceptron: Single Layer Neural Network
No ratings yet
Perceptron: Single Layer Neural Network
14 pages
CSPE 102 - Module 3
No ratings yet
CSPE 102 - Module 3
19 pages
Understanding Neural Networks
No ratings yet
Understanding Neural Networks
12 pages
ML UNIT 2
No ratings yet
ML UNIT 2
23 pages
Unit 2
No ratings yet
Unit 2
18 pages
Soft Computing Manual.-1
No ratings yet
Soft Computing Manual.-1
45 pages
UNIT-4 Material
No ratings yet
UNIT-4 Material
43 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Module I
No ratings yet
Module I
109 pages
activation fn
No ratings yet
activation fn
15 pages
Unit 2 - Machine Learning
No ratings yet
Unit 2 - Machine Learning
19 pages
ml(5)unit(final) (1)
No ratings yet
ml(5)unit(final) (1)
19 pages
Activation
No ratings yet
Activation
7 pages
Experiment No. 1 SL-II (ANN)
No ratings yet
Experiment No. 1 SL-II (ANN)
3 pages
Artificial Neural Networks(ANN)
No ratings yet
Artificial Neural Networks(ANN)
67 pages
activatn fn 2
No ratings yet
activatn fn 2
10 pages
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Machine Learning - WWW - Rgpvnotes.in
18 pages
UNIT1_Perceptron_MLP
No ratings yet
UNIT1_Perceptron_MLP
26 pages
Lesson 7.0 Supervised Learning With Neural Networks (1)
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks (1)
22 pages
Deep Learning Unit1
No ratings yet
Deep Learning Unit1
25 pages
Aiml Unit 5
No ratings yet
Aiml Unit 5
16 pages
DEEP LEARNING Paper
No ratings yet
DEEP LEARNING Paper
12 pages
Unit 1 NNDL
No ratings yet
Unit 1 NNDL
8 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
Module 3 Final
No ratings yet
Module 3 Final
88 pages
Unit III
No ratings yet
Unit III
37 pages
B.Tech Project Mid Term Report: Handwritten Digits Recognition Using Neural Networks
No ratings yet
B.Tech Project Mid Term Report: Handwritten Digits Recognition Using Neural Networks
13 pages
BTP Project Report
No ratings yet
BTP Project Report
13 pages
Activation Function
No ratings yet
Activation Function
9 pages
Neural Networks
No ratings yet
Neural Networks
10 pages
Multilayer_Feedforward_Network- Activation Functions (1)
No ratings yet
Multilayer_Feedforward_Network- Activation Functions (1)
9 pages
From Perceptron To Deep Neural Nets - Becoming Human - Artificial Intelligence Magazine
No ratings yet
From Perceptron To Deep Neural Nets - Becoming Human - Artificial Intelligence Magazine
36 pages
Neural Net Notes
No ratings yet
Neural Net Notes
7 pages
Presentation for deep learning
No ratings yet
Presentation for deep learning
15 pages
NNDL
No ratings yet
NNDL
96 pages
Artificial Neural Network Notes
No ratings yet
Artificial Neural Network Notes
9 pages
UNIT 4 - Perceptron and DL
No ratings yet
UNIT 4 - Perceptron and DL
39 pages
Unit - II ML
No ratings yet
Unit - II ML
9 pages
Lecture 5 - CS50's Introduction to Artificial Intelligence with Python
No ratings yet
Lecture 5 - CS50's Introduction to Artificial Intelligence with Python
16 pages
Lecture-02: PGDDS 202
No ratings yet
Lecture-02: PGDDS 202
15 pages
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
From Everand
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
Fouad Sabry
No ratings yet
Quantum Entanglement-Introduction by
No ratings yet
Quantum Entanglement-Introduction by
4 pages
OS ASSIGNMENT QUESTIONS (Module 3)
No ratings yet
OS ASSIGNMENT QUESTIONS (Module 3)
2 pages
Final Quiz 1 - Attempt Review
No ratings yet
Final Quiz 1 - Attempt Review
3 pages
2021 Lecture08 FirstOrderLogic PDF
No ratings yet
2021 Lecture08 FirstOrderLogic PDF
97 pages
L26 Banker Algorithm
No ratings yet
L26 Banker Algorithm
8 pages
Pre-Placements Checklist
100% (1)
Pre-Placements Checklist
9 pages
Dynamic Programming
No ratings yet
Dynamic Programming
8 pages
42
No ratings yet
42
4 pages
Quiz - MIDTERM EXAM-CALCULUSIIIMATH ANALYSIS III
No ratings yet
Quiz - MIDTERM EXAM-CALCULUSIIIMATH ANALYSIS III
3 pages
Neerc 2011 Analysis
No ratings yet
Neerc 2011 Analysis
13 pages
DS 20-21
No ratings yet
DS 20-21
2 pages
15) Trapping Rain Water
No ratings yet
15) Trapping Rain Water
51 pages
Full Download Introduction to the design and analysis of algorithms 3rd edition Edition Levitin PDF DOCX
100% (17)
Full Download Introduction to the design and analysis of algorithms 3rd edition Edition Levitin PDF DOCX
60 pages
Unit 4 - Queue
No ratings yet
Unit 4 - Queue
10 pages
comp106_6_computation
No ratings yet
comp106_6_computation
63 pages
Module 8 Review Without Videos
No ratings yet
Module 8 Review Without Videos
6 pages
Problem Reduction
No ratings yet
Problem Reduction
10 pages
1 Non-Uniform Quantizer - PDF
No ratings yet
1 Non-Uniform Quantizer - PDF
5 pages
PNP Problems
No ratings yet
PNP Problems
11 pages
Subject: DSTC++: Unit-I: Above Average Questions Short Questions
No ratings yet
Subject: DSTC++: Unit-I: Above Average Questions Short Questions
7 pages
CSCE 3110 Data Structures and Algorithms Assignment 2
No ratings yet
CSCE 3110 Data Structures and Algorithms Assignment 2
4 pages
GE3151 PYTHON Syllabus
No ratings yet
GE3151 PYTHON Syllabus
2 pages
Chapter5: Algorithm, Flowchart, Pseudocode: Find Sum of Two Numbers
No ratings yet
Chapter5: Algorithm, Flowchart, Pseudocode: Find Sum of Two Numbers
20 pages
ADC - Lecture 5 Digital Source Coding - 1
No ratings yet
ADC - Lecture 5 Digital Source Coding - 1
10 pages
Numerical Analysis: Lecture - 6
No ratings yet
Numerical Analysis: Lecture - 6
13 pages
Floating Point Numbers
No ratings yet
Floating Point Numbers
26 pages
CIVE 320 Lecture 5
No ratings yet
CIVE 320 Lecture 5
38 pages
Code Generation: Integrated Instruction Selection and Register Allocation Algorithms
No ratings yet
Code Generation: Integrated Instruction Selection and Register Allocation Algorithms
32 pages