0% found this document useful (0 votes)

14 views

Activation Functions For Neural Networks: Application and Performance-Based Comparison

Past decade has seen explosive growth of Deep Learning (DL) algorithms based on Artificial Neural Networks (ANNs) and its applications in vast emerging domains to solve real world complex problems. The DL architecture uses Activation Functions (AFs), to perform the task of finding relationship between the input feature and the output. Essential building blocks of any ANN are AFs which bring the required non-linearity of the output in the Output layer of network.

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

Activation Functions For Neural Networks: Application and Performance-Based Comparison

Uploaded by

International Journal of Innovative Science and Research Technology

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Volume 9, Issue 4, April – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24APR934

Activation Functions for Neural Networks:

Application and Performance-based Comparison
1 2
Ajay Kumar Dr. Nilesh Ware
M. Tech Student Assistant Professor
Defence Institute of Advanced Technology, Pune Defence Institute of Advanced Technology, Pune

Abstract:- Past decade has seen explosive growth of Deep II. ACTIVATION FUNCTIONS
Learning (DL) algorithms based on Artificial Neural
Networks (ANNs) and its applications in vast emerging Primarily there are 04 types of AFs prevalent in ANN
domains to solve real world complex problems. The DL namely Sigmoid, ReLU, Exponential Unit and Adaptive Unit
architecture uses Activation Functions (AFs), to perform based AFs. In addition to these primary functions, there are a
the task of finding relationship between the input feature number of use-case based variations of primary AFs which
and the output. Essential building blocks of any ANN are are well suited for a specific application area such as Leaky-
AFs which bring the required non-linearity of the output ReLU is suitable for Convolution Neural Network (CNN).
in the Output layer of network. Layers of ANNs are Chigozie Enyinna Nwankpa et al has presented a detailed
combinations of linear and nonlinear AFs. Most summary of various AFs used in ANN. Author has identified
extensively used AFs are Sigmoid, Hyperbolic Tangent that there are 04 flavors of ReLU AFs which are prevalently
(Tanh), Rectified Linear Unit (ReLU) etc to name a few. used in neural networks the rectified linear units (ReLU)
Choosing an AF for a particular AF depends on various namely Leaky-ReLU, Parametric-ReLU, Randomized-ReLU
factors such as Nature of Application, Design of ANN, and S-shaped-ReLU. Furthermore, the Sigmoid has two
Optimizers used in the network, Complexity of Data etc. variants namely Hyperbolic tangent (TanH) and Exponential
This paper presents a survey on most widely used AFs Linear Squashing Activation Function (EliSH). The Softmax,
along with the important consideration while selecting an Maxout , Softplus, Softsign, and Swish functions has no
AF on a specific problem domain. A broad guideline on variants [1]. Sigmoid is generally used in Logistic Regression
selecting an AF based on the literature survey has been whereas TanH is predominantly used in LSTM cell in NLP
presented to help researchers in employing suitable AF in tasks using Recurrent Neural Networks (RNN).
their problem domain.
The Exponential Linear Unit (ELU) has negative
Keywords:- Artificial Neural Network, Activation Functions, values which works as Batch Normalisation thus speeding up
RNN. the convergence but with lower computational complexity.

I. INTRODUCTION III. NEED FOR ACTIVATION FUNCTION

AFs are processing units based on mathematical ANNs enables an algorithm in making faster decisions
equations that determine the output of ANN model. Each without human intervention as they are able to infer complex
neuron in ANN receives input data, applies a linear non-linear relationship among the features of the dataset. The
transformation (weighted summation) and then pass the result primary role of the AF is to transform the calculated summation of
through an AF typically a Sigmoid or ReLU, in order to bring weights input from the node and bias into an output value to be
non-linearity to the input data. This process allows ANN to fed to the next hidden layer or as output in the output layer
capture complex, non-linear relationships within data which [Figure-01]. The output produced by the AF of output layer is
otherwise is not possible by conventional Machine Learning compared to the desired value by means of Loss function and
algorithms. Performance of ANN is dependent on its accordingly a Gradient is calculated using some Optimization
efficiency of convergence so as to stabilise the weights Algorithm (usually Gradient Descent) in order to achieve the
assigned to various input which are provided to a Neuron or a local or global minima for the ANN using backpropagation.
set of Neurons in each layer. ANN is said to have converged The resultant weight vector that contains the hidden
when no further significant change to the assigned weights is characteristics in the data. These AFs are often referred to as
possible in subsequent iterations. Depending on the selection a Transfer Functions. A typical ANN is biologically inspired
of AFs, a network may converge faster or may not converge computer programmes, inspired by the working of human
at all. AF is chosen so as to limit the output in the range to brain. These collection of neurons are called networks
any value between -1 to 1 , 0 to 1 , or -∞ to +∞ depending because they are stacked together in form of layers with each
upon the AF used in various neuron in different layers of set of neurons performing specific function and infer
ANN. knowledge by detecting the relationships and patterns in data
using past experiences known as training examples. In the
absence of AFs every neuron will behave as an Identity function
and will only perform the summation of the weighted input

IJISRT24APR934 www.ijisrt.com 1701

Volume 9, Issue 4, April – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24APR934

features using the weights and biases because irrespective of with one hidden layer for classification of records in the
number of layers in ANN all the neurons would simply output the IMDB, Movie Review, and MNIST data sets. Elliott AF
summation of input features without transforming them. The and its modifications demonstrated the least average error
resultant linear function will not be able to capture the non-linear and better results than the sigmoid AF which is more
pattern in the input data, even though, the ANN will become popular in LSTM networks [6].
simpler. However, ANN would work well for linear regression  Dubey and Jain et al have used ReLU and Leaky-ReLU
problems where the predicted output is same as the weighted AFs for deep layers and softmax AF for Output layer on
sum of the input features and bias. MNIST dataset classification. Result obtained shows that
CNN with ReLU showed better results than Leaky ReLU
in terms of model accuracy and model loss. Same
findings have been confirmed by Banerjee et al in his
experiment on MNIST dataset [7].
 Castaneda et al experiment on Object detection, Face
detection, Text and Sound dataset using ReLU, SELU and
Maxout shows that ReLU is best suited for Object, Face
and Text detection, whereas, SELU and Maxout are better
for sound / speech detections [8].
 Shiv Ram Dubey et al have used CIFAR10 and
CIFAR100 datasets for the image classification
experiment over different CNN models. It is observed
that the Softplus, ELU and CELU are better suited with
MobileNet. The ReLU, Mish and PDELU exhibit good
Fig 1 Building Blocks of Neural Network performance with VGG16, GoogleNet and DenseNet. The
ReLU, LReLU, CELU, ELU, GELU, ABReLU, and
IV. LITERATURE SURVEY PDELU AFs are better for the networks having residual
connections, such as ResNet50, SENet18 and
 A Wibowo, W Wiryawan, and I Nuqoyati DenseNet121 [9].
experimented on Cancer classification using using  Tomasz Szandała et al experimented data-set CIFAR-10
microRNA feature. In his experiment Gradient Descent, with just two CNN convolution layers to compare
Momentum, RMSProp, AdaGrad, AdaDelta, and Adam efficiency of ReLU, Sigmoid, TanH, Leaky-ReLU,
optimizers were used. The result showed that ReLU AF SWISH, Softsign and Softplus. The training/ test data
produced 98.536% and 98.54762% accuracy with Adam were split in 5:1 ratio [Fig-02] presents the accuracy of
and RMSProp optimizer resp [2]. various AFs [10]. The performance of ReLU and Leaky
 Bekir Karlik and A. Vehbi Olgac attempted to analyze ReLU were the most satisfying and both of them
the performance of Multi Layered Perceptron produced above 70 % accuracy in classification in spite of
architectures using various AFs such as sigmoid, Uni- the dying ReLU condition. Rest all other AFs resulted in
polar sigmoid, Tanh, Conic Section, and Radial Bases lower accuracy of less than 70 % [Table-01].
Function (RBF) with varying number of Neurons in
hidden and output layers. Results shows that when Tanh Table 1 Relative Accuracy Ratio of AFs
AF was used in both hidden & Output layers, best
accuracy of around 95 and 99% was observed for 100 and
500 iterations resp [3].
 Hari Krishna Vydana, Anil Kumar Vuppala studied the
influence of various AFs on speech recognition system on
TIMIT and WSJ datasets. The result shows that on
smaller datasets (TIMIT) ReLU worked better producing
the minimum phone rete error of 18.9 whereas for larger
dataset (WSJ), Exponential Linear Unit (ELU) produced
better result of reduced phone rate error of 19.5. It shows
whenever we have a larger dataset for speech recognition,
we should first employ ELU. [4]
 Giovanni Alcantara conducted an empirical analysis on
the effectiveness of using AFs on the MNIST
classification of handwritten digits (LECUN).
Experiments suggests that ReLU, Leaky ReLU, ELU and
SELU AFs all yield great results in terms of validation
error, however, ELU performed better than all other
models [5].
 Farzad et al employed Long Short-Term Memory based
Recurrent ANN (RNN) for sentiment analysis task on
Fig 2 Relative Accuracy Graph of AFs
IMDB and Movie Review and observed that Elliott AF

IJISRT24APR934 www.ijisrt.com 1702

Volume 9, Issue 4, April – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24APR934

V. IMPORTANT CONSIDERATIONS FOR Leaky ReLU addresses the issue of dead activation by
SELECTION OF AF replacing ‘0’ with alpha times x (alpha = 0.01) such that
derivative of LReLU is slightly greater than ‘0’ [Fig-05].
The selection of an AF is a critical decision when Similarly, Parametric ReLU (PReLU), ELU are often
designing and training ANNs. Different AFs can have a preferred because they are less prone to vanishing gradients
significant impact on the performance and convergence of a [11]
ANN. Here are some critical factors that affect the choice of
AF in an ANN:

 Differentiability:
Two important properties of AFs are that they should be
differentiable and that their gradient should be expressible
using the function itself. The first essential properties make
them suitable for the backpropagation which is the essence of
ANNs. Many optimization techniques, such as gradient
descent, rely on the derivative of the AFs. Hence, it's
essential that the AF is differentiable or has well-defined
gradients. This allows for the efficient training of the
network. The second desirable property reduces the
computation time of ANNs because at times the ANN is
trained on millions of complex data points. Functions of
Sigmoid and TanH AFs and their derivative is as shown
below along with their graph plot [Fig-03] : Fig 4 Sigmoid and its Derivative Curve

Fig 3 Sigmoid & TanH Graph

Fig 5 ReLU and Leaky ReLU Graph
 Vanishing and Exploding Gradients:
Some AFs such as Sigmoid, Tanh are prone to  Sparsity of Neurons:
vanishing gradient problem, where gradients become very As the number of layers and hence the number of
small during backpropagation leading to delayed neurons increases in an ANN, the networks start overfitting.
convergence. Adjoining figure shows the plot of a Sigmoid The very first step to avoid overfitting is to reduce the
AF and its derivative. The maximum value of the Sigmoid complexity of the ANN by reducing the number of hidden
derivative is 0.3 (approx) [Fig-04]. In the backpropagation layers thereby number of neurons in network. Some AFs
when chain rule is applied as a process of updating the introduce sparsity in the network by allowing some neurons
weights associated with neurons in the hidden layer, the to remain inactive. This situation is called Dead-Activation.
derivatives tend to get smaller exponentially and at a deeper It is observed in ReLU AF which is the result of negative
layer it will become insignificant to cause any significant input to the neuron which results in ‘0’ output for all negative
change in the weights. This means, in deeper architectures, data. In some scenario, the dead activation has been found
no learning happens for the deeper neurons or it is useful such as Feature identification using CNN. It is
remarkably slower rate as compared to learning in shallower desirable that while trying to detecting a feature are
layers. This can make training slow and less effective. If we computationally efficient compared to more more complex
choose an optimizer that is sensitive to vanishing gradients ones like Sigmoid or TanH. Owing to lesser computational
(e.g., plain SGD), you might want to avoid using AFs that complexity in ReLU family compared to Sigmoid, TanH
make this problem worse. Since, derivative of ReLU for AFs, ANN employing ReLU tends to converge faster than
positive input is always ‘1’, it addresses the problem of later AFs. An experiment conducted on CIFAR-10 dataset
vanishing gradient in ANN. However, a ‘0’ gradient for using ReLU and TanH AFs shows that ReLU converges six
negative input it leads to dead activation. A modified version times faster than TanH [Fig-06] [12].

IJISRT24APR934 www.ijisrt.com 1703

Volume 9, Issue 4, April – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24APR934

default choice is Sigmoid AF whereas, ReLU and its

variations are widely used with CNN due to their
effectiveness in image processing tasks. Selecting an AF for a
network in output layer is made based on the nature of the
problem and the detected variables. As a rule of thumb, for
hidden layer, we must employ ReLU as a default AF. Only in
cases where the performance is not optimum should we move
to other AFs. We should avoid using Sigmoid, Tanh functions
in hidden layers as they have inherent problem of Vanishing
Gradient. Swish function is used in ANNs having a depth
greater than 40 layers. ReLUs, among all the other viable
contenders have the cheapest computational budget, as well
as simple to implement. If ReLU doesn't give promising
results, we may use either a Leaky ReLU or ELU. If compute
resources are at premium and network is deep, Leaky-ReLU
is preferred else ELU works better. If lot of computational
Fig 6 Sigmoid and its Derivate Curve budget and time is available, PReLU may be used. With
Parametric ReLU, a whole bunch of parameters to be learned
are added to optimisation problem so it needs lots of training
 Architectural Considerations:
The choice of AF can also be influenced by the overall data. Randomized-ReLU can be useful if your network
architecture of the ANN. Different AFs have different suffers with overfitting. In nutshell, there is no thumb rule to
properties and are more suitable for certain types of network apply a particular AF in a particular situation or a problem
domain. There are advantages and disadvantages of every AF
designs and tasks. It's important to experiment with different
and it completely depends on the given situation which
AFs and monitor their impact on the network's training and
enables the choice of an AF.
performance to select the one that works best for specific
task. Some critical factors to consider before selecting an AF
 However, Researchers may follow following Broad
based on the ANN design considerations are :
Guidelines while Trying out Various AFs: -
 Output Layer:
 Best AF to start working on is ReLU as it obviates
The choice of AF for the output layer depends on the
Vanishing Gradient and makes computations faster due to
nature of the problem. For binary classification, the sigmoid
function is often used, while Softmax is common for multi- its simple nature of function and its derivative.
class classification. Logistic Regression, being a two-class  If ReLU is resulting in overfitting of network due to
problem, uses Sigmoid AF. complexity of ANN, we can use the leaky ReLU function
to address dead activation, if it is not desirable.
 Hidden Layer:  In cases of slow convergence due to Vanishing Gradient
AFs that are used in the hidden layer should be Non- problem i.e. gradient reaching the value zero, try avoiding
linear, to let the ANN learn non-linear relationships, Sigmoid and Tanh functions.
Unbounded, to enable faster learning and avoid saturating  For NLP tasks using RNN a combination of sigmoid/
early and differentiable, to enable back-propagation and Tanh functions gives better results.
learning. Typically, AFs like ReLU or its variants (e.g.,  ReLU should only be used in the hidden layers and not in
Leaky ReLU) are used in hidden layers of an ANN. the outer layer of ANN. [13]

 Range of Output: VII. CONCLUSION

The range of values that an AF can produce also
matters. Some AFs squash their input into a specific range, This paper has presented a very brief introduction to
like the sigmoid function which maps input to values AFs used in an ANN with an aim to provide a broad
between 0 and 1. Others, like the ReLU, produce unbounded guideline to researcher on selecting a particular AF in a given
outputs. The choice of AF depends on the problem and the situation. Choosing an AF depends on variety of conditions
desired output range. In a regression problem, we use the such as the Complexity of ANN, Vanishing Gradients, Dead
linear (identity) AF with one node. In a binary classifier, we activation, Complexity of the function itself etc. AN effort
use the sigmoid AF with one node. has been made to highlight such conditions in order to
consider a particular AF for a given problem. In summary,
VI. APPLICATION BASED SELECTION while variety of factors can affect the choice of an AF, there
is no one-size-fits-all rule. It's essential to consider the
The choice of AF can also depend on the specific characteristics of AF, as well as the specific requirements of
problem which we are trying to solve. For instance, Softmax your ANN and problem statement, when making these
is commonly used in the output layer in classification tasks, choices. Experimentation and fine-tuning are often necessary
while other functions such as Sigmoid, Tanh might be more to find the best combination that yields optimal training and
suitable for sentiment analysis. For Logistic Regression, performance results for your dataset. Most of the AFs have
not been discussed in much detail due to restriction on

IJISRT24APR934 www.ijisrt.com 1704

Volume 9, Issue 4, April – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24APR934

content, however, as a future work, a comprehensive analysis

of all the AFs can be done while considering their advantages
and disadvantages in a particular situation and a much more
detailed comparison of various AFs would be carried out at
an implementation level presenting a more enhanced
guideline on AFs.

REFERENCES

[1]. N. C. Envinna, “Activation Functions: Comparision of

trends in practice and reseachfor deep learning,”
CS.LG, 2018.
[2]. W. W. A Wibowo, “Optimization of ANN for Cancer
microRNA biomarkers classification,” Journal of
Physics, 2019.
[3]. B. Karlik, “performance aalysis of various Activation
functions in generalised MLP architectures of ANN,”
International Journal of Artificial Intelligence And
Expert Systems (IJAE), vol. (1), no. (4), 2011.
[4]. H. K. Vydana, “investigation study of various
Activation functions for speech recognition”.
[5]. A. Giovanni, “Emperical analysis of non-linear
Activation functionsfor deep ANN in classification
tasks,” CS.LG, 2017.
[6]. A. Farzad, “a comparative performance analysis of
different activation functions in LSTM network for
classification,” Neural Comput & Applic 31, 2507–
2521 , 2019.
[7]. V. J. Dubey AK, “Comparative study of convolution
ANN’s relu and leaky-relu AFs,” Lecture Notes in
Electrical Engineering, vol 553. Springer, Singapore,
2019.
[8]. G. Castaneda, “Evaluation of maxout activations in
deep learning across several big data domains,”
Journal of Big Data, 2019.
[9]. S. R. Dubey, “Activation Functions in Deep Learning:
A Comprehensive Survey and Benchmark,”
Neurocomputing, vol. 503, 2022.
[10]. T. Szandała, “Review and Comparison of Commonly
Used Activation Functions for Deep Neural
Networks,” SCI, vol. 903, 2020.
[11]. S. V. Chekuri, “Online video lectures,” Applied Roots,
2018.
[12]. A. Krizhevsky, “ImageNet Classification with Deep
Convolutional”.
[13]. S. Sharma, “ACTIVATION FUNCTIONS IN
NEURAL NETWORKS,” International Journal of
Engineering Applied Sciences and Technology, vol. 4,
no. 12, 2020.
[14]. R. Jiang, “Deep Neural Networks for Channel
Estimation in Underwater Acoustic OFDM Systems,”
IEEE, 2019.
[15]. S. Sharma, “Activation Functions in Neural
Networks,” Toward data science, 2017.

IJISRT24APR934 www.ijisrt.com 1705

Stock Price Prediction Using Artificial Neural Networks: Padmaja Dhenuvakonda, R. Anandan, N. Kumar
100% (1)
Stock Price Prediction Using Artificial Neural Networks: Padmaja Dhenuvakonda, R. Anandan, N. Kumar
5 pages
Seminar Report ANN
100% (2)
Seminar Report ANN
21 pages
Studying_the_Effect_of_Activation_Function_on_Clas
No ratings yet
Studying_the_Effect_of_Activation_Function_on_Clas
7 pages
1-s2.0-S0950705124013960-main
No ratings yet
1-s2.0-S0950705124013960-main
12 pages
Optimum Steepest Descent Higher Level Learning Radical Basis Function Network
No ratings yet
Optimum Steepest Descent Higher Level Learning Radical Basis Function Network
14 pages
singh2005
No ratings yet
singh2005
8 pages
kuo2000
No ratings yet
kuo2000
13 pages
DB 86 Fbee
No ratings yet
DB 86 Fbee
14 pages
Is ECMS2013 0184
No ratings yet
Is ECMS2013 0184
6 pages
F1369 TarjomeFa English
No ratings yet
F1369 TarjomeFa English
7 pages
Forecasting Foreign Exchange Rate Using Robust Laguerre Neural Network
No ratings yet
Forecasting Foreign Exchange Rate Using Robust Laguerre Neural Network
5 pages
Optimizing CNN-BiGRU Performance: Mish Activation and Comparative Analysis
No ratings yet
Optimizing CNN-BiGRU Performance: Mish Activation and Comparative Analysis
19 pages
Training Deep Spiking Neural Networks Using Backpropagation: Jun Haeng Lee, Tobi Delbruck and Michael Pfeiffer
No ratings yet
Training Deep Spiking Neural Networks Using Backpropagation: Jun Haeng Lee, Tobi Delbruck and Michael Pfeiffer
13 pages
Gianluca Maguolo Et Al - 2021 - Ensemble of Convolutional Neural Networks Trained With Different Activation
No ratings yet
Gianluca Maguolo Et Al - 2021 - Ensemble of Convolutional Neural Networks Trained With Different Activation
8 pages
1 s2.0 S0030402616302108 Main
No ratings yet
1 s2.0 S0030402616302108 Main
6 pages
Developing A Functional Model Extracted From An Artificial Neural Network For The Prediction of Friction Factor in The Turbulent Flow Regime
No ratings yet
Developing A Functional Model Extracted From An Artificial Neural Network For The Prediction of Friction Factor in The Turbulent Flow Regime
18 pages
Improvement of Learning For CNN With Relu Activation by Sparse Regularization
No ratings yet
Improvement of Learning For CNN With Relu Activation by Sparse Regularization
8 pages
Spikems: Deep Spiking Neural Network For Motion Segmentation
No ratings yet
Spikems: Deep Spiking Neural Network For Motion Segmentation
7 pages
Chen and Chang1996_A Feedforward Neural Network With Function Shape Autotuning
No ratings yet
Chen and Chang1996_A Feedforward Neural Network With Function Shape Autotuning
15 pages
FPGA Based Artificial Neural Network
No ratings yet
FPGA Based Artificial Neural Network
11 pages
SPECTRAL INFERENCE NETWORKS: UNIFYING DEEP AND SPECTRAL LEARNING
No ratings yet
SPECTRAL INFERENCE NETWORKS: UNIFYING DEEP AND SPECTRAL LEARNING
26 pages
ANNtoSNN PDF
No ratings yet
ANNtoSNN PDF
12 pages
Resiliency of Deep Neural Networks Under Quantizations
No ratings yet
Resiliency of Deep Neural Networks Under Quantizations
11 pages
Classification+of+Wheat+Types+by+Artificial+Neural+Network
No ratings yet
Classification+of+Wheat+Types+by+Artificial+Neural+Network
4 pages
DL_PRESENTATION
No ratings yet
DL_PRESENTATION
82 pages
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
No ratings yet
Institute of Engineering and Technology Davv, Indore: Lab Assingment On
14 pages
Artificial Neural Networks and Their App
No ratings yet
Artificial Neural Networks and Their App
5 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
10 pages
Optimizing_Activation_Function_in_Deep_Artificial_
No ratings yet
Optimizing_Activation_Function_in_Deep_Artificial_
11 pages
Analyzing Features For Activity Recognition: Tâm Huynh and Bernt Schiele
No ratings yet
Analyzing Features For Activity Recognition: Tâm Huynh and Bernt Schiele
6 pages
1995 (2)
No ratings yet
1995 (2)
6 pages
Computer Vision and Image Understanding: Hamed Elwarfalli, Russell C. Hardie
No ratings yet
Computer Vision and Image Understanding: Hamed Elwarfalli, Russell C. Hardie
11 pages
Fnins 15 651141
No ratings yet
Fnins 15 651141
7 pages
Top 10 Deep Learning Algorithms You Should Know in 2023
No ratings yet
Top 10 Deep Learning Algorithms You Should Know in 2023
14 pages
151180080_BM466_HOMEWORK 4
No ratings yet
151180080_BM466_HOMEWORK 4
10 pages
Paper - Design and Analysis of FIR Filter Using Artificial Neural Network - 2015
No ratings yet
Paper - Design and Analysis of FIR Filter Using Artificial Neural Network - 2015
4 pages
A View of Artificial Neural Network Models in Different
No ratings yet
A View of Artificial Neural Network Models in Different
6 pages
Unit-V
No ratings yet
Unit-V
42 pages
Behaviour Analysis of Multilayer Perceptronswith M
No ratings yet
Behaviour Analysis of Multilayer Perceptronswith M
7 pages
2017_Fault Diagnosis of roller bearings based on a wavelet neural network and_applsci-07-00158-v2
No ratings yet
2017_Fault Diagnosis of roller bearings based on a wavelet neural network and_applsci-07-00158-v2
10 pages
1 s2.0 S0141933124000322 Main
No ratings yet
1 s2.0 S0141933124000322 Main
7 pages
Published Indoor Navigation
No ratings yet
Published Indoor Navigation
6 pages
Assignment 4
No ratings yet
Assignment 4
7 pages
MATLAB Implementation
No ratings yet
MATLAB Implementation
5 pages
Article NNOptimization
No ratings yet
Article NNOptimization
8 pages
Artificial Neural Networks and Their Applications: June 2005
No ratings yet
Artificial Neural Networks and Their Applications: June 2005
6 pages
Face Recognition Based On Chaotic Fuzzy RBF Neural Network: Meng Jian-Liang Gao Wan-Qing Pang Hui-Jing Niu Wei-Hua
No ratings yet
Face Recognition Based On Chaotic Fuzzy RBF Neural Network: Meng Jian-Liang Gao Wan-Qing Pang Hui-Jing Niu Wei-Hua
4 pages
Analyzing the Performance of Multilayer Neural
No ratings yet
Analyzing the Performance of Multilayer Neural
16 pages
Paper Publication2
No ratings yet
Paper Publication2
8 pages
Universal Approximation Using Incremental Constructive Feedforward Networks With Random Hidden Nodes
No ratings yet
Universal Approximation Using Incremental Constructive Feedforward Networks With Random Hidden Nodes
14 pages
neural network
No ratings yet
neural network
11 pages
Inception Net
No ratings yet
Inception Net
88 pages
Time Series Forecasting Using Back Propagation Neural Network With ADE Algorithm
No ratings yet
Time Series Forecasting Using Back Propagation Neural Network With ADE Algorithm
5 pages
Ai ML Important Questions
No ratings yet
Ai ML Important Questions
21 pages
Review of Deep Learning Algorithms and Architectur
No ratings yet
Review of Deep Learning Algorithms and Architectur
29 pages
Regularized CNN Model For Crop Classification
No ratings yet
Regularized CNN Model For Crop Classification
3 pages
viva
No ratings yet
viva
8 pages
Distinct Variation Pattern Discovery Using Alternating Nonlinear Principal Component Analysis
No ratings yet
Distinct Variation Pattern Discovery Using Alternating Nonlinear Principal Component Analysis
11 pages
2
No ratings yet
2
54 pages
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
From Everand
Multilayer Perceptron: Fundamentals and Applications for Decoding Neural Networks
Fouad Sabry
No ratings yet
Machine Learning-Enhanced Models in Brain Tumors: A Mathematical and Computational Perspective
No ratings yet
Machine Learning-Enhanced Models in Brain Tumors: A Mathematical and Computational Perspective
4 pages
Exploratory Data Analysis for Banking
No ratings yet
Exploratory Data Analysis for Banking
5 pages
Recognizing and Addressing Mental Health Comorbidities in Hypertension Care Strategies: A Narrative Review
No ratings yet
Recognizing and Addressing Mental Health Comorbidities in Hypertension Care Strategies: A Narrative Review
9 pages
Exploring The Skin Lightening Potential of PADMAKA (Prunus cerasoides) In A Novel Face Serum
No ratings yet
Exploring The Skin Lightening Potential of PADMAKA (Prunus cerasoides) In A Novel Face Serum
8 pages
Investment Feasibility of Hydroponic Farming: Analysing the Return on Investment (ROI) Compared to Traditional Farming
No ratings yet
Investment Feasibility of Hydroponic Farming: Analysing the Return on Investment (ROI) Compared to Traditional Farming
4 pages
A Effectiveness of Multi-Intervention Programme Combining Benson's Relaxation Therapy and Counseling on Perceived Stress among Stroke Victims
No ratings yet
A Effectiveness of Multi-Intervention Programme Combining Benson's Relaxation Therapy and Counseling on Perceived Stress among Stroke Victims
7 pages
Predicting Employee Attrition using Machine Learning Techniques
No ratings yet
Predicting Employee Attrition using Machine Learning Techniques
10 pages
AI-Powered Inventory Management System: Revolutionizing Stock Monitoring with Real-Time Alerts & Visual Recognition
No ratings yet
AI-Powered Inventory Management System: Revolutionizing Stock Monitoring with Real-Time Alerts & Visual Recognition
12 pages
Conceptual Model on the Effect of Axial Load on Shallow Isolated Footings Resting on Clay Soil
No ratings yet
Conceptual Model on the Effect of Axial Load on Shallow Isolated Footings Resting on Clay Soil
6 pages
Phacoemulsification vs. Manual SICS: Which Poses a Higher Risk for Postoperative Dry Eye?
No ratings yet
Phacoemulsification vs. Manual SICS: Which Poses a Higher Risk for Postoperative Dry Eye?
5 pages
Corporate Social Responsibility as a Strategic Tool for Organisaeional Success in Corpoarate Financial Intermediation: Empirical Evidence from Rivers State, Nigeria
No ratings yet
Corporate Social Responsibility as a Strategic Tool for Organisaeional Success in Corpoarate Financial Intermediation: Empirical Evidence from Rivers State, Nigeria
9 pages
Assessment Tools and Gap Analysis on the Competencies Covered in Mathematics in Tupi Secondary High School
No ratings yet
Assessment Tools and Gap Analysis on the Competencies Covered in Mathematics in Tupi Secondary High School
12 pages
Learning-Based Intrusion Detection and Prevention System (LIDPS)
No ratings yet
Learning-Based Intrusion Detection and Prevention System (LIDPS)
10 pages
Real-Time Sign Language to Speech Translation using Convolutional Neural Networks and Gesture Recognition
No ratings yet
Real-Time Sign Language to Speech Translation using Convolutional Neural Networks and Gesture Recognition
5 pages
Mechanical Performance and Durability Evaluation of Self-Healing Polymers
No ratings yet
Mechanical Performance and Durability Evaluation of Self-Healing Polymers
5 pages
Comparative Study of Formulated Herbal Lozenges and AYURTUSS Lozenges
No ratings yet
Comparative Study of Formulated Herbal Lozenges and AYURTUSS Lozenges
6 pages
Design and Economic Analysis of Boil-Off Gas Recovery in LNG Facilities
No ratings yet
Design and Economic Analysis of Boil-Off Gas Recovery in LNG Facilities
11 pages
Cardio-Eye Connection: Retinal Eye Imaging for Heart Attack Risk Prediction
No ratings yet
Cardio-Eye Connection: Retinal Eye Imaging for Heart Attack Risk Prediction
6 pages
Promoting Sustainable Development through Waste Recycling: A Case Study of Green Entrepreneurship in Bo City, Sierra Leone
No ratings yet
Promoting Sustainable Development through Waste Recycling: A Case Study of Green Entrepreneurship in Bo City, Sierra Leone
11 pages
AI-Powered Local Crime Prediction
No ratings yet
AI-Powered Local Crime Prediction
6 pages
Impact of Nurse-Patient Ratios on Patient Outcomes in Acute Care Settings in Mogadishu, Somalia
No ratings yet
Impact of Nurse-Patient Ratios on Patient Outcomes in Acute Care Settings in Mogadishu, Somalia
7 pages
Healthify: A Conversational AI for Mental Health Support Using Groq and LangChain Frameworks
No ratings yet
Healthify: A Conversational AI for Mental Health Support Using Groq and LangChain Frameworks
7 pages
An EOQ Model for Deteriorating Item with Preservation Technology, Linear Holding Cost, and Multi-Variate Demand
No ratings yet
An EOQ Model for Deteriorating Item with Preservation Technology, Linear Holding Cost, and Multi-Variate Demand
6 pages
Evaluating The Impact of Partially Replacing Cement with Rice Husk Ash and Metakaolin on the Rheological Behavior and Mechanical Strength of Self-Compacting Concrete
No ratings yet
Evaluating The Impact of Partially Replacing Cement with Rice Husk Ash and Metakaolin on the Rheological Behavior and Mechanical Strength of Self-Compacting Concrete
19 pages
Machine Learning Approaches to Classification of Online Users by Exploiting Information Seeking Behaviours
No ratings yet
Machine Learning Approaches to Classification of Online Users by Exploiting Information Seeking Behaviours
6 pages
Predicting Genetic Disorders: Implementation and Deployment on EC2 instances in AWS
No ratings yet
Predicting Genetic Disorders: Implementation and Deployment on EC2 instances in AWS
13 pages
Extraction of Cu(II) Ions Using Chloroform Solution of 4,4 ́-(1E,1E ́)-1,1 ́-(Ethane-1,2- Diylbis(Azan-1-YL- 1ylidene))BIS(5-Methyl-2- Phenyl-2,3-Dihydro-1H-Pyrazol-3-OL) (H2BuEtP) Under the Influence of Acids, Anions and Complexing Agents
No ratings yet
Extraction of Cu(II) Ions Using Chloroform Solution of 4,4 ́-(1E,1E ́)-1,1 ́-(Ethane-1,2- Diylbis(Azan-1-YL- 1ylidene))BIS(5-Methyl-2- Phenyl-2,3-Dihydro-1H-Pyrazol-3-OL) (H2BuEtP) Under the Influence of Acids, Anions and Complexing Agents
10 pages
Case Study of Atenolol
No ratings yet
Case Study of Atenolol
5 pages
Optimizing Light Vehicle Fleet Longevity: Addressing Operational, Environmental and Maintenance Challenges at the Tarkwa Mine Site
No ratings yet
Optimizing Light Vehicle Fleet Longevity: Addressing Operational, Environmental and Maintenance Challenges at the Tarkwa Mine Site
8 pages
Case Study of Methylcobalamin in Pharmamarketing
No ratings yet
Case Study of Methylcobalamin in Pharmamarketing
5 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
36 pages
Intelligent System: Lecture Notes For Chapter 7
No ratings yet
Intelligent System: Lecture Notes For Chapter 7
25 pages
Classification Algorithm in Machine Learning
No ratings yet
Classification Algorithm in Machine Learning
7 pages
Review and Comparison of Face Detection Algorithms: Kirti Dang Shanu Sharma
No ratings yet
Review and Comparison of Face Detection Algorithms: Kirti Dang Shanu Sharma
5 pages
AI 900 Slides
No ratings yet
AI 900 Slides
162 pages
Machine Learning A Comprehensive Report
No ratings yet
Machine Learning A Comprehensive Report
10 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
13 pages
Paradox
No ratings yet
Paradox
14 pages
Concept Bottleneck Models
No ratings yet
Concept Bottleneck Models
19 pages
Arxiv: Natural Language Processing (Almost) From Scratch
No ratings yet
Arxiv: Natural Language Processing (Almost) From Scratch
47 pages
L17-Perceptron
No ratings yet
L17-Perceptron
21 pages
Aiml Q Bank
No ratings yet
Aiml Q Bank
25 pages
Less Than One'-Shot Learning: Learning N Classes From M N Samples
No ratings yet
Less Than One'-Shot Learning: Learning N Classes From M N Samples
8 pages
Digital Image Processing Lecture
No ratings yet
Digital Image Processing Lecture
63 pages
Orange Visual Programming
No ratings yet
Orange Visual Programming
222 pages
Accident Avoidance System Using Drowsiness Detection Based On ML & Ip
No ratings yet
Accident Avoidance System Using Drowsiness Detection Based On ML & Ip
16 pages
Short Brief - Machine Learning
No ratings yet
Short Brief - Machine Learning
10 pages
A Comparative Study of K-Means, K-Medoid and Enhanced K-Medoid Algorithms
No ratings yet
A Comparative Study of K-Means, K-Medoid and Enhanced K-Medoid Algorithms
4 pages
RANDOM FOREST (Binary Classification)
No ratings yet
RANDOM FOREST (Binary Classification)
5 pages
INT354 Syllabus
No ratings yet
INT354 Syllabus
2 pages
Vanishing and Exploding
No ratings yet
Vanishing and Exploding
9 pages
K-Means and ISODATA Clustering Algorithms For Landcover Classification Using Remote Sensing
No ratings yet
K-Means and ISODATA Clustering Algorithms For Landcover Classification Using Remote Sensing
4 pages
JCM 08 01050 PDF
No ratings yet
JCM 08 01050 PDF
13 pages
The IS Effectiveness Matrix The Importance of Stak
No ratings yet
The IS Effectiveness Matrix The Importance of Stak
13 pages
11 Chapter.4
No ratings yet
11 Chapter.4
26 pages
KNN Model
No ratings yet
KNN Model
5 pages
Analytics and Business Intelligence
No ratings yet
Analytics and Business Intelligence
8 pages
Cancer Detection Using Data Mining
No ratings yet
Cancer Detection Using Data Mining
13 pages
Ai Model Question Paper-1
No ratings yet
Ai Model Question Paper-1
12 pages
Workera Report
No ratings yet
Workera Report
22 pages

Activation Functions For Neural Networks: Application and Performance-Based Comparison

Uploaded by

Activation Functions For Neural Networks: Application and Performance-Based Comparison

Uploaded by

Volume 9, Issue 4, April – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24APR934

Activation Functions for Neural Networks:

I. INTRODUCTION III. NEED FOR ACTIVATION FUNCTION

IJISRT24APR934 www.ijisrt.com 1701

IJISRT24APR934 www.ijisrt.com 1702

Fig 3 Sigmoid & TanH Graph

IJISRT24APR934 www.ijisrt.com 1703

default choice is Sigmoid AF whereas, ReLU and its

 Range of Output: VII. CONCLUSION

IJISRT24APR934 www.ijisrt.com 1704

content, however, as a future work, a comprehensive analysis

[1]. N. C. Envinna, “Activation Functions: Comparision of

IJISRT24APR934 www.ijisrt.com 1705

You might also like