Neural Networks and its related Concepts

Overview
● What is Neural Network, Artificial Neural Networks: Biological neurons
and its working
● Simulation of biological neurons to problem solving
● Learning rules and various activation functions (sigmoid, tanh, relu and
softmax )
● McCulloch Pitts Neuron, Concept of Linear Separability
● Single layer Perceptron
● Feedforward Neural Networks
● Back Propagation networks
● Character Recognition Application
● Stochastic Gradient Descent
● Immunological computing

Introduction
● What is Neural Network??
● A method of computing, based on the interaction of multiple connected processing
elements.
● A powerful technique to solve many real world problems.
● The ability to learn from experience in order to improve their performance.
● At the core of a neural network is a mathematical model that is used to make predictions
or decisions based on input data.
● The neurons in a neural network are connected by weighted links that allow them to
communicate with one another.
● There are several types of neural networks, including feedforward neural networks,
convolutional neural networks, and recurrent neural networks.

Basics of Neural Network
● A neuron is a cell that carries electrical impulses and are the basic units of
the nervous system.
● Every neuron is made of a cell body (also called a soma), dendrites and an
axon. Dendrites and axons are nerve fibers. There are about 86 billion neurons
in the human brain, which comprises roughly 10% of all brain cells.
● Neurons are connected to one another and tissues. They do not touch and
instead form tiny gaps called synapses. These gaps can be chemical synapses
or electrical synapses and pass the signal from one neuron to the next.
● Dendrite — It receives signals from other neurons.
● Soma (cell body) — It sums all the incoming signals to generate input.
● Axon — When the sum reaches a threshold value, neuron fires and the signal
travels down the axon to the other neurons.
● Synapses — The point of interconnection of one neuron with other neurons.
The amount of signal transmitted depend upon the strength (synaptic weights)
of the connections.

Comparing ANN and BNN
● As this concept borrowed from ANN there are lot of similarities though there are differences too.
● Similarities are in the following table

Learning
• Learning = learning by adaptation
• The objective of learning in biological organisms is to improve their
survival and reproductive success by adapting to changing environmental
conditions and developing new strategies for survival.
• Learning in biological organisms allows them to:
1. Respond to environmental changes
2. Improve their performance
3. Develop new behaviors
4. Enhance communication

Types of Learning in Neural Network
● Supervised Learning — Supervised learning is a type of machine
learning where the algorithm is trained on labeled data, which means that
the data is already categorized into specific classes or categories.
● Unsupervised Learning — Unsupervised learning is a type of machine
learning where the algorithm is trained on unlabeled data, which means
that the data is not categorized into specific classes or categories. The goal
of unsupervised learning is to find patterns and relationships in the data
without any prior knowledge of what the data represent.
● Reinforcement Learning — Reinforcement learning is a type of machine
learning where an agent learns to make decisions in an environment by
receiving feedback in the form of rewards or penalties.

Model of Artificial
Neural Network
● Receives n-inputs
● Multiplies each input by its
weight
● Applies activation function
to the sum of results
● Outputs result

Activation Functions
● The activation function is a mathematical “gate” in between
the input feeding the current neuron and its output going to
the next layer. They basically decide whether the neuron should
be activated or not.
● Activation functions in a neural network (NN) are mathematical
functions that are applied to the output of a neuron in the
network.
● The activation function introduces non-linearity into the
network and helps to produce a non-linear decision boundary
that can be used to model complex relationships in the input
data.

Why do we use an activation function ?
If we do not have the activation function the weights and bias would simply
do a linear transformation.
A linear equation is simple to solve but is limited in its capacity to solve
complex problems and have less power to learn complex functional
mappings from data.
A neural network without an activation function is just a linear regression
model.
Generally, neural networks use non-linear activation functions, which can
help the network learn complex data, compute and learn almost any function
representing a question, and provide accurate predictions.

Why use a non-linear activation function?
If we were to use a linear activation function or identity activation
functions then the neural network will just output a linear
function of input.
And so, no matter how many layers our neural network has, it will
still behave just like a single layer network because summing
these layers will give us another linear function which is not strong
enough to model data.

Linear or Identity Activation Function

Linear or Identity Activation Function
Equation: f(x) = x
Derivative: f’(x) = 1
Range: (-∞, +∞)
Two major problems:
1. Back-propagation is not possible — The derivative of the function
is a constant, and has no relation to the input, X. So it’s not possible to
go back and understand which weights in the input neurons can
provide a better prediction.
2. All layers of the neural network collapse into one — with linear
activation functions, no matter how many layers in the neural network,
the last layer will be a linear function of the first layer

Non-linear Activation Function
Modern neural network models use non-linear activation functions.
They allow the model to create complex mappings between the
network’s inputs and outputs, which are essential for learning
and modeling complex data, such as images, video, audio, and
data sets which are non-linear or have high dimensionality.
Almost any process imaginable can be represented as a
functional computation in a neural network, provided that the
activation function is non-linear.

Non-linear Activation Function
Non-linear functions address the problems of a linear activation
function:
They allow back-propagation because they have a derivative
function which is related to the inputs.
They allow “stacking” of multiple layers of neurons to create
a deep neural network. Multiple hidden layers of neurons are
needed to learn complex data sets with high levels of accuracy.

Activation Functions
● Some commonly used activation functions in NNs include:
● Sigmoid function: The sigmoid function is an S-shaped curve that maps any input value
to a value between 0 and 1. It is commonly used as the activation function in the output
layer of binary classification problems.
● ReLU (Rectified Linear Unit) function: The ReLU function maps any input value to 0 if
it is negative, and to the input value if it is positive. It is commonly used as the activation
function in the hidden layers of deep neural networks.
● Tanh (Hyperbolic tangent) function: The Tanh function is similar to the sigmoid
function, but it maps any input value to a value between -1 and 1. It is also commonly
used as an activation function in the hidden layers of neural networks.
● Softmax function: The softmax function is used in the output layer of multi-class
classification problems. It maps the output values of each neuron to a probability
distribution over the classes.

Sigmoid Function
● It is a function which is plotted as ‘S’ shaped
graph.
● Equation : A = 1/(1 + e-x
)
● Derivative: f’(x) = s*(1-s)
● Nature : Non-linear. Notice that X values
lies between -2 to 2, Y values are very steep.
This means, small changes in x would also
bring about large changes in the value of Y.
● Value Range : 0 to 1
● Uses : Usually used in output layer of a
binary classification, where result is either 0
or 1, as value for sigmoid function lies
between 0 and 1 only so, result can be
predicted easily to be 1 if value is greater
than 0.5 and 0 otherwise.

Sigmoid Function
Advantages:
1. The function is differentiable.That means, we can find the slope
of the sigmoid curve at any two points.
2. Output values bound between 0 and 1, normalizing the output of
each neuron.
Disadvantages:
3. Vanishing gradient — For very large or very small inputs, the
sigmoid curve flattens.
This means the gradient (slope) becomes almost zero.
With gradients so small, the neural network struggles to update
its weights, slowing or even stopping learning.
4. Due to the vanishing gradient, the training process becomes very
slow, as updates to the model are minimal. sigmoids have slow
convergence.
● Outputs not zero centered.: The sigmoid output ranges from 0 to 1,
so it is always positive.
● This causes issues during weight updates, as the gradients can push
all weights in the same direction, making optimization harder.
1. Computationally expensive.

Tanh Function
• The activation that works almost always better than sigmoid
function is Tanh function also known as Tangent Hyperbolic
function. It’s actually mathematically shifted version of the
sigmoid function. Both are similar and can be derived from each
other.
• Equation :
• Value Range :- -1 to +1
• Derivative: (1- a²)
• Nature :- non-linear*
• Uses :- Usually used in hidden layers of a neural network as it’s
values lies between -1 to 1 hence the mean for the hidden layer
comes out be 0 or very close to it, hence helps in centering the
data by bringing mean close to 0. This makes learning for the
next layer much easier.

Tanh Function
Advantages:
1. Zero centered — Unlike the sigmoid function,
the tanh function outputs values between −1
and 1.
This helps the neural network model
inputs with strong negative, neutral, and
strong positive values more effectively,
leading to faster convergence.
2. The function and its derivative both are
monotonic.(consistently increase or decrease,
simplifying learning)
3. Works better than sigmoid function
Disadvantage:
4. It also suffers vanishing gradient problem and
hence slow convergence.

RELU Function
•It Stands for Rectified linear unit. It is the most widely used
activation function. Chiefly implemented in hidden layers of
Neural network.
•Equation :- A(x) = max(0,x). It gives an output x if x is
positive and 0 otherwise.
•Value Range :- [0, inf)
•Nature :- non-linear, which means we can easily
backpropagate the errors and have multiple layers of neurons
being activated by the ReLU function.
•Uses :- ReLu is less computationally expensive than tanh and
sigmoid because it involves simpler mathematical operations.
At a time only a few neurons are activated making the network
sparse making it efficient and easy for computation.
In simple words, RELU learns much faster than sigmoid and
Tanh function.

Softmax Function
● The softmax activation function is commonly
used in the output layer of a neural network
when performing multiclass classification.
● Nature :- non-linear
● softmax(x_i) = exp(x_i) / sum(exp(x_j)) for all j
● Uses :- Usually used when trying to handle
multiple classes. The softmax function was
commonly found in the output layer of image
classification problems.
● The softmax function is particularly useful in
multiclass classification tasks, where the goal is
to predict the probability of each possible class
for a given input.

Activation function
● Sigmoid functions and their combinations generally work better in
the case of classification problems.
● Sigmoid and tanh functions are sometimes avoided due to the
vanishing gradient problem.
● ReLU activation function is widely used and is default choice as it
yields better results.
● ReLU function should only be used in the hidden layers.
● An output layer can be linear activation function in case of
regression problems.

Activation function
● The basic rule of thumb is if you really don’t know what
activation function to use, then simply use RELU as it is a
general activation function in hidden layers and is used in most
cases these days.
● If your output is for binary classification then, sigmoid
function is very natural choice for output layer.
● If your output is for multi-class classification then, Softmax is
very useful to predict the probabilities of each classes.

What is the Perceptron model in Machine Learning?
Perceptron is Machine Learning algorithm for supervised learning of
various binary classification tasks. Further, Perceptron is also
understood as an Artificial Neuron or neural network unit that helps
to detect certain input data computations in business intelligence.
Perceptron model is also treated as one of the best and simplest
types of Artificial Neural networks. However, it is a supervised
learning algorithm of binary classifiers. Hence, we can consider it as
a single-layer neural network with four main parameters, i.e., input
values, weights and Bias, net sum, and an activation function.

What is Binary classifier in Machine Learning?
A binary classifier is a model used to categorize data into two distinct
classes (e.g., Yes/No, 1/-1, True/False).
A binary classifier predicts which of two classes a given input belongs
to:
● Positive class: Often labeled as 1.
● Negative class: Often labeled as −1 (or 0, depending on
convention).

Basic Components of Perceptron

○ Input Nodes or Input Layer:
This is the primary component of Perceptron which accepts the initial data
into the system for further processing. Each input node contains a real
numerical value.
○ Wight and Bias:
Weight parameter represents the strength of the connection between units.
This is another most important parameter of Perceptron components. Weight is
directly proportional to the strength of the associated input neuron in deciding
the output. Further, Bias can be considered as the line of intercept in a linear
equation.

Activation Function:
These are the final and important components that help to determine whether
the neuron will fire or not. Activation Function can be considered primarily
as a step function.
Types of Activation functions:
—-

How does Perceptron work?
In Machine Learning, Perceptron is considered as a single-layer neural network that
consists of four main parameters named input values (Input nodes), weights and Bias,
net sum, and an activation function.
The perceptron model begins with the multiplication of all input values and their
weights, then adds these values together to create the weighted sum.
Then this weighted sum is applied to the activation function 'f' to obtain the desired
output.
This activation function is also known as the step function and is represented by 'f'.
his step function or Activation function plays a vital role in ensuring that output is mapped
between required values (0,1) or (-1,1).
It is important to note that the weight of input is indicative of the strength of a node. Similarly,
an input's bias value gives the ability to shift the activation function curve up or down.

How does Perceptron work?
Step-1
In the first step first, multiply all input values with corresponding weight values and then add them
to determine the weighted sum. Mathematically, we can calculate the weighted sum as follows:
∑wi*xi = x1*w1 + x2*w2 +…wn*xn
Add a special term called bias 'b' to this weighted sum to improve the model's performance.
∑wi*xi + b
Step-2
In the second step, an activation function is applied with the above-mentioned weighted sum,
which gives us output either in binary form or a continuous value as follows:
Y = f(∑wi*xi + b)

Single Layer Perceptron
A single-layer perceptron is a type of artificial neural network that
consists of only one layer of artificial neurons.
It is the simplest type of neural network and was proposed by Frank
Rosenblatt in 1958.
Single layer perceptron has been used in various applications,
including: Pattern Recognition, Binary Classification, , Control
Systems, Medical Diagnosis, Financial Forecasting

The perceptron consists of 4 parts:
Input value or One input layer: The input layer of the perceptron is made of
artificial input neurons and takes the initial data into the system for further
processing.
Weights and Bias:
Weight: It represents the dimension or strength of the connection between
units.
Bias: It is the same as the intercept added in a linear equation. bias is a
tunable parameter in neural networks that can help improve the accuracy and
flexibility of the model by allowing it to learn more complex decision
boundaries.
Net sum: It calculates the total sum.
Activation Function: A neuron can be activated or not, is determined by an
activation function. The activation function calculates a weighted sum and
further adding bias with it to give the result.

The Perceptron Learning Rule
1. Initialize the weights: Start with random weights for each input.
2. Input the training data: Input the features into the perceptron and calculate
the output.
3. Calculate the error: Compare the predicted output with the desired output to
calculate the error.
4. Update the weights: Adjust the weights of the inputs based on the error. If the
predicted output is less than the desired output, increase the weights of the
inputs. If the predicted output is greater than the desired output, decrease the
weights of the inputs. The magnitude of the weight adjustment is proportional
to the error and the input value.
Repeat: Repeat steps 2 to 4 until the error is minimized or a maximum number of
iterations is reached

Perceptron Function
Perceptron is a function that maps its input “x,” which is multiplied with the learned weight
coefficient; an output value” f(x)”is generated.
In the equation given above:
“w” = vector of real-valued weights
“b” = bias (an element that adjusts the boundary away from origin without any dependence
on the input value)
“x” = vector of input x values
“m” = number of inputs to the Perceptron
The output can be represented as “1” or “0.” It can also be represented as “1” or “-1”
depending on which activation function is used

Activation Functions of Perceptron
The activation function applies a step rule (convert the numerical output
into +1 or -1) to check if the output of the weighting function is greater than
zero or not.
For example:
If ∑ wixi> 0 => then final output “o” = 1 (issue bank loan)
Else, final output “o” = -1 (deny bank loan)
Step function gets triggered above a certain value of the neuron output; else
it outputs zero. Sign Function outputs +1 or -1 depending on whether
neuron output is greater than zero or not. Sigmoid is the S-curve and outputs
a value between 0 and 1.

Feedforward Neural Networks (FFNN)
(FFNN) is a type of artificial neural network where the information flows in one
direction only, from the input layer through one or more hidden layers to the output layer.
The output of each layer is connected to the input of the next layer, and the weights
and biases of the connections are learned during the training process.
FFNNs are commonly used for tasks such as classification, Control systems such as
robotics , and pattern recognition.
They can be trained using supervised learning, where the training data consists of input-
output pairs, and the network learns to map inputs to outputs.
The weights and biases of the network are updated during the training process using
backpropagation, which is an algorithm that computes the gradients of the loss function
with respect to the weights and biases.

How FFNN works
The input layer of an FFNN takes in the input data, which is usually in the form
of a vector.
Passes it through a series of hidden layers, each consisting of a set of neurons.
Each neuron in a hidden layer takes in the weighted sum of the outputs from the
previous layer, adds a bias term, and applies an activation function to produce
an output that is passed to the next layer.
The output layer produces the final output of the network, which is usually a
prediction or a classification.

A Multi-Layer Perceptron (MLP)
A Multi-Layer Perceptron (MLP) is a type of neural
network that consists of multiple layers of artificial
neurons.
MLPs are also known as feedforward neural networks.
The architecture of an MLP consists of an input layer,
one or more hidden layers, and an output layer.
Each layer is composed of multiple artificial neurons that
compute a weighted sum of the input signals and apply
an activation function to produce an output signal.

A Multi-Layer Perceptron (MLP)
The hidden layers in an MLP are responsible for extracting
features from the input data and transforming them into a
format that is suitable for the output layer.
The output layer produces the final output of the network,
which can be binary or continuous.
The learning process of an MLP involves adjusting the
weights of the input signals using backpropagation.
Backpropagation allows the MLP to learn from the training
data and improve its performance over time.

Compare single layer and multilayer perceptron model
Architecture:
Single-layer perceptrons have only one layer of neurons that directly connects to
the input data, whereas multilayer perceptrons consist of multiple layers of
neurons, including one or more hidden layers that lie between the input and output
layers.
Capabilities:
Single-layer perceptrons are limited to linearly separable problems, meaning they
can only learn and classify data that can be separated by a single straight line.
In contrast, multilayer perceptrons can learn and classify non-linearly
separable problems by using hidden layers to transform the input data into a
more complex feature space that can be separated by the output layer.

Compare single layer and multilayer perceptron model
Training:
Single-layer perceptrons use a simple learning rule called the Perceptron Learning
Algorithm, which adjusts the weights of the input signals to minimize the error
between the predicted and actual output. In contrast, multilayer perceptrons use a
more complex learning algorithm called backpropagation, which iteratively adjusts
the weights of all the neurons in the network to minimize the error between the
predicted and actual output.
Applications:
Single-layer perceptrons are typically used for simple binary classification
problems, such as predicting whether an email is spam or not. Multilayer perceptrons
are more powerful and can be used for a wide range of applications, including
image and speech recognition, natural language processing, and financial forecasting.

Back Propagation networks
Back Propagation are supervised learning algorithms used for training
neural networks.
The basic structure of a backpropagation network consists of an input
layer, one or more hidden layers, and an output layer.
Each layer is composed of one or more neurons, which receive inputs,
process them, and pass the outputs to the next layer.
The connections between the neurons are weighted, and these weights
are adjusted during training to improve the accuracy of the network's
predictions.

Back Propagation networks
During the training process, the network is fed a set of input-output
pairs, and the output of the network is compared to the desired output.
The error between the actual output and the desired output is
then back propagated through the network, and the weights are
adjusted to reduce the error.
This process is repeated many times, with the hope that the
network will eventually converge to a set of weights that produces
accurate predictions for new input data.

How Backpropagation Algorithm Works:
1. Inputs X, arrive through the preconnected path
2. Input is modeled using real weights W. The weights are usually randomly
selected.
3. Calculate the output for every neuron from the input layer, to the hidden
layers, to the output layer.
4. Calculate the error in the outputs
ErrorB= Actual Output – Desired Output
5. Travel back from the output layer to the hidden layer to adjust the weights
such that the error is decreased.

Why We Need Backpropagation?
● Backpropagation is fast, simple and easy to program
● It has no parameters to tune apart from the numbers of
input
● It is a flexible method as it does not require prior
knowledge about the network
● It is a standard method that generally works well
● It does not need any special mention of the features of the
function to be learned.

Concept of Linear Separability
● Linear separability is a concept in mathematics and particularly
in machine learning.
● Imagine you have some points scattered around on a piece of
paper, and you want to draw a straight line to separate them into
two groups.
● If you can draw such a line where all the points of one group are
on one side of the line, and all the points of the other group are
on the other side, then those points are said to be linearly
separable.

● For example, let's say you have red and blue dots on a sheet of
paper, and you want to separate them with a straight line. If you
can draw a line in such a way that all the red dots are on one side
and all the blue dots are on the other side, then those dots are
linearly separable.
● Linear separability is important in machine learning because
it means that the data is easy to classify using a simple
algorithm like a linear classifier. If data is not linearly separable,
more complex methods may be needed to classify it accurately.

● Linear separability is an important concept in neural networks. If the separate points in n-dimensional space
follows
then it is said linearly separable
● For two-dimensional inputs, if there exists a line (whose equation is ) that separates all
samples of one class from the other class, then an appropriate perception can be derived from the equation of the
separating line. such classification problems are called “Linear separable” i.e, separating by a linear combination
of i/p.

Character Recognition Application
Character recognition is a common application of neural networks, and
can be achieved using various types of neural networks, including
Feedforward neural networks, convolutional neural networks, and
recurrent neural networks
The network must be trained on a dataset of labeled character images
in order to learn to recognize characters.
During training, the network adjusts its weights based on the
difference between its predicted output and the true label of the input
image.
Once the network is trained, it can be used to make predictions on new,
unlabeled character images.

OCR (Optical Character Recognition)
OCR is a technology that analyzes the text of a page and turns the
letters into code that may be used to process information.
OCR is a technique for detecting printed or handwritten text
characters inside digital images of paper files, such as scanning paper
records
OCR systems are hardware and software systems that turn physical
documents into machine-readable text.
These digital versions can be highly beneficial to children and young
adults who struggle to read.
The essential application of OCR is to convert hard copy legal or
historical documents into PDFs.

How OCR works?
1. Image Pre-Processing
● Size normalization: This step ensures that all images are of the same
size for consistency. We use a method called bicubic interpolation to
resize images to a standard size.
● Binarization: Here, we convert grayscale images to binary images by
setting a threshold. Pixels above the threshold become white, while
those below become black. This helps in simplifying the image for
further processing.
● Smoothing: To make the edges of objects in the image smoother, we
use erosion and dilation techniques. This helps in reducing noise and
making the objects clearer.

How OCR works?
Text recognition: Once the image is pre-processed, we can start recognizing
text. There are two main methods for this:
● Pattern matching: This works well for typed documents with known
fonts. It compares parts of the image with patterns of characters it
knows.
● Feature extraction: This method looks at specific features of
characters, like lines and curves, to identify them.

How OCR works?
Postprocessing
After recognizing the text, the system converts it into a digital format. Some
systems also create annotated PDF files, which show both the original
scanned document and the recognized text.

Immunological computing
Immunological computing is like using the principles of our immune system to teach
computers how to recognize patterns, make decisions, and solve problems
effectively. It's a fascinating area of research that draws inspiration from nature to
develop smarter algorithms and systems.
Immune System Basics:
Our immune system is like a defense force in our body that helps to keep us healthy.
It can recognize harmful invaders, like bacteria or viruses, and fight them off to keep
us safe.
It does this by identifying foreign substances called antigens and producing antibodies
to neutralize them.

How Immunological Computing Works:
In immunological computing, we mimic the behavior of the immune system to
solve computational problems.
Just like our immune system learns to recognize and respond to threats, in
immunological computing, algorithms learn to recognize patterns in data and
make decisions based on them.
Instead of antigens and antibodies, we use concepts like "patterns" and "rules".
The algorithms adapt and improve over time, similar to how our immune system
builds immunity to diseases.

Applications:
Immunological computing can be used in various fields such as data mining,
pattern recognition, and optimization.
For example, in anomaly detection, it can help identify unusual patterns in
data that may indicate fraud or errors.
In optimization problems, it can be used to find the best solution among
many possibilities, similar to how our immune system finds the best
response to different threats.

Stochastic Gradient Descent
Gradient Descent (GD):
Imagine you are blindfolded on a hill and want to find the lowest point without any help.
You feel the slope under your feet and take small steps downhill. This is like Gradient
Descent where you iteratively move towards the minimum of a function by following the
direction of steepest descent.
Stochastic Gradient Descent (SGD):
Now, let's add a twist. Instead of relying on the slope at your current location alone, you
randomly pick a spot on the hill, feel the slope there, and take a step. Sometimes this spot
might be flat or even uphill, but over many such random steps, you tend to move towards
the bottom of the hill. This randomness helps in escaping local minima and can be faster
than regular Gradient Descent, especially for large datasets.

Stochastic Gradient Descent (SGD) is a variant of the Gradient Descent
algorithm that is used for optimizing machine learning models.
In SGD, instead of using the entire dataset for each iteration, only a single
random training example (or a small batch) is selected to calculate the
gradient and update the model parameters. This random selection introduces
randomness into the optimization process, hence the term “stochastic” in
stochastic Gradient Descent
The advantage of using SGD is its computational efficiency, especially when
dealing with large datasets. By using a single example or a small batch, the
computational cost per iteration is significantly reduced compared to traditional
Gradient Descent methods that require processing the entire dataset.

How it works:
● Start with an initial guess for the minimum point.
● Randomly shuffle your dataset.
● For each data point in the shuffled dataset:
○ Compute the gradient of the loss function at that point (i.e.,
the direction of steepest descent).
○ Update your guess for the minimum point by taking a small
step.
● Repeat this process for a fixed number of iterations or until the
improvement becomes very small.

Neural Networks and its related Concepts

More Related Content

Similar to Neural Networks and its related Concepts (20)

Recently uploaded (20)

Neural Networks and its related Concepts

Editor's Notes