0% found this document useful (0 votes)
7 views

Neural Network Basics

This is a layman's document/guide to understanding feedforward dense neural networks

Uploaded by

BenjaminCoetzee
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Neural Network Basics

This is a layman's document/guide to understanding feedforward dense neural networks

Uploaded by

BenjaminCoetzee
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Neural Network Basics for the Average Joe

A Neural Network (NNW) or Artificial Neural Network (ANN) is mathematical system inspired by the way, a biological
nervous system works. NNW’s are used to solve simple and complex real world problems such as pattern recognition
(handwriting and facial recognition) and data classification.

In this post, we will be looking at Feedforward Neural Networks. I have tried to break up the components and theory of
FFNN’s into simple easy to read categories to help the average Joe … like me, understand it better.

Table of Contents:

1. Basic Layout of a FFNN


2. What a NNW Does (In Layman’s Terms)
3. What a NNW Does (Mathematical Explanation)

INTERNAL
1. Basic Layout of a FFNN
The below image shows the basic layout of a Feedforward Neural Network (FFNN) with Input, Hidden and Output Layers.
Each hollow circle represents a neuron and the lines connecting each hollow circle are called weights.

Generally, you only have one input layer and one output layer. The number of hidden layers can vary anywhere from
zero to how ever many is needed for a specific application.

A NNW can have any number of input, hidden and output layer neurons. It is important to notice that neurons, in
adjacent layers, are connected to each other by some weight value w.

The image below is a generic layout of a FFNN representing i number of input neurons, k number of hidden layers each
having m number of neurons and an output layer with j number of output neurons.

Note, I generally always assign the same number of hidden layer neurons to each hidden layer; you do not have to do
this. I do it because it makes the coding a lot easier. (See other posts for coding in Excel)

The image below is a clearer illustration highlighting the different layers and the interconnected weights between
adjacent layer neurons.

INTERNAL
Each weight, connecting the neurons together, represents a scalar value (some real number), before a NNW is trained
these weight values are randomly assigned a number (usually between -1 and 1).

Each neuron will also have a specific real number value when some input is applied to the input layer; the difference
though is that all hidden layer and output layer neuron values are calculated by means of an activation function. The
input to this activation function is called the net signal.

All input layer neurons have no activation function as the input values, which must be real number values, are supplied
into the network by the user.

If this does not make much sense … don’t worry, these concepts are explained in detail with illustrations in the sections
to come.

INTERNAL
2. What a NNW Does (In Layman’s Terms)
General Explanation

A neural network is a function that receives inputs in the form of numbers and returns an output value(s) from zero to
one based on the interaction of the weighted values connecting each neuron and the resultant values of the neurons
themselves.

For the neural network to give any meaningful result, it first has to be “trained”. To train the neural network you have to
provide it with a bunch of inputs with known output values, see what the network outputs, then based on how far the
network is from giving the correct answer, alter the weights between each layer until the difference or error between
the networks output and the desired output is acceptable.

To put this explanation into a context that will help make sense of all this, consider the following example:

Let us say you have an image consisting of four blocks (or pixels), of the four blocks, three are black and one is green, as
shown below.

Of the four blocks only one can be green and the green block can be moved to the top left, top right, bottom left or
bottom right, as shown below.

From the perspective of a human, you can look at any one of the images above and tell me exactly where the green
block is, whether it is on the top or bottom, and you can do this quite easily.

In this example, you will be able to see how a neural network can be used to determine if the green block is on the top
of the image, or the bottom of the image.

So, in summary we would like the NNW to perform something like this:

INTERNAL
In the image above, we see that an input image is supplied to the NNW, which then causes the NNW to output a result
indicating either ‘Top’ or ‘Bottom’.

The input layer in this case would have four neurons, one for each pixel. The output layer would only have two neurons,
one to indicate top and another to indicate bottom as shown below.

The Input - The image above shows what the input would look like for an image where the green square is at the top left
hand side. Input 2 would be the top RHS pixel, Input 3 the bottom LHS pixel and Input 4 the bottom RHS pixel.

Hidden Layer - To keep the illustration simple I have shown a single hidden layer having two neurons.

Output – If the NNW were trained, Output 1 would be high or have a value close to one and Output 2 would be low
having a value close to zero.

INTERNAL
Training the NNW requires feedforward, error and back propagation methodologies. These terms are discussed in detail
later on in the post; here is a brief explanation of each:

Feedforward Propagation

The process whereby the values of the output neurons are calculated. This is done by first determining the values of
each hidden layer neuron, starting from the first hidden layer until the last hidden layer. The first hidden layer neuron
values are determined by adding up the product of each input neuron value and its weight then adding some bias value
Ѳ and inserting the result into an activation function.

For example, the value of the first neuron in hidden layer 1 is calculated as follows:

h 11=f ( z 11 )=f ( ( i1 x w 11 ) + ( i2 x w 21 )+ ( i3 x w 31 )+ ( i4 x w 41 ) +Ѳ )

Where f (z) is the activation function applied to the net signal z11. w11 is the weight value between input neuron i1 and
h11 neuron, w21 the weight value between input neuron i2 and h11 neuron and so on.

Similarly o1 out be calculated as follows:

o 1=f ( z )=f ( ( h 11 x w 11 )+ ( h 12 x w 21 ) +Ѳ )

Where f (z) is the activation function applied to net signal z. w11 is the weight value between h11 neuron and o1
neuron, w21 the weight value between h12 neuron and o1 neuron.

Error Calculation

The output error of a NNW is extremely important when it comes to training the NNW. This Error is determined by
calculating the difference between the actual NNW output values and the correct output values for a given training data
set.

Error of Neuron o1 = Correct Output of Neuron o1 – Actual Output of Neuron o1

So if neuron o1 was supposed to be 1.0 and the actual was 0.52 the error would be 0.48 or 48% error.

Back Propagation

Backpropagation is the process using the output error, of each output neuron, to determine how much each weight
value and bias value needs to be adjusted to reduce the overall error of each output neuron. In my opinion, this is the
most difficult part of any NNW.

Training Overview

When training the NNW, each time the network goes through one cycle of feedforward and back propagation
computations, the output error between the actual result and the desired result should get smaller. Taking the NNW
through a number of cycles with training data will eventually result in an acceptable level of error after which the NNW
will be sufficiently ‘trained’ and able to output reliable results.

INTERNAL
3. What a NNW Does (Mathematical Explanation)
To understand how the mathematical formulas work I find it helps to have a detailed understanding of the graphical
layout of the NNW and the notation assigned to each layer, neuron and weight. If you have a good understanding of
these elements, the math will seem relatively simple.

Consider the FFNN layout shown below, with i input neurons, k hidden layers (each with m neurons) and j output
neurons. Note only the weight values used in the formulas are shown in the image below, the rest were removed to
make it easier to follow.

To calculate the net input signals to the various neurons I have divided the network up into three sections: The first
hidden layer neurons, the remaining hidden layer neurons and the output layer neurons.

Why do I do this? … Because it simplifies the coding necessary to implement a FFNN in Excel VBA.

Feedforward Propagation Equations

Finding the Net Input Signals

The net signal for any neuron in the first hidden layer:

i
H 1x ∨h 1 x= ∑ ( ¿ X∗w Xx ) + θ1x
X =1

INTERNAL
1
Where H x is the first hidden layer neuron x , w Xx is the weight number X connected to the hidden layer neuron number x
.

The net signal for any hidden layer neuron after first layer:

m
H yx ∨hyx= ∑ ¿ ¿
X =1

y
Where H x is the y th hidden layer neuron x , w Xx is the weight number X connected to the hidden layer neuron number x .
f ( H ¿ ¿ X )¿ is the value of some activation function applied to the net input signal to neuron number X in the hidden
y−1

layer behind hidden layer y .

The net signal for any output neuron:

m
Out x = ∑ ¿ ¿
X=1

Where Out x is the output neuron number x , f (H ¿ ¿ X k )¿ is the value of some activation function applied to the net
input signal to neuron number X in the last hidden layer k. w Xx is the weight number X connected to the output neuron
number x .

Activation Equation

An activation function is applied to each layer of neurons, excluding the input layer. Below is an example of the Sigmoid
activation function and how it applied to the neurons:

1
f ( z )= −z
1−e

1 y
Where z = the net input signal H x for all neurons in the first hidden layer, z = H x for all neurons in all other hidden
layers and z = Out x for all neurons in the output layer.

Output Error

The output error per neuron of a FFNN is determined by calculating the difference between the actual NNW output
values and the correct output values for a given training data set.

INTERNAL
In a stochastic learning method, the output error of each output neuron, per training cycle, is used to update the weight
values through the NNW.

δ x = y x −o x

Where δ x is the output error of output neuron x, y x is the correct/desired output value for output neuron x and o x is
the actual NNW output for output neuron x.

The overall error of the NNW is found using the sum squared error (SSE) as shown below:

j
1
Overall Error per Training Input= ∗∑ ( y X −o X )
2

j X =1

Where j is the number of output neurons.

This error value is useful as it is a good indicator to how well the NNW is trained and thus could be used as stopping
condition, once an acceptable error level has been reached, when training the NNW.

Backpropagation

Once you know what the output error, for every output neuron, these values can then be used to adjust the individual
weight values and biases throughout the NNW. This is known as back propagation.

To demonstrate how backpropagation works, consider the NNW below, which has 4 input neurons, 2 hidden layers each
having 3 neurons and an output layer with 2 neurons:

INTERNAL
I have used different notation for the weights between each layer to make the explanation and equations easier to
follow. All weights between the input and first hidden layer are noted with the u symbol, weights between hidden layer
1 and 2 are noted with the v symbol and weights between hidden layer 2 and the output layer are noted with w.

Notation

The back propagation process starts with the calculated output error of each output neuron. Let us say that the output
error out output neuron 1 is dOut1 and the output error of output neuron 2 is dOut2.

Using these two error values, error values for every hidden layer neuron will be calculated. So let us assign notation to
the hidden layer neurons. Hidden layer 1 neuron error values will be noted as dh11, dh12 and dh13. Hidden layer 2
neuron error values will be noted as dh21, dh22 and dh23.

Calculating the Error Values

Consider the image below:

Starting with the output errors, we then calculate the error of each neuron in the last hidden layer:

The error of the first neuron is calculated as follows: (see blue highlights in image)

dh 21=dOut 1∗w 11+dOut 2∗w 12


Similarly the error of the other two neurons is found as follows:

dh 22=dOut 1∗w 21+ dOut 2∗w 22


dh 23=dOut 1∗w 31+ dOut 2∗w 32

INTERNAL
Now that we have the error of each neuron in the last hidden layer, we can now calculate the error in the first hidden
layer (or second last hidden layer, if you have more than two hidden layers).

The error of the first neuron is calculated as follows: (see blue highlights in image)

dh 11=dh 21∗v 11+dh 22∗v 12+ dh 23∗v 13

Similarly, the error of the other two neurons is found as follows:

dh 12=dh 21∗v 21+dh 22∗v 22+dh 23∗v 23


dh 13=dh 21∗v 31+ dh22∗v 32+dh 23∗v 33

Note no error values need to be calculated for the input neurons.

Calculating Changes in Weights and Biases

Now that we have the error values of each neuron in the NNW, we can use these values to determine how much each
weight value and bias value needs to change to reduce the overall error of the network.

The formula used to calculate the change in weight of any weight in the NNW is shown below:

INTERNAL
df (z yz )
∆ w xy =w xy + η δ yz Neuron x
de

Where w xy is the weight value connecting neuron number x to neuron number y, η is the learning rate of the NNW, δ yz
is the error of the neuron to the right of the weight xy, z yz is the net signal input to neuron yz and Neuron x is the neuron
to the left of weight xy.

To show this graphically consider the below image:

Also represented as:

df ( H xy )
∆ w( y−1 ) y =w ( y−1 ) y +η dh yx f ( H xy−1)
de
The change in the bias value is calculated as follows:

df (z yz )
∆ Θ xy =Θ xy +η δ yz
de
Also represented as:

INTERNAL
df ( H yx )
∆ Θ xy =Θ xy +η dh yx
de

y
Where Θ xy is the bias value associated with neuron H x .

Once all the weights in the NNW have been updated, the next training data set is applied to the input producing some
output error after which the backpropagation process is repeated.

INTERNAL

You might also like