0% found this document useful (0 votes)
2 views

NEURAL NETWORKS RESEARCH

The document discusses the fundamentals of neural networks, focusing on a multilayer perceptron that learns to recognize handwritten digits. It explains the structure of the network, including layers, neurons, weights, biases, and activation functions like sigmoid and ReLU. Additionally, it covers the learning process through training data, cost functions, and the backpropagation method for adjusting weights and biases to minimize errors.

Uploaded by

shaambavy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

NEURAL NETWORKS RESEARCH

The document discusses the fundamentals of neural networks, focusing on a multilayer perceptron that learns to recognize handwritten digits. It explains the structure of the network, including layers, neurons, weights, biases, and activation functions like sigmoid and ReLU. Additionally, it covers the learning process through training data, cost functions, and the backpropagation method for adjusting weights and biases to minimize errors.

Uploaded by

shaambavy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

NEURAL NETWORKS RESEARCH

CHAPTER 1
First example is going to be a neural network that can learn to recognise
handwritten digits.

The multiplayer perceptron is the SIMPLEST plain one to start with

-Neuron: A thing that holds a number -between 0 and 1 ,better to think of


it as a function

Neurons can be each of the pixels of the image with the number
representing the grayscale value of it -that is known as its activation

Each of the pixels in this case 28*28 make up the first LAYER of the neural
network imagine it as all the pixels lined up vertically for the first layer -
784 pixels

The last layer has 10 neurons from 0-9 the activation of these represent
how much the system thinks the image correspond with the digit

The hidden layers are in between they work as an example by recognising


sub components and matching it to the final digit. Each subcomponent
e.g. the loop on a 9 can be broken down further into sub components

For this example lets imagine two hidden layers each with 16 neurons ,
the activation state of the previous layer is tied to the activation state of
the current layer they are very interlinked

We can also make use of weights between the link between neuron layers-
these again should be a number value

We can take all the activation states and calculate a weighted sum, if we
made the weights associated with certain patterns larger taking the
weighted sum can add up to specifying a certain pattern

Another thing to consider is that the weighted sum can be very large a
common thing to do is use a function to squish it into the range 0-1 e.g.
the sigmoid function 1/1+e^-x. This makes negative inputs close to zero
and large positive ones closer to one. You can also add a bias e.g. subtract
10 from the weighted sum to determine WHEN the pattern is meaningful
enough.

So the activation of the second layer of neurons is the sigmoid function of


the weighted sum of activation of the FIRST LAYER

For this overall example there are 13002 weights and biases we can tweak
with so learning refers to finding the right weights and biases
The activation calculation can be written as vectors with is the common
notation for it

ReLU is a function often used over sigmoid since it is easier to train

Max(0,a) -based on the idea of if the activation state passes a certain


threshold or not

CHAPTER 2
-how neural networks learn

Uses training data with labels it will adjust the weights and biases to
improve the performance

After training you test with un labelled data

Finding the minimum of the function is how it works

Each neuron is connected to all previous neurons and the weights are a
measure of the strength of the connection and the bias an indication of
how active the neuron is

You can start by initializing all biases and weights randomly ,this will
obviously be very inaccurate so we calculate the cost similar to chi
squared where you find the difference between the current activation
states and the ones it should be squared .The bigger it is the further from
an accurate network

The cost function uses MANY training examples to find the average
squared difference

After indicating that the network is inaccurate, we have to fix it


The cost function as you can imagine is a very asymmetrical graph -start
at any input and step through until you reach the minimum since there
can be many minimums

We can do this for 3D graphs

If we organise weights and biases into a column vector ,then we can


subtract the negative gradient of the cost function to the associated
weights and biases to create the minimum of the cost function

This method is known as the gradient descent using the negative gradient
of the cost function to converge towards a local minimum of the cost
function

The negative cost function gradient encompasses if the weights should


increase/decrease and by HOW MUCH IMPORTANCE

CHAPTER 3
This is about back propagation the method which neural networks use to
learn

It is how we compute the gradient

Firstly when training

-you can adjust the activation state by increasing the bias, increasing the
weights or change activations from previous

we can increase the weight in proportion to the activation

we can change the activation in proportion to the weight

You might also like