John Bullinaria's Step by Step Guide To Implement Neuronal Network in C

Ghid pas cu pas de implementare AI folosind C.

Uploaded by

Paul Tota

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

43 views

John Bullinaria's Step by Step Guide To Implement Neuronal Network in C

Ghid pas cu pas de implementare AI folosind C.

Uploaded by

Paul Tota

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 6

John Bullinaria's Step by Step Guide to Implementing a Neural Network in C By John A. Bullinaria from the School of Computer Science of The University of itmingham, UK. This document contains a step by step guide to implementing a simple neural network in C. It is aimed mainly at students who wish to (or have been told to) incorporate a neural network learning component into a larger system they are building. Obviously there are many types of neural network one could consider using - here | shall concentrate on one particularly common and useful type, namely a simple fully-connected feed-forward back-propagation network (multi layer perceptron), consisting of an input layer, one hidden layer and an output layer. This type of network will be useful when we have a set of input vectors and a corresponding set of output vectors, and the aim is for the network to produce an appropriate output for each input it is given. Of course, if we already have a complete noise-free set of input and output vectors, then a simple look-up table would suffice. However, if we want the system to generalize, i.e. produce appropriate outputs for inputs that have never been seen before, then a neural network that has Jearned how to map between the known inputs and outputs (i.e. the training data set) will often do a pretty good job for new inputs as well, particularly if an appropriate regularization technique has been used. | shall assume that the reader is already familiar with C, and for more details about neural networks in general there are plenty of good text-books and web-sites available (e.g., see my Neural Computation web-site). So, let us begin... Assingle neuron (j.e. processing unit) takes its total input in and computes an associated output activation Out. A popular activation function is the sigmoid function Out = 1.0/(1.0 + exp(-n)); /* Out = Sigmoid(In) */ though other functions are often used (e.g., linear or hyperbolic tangent). This has the effect of squashing the infinite range of in into the range 0 to 1. It also has the convenient property that its derivative takes the particularly simple form Sigmoid_Derivative Sigmoid * (1.0 - Sigmoid) ; which proves useful when implementing the learning algorithm, Usually the input /n into a given neuron will be the weighted sum of activations feeding in from the outputs of a number of other neurons. It is convenient to think of the activations flowing through layers of neurons. So, if there are Numinput neurons in the input layer, the total activation flowing into a hidden layer neuron is just the sum SumH over all input{i]*Weightf], where Weight{i] is the strengthiweight of the connection between unit /in the input layer and our unit in the hidden layer. Each neuron will also have a bias, or resting state, that is added to the sum of inputs, and it is convenient to call this Weight{0}. This acts as the neuron threshold, We can then compute the hidden unit activation with SumH = Weight(0]; _/* start with the hidden unit bias */ for(i=1;i<=Numinput ;i+*){ —/*i loop over input units */ SumH += Input[i] * Weight{i];_/* add in weighted contribution from each input unit */ Hidden = 1.0/(1.0 + exp(-SumH)); _/* compute sigmoid to give activation */ Normally the hidden layer will have many units as well, so it is appropriate to write the weights between input unit / and hidden layer unit j as an array WeightlH[ijf], in which we have added the label JH to avoid confusion with any other weights in the network, Thus to get the activation of unit j in the hidden layer we haveSumH{j] = Weight!H[O]{] ; for( i= 1;1 <= Numinput ; i++ ) { SumH{j] += Input] * WeightIH[i]G] ; } Hidden{j] = 1.0/(1.0 + exp(-SumH[j])) ; Remember that in C the array indices start from zero, not one, so we would declare our variables as double Input[Numinput+1] ; double SumH[NumHidden+1] ; double Hidden[NumHidden+ 1] ; double Weight!H[Numinput+1][NumHidden+1] ; etc. (or, more likely, declare pointers and use calloc or malloc to allocate the memory), Naturally, we need another loop to get all the hidden unit activations for( j = 1; | <= NumHidden ; j++ ) { SumH{j] = WeightlH(Olii] ; for( i= 1; i<= NumInput SumH{j] += Input] ++) WeightIH{ib] ; } Hiddenfj] = 1.0/(1.0 + exp(-SumH{j])) : } One hidden layer is necessary and sufficient for most purposes, so our hidden layer activations will feed into the output layer in the same way as above. The code can start to become confusing at this point - keeping a separate index i, j, k for each layer helps, as does an intuitive notation for distinguishing between the different layers of weights WeightiH and WeightHO, the sums of activations feeding into each layer SumH and Sumo, and the resultant activations at each layer Hidden and Output. The code thus becomes for(j=1;]<=NumHidden ; j++) { /* j loop computes hidden unit activations */ SumH[j] = WeightIH[O}i] ; for( i= 1; i <= Numinput ; i++ ) { SumH{j] += Input{i] * WeightlH{iJ{] ; } Hiddenji] = 1.0/(1.0 + exp(-SumHfj))) } for( k= 1;k<=NumOutput; k++) { /* k loop computes output unit activations */ SumOfk] = WeightHO[O][k] ; for( j= 1; j <= NumHidden ; j++ ) { SumOJk] += Hidden{j] * WeightHOpi[K] ; } Output{k] = 1.0/(1.0 + exp(-SumOJk))) ; } and the network takes on the familiar form that we shall use for the remainder of this document7 Output, = sna Bias, +>, Fen Wegt2p 7 WeightHOy. z Hidden; sama nas + D put, Weight, | WeightIH, Generally we will have a whole set of NumPattern training patterns, i.e. pairs of input and target output vectors, { Inputfp]fi] , Target{p]{k] } labelled by the index p. The network learns by minimizing some measure of the error of the network's actual outputs compared with the target outputs, For example, the sum squared error over all output units k and all training patterns p will be given by = NumPattem ; p++ ) { for( k= 1; k <= NumOutput ; k++ ) { Error += 0.5 * (Target{pl{k] - Output{p]{k]) * (Targetfp]{k] - Output{p][k}) ; } } (The factor of 0.5 is conventionally included to simplify the algebra in deriving the learning algorithm.) If we insert the above code for computing the network outputs into the p loop of this, we end up with 0; 1; p<=NumPattern ; p++){ —__/* p loop over training patterns */ for( j=1;j <= NumHidden ; j+#){ — /*j loop over hidden units */ SumHp]{] = WeightIH(0]{] ; for(i= 1; i<= Numlnput ; i++ ) { ‘SumH [pI] += Input[p]fi] * WeightIH{i]U] ; Error for } Hidden[p][j] = 1.0/(1.0 + exp(-SumH[p]f)))) ; for( k= 1; k <=NumOutput ; k++) { _ /*k loop over output units */ SumOJp][k] = WeightHO[o][k] ; for( j= 1; j <= NumHidden ; j++) { ‘SumOJpl{k] += Hidden(p]fi] * WeightHOGIK] ; } Output{p][k] = 1.0/(1.0 + exp(-SumOfpl[k})) ; Error += 0.5 * (Target[p]{k] - Output[p]{k]) * (Target{p]Ik] - Outputfp]{k]) ; /* Sum Squared Error */ } Ill leave the reader to dispense with any indices that they don't need for the purposes of their own system (e.g,, the indices on SumH and SumO)The next stage is to iteratively adjust the weights to minimize the network's error. A standard way to do this is by performing ‘gradient descent’ on the error function. We compute analytically how much the error is changed by a small change in each weight (i.e. compute the partial derivatives dErrorldWeight) and shift the weights by a small amount DeltaWeight in the direction that most reduces the error. The literature is full of variations on this general approach - here we shall implemant the ‘standard on-line back-propagation with momentum’ algorithm. This is not the place to go through all the mathematics, but for the above sum squared error we can compute and apply one iteration (or ‘epoch’) of the required weight changes DeltaWeightIH and DeltaWeightHO using NumPattem ; p++ ){ _/* repeat for all the training patterns */ j=1;]<=NumHidden ; j++ ){ —_/* compute hidden unit activations */ SumH(p]{] = WeightIH(0]{] ; for( i= 1:1 <= Numinput ; i++ ) { SumH(p]{] += Input{p]{i] * WeightIH[i]G] } Hidden[p][j] = 1.0/(1.0 + exp(-SumH{[p]fj})) ; for( k= 1;k <=NumOutput ; k++) { — /* compute output unit activations and errors 1 SumOfp][k] = WeightHOjo][K] ; for( j= 1; j <= NumHidden ; j++) { SumOfp][k] += Hiddentp]{] * WeightHOpl[k] ; } Output{p][k] = 1.0/(1.0 + exp(-SumOfp]ik})) ; Error += 0.5 * (Target{p]{k] - Output[p{k]) * (Target{p][k] - Outputfp{k)) ; DeltaO[k] = (Target[p][k] - Output{p][k}) * Output{p][k] * (1.0 - Output[p][k}) ; } for( j=1;j.<=NumHidden ;j++){ — /* 'back-propagate' errors to hidden layer */ SumDOW{] = 0.0 ; for( k= 1; k <= NumOutput ; k++ ) { SumDOWf] += WeightHO[j]{k] * DeltaO[k] ; } DeltaH[j] = SumDOWf] * Hidden[p]f] * (1.0 - Hiddenfp]f) ; } for( j=1;j <= NumHidden ; j++) { _/* update weights WeightIH */ DeltaWeightIH[0]{] = eta * DeltaH[] + alpha * DeltaWeightIH[0]h] ; WeightlH(0]j] += DeltaWeightIH(0}f]) ; for( i= 1:1 <= Numlnput ; i++ ) { DeltaWeightIH[iJf] = eta * Input{p]{i] * Delta[] + alpha * DeltaWeightIH[iIi]; WeightlH[i][] += DeltaWeightIH(i] ; } } for( k= 1; k <=NumOutput ;k ++ ){ —_/* update weights WeightHO */ DeltaWeightHOjO][k] = eta * DeltaO[k] + alpha * DeltaWeightHO[O}{k]; WeightHO[0][k] += DeltaWeightHOjOlIk] ; for( j= 1; | <= NumHidden ; j++ ) { DeltaWeightHO[j]{k] = eta * Hidden[p][] * DeltaO[k] + alpha * DeltaWeightHOpj[k] ; WeightHOjil[k] += DeltaWeightHOpjik] ; (There is clearly plenty of scope for re-ordering, combining and simplifying the loops here - | will leave that for the reader to do once they have understood what the separate code sections aredoing.) The weight changes DeltaWeightIH and DeltaWeightHO are each made up of two components. First, the eta component that is the gradient descent contribution, consisting of the ‘learning rate’ or'step size' eta multiplied by the gradient. Second, the alpha component that is a ‘momentum’ term which effectively keeps a moving average of the gradient descent weight change contributions, and thus smoothes out the overall weight changes. Fixing good values of the learning parameters eta and alpha is usually a matter of trial and error. Certainly alpha must be in the range 0 to 1, and a non-zero value does usually speed up learning. However, setting alpha to zero and having no momentum allows much simpler code. Finding a good value for eta (known as the learning rate or step size) will depend on the problem, and also on the value chosen for alpha. Ifitis set too low, the training will be unnecessarily slow. Having it too large will cause the weight changes to oscillate wildly, and can slow down or even prevent learning altogether. (| generally start by trying eta = 0.1 and explore the effects of repeatedly doubling or halving it.) The complete training process will consist of repeating the above weight updates for a number of epochs (using another for loop) until some error crierion is met, for example the Error falls below some chosen small number. (Note that, with sigmoids on the outputs, the Error can only reach exactly zero if the weights reach infinity! Note also that sometimes the training can get stuck in a ‘local minimum’ of the error function and never get anywhere the actual minimum.) So, we need to wrap the last block of code in something like for( epoch = 1 ; epoch < LARGENUMBER ; epoch++ ) { 1 ABOVE CODE FOR ONE ITERATION */ if( Error < SMALLNUMBER ) break ; Naturally, one must set some initial network weights to start the learning process. Starting all the weights at zero is generally not a good idea, as that is often a local minimum of the error function. Itis norma! to initialize all the weights with small random values. If rando() is your favourite random number generator function that returns a flat distribution of random numbers in the range 0 to 1, and smailwt is the maximum absolute size of your initial weights, then an appropriate section of weight initialization code would be for(j=1;j<=NumHidden ; j++){ _/* initialize WeightIH and DeltaWeightlH */ for( i= 0; i<= Numinput ; i++ ) { DeltaWeightIH[ifi] = 0.0 ; WeightlH[i][] = 2.0 * ( rando() - 0.5 ) * smallwt ; } } for( k= 1;k<=NumOutput ;k ++ ){ —__/* initialize WeightHO and DeltaWeightHO */ for( j = 0 ;j <= NumHidden ; j++ ) { DeltaWeightHOjjl[k] = 0.0 ; WeightHOfi][k] = 2.0 * ( rando() - 0.5 ) * smaliwt ; } Note, that it is a good idea to set all the initial DeltaWeights to zero at the same time. The best value for smallwt will depend on the problem - but they should never be so large that the sigmoids saturate before the training begins (i.e. start too close to zero or one). If the training patterns are presented in the same systematic order during each epoch, it is possible for weight oscillations to occur. It is therefore generally a good idea to use a new random order for the training patterns for each epoch. If we put the NumPattern training pattern indices p in random order into an array ranpat{], then it is simply a matter of replacing our training pattern loop :p <= NumPattern ; p++) { for( pwith for( np = 1 ; np <= NumPattern ; np++ ) { P = ranpat[np] ; Generating the random array ranpat[] is not quite so simple, but the following code will do the job for( p=1;p<=NumPattem; p++) { —__/* set up ordered array */ ranpat[p] = p ; } for( p=1;p<=NumPattern ; p++){ —_/* swap random elements into each position */ np = p + rando() * (NumPattern + 1 -p); op = ranpat[p] ; ranpat[p] = ranpat[np] ; ranpat[np] = op ; } We now have enough code to put together a working neural network program. | have cut and pasted the above code into the file nn.c (which your browser should allow you to save into your own file space). | have added the standard #includes, declared all the variables, hard coded the standard XOR training data and values for eta, alpha and smaliwt, #defined an overly-simple rando(), added some print statements to show what the network is doing, and wrapped the whole lot in a main(){}. The file should compile and run in the normal way (e.g., using the UNIX commands ‘cc nn.c -O -Im -o nn’ and 'nn’). I've left plenty for the reader to do to convert this into a useful program, for example: + Reading in training and testing data from file Allowing the parameters (eta, alpha, smallwt, NumHidden, etc.) to be varied during runtime Having appropriate array sizes determined from the training data and allocating them memory during runtime Saving the network weights to file, and reading them back in again Plotting of errors, output activations, etc. during training There are also numerous network variations that could usefully be implemented, for example: Batch learning, rather than on-line learning Separate training, validation and testing data sets More sophisticated techniques for stopping the training Weight decay or other regularization approaches Different architectures, e.g, more hidden layers, direct input-to-output connections, partial connectivity, etc. + Regression problems require linear output functions, rather than sigmoids Output{p]{k] = SumOfp][k] ; DeltaOjk] = Target{p]{k] - Output[p]{k] ; * Classification problems should use the Cross-Entropy error function, rather than Sum Squared Error Error -= ( Target{p][k] * log( Output{p]{k] ) + ( 1.0 - Target{p][k] ) * log( 1.0 - Outputfp][k] ) ie DeltaOjk] = Target{p][k] - Output{p][k] ; Muliple-class classification problems should really use the Softmax activation function But from here on, you're on your own. | hope you found this page useful... This page is maintained by John Bullinaria. Last updated on 14 October 2009.

Chapter3 - BP
No ratings yet
Chapter3 - BP
12 pages
How To Build Your Own Neural Network From Scratch in
No ratings yet
How To Build Your Own Neural Network From Scratch in
6 pages
Back propagation
No ratings yet
Back propagation
9 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
BPN Illustration
No ratings yet
BPN Illustration
8 pages
Exp 3
No ratings yet
Exp 3
9 pages
Exp 4
No ratings yet
Exp 4
9 pages
Mind - How To Build A Neural Network (Part One)
No ratings yet
Mind - How To Build A Neural Network (Part One)
9 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Intro To Deep Learning - Lab
No ratings yet
Intro To Deep Learning - Lab
7 pages
ANN-Implemetation of Back-Prop
No ratings yet
ANN-Implemetation of Back-Prop
89 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
M3_Transcript
No ratings yet
M3_Transcript
10 pages
ANN MODULE 1 Part2
No ratings yet
ANN MODULE 1 Part2
58 pages
Neural Networks Handout
No ratings yet
Neural Networks Handout
7 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
14 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
Neural Network
100% (1)
Neural Network
54 pages
Neural
No ratings yet
Neural
53 pages
Back-Propagation Is Very Simple. Who Made It Complicated
No ratings yet
Back-Propagation Is Very Simple. Who Made It Complicated
26 pages
Back Propagation Algorithm
No ratings yet
Back Propagation Algorithm
13 pages
Back Propagation
No ratings yet
Back Propagation
29 pages
Pr3_ANN_WriteUp.docx
No ratings yet
Pr3_ANN_WriteUp.docx
8 pages
Module 3.Docxaiml
No ratings yet
Module 3.Docxaiml
20 pages
Pr2_ANN_WriteUp.docx
No ratings yet
Pr2_ANN_WriteUp.docx
11 pages
Modue 2 - Back Propagation Algorithm-Updated
No ratings yet
Modue 2 - Back Propagation Algorithm-Updated
51 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Backpropagation
No ratings yet
Backpropagation
12 pages
Illustration of BackP Learn
No ratings yet
Illustration of BackP Learn
2 pages
Neural networks unit-3
No ratings yet
Neural networks unit-3
14 pages
Back Propagation Learning Algorithm
No ratings yet
Back Propagation Learning Algorithm
15 pages
Lect 9 DM
No ratings yet
Lect 9 DM
35 pages
38_Backpropagation
No ratings yet
38_Backpropagation
19 pages
Supervised Learning Network
No ratings yet
Supervised Learning Network
33 pages
ANN_Model_Calculation_Example_ascii
No ratings yet
ANN_Model_Calculation_Example_ascii
3 pages
Data Mining Techniques: Presentation On Neural Network
No ratings yet
Data Mining Techniques: Presentation On Neural Network
55 pages
AI in Neural Network
No ratings yet
AI in Neural Network
11 pages
A Gentle Introduction To Neural Networks With Python
100% (1)
A Gentle Introduction To Neural Networks With Python
85 pages
A Gentle Introduction To Neural Networks With Python
No ratings yet
A Gentle Introduction To Neural Networks With Python
85 pages
cs519 hw2
No ratings yet
cs519 hw2
15 pages
Soft Computing Practical Teacher Manual
No ratings yet
Soft Computing Practical Teacher Manual
87 pages
Lec 15 MLP Cont
No ratings yet
Lec 15 MLP Cont
34 pages
Da 3 Lab DL 21BCE2687
No ratings yet
Da 3 Lab DL 21BCE2687
15 pages
Classification Advanced
No ratings yet
Classification Advanced
51 pages
06-backprop
No ratings yet
06-backprop
63 pages
Unit III
No ratings yet
Unit III
37 pages
An Introduction To Mathematics Behind Neural Networks
No ratings yet
An Introduction To Mathematics Behind Neural Networks
5 pages
Artificial Neural Networks - MLP
No ratings yet
Artificial Neural Networks - MLP
52 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Ann
No ratings yet
Ann
31 pages
lect8_dnn (1)
No ratings yet
lect8_dnn (1)
33 pages
Week 7 - Lab
No ratings yet
Week 7 - Lab
6 pages
Chap11 Neural Nets
No ratings yet
Chap11 Neural Nets
38 pages
Backpropagation Example
No ratings yet
Backpropagation Example
9 pages
15-NEURAL-NETWORK-UPDATED
No ratings yet
15-NEURAL-NETWORK-UPDATED
85 pages
Back Propagation ALGORITHM
No ratings yet
Back Propagation ALGORITHM
11 pages

John Bullinaria's Step by Step Guide To Implement Neuronal Network in C

Uploaded by

John Bullinaria's Step by Step Guide To Implement Neuronal Network in C

Uploaded by

You might also like