Neural Networks
Neural Networks
Ahmad Aljebaly
Agenda
History of Artificial Neural Networks What is an Artificial Neural Networks? How it works?
Learning
Learning paradigms Supervised learning Unsupervised learning Reinforcement learning Applications areas Advantages and Disadvantages
computer. However, the first important step took place in 1957 when Rosenblatt introduced the first concrete neural model, the perceptron. Rosenblatt also took part in constructing the first successful neurocomputer, the Mark I Perceptron. After this, the development of ANNs has proceeded as described in Figure.
a multi-layered model was derived in 1960. At first, the use of the multilayer perceptron (MLP) was complicated by the lack of a appropriate learning algorithm. In 1974, Werbos came to introduce a so-called backpropagation algorithm for the three-layered perceptron network.
until the breakthrough when a general back propagation algorithm for a multi-layered perceptron was introduced by Rummelhart and Mclelland. in 1982, Hopfield brought out his idea of a neural network. Unlike the neurons in MLP, the Hopfield network consists of only one layer whose neurons are fully connected with each other.
Boltzmann machine has been influenced by both the Hopfield network and the MLP.
Broomhead & Lowe. Although the basic idea of RBF was developed 30 years ago under the name method of potential function, the work by Broomhead & Lowe opened a new frontier in the neural network community.
(SOM) introduced by Kohonen. SOM is a certain kind of topological map which organizes itself based on the input patterns that it is trained with. The SOM originated from the LVQ (Learning Vector Quantization) network the underlying idea of which was also Kohonen's in 1972.
processing units which communicate by sending signals to each other over a large number of weighted connections.
unit. connections between the units. Generally each connection is defined by a weight. a propagation rule, which determines the effective input of a unit from its external inputs. an activation function, which determines the new level of activation based on the effective input and the current activation. an external input for each unit. a method for information gathering (the learning rule). an environment within which the system must operate, providing input signals and _ if necessary _ error signals.
Neural Networks
highly parallel processing slow processing units unreliable units dynamic infrastructure
character recognition or the prediction of future states of a system require massively parallel and adaptive processing.
Biological viewpoint: ANNs can be used to
replicate and simulate components of the human (or animal) brain, thereby giving us insight into natural information processing.
neurons.
In technical systems, we also refer to them as units or nodes.
Basically, each neuron receives input from many other neurons. changes its internal state (activation) based on the current input. sends one output signal to many other neurons, possibly including its input neurons (recurrent network).
information.
In biological systems, one neuron can be connected to as
Once input exceeds a critical level, the neuron discharges a spike an electrical pulse that travels from the body, down the axon, to the next neuron(s)
The axon endings almost touch the dendrites or cell body of the next neuron.
Transmission of an electrical signal from one neuron to the next is effected by neurotransmitters.
Neurotransmitters are chemicals which are released from the first neuron and which bind to the Second.
This link is called a synapse. The strength of the signal that reaches the next neuron depends on factors such as the amount of neurotransmitter available.
x2
x1
Processing
= X1+X2 + .+Xm =y
Output
x2
x1
wm
.... .
w2
w1
= X1w1+X2w2 + .+Xmwm =y
Output
x2
w2
x1
wm
.... .
w1
Processing
Transfer Function (Activation Function)
f(vk)
Output
The output is a function of the input, that is affected by the weights, and the transfer functions
Learning by trialanderror
Continuous process of: Trial:
Processing an input to produce an output (In terms of ANN: Compute the output function of a given input)
Adjust:
Adjust the weights.
Example: XOR
How it works?
How it works?
Set initial values of the weights randomly. Input: truth table of the XOR Do
Design Issues
Initial weights (small random values [1,1])
Transfer function (How the inputs and the weights are
combined to produce output?) Error estimation Weights adjusting Number of neurons Data representation Size of training set
Transfer Functions
Linear: The output is proportional to the total
weighted input. Threshold: The output is set at one of two values, depending on whether the total weighted input is greater than or less than some threshold value. Nonlinear: The output varies continuously but not linearly as the input changes.
Error Estimation
The root mean square error (RMSE) is a frequently-
used measure of the differences between values predicted by a model or an estimator and the values actually observed from the thing being modeled or estimated
Weights Adjusting
After each iteration, weights should be adjusted to
Back Propagation
Back-propagation is an example of supervised
learning is used at each layer to minimize the error between the layers response and the actual data The error at each hidden layer is an average of the evaluated error Hidden layer networks are trained this way
Back Propagation
N is a neuron. Nw is one of Ns inputs weights Nout is Ns output. Nw = Nw + Nw Nw = Nout * (1 Nout)* NErrorFactor NErrorFactor = NExpectedOutput NActualOutput This works only for the last layer, as we can know
Number of neurons
Many neurons: Higher accuracy Slower Risk of overfitting
Data representation
Usually input/output data needs preprocessing
Pictures Pixel intensity
Text: A pattern
hidden neurons
Learning Paradigms
Supervised learning Unsupervised learning
Reinforcement learning
Supervised learning
This is what we have seen so far! A network is fed with a set of training samples
(inputs and corresponding output), and it uses these samples to learn the general relationship between the inputs and the outputs. This relationship is represented by the values of the weights of the trained network.
Unsupervised learning
No desired output is associated with the
training data! Faster than supervised learning Used to find out structures within data:
Clustering Compression
Reinforcement learning
Like supervised learning, but: Weights adjusting is not directly related to the error value. The error value is used to randomly, shuffle weights! Relatively slow learning due to randomness.
Applications Areas
Function approximation
including time series prediction and modeling.
Classification
including patterns and sequences recognition, novelty
Data processing
including filtering, clustering blinds source separation and
compression.
(data mining, e-mail Spam filtering)
Advantages / Disadvantages
Advantages
Adapt to unknown situations Powerful, it can model complex functions.
Conclusion
Artificial Neural Networks are an imitation of the biological
neural networks, but much simpler ones. The computing would have a lot to gain from neural networks. Their ability to learn by example makes them very flexible and powerful furthermore there is need to device an algorithm in order to perform a specific task.
Conclusion
Neural networks also contributes to area of research such a
neurology and psychology. They are regularly used to model parts of living organizations and to investigate the internal mechanisms of the brain. Many factors affect the performance of ANNs, such as the transfer functions, size of training sample, network topology, weights adjusting algorithm,
References
Craig Heller, and David Sadava, Life: The Science of Biology, fifth edition,
Sinauer Associates, INC, USA, 1998. Introduction to Artificial Neural Networks, Nicolas Galoppo von Borries Tom M. Mitchell, Machine Learning, WCB McGraw-Hill, Boston, 1997.
Thank You
current input; sends one output signal to many other neurons, possibly including its input neurons (ANN is recurrent network).
Back-propagation is a type of supervised learning,
used at each layer to minimize the error between the layers response and the actual data.