Learning Processes
Learning Processes
Neural Networks
CSE-6701
Submitted By
Shafikul Islam
ID: 20 CSE 010
&
Joy Sarkar
ID: 20 CSE 012
Department of Computer Science & Engineering
University Of Barishal
Course Teacher
Sohely Jahan
Assistant Professor
Department of Computer Science & Engineering
University Of Barishal
BARISHAL UNIVERSITY
BARISHAL, BANGLADESH
3March, 2023
Learning Processes
What is Learning Process?
Artificial Neural Network (ANN) is entirely inspired by the way the biological
nervous system works. For Example, the human brain works. The most powerful
attribute of the human brain is to adapt, and ANN acquires similar
characteristics. We should understand that how exactly our brain does? It is still
very primitive, although we have a fundamental understanding of the procedure.
It is accepted that during the learning procedure, the brain's neural structure is
altered, increasing or decreasing the capacity of its synaptic connections relying
on their activity. This is the reason why more relevant information is simpler to
review than information that has not been reviewed for a long time. More
significant information will have powerful synaptic connections, and less
applicable information will gradually have its synaptic connections weaken,
making it harder to review.
Why is it important?
2
is constructed. Now to change the input/output behavior, we need to adjust the
weights.
3
Supervised learning
Supervision: The training data (observations, measurements, etc.) are
accompanied by labels indicating the class of the observations. New data is
classified based on the training set
Unsupervised learning
The class labels of training data are unknown. Given a set of measurements,
observations, etc. with the aim of establishing the existence of classes or
clusters in the data.
Reinforcement Learning
Error-Correction Learning
Δw = α * (d - y) * x
where:
4
α = the learning rate, which determines how much the weight should be
adjusted in response to each training example
y = the actual output produced by the network for that training example
The delta rule can be applied to each connection in the network to update its
weight, with the overall goal of minimizing the error across all training
examples. By iteratively adjusting the weights in this way, the network
gradually learns to produce more accurate outputs for a given set of inputs.
5
Memory-based learning
6
memory-based learning algorithms when the data set is relatively small and
there is no prior knowledge or information about the underlying patterns in the
data.
Two classic examples of memory-based learning are K-nearest neighbours
classification and K-nearest neighbours regression.
Hebbian Learning
7
frequently activated together and decrease the strength of connections between
neurons that are rarely activated together.
There are several variants of Hebbian Learning that have been proposed over
the years, including:
1. Initialize the weights: Set the weights between the input and output
neurons to small random values.
8
3. Calculate the output: Calculate the output of the network by applying the
current weights to the input.
4. Update the weights: Update the weights between the input and output
neurons using the Hebbian learning rule, which states that the weight
between two neurons should be increased if they are both active, and
decreased if one is active and the other is not. The Hebbian learning rule
can be expressed mathematically as:
where Δw_ij is the change in the weight between neuron i and neuron j, η is the
learning rate, x_i is the activation of neuron i, and y_j is the activation of neuron
j.
5. Repeat: Repeat steps 2-4 for each input pattern in the training set.
6. Test the network: Test the network on new input patterns to evaluate its
performance.
covariance hypothesis
The covariance hypothesis in Hebbian learning proposes that the strength of the
connection between two neurons should be modified based on the covariance
between their activities. The basic idea is that when two neurons fire together,
the strength of their connection should be increased, and when they fire
independently, the strength of their connection should be decreased. One
advantage of the covariance hypothesis is that it provides a more precise and
nuanced way of adjusting synaptic strengths than the original Hebbian rule,
which simply strengthened connections between neurons that fired together.
The covariance hypothesis takes into account not only whether two neurons fire
together, but also the statistical relationship between their activities.
9
Covariance-based learning has been used in various neural network models and
has been shown to improve their learning performance on certain tasks.
However, it also has some limitations, such as being sensitive to the mean
activity levels of the neurons and being prone to instability if the covariance
matrix becomes singular or close to singular.
One common form of the covariance-based learning rule is the Oja's rule, which
is given by:
where Δwij is the change in the synaptic weight between neuron i and j, η is the
learning rate, xi and yj are the pre- and post-synaptic activities, respectively, wij
is the initial synaptic weight, and α is a regularization parameter that controls
the magnitude of the weight update.
In this rule, the first term (xiyj) represents the covariance between the pre- and
post-synaptic activities, while the second term (αyj^2*wij) acts as a decay term
that prevents the weight from growing too large.
The Oja's rule can be used in a variety of neural network models, such as the
self-organizing map and the adaptive resonance theory. When used in
conjunction with other learning rules, such as the backpropagation algorithm,
the covariance-based learning rule can help improve the learning performance
of neural networks, particularly in tasks that require detecting correlations or
patterns in the input data.
10
It's worth noting that the covariance-based learning rule has some limitations
and requires careful tuning of its parameters to avoid instability and overfitting.
Additionally, other methods for implementing the covariance hypothesis in
Hebbian learning have been proposed, such as the BCM (Bienenstock-Cooper-
Munro) rule and the trace learning rule, which have their own strengths and
weaknesses.
Competitive learning
A set of neurons that are all the same, except for some randomly
distributed synaptic weights, which respond differently to a given set of
input patterns
A mechanism that permits the neurons to compete for the right to respond
to a given subset of inputs, such that only one output neuron (or only one
neuron per group), is active (i.e. "on") at a time. The neuron that wins the
competition is called a "winner-take-all" neuron.
11
The competitive networks recode sets of correlated inputs to one of a few output
neurons essentially removes the redundancy in representation.
For every input vector, the competitive neurons “compete” with each other to
see which one of them is the most similar to that particular input vector. The
winner neuron m sets its output to:
12
● They consist of stochastic neurons, which have one of the two
clamped frozenstate
● If we apply simulated annealing on discrete Hopfield network, then
Architecture
Training Algorithm
13
As we know that Boltzmann machines have fixed weights, hence there will be
no training algorithm as we do not need to update the weights in the network.
However, to test the network we have to set the weights as well as to find the
consensus function CF
Testing Algorithm
Step 2 − Continue steps 3-8, when the stopping condition is not true.
Step 4 − Assume that one of the state has changed the weight and choose the
integer I, J as random values between 1 and n.
14
Step 7 − Accept or reject this change as follows −
15
of the network.)
1. The assignment of credit for outcomes to actions. This is called the temporal
2. ‘The assignment of credit for actions to internal decisions. This is called the
structural credit-assignment problem in that it involves assigning credit to the
internal structures of actions generated by the system.
16
taken by a learning machine that result in certain outcomes, and we must
determine which of these actions were responsible for the outcomes, The
combined temporal and structural credit-assignment problem faces any
distributed learning machine that attempts to improve its performance in
situations involving temporally extended behavior (Williams, 1988).
The environment is, however, the neural network of interest. Suppose now that
the teacher and the neural network are both exposed to a training vector (i...
example) drawn from the environment. By virtue of built-in knowledge, the
17
teacher is able to provide the neural network with a desired response for that
training vector. Indeed, the desired response represents the optimum action to
be performed by the neural network. The network parameters are adjusted under
the combined influence of the training vector and the error signal.
The error signal is defined as the difference between the desired response and
the actual response of the network. This adjustment is carried out iteratively in a
step-by-step fashion with the aim of eventually making the neural network
emulate the teacher; the emulation is presumed to be optimum in some
statistical sense.
The true error surface is averaged over all possible input-output examples. Any
given operation of the system under the teacher's supervision is represented as a
point on the error surface. For the system to improve performance over time and
therefore learn from the teacher, the operating point has to move down
successively toward a minimum point of the error surface; the minimum point
may be a local minimum or a global minimum.
18
LEARNING WITHOUT A TEACHER :
In supervised learning, the learning process takes place under the tutelage of a
teacher.
there is no teacher to oversee the learning process. That is to say, there are no
labeled
index of performance.
19
Figure 2.7 shows the block diagram of one form of a reinforcement Jearning
system built around a critic that converts a primary reinforcement signal
received from the environment into a higher quality reinforcement signal called
the heuristic reinforcement signal, both of which are scalar inputs. The system is
designed to learn under delayed reinforcement, which means that the system
observes a temporal sequence of stimuli (i.e., state vectors) also received from
the environment, which eventually result in the generation of the heuristic
reinforcement signal. The goal of learning is to minimize a cost-fo-go function,
defined as the expectation of the cumulative cost of actions taken over a
sequence of steps instead of simply the immediate cost. It may turn out that
certain actions taken earlier in that sequence of time steps are in fact the best
determinants of overall system behavior. The function of the learning machine,
which constitutes the second component of the system, is to discover these
actions and to feed them back to the environment.
20
process.
this we mean that the learning machine must be able to assign credit and blame
individually to each action in the sequence of time steps that led to the final out-
come, while the primary reinforcement may only evaluate the outcome.
LEARNING TASKS:
22
immediately and with practically no effort. Humans perform pattern
recognition through a learning process: so it is with neural networks.
23
Function approximation:
24
Overall, function approximation is a fundamental task in neural networks
that enables learning, complex modeling, generalization, and transfer
learning.
1. System identification.
2. Inverse system:
Control:
25
Control is an important aspect of neural networks as it allows for the efficient
and effective operation of the network in achieving its goals. Here are some
specific reasons why control is important in neural networks:
26
Overall, control is an important aspect of neural networks that ensures stability,
improves performance, enables adaptability, and ensures safety in various
applications.
Filtering: ‘The term filter often refers to a device or algorithm used to extract
information about a prescribed quantity of interest from a set of noisy data. The
noise may arise from a variety of sources. For example, the data may have been
measured by means of noisy sensors or may represent an information-bearing
signal that has been corrupted by transmission through a communication
channel. Another example is that of a useful signal component corrupted by an
interfering signal picked up from the surrounding environment. We may use a
filter to perform three basic information processing tasks:
27
here is to derive information about what the quantity of interest will be
like at some time 7 + nin the future, for some ny > 0. by using data
measured up to
Beamforming is commonly used in radar and sonar systems where the primary
noise and interfering signals (e.g. jammer). This task is complicated by two
factors.
MEMORY:
28
be divided into “short-term” and “long-term” memory, depending on the
retention time .
In this section we study an associative memory that offers the following charac-
Teristics:
29
(Otherwise the memory would have to be exceptionally large for it to
accommo-
other.) There is therefore the distinct possibilty for the memory to make
errors
30