0% found this document useful (0 votes)
46 views

Soft Computing Unit 2 Notes..

Uploaded by

Aj Maurya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Soft Computing Unit 2 Notes..

Uploaded by

Aj Maurya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 24

SOFT COMPUTING IT-701

Unit II- Supervised Learning: Perceptron learning, Single layer/multilayer, Adaline,


Madaline, Back propagation network, RBFN, Application of Neural network in
forecasting, data compression and image compression.

Supervised Machine Learning


 Supervised learning is the types of machine learning in which machines are trained using well
"labelled" training data, and on basis of that data, machines predict the output. The labelled data
means some input data is already tagged with the correct output.

 In supervised learning, the training data provided to the machines work as the supervisor that
teaches the machines to predict the output correctly. It applies the same concept as a student learns
in the supervision of the teacher.

 Supervised learning is a process of providing input data as well as correct output data to the machine
learning model. The aim of a supervised learning algorithm is to find a mapping function to map
the input variable(x) with the output variable(y).

 In the real-world, supervised learning can be used for Risk Assessment, Image classification,
Fraud Detection, spam filtering, etc.

How Supervised Learning Works?


In supervised learning, models are trained using labelled dataset, where the model learns about each type of
data. Once the training process is completed, the model is tested on the basis of test data (a subset of the
training set), and then it predicts the output.

The working of Supervised learning can be easily understood by the below example and diagram:
The machine is already trained on all types of shapes, and when it finds a new shape, it classifies the shape
on the bases of a number of sides, and predicts the output.

Steps Involved in Supervised Learning:

o First Determine the type of training dataset


o Collect/Gather the labelled training data.
o Split the training dataset into training dataset, test dataset, and validation dataset.
o Determine the input features of the training dataset, which should have enough knowledge so that the
model can accurately predict the output.
o Determine the suitable algorithm for the model, such as support vector machine, decision tree, etc.
o Execute the algorithm on the training dataset. Sometimes we need validation sets as the control
parameters, which are the subset of training datasets.
o Evaluate the accuracy of the model by providing the test set. If the model predicts the correct output,
which means our model is accurate.

Types of supervised Machine learning Algorithms:


Supervised learning can be further divided into two types of problems:

1. Regression

Regression algorithms are used if there is a relationship between the input variable and the output
variable. It is used for the prediction of continuous variables, such as Weather forecasting, Market
Trends, etc. Below are some popular Regression algorithms which come under supervised learning:

o Linear Regression
o Regression Trees
o Non-Linear Regression
o Bayesian Linear Regression
o Polynomial Regression
2. Classification

Classification algorithms are used when the output variable is categorical, which means there are two
classes such as Yes-No, Male-Female, True-false, etc.

Spam Filtering,
o Random Forest
o Decision Trees
o Logistic Regression
o Support vector Machines

Advantages of Supervised learning:

o With the help of supervised learning, the model can predict the output on the basis of prior experiences.
o In supervised learning, we can have an exact idea about the classes of objects.
o Supervised learning model helps us to solve various real-world problems such as fraud detection, spam
filtering, etc.

Disadvantages of supervised learning:

o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is different from the training dataset.
o Training required lots of computation times.
o In supervised learning, we need enough knowledge about the classes of object.

Perceptron Learning-
 In Machine Learning and Artificial Intelligence, Perceptron is the most commonly used term for
all folks. It is the primary step to learn Machine Learning and Deep Learning technologies,
which consists of a set of weights, input values or scores, and a threshold. Perceptron is a
building block of an Artificial Neural Network. Initially, in the mid of 19th century, Mr. Frank
Rosenblatt invented the Perceptron for performing certain calculations to detect input data
capabilities or business intelligence. Perceptron is a linear Machine Learning algorithm used for
supervised learning for various binary classifiers. This algorithm enables neurons to learn
elements and processes them one by one during preparation.

 Perceptron is Machine Learning algorithm for supervised learning of various binary


classification tasks. Further, Perceptron is also understood as an Artificial Neuron or neural
network unit that helps to detect certain input data computations in business intelligence.

 Perceptron model is also treated as one of the best and simplest types of Artificial Neural
networks. However, it is a supervised learning algorithm of binary classifiers. Hence, we can
consider it as a single-layer neural network with four main parameters, i.e., input values,
weights and Bias, net sum, and an activation function.

Basic Components of Perceptron


Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains three main
components. These are as follows:
 Input Nodes or Input Layer:
 This is the primary component of Perceptron which accepts the initial data into the system for
further processing. Each input node contains a real numerical value.

 Wight and Bias:


 Weight parameter represents the strength of the connection between units. This is another most
important parameter of Perceptron components. Weight is directly proportional to the strength of the
associated input neuron in deciding the output. Further, Bias can be considered as the line of
intercept in a linear equation.

 Activation Function:
 These are the final and important components that help to determine whether the neuron will fire or
not. Activation Function can be considered primarily as a step function.

Types of Activation functions:

o Sign function
o Step function, and
o Sigmoid function

The data scientist uses the activation function to take a subjective decision based on various problem
statements and forms the desired outputs. Activation function may differ (e.g., Sign, Step, and Sigmoid) in
perceptron models by checking whether the learning process is slow or has vanishing or exploding
gradients.

How does Perceptron work?


In Machine Learning, Perceptron is considered as a single-layer neural network that consists of four main
parameters named input values (Input nodes), weights and Bias, net sum, and an activation function. The
perceptron model begins with the multiplication of all input values and their weights, then adds these values
together to create the weighted sum. Then this weighted sum is applied to the activation function 'f' to obtain
the desired output. This activation function is also known as the step function and is represented by 'f'.

This step function or Activation function plays a vital role in ensuring that output is mapped between
required values (0,1) or (-1,1). It is important to note that the weight of input is indicative of the strength of
a node. Similarly, an input's bias value gives the ability to shift the activation function curve up or down.

Perceptron model works in two important steps as follows:

Step-1

In the first step first, multiply all input values with corresponding weight values and then add them to
determine the weighted sum. Mathematically, we can calculate the weighted sum as follows:

∑wi*xi = x1*w1 + x2*w2 +…wn*xn

Add a special term called bias 'b' to this weighted sum to improve the model's performance.

∑wi*xi + b

Step-2

In the second step, an activation function is applied with the above-mentioned weighted sum, which gives
us output either in binary form or a continuous value as follows:

Y = f(∑wi*xi + b)
Single Layer/Multilayer
Types of Perceptron Models
Based on the layers, Perceptron models are divided into two types. These are as follows:

1. Single-layer Perceptron Model


2. Multi-layer Perceptron model

Single Layer Perceptron Model:


 This is one of the easiest Artificial neural networks (ANN) types. A single-layered perceptron
model consists feed-forward network and also includes a threshold transfer function inside the
model. The main objective of the single-layer perceptron model is to analyze the linearly
separable objects with binary outcomes.

 In a single layer perceptron model, its algorithms do not contain recorded data, so it begins with
inconstantly allocated input for weight parameters. Further, it sums up all inputs (weight). After
adding all inputs, if the total sum of all inputs is more than a pre-determined value, the model
gets activated and shows the output value as +1.

 If the outcome is same as pre-determined or threshold value, then the performance of this model
is stated as satisfied, and weight demand does not change. However, this model consists of a
few discrepancies triggered when multiple weight inputs values are fed into the model. Hence,
to find desired output and minimize errors, some changes should be necessary for the weights
input.

 "Single-layer perceptron can learn only linearly separable patterns."

Multi-Layered Perceptron Model:


Like a single-layer perceptron model, a multi-layer perceptron model also has the same model structure
but has a greater number of hidden layers.

The multi-layer perceptron model is also known as the Backpropagation algorithm, which executes in
two stages as follows:

 Forward Stage: Activation functions start from the input layer in the forward stage and
terminate on the output layer.
 Backward Stage: In the backward stage, weight and bias values are modified as per the
model's requirement. In this stage, the error between actual output and demanded originated
backward on the output layer and ended on the input layer.
Hence, a multi-layered perceptron model has considered as multiple artificial neural networks having
various layers in which activation function does not remain linear, similar to a single layer perceptron
model. Instead of linear, activation function can be executed as sigmoid, TanH, ReLU, etc., for
deployment.
A multi-layer perceptron model has greater processing power and can process linear and non-linear
patterns. Further, it can also implement logic gates such as AND, OR, XOR, NAND, NOT, XNOR,
NOR.

Advantages of Multi-Layer Perceptron:

o A multi-layered perceptron model can be used to solve complex non-linear problems.


o It works well with both small and large input data.
o It helps us to obtain quick predictions after the training.
o It helps to obtain the same accuracy ratio with large as well as small data.
Disadvantages of Multi-Layer Perceptron:

o In Multi-layer perceptron, computations are difficult and time-consuming.


o In multi-layer Perceptron, it is difficult to predict how much the dependent variable affects each
independent variable.
o The model functioning depends on the quality of the training.

Comparison: Single Layer Perceptron vs. Multilayer Perceptron


Feature Single Layer Perceptron (SLP) Multilayer Perceptron (MLP)
Layers Single layer (input + output) Multiple layers (input, hidden, output)
Problem Can only solve linearly separable Can solve both linearly and non-linearly
Solving problems (e.g., AND, OR) separable problems (e.g., XOR)
Activation Step or linear activation Non-linear activation functions (Sigmoid, ReLU,
Function etc.)
Training Perceptron learning rule (simple gradient Backpropagation with gradient descent
Algorithm descent)
Complexity Simple, less computationally expensive More complex, computationally expensive
Learning Limited to simple tasks Capable of handling complex and large-scale
Capacity tasks
Use Cases Simple binary classification tasks Complex classification, regression, and time-
series prediction
Risk of Low (due to simplicity) Higher risk (especially with deep architectures)
Overfitting
Training Time Fast Slower, especially for deep networks

Perceptron Function-
Perceptron function ''f(x)'' can be achieved as output by multiplying the input 'x' with the learned weight
coefficient 'w'.

Mathematically, we can express it as follows:

f(x)=1; if w.x+b>0
otherwise, f(x)=0

o 'w' represents real-valued weights vector


o 'b' represents the bias
o 'x' represents a vector of input x values.

Characteristics of Perceptron-
The perceptron model has the following characteristics.

1. Perceptron is a machine learning algorithm for supervised learning of binary classifiers.


2. In Perceptron, the weight coefficient is automatically learned.
3. Initially, weights are multiplied with input features, and the decision is made whether the
neuron is fired or not.
4. The activation function applies a step rule to check whether the weight function is greater than
zero.
5. The linear decision boundary is drawn, enabling the distinction between the two linearly
separable classes +1 and -1.
6. If the added sum of all input values is more than the threshold value, it must have an output
signal; otherwise, no output will be shown.

Limitations of Perceptron Model


A perceptron model has limitations as follows:

o The output of a perceptron can only be a binary number (0 or 1) due to the hard limit transfer
function.
o Perceptron can only be used to classify the linearly separable sets of input vectors. If input
vectors are non-linear, it is not easy to classify them properly.
ADALINE, MADALINE
Architecture of ADALINE (Adaptive Linear Neuron)

ADALINE is one of the earliest neural networks developed by Bernard Widrow and Ted Hoff in 1960.
It stands for Adaptive Linear Neuron or Adaptive Linear Element. The ADALINE architecture is
similar to a simple perceptron but with a key difference in the learning rule and activation function. It is
used for classification and prediction tasks.
Workflow

Architecture:

Key Characteristics:

 Linear Decision Boundary: Since it uses a linear activation function, it can only solve linearly
separable problems.
 Real-Valued Output: The output is not binary; instead, it's a real value determined by the
weighted sum of inputs.

Algorithm:
Step 1: Initialize weight not zero but small random values are used. Set learning rate α.
Step 2: While the stopping condition is False do steps 3 to 7.
Step 3: for each training set perform steps 4 to 6.
Step 4: Set activation of input unit x i = si for (i=1 to n).
Step 5: compute net input to output unit

Here, b is the bias and n is the total number of neurons.


Step 6: Update the weights and bias for i=1 to n
and calculate

when the predicted output and the true value are the same then the weight will not change.
Step 7: Test the stopping condition. The stopping condition may be when the weight changes at a
low rate or no change.

Architecture of MADALINE (Multiple ADALINEs)

MADALINE (Multiple ADAptive LINear Elements) is a more complex neural network architecture
than ADALINE, designed to handle non-linear decision boundaries. It was also developed by Bernard
Widrow and his student, Michael Lehr, in 1959. MADALINE is essentially a network of multiple
ADALINE units.

 The Madaline(supervised Learning) model consists of many Adaline in parallel with a single
output unit. The Adaline layer is present between the input layer and the Madaline layer hence
Adaline layer is a hidden layer. The weights between the input layer and the hidden layer are
adjusted, and the weight between the hidden layer and the output layer is fixed.
 It may use the majority vote rule, the output would have an answer either true or false. Adaline
and Madaline layer neurons have a bias of ‘1’ connected to them. use of multiple Adaline helps
counter the problem of non-linear separability.
Architecture Components:

1. Input Layer:
o Similar to ADALINE, the input layer consists of multiple input neurons that receive the
input features.

2. Hidden Layer (Multiple ADALINE units):


o The key distinction between MADALINE and ADALINE is the presence of a hidden
layer.
o Each neuron in the hidden layer is an ADALINE unit, meaning it performs a weighted
sum of inputs and applies a linear activation function.

3. Output Layer:
o The output layer consists of one or more neurons, and each neuron aggregates the
outputs from the hidden layer neurons.

4. Non-linear Decision Boundary:


o MADALINE can solve non-linearly separable problems because the hidden layer allows
for non-linear transformation of the input data, enabling it to handle more complex tasks
than ADALINE.

5. Learning Rule (Madaline Rule I or Rule II):


o MADALINE uses a unique learning rule, known as Madaline Rule I or Madaline Rule
II, which is a heuristic training algorithm.
o Madaline Rule I modifies weights based on misclassified patterns in an attempt to
minimize the global error.

Madaline Rule II is more sophisticated and works by:

o Identifying which hidden layer unit’s weight update will produce the largest reduction in
error.
o Updating the weights based on which neuron’s update leads to the largest reduction in
error.

Key Characteristics:

 Non-linear Classification: MADALINE can solve non-linear problems due to its multiple
ADALINE units.
 Threshold Logic: Uses a step function (sign function) for final output classification.
 Multilayered: Unlike ADALINE, MADALINE consists of multiple layers, making it more
versatile for complex pattern recognition tasks.

Algorithm:
Step 1: Initialize weight and set learning rate α.
v1=v2=0.5 , b=0.5
other weight may be a small random value.
Step 2: While the stopping condition is False do steps 3 to 9.
Step 3: for each training set perform steps 4 to 8.
Step 4: Set activation of input unit xi = si for (i=1 to n).
Step 5: compute net input of Adaline unit
zin1 = b1 + x1w11 + x2w21
zin2 = b2 + x1w12 + x2w22
Step 6: for output of remote Adaline unit using activation function given below:
Activation function f(z) = .
z1=f(zin1)
z2=f(zin2)
Step 7: Calculate the net input to output.
yin = b3 + z1v1 + z2v2
Apply activation to get the output of the net
y=f(yin)
Step 8: Find the error and do weight updation
if t ≠ y then t=1 update weight on z(j) unit whose next input is close to 0.
if t = y no updation
wij(new) =wij(old) + α(t-zinj)xi
bj(new) = bj(old) + α(t-zinj)
if t=-1 then update weights on all unit z k which have positive net input
Step 9: Test the stopping condition; weights change all number of epochs.

Comparison of ADALINE and MADALINE:


Feature ADALINE MADALINE
Layers Single-layer (no hidden layers) Multilayer (input, hidden, output)
Decision Boundary Linear Non-linear
Activation Linear activation Linear in hidden layer, sign in output
Function
Learning Rule LMS (Least Mean Square) Madaline Rule I or II
Complexity Simple More complex
Problem Type Linearly separable problems Non-linearly separable problems
Use Case Simple classification, regression tasks Complex pattern recognition tasks

Applications of ADALINE and MADALINE:

 ADALINE: Used for linear classification problems, signal processing, and adaptive filtering.
 MADALINE: Suitable for more complex tasks such as speech recognition, pattern
classification, and multi-class problems where non-linear decision boundaries are required.

Both architectures are historically significant in the evolution of neural networks and contributed
foundational concepts to more advanced neural network architectures used today.

Back propagation network


Back propagation network- A Backpropagation Network (BPN) is a type of multilayer feedforward
neural network that uses the backpropagation algorithm for training. Backpropagation is a supervised
learning algorithm that minimizes the error by adjusting the weights of the network in response to the
difference between the predicted output and the actual output.

Key Components of Backpropagation Network:

1. Input Layer:
o The input layer consists of neurons that receive the input data and pass it forward to the
next layer. Each neuron in this layer represents one feature from the input dataset.

2. Hidden Layer(s):
o The hidden layer(s) consist of neurons that transform the input features into a more
abstract representation. A backpropagation network can have one or more hidden layers
depending on the complexity of the problem.
o Each neuron in the hidden layer applies an activation function (such as the sigmoid,
tanh, or ReLU) to its weighted sum of inputs.

3. Output Layer:
o The output layer generates the final prediction of the network. The number of neurons in
this layer corresponds to the number of output variables (for example, one for
regression, or multiple for classification).

4. Weights and Biases:


o Each connection between neurons has an associated weight, which determines the
strength of the connection. Biases are added to each neuron to help shift the activation
function.
o During training, the backpropagation algorithm adjusts these weights and biases to
minimize the network's error.

Working of Backpropagation:

1. Forward Propagation:
o The input data is passed through the input layer, hidden layer(s), and output layer.
o Each neuron computes a weighted sum of its inputs, applies the activation function, and
passes the result to the next layer.
o The output of the network is compared to the actual output (the target), and an error is
calculated (commonly using a loss function like mean squared error or cross-entropy).

2. Backward Propagation (Backpropagation):


o The backpropagation algorithm computes the gradient of the loss function with respect
to each weight in the network. This is done using the chain rule of calculus.
o Starting from the output layer, the error is propagated backward through the network,
layer by layer, updating the weights in each layer to minimize the error.

3. Weight Updates:
o The weights are updated in the direction that minimizes the error. The weight update is
usually performed using an optimization algorithm like gradient descent, where the
weights are adjusted based on the learning rate and the gradient of the error:

4. Repeat:
o The forward and backward passes are repeated for multiple iterations (epochs) until the
network’s error converges to a minimum, or it reaches a stopping criterion (such as a
fixed number of epochs or a desired accuracy).

Key Features of Backpropagation Network:

 Supervised Learning: Backpropagation networks are trained with labeled data, where the
correct output (target) is provided for each input.
 Error Minimization: The goal of backpropagation is to reduce the difference (error) between
the network’s predicted output and the actual output by adjusting the network's weights.
 Multiple Layers: A backpropagation network typically contains at least one hidden layer,
which allows it to capture more complex relationships in the data (multilayer perceptron, MLP).
 Non-linear Activation Functions: Neurons use activation functions such as sigmoid, tanh, or
ReLU to introduce non-linearity into the model, allowing the network to learn complex, non-
linear mappings.
Applications of Backpropagation Networks:

1. Classification:
o Backpropagation is widely used for classification tasks such as image classification,
text classification, and speech recognition. The network learns to assign input data to
different categories based on its features.

2. Regression:
o BPNs are also applied in regression problems where the goal is to predict a continuous
output variable, such as predicting house prices, stock market trends, or demand
forecasting.

3. Pattern Recognition:
o Backpropagation networks are used in applications like handwriting recognition,
fingerprint recognition, and facial recognition due to their ability to learn patterns from
the input data.

4. Time Series Prediction:


o Although backpropagation networks are generally feedforward, they can be adapted for
time series prediction problems by including past time steps as input, helping in
predicting future values.

5. Medical Diagnosis:
o In medical applications, backpropagation networks can help in disease diagnosis by
analyzing patient data and learning patterns that indicate the presence of certain
conditions.
Advantages:

 Ability to Learn Complex Patterns: BPNs can learn highly non-linear relationships between
input and output data, making them suitable for a wide range of tasks.
 Generalization: After training, BPNs can generalize and make accurate predictions on new,
unseen data.
 Versatility: The architecture can be applied to classification, regression, and other predictive
tasks.

Disadvantages:

 Slow Convergence: Backpropagation can be slow to converge, especially with deep networks,
as it may require many iterations to reach an optimal solution.
 Local Minima: The network can get stuck in local minima of the loss function, leading to
suboptimal solutions.
 Sensitive to Hyperparameters: The performance of a BPN is sensitive to hyperparameters
such as the learning rate, number of hidden layers, and number of neurons. These need to be
carefully tuned.
 Overfitting: If the network is too large (too many neurons or layers), it may overfit the training
data and fail to generalize well to unseen data.

Example Architecture:

For a BPN used in a simple binary classification task:

1. Input Layer: 2 neurons (one for each feature).


2. Hidden Layer: 3 neurons with a non-linear activation function (e.g., sigmoid).
3. Output Layer: 1 neuron with a sigmoid activation function to output a probability for the
binary classification.

Radial Basis Function Network

Radial Basis Function Network(RBFN)- A Radial Basis Function Network (RBFN) is a type of
artificial neural network that uses radial basis functions as activation functions. It is commonly used for
function approximation, time series prediction, classification, and system control due to its simple
structure and fast learning capabilities. The radial basis function (RBF) is a classification and functional
approximation neural network developed by M.J.D. Powell.

The network uses the most common nonlinearities such as sigmoidal and Gaussian kernel functions.
The Gaussian functions are also used in regularization networks. The response of such a function is
positive for all values of y; the response decreases to 0 as |1|-> 0.

The Gaussian function is generally defined as

f(y) = e-Y2
The derivative of this function is given by:

f’(y) = -2ye-Y2

= -2yf(y)

ARCHITECTURE OF RBFN

Key Components of RBFN:

1. Input Layer:
o The input layer consists of input neurons that pass the input features directly to the
hidden layer without any transformation.

2. Hidden Layer:
o The hidden layer uses radial basis functions (usually Gaussian functions) as the
activation functions.
o Each neuron in the hidden layer has a center and a spread (radius). The radial basis
function computes the distance between the input vector and the center of the neuron.

3. Output Layer:
o The output layer is typically a linear combination of the hidden layer outputs. It can have
one or more neurons depending on the task (regression or classification).
o The weights between the hidden layer and output layer are typically learned through
linear regression or another optimization technique.

Working of RBFN:

1. Training Phase:
o In the training process, the centers c\mathbf{c}c of the radial basis functions are
determined. These can be chosen randomly, through k-means clustering, or by other
methods.
o The spread parameter σ\sigmaσ for each neuron is set based on the distance between the
centers or manually.
o Once the centers and spreads are fixed, the weights connecting the hidden layer to the
output layer are learned using methods like least squares or gradient descent.

2. Prediction Phase:
o For a given input, the network calculates the distance between the input and the centers
of each neuron in the hidden layer.
o The radial basis function (such as Gaussian) is applied to the distances to compute the
hidden layer outputs.
o Finally, the weighted sum of these hidden layer outputs is used to generate the output.

Key Characteristics:

 Localized learning: The output of a hidden neuron only depends on the inputs near its center,
making the RBFN focus on local regions of the input space.
 Fast training: Since the weights between the hidden and output layers are often learned
linearly, training can be much faster compared to traditional feedforward networks.
 Universal approximator: RBFNs are proven to be capable of approximating any continuous
function given enough hidden neurons.

Applications:

 Function approximation: RBFNs can approximate complex, non-linear functions and are often
used in curve fitting and interpolation problems.
 Time-series prediction: RBFNs can predict future values in time-series data by learning
patterns from historical data.
 Classification: RBFNs can be used in pattern recognition and classification problems by
assigning classes based on the outputs of the network.
 Control systems: RBFNs are applied in robotics and control systems where function
approximation and fast learning are required.

Advantages:

 Simplicity: RBFNs have a simpler structure compared to other networks like multilayer
perceptrons (MLPs), making them easier to design and train.
 Fast learning: Since the weights between the hidden and output layers are often determined
using linear regression, training is relatively quick.
 Localized learning: RBFNs focus on local regions of the input space, making them effective in
tasks where local data patterns matter.

Disadvantages:

 Scalability: RBFNs can require a large number of neurons for high-dimensional input data,
which may lead to higher computational costs.
 Center and spread selection: The performance of the RBFN is highly dependent on how the
centers and spreads of the radial basis functions are chosen, which can sometimes be difficult to
tune.
 Sensitive to outliers: The performance may degrade if there are outliers in the training data, as
these affect the selection of centers and the spread.

Example Architecture:

For an RBFN used for classification with two input features, three radial basis neurons in the hidden
layer, and one output neuron:

1. Input Layer: 2 neurons (one for each feature).


2. Hidden Layer: 3 neurons with Gaussian radial basis functions.
3. Output Layer: 1 neuron that sums the weighted outputs of the hidden layer.

The network takes the input features, applies the radial basis functions in the hidden layer, and
generates a final output, which is the predicted class.

Training Algorithm:

Step 0: Set the weight to small random values.

Step 1: Perform steps 2-8 when the stopping condition is false.

Step 2: Perform steps 3-7 for each input.


Step 3: Each input unit (xi for all I = 1 to n) receives input signals and transmits to the next hidden layer
unit.

Step 4: Calculate the radial basis function.

Step 5: Select the centers for the radial basis function. The centers are selected from the set of input
vectors. It should be noted that a sufficient number of centers have to be selected to ensure adequate
sampling of the input vector space.

Step 6: Calculate the output from the hidden layer unit:

where xji is the center of the RBF unit for input variable; σi the width of ith RBF unit; xji the jth
variable of input pattern.

Step 7: Calculate the output of the neural network:

k
ynet = ∑ wimvi(xi) + w0
i=1

where k is the number of hidden layer nodes (RBF function);ynet the output value of mth node in
output layer for the nth incoming pattern; Wim the weight between ith RBF unit and m th output node;
Wo the biasing term at nth output node.

Step 8: Calculate the error and test for the stopping condition. The stopping condition may be number
of epochs or to a certain extent weight change.

Application of Neural network in forecasting, data compression and image compression


Neural networks provide powerful, flexible tools for forecasting, data compression, and image compression. Their
ability to model non-linear relationships, reduce data dimensionality, and generate efficient representations makes them
ideal for these tasks. By leveraging advanced architectures like RNNs, autoencoders, and CNNs, neural networks
continue to push the boundaries of what is possible in predictive modeling and data efficiency.

1. Neural Networks in Forecasting

Neural networks are widely used for predictive tasks, particularly in time series forecasting. The ability
of neural networks to learn complex, non-linear patterns makes them suitable for various types of
forecasting.
Applications:

 Financial Forecasting: Neural networks, especially Recurrent Neural Networks (RNNs) and
Long Short-Term Memory (LSTM) networks, are used to predict stock prices, market trends,
and other financial metrics based on historical data.
 Weather Forecasting: Neural networks can model weather patterns by analyzing historical
weather data, predicting temperature, rainfall, and other meteorological variables.
 Demand Forecasting: Businesses use neural networks for inventory and sales forecasting to
predict future demand and optimize supply chain operations.
 Energy Load Forecasting: In the energy sector, neural networks help predict future energy
consumption based on historical load data, helping manage grid demand efficiently.

Benefits:

 Non-linear pattern recognition: Neural networks excel at capturing intricate, non-linear


relationships between input variables.
 Adaptability: They can continuously learn from new data, making them flexible to dynamic
environments.

2. Neural Networks in Data Compression

Neural networks can be applied to compress data by learning efficient data representations that require
fewer bits while preserving the original information. This is crucial in scenarios where data storage and
transmission costs need to be minimized.
Applications:

 Autoencoders for Data Compression: An autoencoder is a type of neural network used for
unsupervised learning that can compress data into a lower-dimensional representation
(encoding) and then reconstruct it (decoding). It consists of an encoder that compresses the
input and a decoder that reconstructs it.
o Lossless Compression: Autoencoders are used to compress high-dimensional data into
a latent space representation without losing important information, which is useful for
compressing text, sensor data, etc.
o Dimensionality Reduction: Neural networks reduce data size by learning a smaller
representation of the original dataset, making data more manageable for tasks like
visualization, storage, or faster processing.

Benefits:

 Efficient Compression: Neural networks can learn the most important features, leading to
highly efficient data compression.
 Improved Performance: Compression models built on neural networks often outperform
traditional algorithms in terms of compression ratios and reconstruction quality.

3. Neural Networks in Image Compression

Neural networks, particularly deep learning methods, have revolutionized image compression
techniques by offering more efficient and intelligent ways to compress images with minimal loss in
quality.
Applications:

 Deep Autoencoders for Image Compression: Similar to data compression, autoencoders are
used to compress image data. They learn to represent the most essential features of an image in
a compact, low-dimensional format and reconstruct the original image during decompression.
 Convolutional Neural Networks (CNNs) for Image Compression: CNNs can be used to
detect patterns and features in images, enabling them to generate efficient image
representations. Deep learning-based compression methods often outperform traditional
algorithms like JPEG in preserving image quality at higher compression ratios.
 Generative Adversarial Networks (GANs) for Super-resolution: GANs can be used in
conjunction with compression techniques to generate high-quality reconstructions from
compressed data, which is especially useful in tasks like image super-resolution and restoration.
 Variational Autoencoders (VAEs): VAEs are a type of autoencoder that can generate new
data from compressed representations, making them suitable for image compression tasks where
the balance between reconstruction quality and compression is crucial.

Benefits:

 High compression efficiency: Neural networks often outperform conventional image


compression algorithms (like JPEG) by achieving higher compression ratios while maintaining
quality.
 Customizability: Neural networks can be trained for specific tasks or datasets, making them
more versatile for different compression applications.

You might also like