Neural networks are inspired by biological neurons and are used to learn relationships in data. The document defines an artificial neural network as a large number of interconnected processing elements called neurons that learn from examples. It outlines the key components of artificial neurons including weights, inputs, summation, and activation functions. Examples of neural network architectures include single-layer perceptrons, multi-layer perceptrons, convolutional neural networks, and recurrent neural networks. Common applications of neural networks include pattern recognition, data classification, and processing sequences.
The document discusses neural network architecture and components. It explains that a neural network consists of nodes that represent neurons, similar to the human brain. Data is fed through an input layer, processed through hidden layers, and output at the output layer. Key components include the neuron/node, weights, biases, and activation functions. Common activation functions are sigmoid, tanh, ReLU, and softmax, each suited for different types of problems. The document provides details on each of these components and how they enable neural networks to learn from data.
This PPT contains entire content in short. My book on ANN under the title "SOFT COMPUTING" with Watson Publication and my classmates can be referred together.
Artificial neural networks mimic the human brain by using interconnected layers of neurons that fire electrical signals between each other. Activation functions are important for neural networks to learn complex patterns by introducing non-linearity. Without activation functions, neural networks would be limited to linear regression. Common activation functions include sigmoid, tanh, ReLU, and LeakyReLU, with ReLU and LeakyReLU helping to address issues like vanishing gradients that can occur with sigmoid and tanh functions.
The document discusses components and concepts related to artificial neural networks. It describes the basic units (neurons), connections between neurons, propagation and activation functions, common activation functions like sigmoid and tanh, and network topologies including feedforward and recurrent networks. It provides details on how artificial neural networks are designed based on the human brain and how information is processed through the connections and activation of neurons.
The document discusses different types of machine learning paradigms including supervised learning, unsupervised learning, and reinforcement learning. It then provides details on artificial neural networks, describing them as consisting of simple processing units that communicate through weighted connections, similar to neurons in the human brain. The document outlines key aspects of artificial neural networks like processing units, connections between units, propagation rules, and learning methods.
The document discusses artificial neural networks and backpropagation. It provides background on neural networks, including their biological inspiration from the human brain. It describes the basic components of artificial neurons and how they are connected in networks. It explains feedforward neural networks and discusses limitations of single-layer perceptrons. The document then introduces multi-layer feedforward networks and the backpropagation algorithm, which allows training of hidden layers by propagating error backwards. It provides details on calculating error terms and updating weights in backpropagation training.
The document provides an introduction to neural networks, including:
- Biological neural networks transmit signals via neurons connected by synapses and axons.
- Artificial neural networks are composed of simple processing elements (neurons) that operate in parallel and are determined by network structure and connection strengths (weights).
- Multilayer neural networks consist of an input layer, hidden layers, and output layer connected by weights to solve complex problems. Learning involves updating weights so the network can efficiently perform tasks.
The document discusses artificial neural networks and their biological inspiration. It provides details on:
- The basic structure and functioning of biological neurons
- How artificial neural networks are modeled after biological neural networks with nodes, links, weights, and activation functions
- Examples of different activation functions used in artificial neurons like threshold, sigmoid, and linear functions
- How simple logic gates can be modeled using the McCulloch-Pitts neuron model with different weight configurations
- Learning in neural networks involves adjusting the connection weights between neurons through supervised or unsupervised learning processes.
The document discusses the concepts of soft computing and artificial neural networks. It defines soft computing as an emerging approach to computing that parallels the human mind in dealing with uncertainty and imprecision. Soft computing consists of fuzzy logic, neural networks, and genetic algorithms. Neural networks are simplified models of biological neurons that can learn from examples to solve problems. They are composed of interconnected processing units, learn via training, and can perform tasks like pattern recognition. The document outlines the basic components and learning methods of artificial neural networks.
Neural networks are a new method of programming computers that are good at pattern recognition. They are inspired by the human brain and are composed of interconnected processing elements called neurons. Neural networks learn by example through adjusting synaptic connections between neurons. They can be trained to perform tasks like pattern recognition and classification. There are different types of neural networks including feedforward and feedback networks. Training involves adjusting weights to minimize error through algorithms like backpropagation. Neural networks are used in applications like data analysis, forecasting, and medical diagnosis.
V2.0 open power ai virtual university deep learning and ai introductionGanesan Narayanasamy
OpenPOWER AI virtual University's - focus on bringing together industry, government and academic expertise to connect and help shape the AI future .
https://www.youtube.com/channel/UCYLtbUp0AH0ZAv5mNut1Kcg
Data Science - Part VIII - Artifical Neural NetworkDerek Kane
This lecture provides an overview of biological based learning in the brain and how to simulate this approach through the use of feed-forward artificial neural networks with back propagation. We will go through some methods of calibration and diagnostics and then apply the technique on three different data mining tasks: binary prediction, classification, and time series prediction.
The document discusses various activation functions used in neural networks including Tanh, ReLU, Leaky ReLU, Sigmoid, and Softmax. It explains that activation functions introduce non-linearity and allow neural networks to learn complex patterns. Tanh squashes outputs between -1 and 1 while ReLU sets negative values to zero, addressing the "dying ReLU" problem. Leaky ReLU allows a small negative slope. Sigmoid and Softmax transform outputs between 0-1 for classification problems. Activation functions determine if a neuron's output is important for prediction.
This document discusses artificial neural networks (ANNs) and how they are inspired by biological neural networks in the human brain. It provides details on the basic components of biological neurons (dendrites, soma, axon, synapses) and how ANNs attempt to mimic this structure. The document then describes some key aspects of ANNs, including activation functions like sigmoid, tanh, ReLU, and how neural networks work by taking input values, applying weights and an activation function, and producing an output. It focuses on ANNs for problems like regression and classification.
- The document discusses multi-layer perceptrons (MLPs), a type of artificial neural network. MLPs have multiple layers of nodes and can classify non-linearly separable data using backpropagation.
- It describes the basic components and working of perceptrons, the simplest type of neural network, and how they led to the development of MLPs. MLPs use backpropagation to calculate error gradients and update weights between layers.
- Various concepts are explained like activation functions, forward and backward propagation, biases, and error functions used for training MLPs. Applications mentioned include speech recognition, image recognition and machine translation.
Neural networks are computing systems inspired by the human brain that are composed of interconnected nodes similar to neurons. They can recognize complex patterns in raw data through learning algorithms. An artificial neural network consists of layers of nodes - an input layer, one or more hidden layers, and an output layer. Weights are assigned to connections between nodes and are adjusted during training to produce the desired output.
ACTIVATION FUNCTIONS IN SOFT COMPUTING AWsssmrockz
THIS PRESENTATION HAS THE SOFT COMPUTING REQUIRED ACTIVATION FUNCTIONS WHICH IS ALSO FOUND IN DEEP LEARNING AND MACHINE LEARNING AND ALSO WHICH IS USED EVEN IN AI(ARTIFICIAL INTELLIGENCE)
π0.5: a Vision-Language-Action Model with Open-World GeneralizationNABLAS株式会社
今回の資料「Transfusion / π0 / π0.5」は、画像・言語・アクションを統合するロボット基盤モデルについて紹介しています。
拡散×自己回帰を融合したTransformerをベースに、π0.5ではオープンワールドでの推論・計画も可能に。
This presentation introduces robot foundation models that integrate vision, language, and action.
Built on a Transformer combining diffusion and autoregression, π0.5 enables reasoning and planning in open-world settings.
Sorting Order and Stability in Sorting.
Concept of Internal and External Sorting.
Bubble Sort,
Insertion Sort,
Selection Sort,
Quick Sort and
Merge Sort,
Radix Sort, and
Shell Sort,
External Sorting, Time complexity analysis of Sorting Algorithms.
Ad
More Related Content
Similar to Neural Networks and its related Concepts (20)
The document discusses artificial neural networks and backpropagation. It provides background on neural networks, including their biological inspiration from the human brain. It describes the basic components of artificial neurons and how they are connected in networks. It explains feedforward neural networks and discusses limitations of single-layer perceptrons. The document then introduces multi-layer feedforward networks and the backpropagation algorithm, which allows training of hidden layers by propagating error backwards. It provides details on calculating error terms and updating weights in backpropagation training.
The document provides an introduction to neural networks, including:
- Biological neural networks transmit signals via neurons connected by synapses and axons.
- Artificial neural networks are composed of simple processing elements (neurons) that operate in parallel and are determined by network structure and connection strengths (weights).
- Multilayer neural networks consist of an input layer, hidden layers, and output layer connected by weights to solve complex problems. Learning involves updating weights so the network can efficiently perform tasks.
The document discusses artificial neural networks and their biological inspiration. It provides details on:
- The basic structure and functioning of biological neurons
- How artificial neural networks are modeled after biological neural networks with nodes, links, weights, and activation functions
- Examples of different activation functions used in artificial neurons like threshold, sigmoid, and linear functions
- How simple logic gates can be modeled using the McCulloch-Pitts neuron model with different weight configurations
- Learning in neural networks involves adjusting the connection weights between neurons through supervised or unsupervised learning processes.
The document discusses the concepts of soft computing and artificial neural networks. It defines soft computing as an emerging approach to computing that parallels the human mind in dealing with uncertainty and imprecision. Soft computing consists of fuzzy logic, neural networks, and genetic algorithms. Neural networks are simplified models of biological neurons that can learn from examples to solve problems. They are composed of interconnected processing units, learn via training, and can perform tasks like pattern recognition. The document outlines the basic components and learning methods of artificial neural networks.
Neural networks are a new method of programming computers that are good at pattern recognition. They are inspired by the human brain and are composed of interconnected processing elements called neurons. Neural networks learn by example through adjusting synaptic connections between neurons. They can be trained to perform tasks like pattern recognition and classification. There are different types of neural networks including feedforward and feedback networks. Training involves adjusting weights to minimize error through algorithms like backpropagation. Neural networks are used in applications like data analysis, forecasting, and medical diagnosis.
V2.0 open power ai virtual university deep learning and ai introductionGanesan Narayanasamy
OpenPOWER AI virtual University's - focus on bringing together industry, government and academic expertise to connect and help shape the AI future .
https://www.youtube.com/channel/UCYLtbUp0AH0ZAv5mNut1Kcg
Data Science - Part VIII - Artifical Neural NetworkDerek Kane
This lecture provides an overview of biological based learning in the brain and how to simulate this approach through the use of feed-forward artificial neural networks with back propagation. We will go through some methods of calibration and diagnostics and then apply the technique on three different data mining tasks: binary prediction, classification, and time series prediction.
The document discusses various activation functions used in neural networks including Tanh, ReLU, Leaky ReLU, Sigmoid, and Softmax. It explains that activation functions introduce non-linearity and allow neural networks to learn complex patterns. Tanh squashes outputs between -1 and 1 while ReLU sets negative values to zero, addressing the "dying ReLU" problem. Leaky ReLU allows a small negative slope. Sigmoid and Softmax transform outputs between 0-1 for classification problems. Activation functions determine if a neuron's output is important for prediction.
This document discusses artificial neural networks (ANNs) and how they are inspired by biological neural networks in the human brain. It provides details on the basic components of biological neurons (dendrites, soma, axon, synapses) and how ANNs attempt to mimic this structure. The document then describes some key aspects of ANNs, including activation functions like sigmoid, tanh, ReLU, and how neural networks work by taking input values, applying weights and an activation function, and producing an output. It focuses on ANNs for problems like regression and classification.
- The document discusses multi-layer perceptrons (MLPs), a type of artificial neural network. MLPs have multiple layers of nodes and can classify non-linearly separable data using backpropagation.
- It describes the basic components and working of perceptrons, the simplest type of neural network, and how they led to the development of MLPs. MLPs use backpropagation to calculate error gradients and update weights between layers.
- Various concepts are explained like activation functions, forward and backward propagation, biases, and error functions used for training MLPs. Applications mentioned include speech recognition, image recognition and machine translation.
Neural networks are computing systems inspired by the human brain that are composed of interconnected nodes similar to neurons. They can recognize complex patterns in raw data through learning algorithms. An artificial neural network consists of layers of nodes - an input layer, one or more hidden layers, and an output layer. Weights are assigned to connections between nodes and are adjusted during training to produce the desired output.
ACTIVATION FUNCTIONS IN SOFT COMPUTING AWsssmrockz
THIS PRESENTATION HAS THE SOFT COMPUTING REQUIRED ACTIVATION FUNCTIONS WHICH IS ALSO FOUND IN DEEP LEARNING AND MACHINE LEARNING AND ALSO WHICH IS USED EVEN IN AI(ARTIFICIAL INTELLIGENCE)
π0.5: a Vision-Language-Action Model with Open-World GeneralizationNABLAS株式会社
今回の資料「Transfusion / π0 / π0.5」は、画像・言語・アクションを統合するロボット基盤モデルについて紹介しています。
拡散×自己回帰を融合したTransformerをベースに、π0.5ではオープンワールドでの推論・計画も可能に。
This presentation introduces robot foundation models that integrate vision, language, and action.
Built on a Transformer combining diffusion and autoregression, π0.5 enables reasoning and planning in open-world settings.
Sorting Order and Stability in Sorting.
Concept of Internal and External Sorting.
Bubble Sort,
Insertion Sort,
Selection Sort,
Quick Sort and
Merge Sort,
Radix Sort, and
Shell Sort,
External Sorting, Time complexity analysis of Sorting Algorithms.
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...Infopitaara
A feed water heater is a device used in power plants to preheat water before it enters the boiler. It plays a critical role in improving the overall efficiency of the power generation process, especially in thermal power plants.
🔧 Function of a Feed Water Heater:
It uses steam extracted from the turbine to preheat the feed water.
This reduces the fuel required to convert water into steam in the boiler.
It supports Regenerative Rankine Cycle, increasing plant efficiency.
🔍 Types of Feed Water Heaters:
Open Feed Water Heater (Direct Contact)
Steam and water come into direct contact.
Mixing occurs, and heat is transferred directly.
Common in low-pressure stages.
Closed Feed Water Heater (Surface Type)
Steam and water are separated by tubes.
Heat is transferred through tube walls.
Common in high-pressure systems.
⚙️ Advantages:
Improves thermal efficiency.
Reduces fuel consumption.
Lowers thermal stress on boiler components.
Minimizes corrosion by removing dissolved gases.
International Journal of Distributed and Parallel systems (IJDPS)samueljackson3773
The growth of Internet and other web technologies requires the development of new
algorithms and architectures for parallel and distributed computing. International journal of
Distributed and parallel systems is a bimonthly open access peer-reviewed journal aims to
publish high quality scientific papers arising from original research and development from
the international community in the areas of parallel and distributed systems. IJDPS serves
as a platform for engineers and researchers to present new ideas and system technology,
with an interactive and friendly, but strongly professional atmosphere.
ELectronics Boards & Product Testing_Shiju.pdfShiju Jacob
This presentation provides a high level insight about DFT analysis and test coverage calculation, finalizing test strategy, and types of tests at different levels of the product.
We introduce the Gaussian process (GP) modeling module developed within the UQLab software framework. The novel design of the GP-module aims at providing seamless integration of GP modeling into any uncertainty quantification workflow, as well as a standalone surrogate modeling tool. We first briefly present the key mathematical tools on the basis of GP modeling (a.k.a. Kriging), as well as the associated theoretical and computational framework. We then provide an extensive overview of the available features of the software and demonstrate its flexibility and user-friendliness. Finally, we showcase the usage and the performance of the software on several applications borrowed from different fields of engineering. These include a basic surrogate of a well-known analytical benchmark function; a hierarchical Kriging example applied to wind turbine aero-servo-elastic simulations and a more complex geotechnical example that requires a non-stationary, user-defined correlation function. The GP-module, like the rest of the scientific code that is shipped with UQLab, is open source (BSD license).
Concept of Problem Solving, Introduction to Algorithms, Characteristics of Algorithms, Introduction to Data Structure, Data Structure Classification (Linear and Non-linear, Static and Dynamic, Persistent and Ephemeral data structures), Time complexity and Space complexity, Asymptotic Notation - The Big-O, Omega and Theta notation, Algorithmic upper bounds, lower bounds, Best, Worst and Average case analysis of an Algorithm, Abstract Data Types (ADT)
its all about Artificial Intelligence(Ai) and Machine Learning and not on advanced level you can study before the exam or can check for some information on Ai for project
In tube drawing process, a tube is pulled out through a die and a plug to reduce its diameter and thickness as per the requirement. Dimensional accuracy of cold drawn tubes plays a vital role in the further quality of end products and controlling rejection in manufacturing processes of these end products. Springback phenomenon is the elastic strain recovery after removal of forming loads, causes geometrical inaccuracies in drawn tubes. Further, this leads to difficulty in achieving close dimensional tolerances. In the present work springback of EN 8 D tube material is studied for various cold drawing parameters. The process parameters in this work include die semi-angle, land width and drawing speed. The experimentation is done using Taguchi’s L36 orthogonal array, and then optimization is done in data analysis software Minitab 17. The results of ANOVA shows that 15 degrees die semi-angle,5 mm land width and 6 m/min drawing speed yields least springback. Furthermore, optimization algorithms named Particle Swarm Optimization (PSO), Simulated Annealing (SA) and Genetic Algorithm (GA) are applied which shows that 15 degrees die semi-angle, 10 mm land width and 8 m/min drawing speed results in minimal springback with almost 10.5 % improvement. Finally, the results of experimentation are validated with Finite Element Analysis technique using ANSYS.
Fluid mechanics is the branch of physics concerned with the mechanics of fluids (liquids, gases, and plasmas) and the forces on them. Originally applied to water (hydromechanics), it found applications in a wide range of disciplines, including mechanical, aerospace, civil, chemical, and biomedical engineering, as well as geophysics, oceanography, meteorology, astrophysics, and biology.
It can be divided into fluid statics, the study of various fluids at rest, and fluid dynamics.
Fluid statics, also known as hydrostatics, is the study of fluids at rest, specifically when there's no relative motion between fluid particles. It focuses on the conditions under which fluids are in stable equilibrium and doesn't involve fluid motion.
Fluid kinematics is the branch of fluid mechanics that focuses on describing and analyzing the motion of fluids, such as liquids and gases, without considering the forces that cause the motion. It deals with the geometrical and temporal aspects of fluid flow, including velocity and acceleration. Fluid dynamics, on the other hand, considers the forces acting on the fluid.
Fluid dynamics is the study of the effect of forces on fluid motion. It is a branch of continuum mechanics, a subject which models matter without using the information that it is made out of atoms; that is, it models matter from a macroscopic viewpoint rather than from microscopic.
Fluid mechanics, especially fluid dynamics, is an active field of research, typically mathematically complex. Many problems are partly or wholly unsolved and are best addressed by numerical methods, typically using computers. A modern discipline, called computational fluid dynamics (CFD), is devoted to this approach. Particle image velocimetry, an experimental method for visualizing and analyzing fluid flow, also takes advantage of the highly visual nature of fluid flow.
Fundamentally, every fluid mechanical system is assumed to obey the basic laws :
Conservation of mass
Conservation of energy
Conservation of momentum
The continuum assumption
For example, the assumption that mass is conserved means that for any fixed control volume (for example, a spherical volume)—enclosed by a control surface—the rate of change of the mass contained in that volume is equal to the rate at which mass is passing through the surface from outside to inside, minus the rate at which mass is passing from inside to outside. This can be expressed as an equation in integral form over the control volume.
The continuum assumption is an idealization of continuum mechanics under which fluids can be treated as continuous, even though, on a microscopic scale, they are composed of molecules. Under the continuum assumption, macroscopic (observed/measurable) properties such as density, pressure, temperature, and bulk velocity are taken to be well-defined at "infinitesimal" volume elements—small in comparison to the characteristic length scale of the system, but large in comparison to molecular length scale
Value Stream Mapping Worskshops for Intelligent Continuous SecurityMarc Hornbeek
This presentation provides detailed guidance and tools for conducting Current State and Future State Value Stream Mapping workshops for Intelligent Continuous Security.
The Fluke 925 is a vane anemometer, a handheld device designed to measure wind speed, air flow (volume), and temperature. It features a separate sensor and display unit, allowing greater flexibility and ease of use in tight or hard-to-reach spaces. The Fluke 925 is particularly suitable for HVAC (heating, ventilation, and air conditioning) maintenance in both residential and commercial buildings, offering a durable and cost-effective solution for routine airflow diagnostics.
2. Overview
● What is Neural Network, Artificial Neural Networks: Biological neurons
and its working
● Simulation of biological neurons to problem solving
● Learning rules and various activation functions (sigmoid, tanh, relu and
softmax )
● McCulloch Pitts Neuron, Concept of Linear Separability
● Single layer Perceptron
● Feedforward Neural Networks
● Back Propagation networks
● Character Recognition Application
● Stochastic Gradient Descent
● Immunological computing
3. Introduction
● What is Neural Network??
● A method of computing, based on the interaction of multiple connected processing
elements.
● A powerful technique to solve many real world problems.
● The ability to learn from experience in order to improve their performance.
● At the core of a neural network is a mathematical model that is used to make predictions
or decisions based on input data.
● The neurons in a neural network are connected by weighted links that allow them to
communicate with one another.
● There are several types of neural networks, including feedforward neural networks,
convolutional neural networks, and recurrent neural networks.
4. Basics of Neural Network
● A neuron is a cell that carries electrical impulses and are the basic units of
the nervous system.
● Every neuron is made of a cell body (also called a soma), dendrites and an
axon. Dendrites and axons are nerve fibers. There are about 86 billion neurons
in the human brain, which comprises roughly 10% of all brain cells.
● Neurons are connected to one another and tissues. They do not touch and
instead form tiny gaps called synapses. These gaps can be chemical synapses
or electrical synapses and pass the signal from one neuron to the next.
● Dendrite — It receives signals from other neurons.
● Soma (cell body) — It sums all the incoming signals to generate input.
● Axon — When the sum reaches a threshold value, neuron fires and the signal
travels down the axon to the other neurons.
● Synapses — The point of interconnection of one neuron with other neurons.
The amount of signal transmitted depend upon the strength (synaptic weights)
of the connections.
6. Comparing ANN and BNN
● As this concept borrowed from ANN there are lot of similarities though there are differences too.
● Similarities are in the following table
9. Learning
• Learning = learning by adaptation
• The objective of learning in biological organisms is to improve their
survival and reproductive success by adapting to changing environmental
conditions and developing new strategies for survival.
• Learning in biological organisms allows them to:
1. Respond to environmental changes
2. Improve their performance
3. Develop new behaviors
4. Enhance communication
10. Types of Learning in Neural Network
● Supervised Learning — Supervised learning is a type of machine
learning where the algorithm is trained on labeled data, which means that
the data is already categorized into specific classes or categories.
● Unsupervised Learning — Unsupervised learning is a type of machine
learning where the algorithm is trained on unlabeled data, which means
that the data is not categorized into specific classes or categories. The goal
of unsupervised learning is to find patterns and relationships in the data
without any prior knowledge of what the data represent.
● Reinforcement Learning — Reinforcement learning is a type of machine
learning where an agent learns to make decisions in an environment by
receiving feedback in the form of rewards or penalties.
11. Model of Artificial
Neural Network
● Receives n-inputs
● Multiplies each input by its
weight
● Applies activation function
to the sum of results
● Outputs result
12. Activation Functions
● The activation function is a mathematical “gate” in between
the input feeding the current neuron and its output going to
the next layer. They basically decide whether the neuron should
be activated or not.
● Activation functions in a neural network (NN) are mathematical
functions that are applied to the output of a neuron in the
network.
● The activation function introduces non-linearity into the
network and helps to produce a non-linear decision boundary
that can be used to model complex relationships in the input
data.
14. Why do we use an activation function ?
If we do not have the activation function the weights and bias would simply
do a linear transformation.
A linear equation is simple to solve but is limited in its capacity to solve
complex problems and have less power to learn complex functional
mappings from data.
A neural network without an activation function is just a linear regression
model.
Generally, neural networks use non-linear activation functions, which can
help the network learn complex data, compute and learn almost any function
representing a question, and provide accurate predictions.
15. Why use a non-linear activation function?
If we were to use a linear activation function or identity activation
functions then the neural network will just output a linear
function of input.
And so, no matter how many layers our neural network has, it will
still behave just like a single layer network because summing
these layers will give us another linear function which is not strong
enough to model data.
17. Linear or Identity Activation Function
Equation: f(x) = x
Derivative: f’(x) = 1
Range: (-∞, +∞)
Two major problems:
1. Back-propagation is not possible — The derivative of the function
is a constant, and has no relation to the input, X. So it’s not possible to
go back and understand which weights in the input neurons can
provide a better prediction.
2. All layers of the neural network collapse into one — with linear
activation functions, no matter how many layers in the neural network,
the last layer will be a linear function of the first layer
18. Non-linear Activation Function
Modern neural network models use non-linear activation functions.
They allow the model to create complex mappings between the
network’s inputs and outputs, which are essential for learning
and modeling complex data, such as images, video, audio, and
data sets which are non-linear or have high dimensionality.
Almost any process imaginable can be represented as a
functional computation in a neural network, provided that the
activation function is non-linear.
19. Non-linear Activation Function
Non-linear functions address the problems of a linear activation
function:
They allow back-propagation because they have a derivative
function which is related to the inputs.
They allow “stacking” of multiple layers of neurons to create
a deep neural network. Multiple hidden layers of neurons are
needed to learn complex data sets with high levels of accuracy.
20. Activation Functions
● Some commonly used activation functions in NNs include:
● Sigmoid function: The sigmoid function is an S-shaped curve that maps any input value
to a value between 0 and 1. It is commonly used as the activation function in the output
layer of binary classification problems.
● ReLU (Rectified Linear Unit) function: The ReLU function maps any input value to 0 if
it is negative, and to the input value if it is positive. It is commonly used as the activation
function in the hidden layers of deep neural networks.
● Tanh (Hyperbolic tangent) function: The Tanh function is similar to the sigmoid
function, but it maps any input value to a value between -1 and 1. It is also commonly
used as an activation function in the hidden layers of neural networks.
● Softmax function: The softmax function is used in the output layer of multi-class
classification problems. It maps the output values of each neuron to a probability
distribution over the classes.
21. Sigmoid Function
● It is a function which is plotted as ‘S’ shaped
graph.
● Equation : A = 1/(1 + e-x
)
● Derivative: f’(x) = s*(1-s)
● Nature : Non-linear. Notice that X values
lies between -2 to 2, Y values are very steep.
This means, small changes in x would also
bring about large changes in the value of Y.
● Value Range : 0 to 1
● Uses : Usually used in output layer of a
binary classification, where result is either 0
or 1, as value for sigmoid function lies
between 0 and 1 only so, result can be
predicted easily to be 1 if value is greater
than 0.5 and 0 otherwise.
22. Sigmoid Function
Advantages:
1. The function is differentiable.That means, we can find the slope
of the sigmoid curve at any two points.
2. Output values bound between 0 and 1, normalizing the output of
each neuron.
Disadvantages:
3. Vanishing gradient — For very large or very small inputs, the
sigmoid curve flattens.
This means the gradient (slope) becomes almost zero.
With gradients so small, the neural network struggles to update
its weights, slowing or even stopping learning.
4. Due to the vanishing gradient, the training process becomes very
slow, as updates to the model are minimal. sigmoids have slow
convergence.
● Outputs not zero centered.: The sigmoid output ranges from 0 to 1,
so it is always positive.
● This causes issues during weight updates, as the gradients can push
all weights in the same direction, making optimization harder.
1. Computationally expensive.
23. Tanh Function
• The activation that works almost always better than sigmoid
function is Tanh function also known as Tangent Hyperbolic
function. It’s actually mathematically shifted version of the
sigmoid function. Both are similar and can be derived from each
other.
• Equation :
• Value Range :- -1 to +1
• Derivative: (1- a²)
• Nature :- non-linear*
• Uses :- Usually used in hidden layers of a neural network as it’s
values lies between -1 to 1 hence the mean for the hidden layer
comes out be 0 or very close to it, hence helps in centering the
data by bringing mean close to 0. This makes learning for the
next layer much easier.
24. Tanh Function
Advantages:
1. Zero centered — Unlike the sigmoid function,
the tanh function outputs values between −1
and 1.
This helps the neural network model
inputs with strong negative, neutral, and
strong positive values more effectively,
leading to faster convergence.
2. The function and its derivative both are
monotonic.(consistently increase or decrease,
simplifying learning)
3. Works better than sigmoid function
Disadvantage:
4. It also suffers vanishing gradient problem and
hence slow convergence.
25. RELU Function
•It Stands for Rectified linear unit. It is the most widely used
activation function. Chiefly implemented in hidden layers of
Neural network.
•Equation :- A(x) = max(0,x). It gives an output x if x is
positive and 0 otherwise.
•Value Range :- [0, inf)
•Nature :- non-linear, which means we can easily
backpropagate the errors and have multiple layers of neurons
being activated by the ReLU function.
•Uses :- ReLu is less computationally expensive than tanh and
sigmoid because it involves simpler mathematical operations.
At a time only a few neurons are activated making the network
sparse making it efficient and easy for computation.
In simple words, RELU learns much faster than sigmoid and
Tanh function.
26. Softmax Function
● The softmax activation function is commonly
used in the output layer of a neural network
when performing multiclass classification.
● Nature :- non-linear
● softmax(x_i) = exp(x_i) / sum(exp(x_j)) for all j
● Uses :- Usually used when trying to handle
multiple classes. The softmax function was
commonly found in the output layer of image
classification problems.
● The softmax function is particularly useful in
multiclass classification tasks, where the goal is
to predict the probability of each possible class
for a given input.
27. Activation function
● Sigmoid functions and their combinations generally work better in
the case of classification problems.
● Sigmoid and tanh functions are sometimes avoided due to the
vanishing gradient problem.
● ReLU activation function is widely used and is default choice as it
yields better results.
● ReLU function should only be used in the hidden layers.
● An output layer can be linear activation function in case of
regression problems.
28. Activation function
● The basic rule of thumb is if you really don’t know what
activation function to use, then simply use RELU as it is a
general activation function in hidden layers and is used in most
cases these days.
● If your output is for binary classification then, sigmoid
function is very natural choice for output layer.
● If your output is for multi-class classification then, Softmax is
very useful to predict the probabilities of each classes.
29. What is the Perceptron model in Machine Learning?
Perceptron is Machine Learning algorithm for supervised learning of
various binary classification tasks. Further, Perceptron is also
understood as an Artificial Neuron or neural network unit that helps
to detect certain input data computations in business intelligence.
Perceptron model is also treated as one of the best and simplest
types of Artificial Neural networks. However, it is a supervised
learning algorithm of binary classifiers. Hence, we can consider it as
a single-layer neural network with four main parameters, i.e., input
values, weights and Bias, net sum, and an activation function.
30. What is Binary classifier in Machine Learning?
A binary classifier is a model used to categorize data into two distinct
classes (e.g., Yes/No, 1/-1, True/False).
A binary classifier predicts which of two classes a given input belongs
to:
● Positive class: Often labeled as 1.
● Negative class: Often labeled as −1 (or 0, depending on
convention).
32. Basic Components of Perceptron
○ Input Nodes or Input Layer:
This is the primary component of Perceptron which accepts the initial data
into the system for further processing. Each input node contains a real
numerical value.
○ Wight and Bias:
Weight parameter represents the strength of the connection between units.
This is another most important parameter of Perceptron components. Weight is
directly proportional to the strength of the associated input neuron in deciding
the output. Further, Bias can be considered as the line of intercept in a linear
equation.
33. Basic Components of Perceptron
Activation Function:
These are the final and important components that help to determine whether
the neuron will fire or not. Activation Function can be considered primarily
as a step function.
Types of Activation functions:
—-
35. How does Perceptron work?
In Machine Learning, Perceptron is considered as a single-layer neural network that
consists of four main parameters named input values (Input nodes), weights and Bias,
net sum, and an activation function.
The perceptron model begins with the multiplication of all input values and their
weights, then adds these values together to create the weighted sum.
Then this weighted sum is applied to the activation function 'f' to obtain the desired
output.
This activation function is also known as the step function and is represented by 'f'.
his step function or Activation function plays a vital role in ensuring that output is mapped
between required values (0,1) or (-1,1).
It is important to note that the weight of input is indicative of the strength of a node. Similarly,
an input's bias value gives the ability to shift the activation function curve up or down.
36. How does Perceptron work?
Step-1
In the first step first, multiply all input values with corresponding weight values and then add them
to determine the weighted sum. Mathematically, we can calculate the weighted sum as follows:
∑wi*xi = x1*w1 + x2*w2 +…wn*xn
Add a special term called bias 'b' to this weighted sum to improve the model's performance.
∑wi*xi + b
Step-2
In the second step, an activation function is applied with the above-mentioned weighted sum,
which gives us output either in binary form or a continuous value as follows:
Y = f(∑wi*xi + b)
37. Single Layer Perceptron
A single-layer perceptron is a type of artificial neural network that
consists of only one layer of artificial neurons.
It is the simplest type of neural network and was proposed by Frank
Rosenblatt in 1958.
Single layer perceptron has been used in various applications,
including: Pattern Recognition, Binary Classification, , Control
Systems, Medical Diagnosis, Financial Forecasting
38. The perceptron consists of 4 parts:
Input value or One input layer: The input layer of the perceptron is made of
artificial input neurons and takes the initial data into the system for further
processing.
Weights and Bias:
Weight: It represents the dimension or strength of the connection between
units.
Bias: It is the same as the intercept added in a linear equation. bias is a
tunable parameter in neural networks that can help improve the accuracy and
flexibility of the model by allowing it to learn more complex decision
boundaries.
Net sum: It calculates the total sum.
Activation Function: A neuron can be activated or not, is determined by an
activation function. The activation function calculates a weighted sum and
further adding bias with it to give the result.
39. The Perceptron Learning Rule
1. Initialize the weights: Start with random weights for each input.
2. Input the training data: Input the features into the perceptron and calculate
the output.
3. Calculate the error: Compare the predicted output with the desired output to
calculate the error.
4. Update the weights: Adjust the weights of the inputs based on the error. If the
predicted output is less than the desired output, increase the weights of the
inputs. If the predicted output is greater than the desired output, decrease the
weights of the inputs. The magnitude of the weight adjustment is proportional
to the error and the input value.
Repeat: Repeat steps 2 to 4 until the error is minimized or a maximum number of
iterations is reached
41. Perceptron Function
Perceptron is a function that maps its input “x,” which is multiplied with the learned weight
coefficient; an output value” f(x)”is generated.
In the equation given above:
“w” = vector of real-valued weights
“b” = bias (an element that adjusts the boundary away from origin without any dependence
on the input value)
“x” = vector of input x values
“m” = number of inputs to the Perceptron
The output can be represented as “1” or “0.” It can also be represented as “1” or “-1”
depending on which activation function is used
42. Activation Functions of Perceptron
The activation function applies a step rule (convert the numerical output
into +1 or -1) to check if the output of the weighting function is greater than
zero or not.
For example:
If ∑ wixi> 0 => then final output “o” = 1 (issue bank loan)
Else, final output “o” = -1 (deny bank loan)
Step function gets triggered above a certain value of the neuron output; else
it outputs zero. Sign Function outputs +1 or -1 depending on whether
neuron output is greater than zero or not. Sigmoid is the S-curve and outputs
a value between 0 and 1.
43. Feedforward Neural Networks (FFNN)
(FFNN) is a type of artificial neural network where the information flows in one
direction only, from the input layer through one or more hidden layers to the output layer.
The output of each layer is connected to the input of the next layer, and the weights
and biases of the connections are learned during the training process.
FFNNs are commonly used for tasks such as classification, Control systems such as
robotics , and pattern recognition.
They can be trained using supervised learning, where the training data consists of input-
output pairs, and the network learns to map inputs to outputs.
The weights and biases of the network are updated during the training process using
backpropagation, which is an algorithm that computes the gradients of the loss function
with respect to the weights and biases.
44. How FFNN works
The input layer of an FFNN takes in the input data, which is usually in the form
of a vector.
Passes it through a series of hidden layers, each consisting of a set of neurons.
Each neuron in a hidden layer takes in the weighted sum of the outputs from the
previous layer, adds a bias term, and applies an activation function to produce
an output that is passed to the next layer.
The output layer produces the final output of the network, which is usually a
prediction or a classification.
46. A Multi-Layer Perceptron (MLP)
A Multi-Layer Perceptron (MLP) is a type of neural
network that consists of multiple layers of artificial
neurons.
MLPs are also known as feedforward neural networks.
The architecture of an MLP consists of an input layer,
one or more hidden layers, and an output layer.
Each layer is composed of multiple artificial neurons that
compute a weighted sum of the input signals and apply
an activation function to produce an output signal.
47. A Multi-Layer Perceptron (MLP)
The hidden layers in an MLP are responsible for extracting
features from the input data and transforming them into a
format that is suitable for the output layer.
The output layer produces the final output of the network,
which can be binary or continuous.
The learning process of an MLP involves adjusting the
weights of the input signals using backpropagation.
Backpropagation allows the MLP to learn from the training
data and improve its performance over time.
49. Compare single layer and multilayer perceptron model
Architecture:
Single-layer perceptrons have only one layer of neurons that directly connects to
the input data, whereas multilayer perceptrons consist of multiple layers of
neurons, including one or more hidden layers that lie between the input and output
layers.
Capabilities:
Single-layer perceptrons are limited to linearly separable problems, meaning they
can only learn and classify data that can be separated by a single straight line.
In contrast, multilayer perceptrons can learn and classify non-linearly
separable problems by using hidden layers to transform the input data into a
more complex feature space that can be separated by the output layer.
50. Compare single layer and multilayer perceptron model
Training:
Single-layer perceptrons use a simple learning rule called the Perceptron Learning
Algorithm, which adjusts the weights of the input signals to minimize the error
between the predicted and actual output. In contrast, multilayer perceptrons use a
more complex learning algorithm called backpropagation, which iteratively adjusts
the weights of all the neurons in the network to minimize the error between the
predicted and actual output.
Applications:
Single-layer perceptrons are typically used for simple binary classification
problems, such as predicting whether an email is spam or not. Multilayer perceptrons
are more powerful and can be used for a wide range of applications, including
image and speech recognition, natural language processing, and financial forecasting.
51. Back Propagation networks
Back Propagation are supervised learning algorithms used for training
neural networks.
The basic structure of a backpropagation network consists of an input
layer, one or more hidden layers, and an output layer.
Each layer is composed of one or more neurons, which receive inputs,
process them, and pass the outputs to the next layer.
The connections between the neurons are weighted, and these weights
are adjusted during training to improve the accuracy of the network's
predictions.
52. Back Propagation networks
During the training process, the network is fed a set of input-output
pairs, and the output of the network is compared to the desired output.
The error between the actual output and the desired output is
then back propagated through the network, and the weights are
adjusted to reduce the error.
This process is repeated many times, with the hope that the
network will eventually converge to a set of weights that produces
accurate predictions for new input data.
54. How Backpropagation Algorithm Works:
1. Inputs X, arrive through the preconnected path
2. Input is modeled using real weights W. The weights are usually randomly
selected.
3. Calculate the output for every neuron from the input layer, to the hidden
layers, to the output layer.
4. Calculate the error in the outputs
ErrorB= Actual Output – Desired Output
5. Travel back from the output layer to the hidden layer to adjust the weights
such that the error is decreased.
55. Why We Need Backpropagation?
● Backpropagation is fast, simple and easy to program
● It has no parameters to tune apart from the numbers of
input
● It is a flexible method as it does not require prior
knowledge about the network
● It is a standard method that generally works well
● It does not need any special mention of the features of the
function to be learned.
56. Concept of Linear Separability
● Linear separability is a concept in mathematics and particularly
in machine learning.
● Imagine you have some points scattered around on a piece of
paper, and you want to draw a straight line to separate them into
two groups.
● If you can draw such a line where all the points of one group are
on one side of the line, and all the points of the other group are
on the other side, then those points are said to be linearly
separable.
57. Concept of Linear Separability
● For example, let's say you have red and blue dots on a sheet of
paper, and you want to separate them with a straight line. If you
can draw a line in such a way that all the red dots are on one side
and all the blue dots are on the other side, then those dots are
linearly separable.
● Linear separability is important in machine learning because
it means that the data is easy to classify using a simple
algorithm like a linear classifier. If data is not linearly separable,
more complex methods may be needed to classify it accurately.
58. Concept of Linear Separability
● Linear separability is an important concept in neural networks. If the separate points in n-dimensional space
follows
then it is said linearly separable
● For two-dimensional inputs, if there exists a line (whose equation is ) that separates all
samples of one class from the other class, then an appropriate perception can be derived from the equation of the
separating line. such classification problems are called “Linear separable” i.e, separating by a linear combination
of i/p.
59. Character Recognition Application
Character recognition is a common application of neural networks, and
can be achieved using various types of neural networks, including
Feedforward neural networks, convolutional neural networks, and
recurrent neural networks
The network must be trained on a dataset of labeled character images
in order to learn to recognize characters.
During training, the network adjusts its weights based on the
difference between its predicted output and the true label of the input
image.
Once the network is trained, it can be used to make predictions on new,
unlabeled character images.
60. OCR (Optical Character Recognition)
OCR is a technology that analyzes the text of a page and turns the
letters into code that may be used to process information.
OCR is a technique for detecting printed or handwritten text
characters inside digital images of paper files, such as scanning paper
records
OCR systems are hardware and software systems that turn physical
documents into machine-readable text.
These digital versions can be highly beneficial to children and young
adults who struggle to read.
The essential application of OCR is to convert hard copy legal or
historical documents into PDFs.
62. How OCR works?
1. Image Pre-Processing
● Size normalization: This step ensures that all images are of the same
size for consistency. We use a method called bicubic interpolation to
resize images to a standard size.
● Binarization: Here, we convert grayscale images to binary images by
setting a threshold. Pixels above the threshold become white, while
those below become black. This helps in simplifying the image for
further processing.
● Smoothing: To make the edges of objects in the image smoother, we
use erosion and dilation techniques. This helps in reducing noise and
making the objects clearer.
63. How OCR works?
Text recognition: Once the image is pre-processed, we can start recognizing
text. There are two main methods for this:
● Pattern matching: This works well for typed documents with known
fonts. It compares parts of the image with patterns of characters it
knows.
● Feature extraction: This method looks at specific features of
characters, like lines and curves, to identify them.
64. How OCR works?
Postprocessing
After recognizing the text, the system converts it into a digital format. Some
systems also create annotated PDF files, which show both the original
scanned document and the recognized text.
65. Immunological computing
Immunological computing is like using the principles of our immune system to teach
computers how to recognize patterns, make decisions, and solve problems
effectively. It's a fascinating area of research that draws inspiration from nature to
develop smarter algorithms and systems.
Immune System Basics:
Our immune system is like a defense force in our body that helps to keep us healthy.
It can recognize harmful invaders, like bacteria or viruses, and fight them off to keep
us safe.
It does this by identifying foreign substances called antigens and producing antibodies
to neutralize them.
66. Immunological computing
How Immunological Computing Works:
In immunological computing, we mimic the behavior of the immune system to
solve computational problems.
Just like our immune system learns to recognize and respond to threats, in
immunological computing, algorithms learn to recognize patterns in data and
make decisions based on them.
Instead of antigens and antibodies, we use concepts like "patterns" and "rules".
The algorithms adapt and improve over time, similar to how our immune system
builds immunity to diseases.
67. Immunological computing
Applications:
Immunological computing can be used in various fields such as data mining,
pattern recognition, and optimization.
For example, in anomaly detection, it can help identify unusual patterns in
data that may indicate fraud or errors.
In optimization problems, it can be used to find the best solution among
many possibilities, similar to how our immune system finds the best
response to different threats.
68. Stochastic Gradient Descent
Gradient Descent (GD):
Imagine you are blindfolded on a hill and want to find the lowest point without any help.
You feel the slope under your feet and take small steps downhill. This is like Gradient
Descent where you iteratively move towards the minimum of a function by following the
direction of steepest descent.
Stochastic Gradient Descent (SGD):
Now, let's add a twist. Instead of relying on the slope at your current location alone, you
randomly pick a spot on the hill, feel the slope there, and take a step. Sometimes this spot
might be flat or even uphill, but over many such random steps, you tend to move towards
the bottom of the hill. This randomness helps in escaping local minima and can be faster
than regular Gradient Descent, especially for large datasets.
69. Stochastic Gradient Descent
Stochastic Gradient Descent (SGD) is a variant of the Gradient Descent
algorithm that is used for optimizing machine learning models.
In SGD, instead of using the entire dataset for each iteration, only a single
random training example (or a small batch) is selected to calculate the
gradient and update the model parameters. This random selection introduces
randomness into the optimization process, hence the term “stochastic” in
stochastic Gradient Descent
The advantage of using SGD is its computational efficiency, especially when
dealing with large datasets. By using a single example or a small batch, the
computational cost per iteration is significantly reduced compared to traditional
Gradient Descent methods that require processing the entire dataset.
70. Stochastic Gradient Descent
How it works:
● Start with an initial guess for the minimum point.
● Randomly shuffle your dataset.
● For each data point in the shuffled dataset:
○ Compute the gradient of the loss function at that point (i.e.,
the direction of steepest descent).
○ Update your guess for the minimum point by taking a small
step.
● Repeat this process for a fixed number of iterations or until the
improvement becomes very small.
Editor's Notes
#9: Respond to environmental changes: By learning from past experiences, organisms can adjust their behavior to changing environmental conditions, such as changes in temperature, availability of resources, or presence of predators.
Improve their performance: Organisms can improve their ability to perform various tasks, such as finding food or avoiding predators, through trial-and-error learning or observational learning.
Develop new behaviors: Through learning, organisms can develop new behaviors that allow them to exploit new resources or adapt to new environmental challenges.
Enhance communication: Learning can also improve communication between individuals within a species, allowing for the transmission of knowledge and cultural traditions.