0% found this document useful (0 votes)
8 views

Unit-5 Machine Learning

Machine Learning (ML) is a subset of artificial intelligence that enables computers to learn from data and make predictions without explicit programming. It is categorized into three main types: Supervised Learning, Unsupervised Learning, and Reinforcement Learning, each with distinct methodologies and applications. Additionally, statistical learning techniques, such as Naive Bayes and Genetic Algorithms, are employed for optimization and classification tasks across various domains.

Uploaded by

hemantbhatta003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Unit-5 Machine Learning

Machine Learning (ML) is a subset of artificial intelligence that enables computers to learn from data and make predictions without explicit programming. It is categorized into three main types: Supervised Learning, Unsupervised Learning, and Reinforcement Learning, each with distinct methodologies and applications. Additionally, statistical learning techniques, such as Naive Bayes and Genetic Algorithms, are employed for optimization and classification tasks across various domains.

Uploaded by

hemantbhatta003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Unit-5 Machine Learning

Learning
Learning is the process of acquiring new or modifying existing knowledge, behaviors, skills, values, or
preferences. Learning is the process of converting experience into expertise or knowledge

Introduction of Machine Learning

 Machine learning (ML) is a critical subset of artificial intelligence (AI) that focuses on the development
of algorithms and statistical models enabling computers to perform tasks without explicit instructions,
relying instead on patterns and inference.
 Machine learning enables a computer system to make predictions or take some decisions using
historical data without being explicitly programmed.
 Machine learning uses a massive amount of structured and semi-structured data so that a machine
learning model can generate accurate result or give predictions based on that data.
 Arthur Samuel, a pioneer in the field of artificial intelligence and computer gaming, coined the
term “Machine Learning”.
 The goal of ML is to allow machines to learn from data so that they can give accurate output. Deep
learning is a main subset of machine learning.
 Machine learning is working to create machines that can perform only those specific tasks for which
they are trained.
 Machine learning is used in many different applications, from image and speech recognition to natural
language processing, recommendation systems, fraud detection, portfolio optimization, automated task,
and so on.
 Machine learning models are also used to power autonomous vehicles, drones, and robots, making
them more intelligent and adaptable to changing environments.

Data(Input)
Machine
Program
Output
Learning

Types of Machine Learning

Based on the methods and way of learning, machine learning is divided into mainly three types, which are:

1) Supervised Machine Learning

2) Unsupervised Machine Learning

3) Reinforcement Learning

1] Supervised learning

 Supervised learning is a paradigm in machine learning where input objects and a desired output value
train a model. The training data is processed, building a function that maps new data on expected
output values.

Prepared By Keshab Pal


 In supervised learning, the training data provided to the machines work as the supervisor that teaches
the machines to predict the output correctly. It applies the same concept as a student learns in the
supervision of the teacher.

 Supervised learning is a process of providing input data as well as correct output data to the machine
learning model. The aim of a supervised learning algorithm is to find a mapping function to map the
input variable(x) with the output variable(y).

 In supervised learning, models are trained using labelled dataset, where the model learns about each
type of data. Once the training process is completed, the model is tested on the basis of test data and
then it predicts the output.

 In supervised learning, the algorithm is provided with input features and corresponding output labels,
and it learns to generalize from this data to make predictions on new, unseen data.

Advantages:

o Since supervised learning work with the labelled dataset so we can have an exact idea about the
classes of objects.
o These algorithms are helpful in predicting the output on the basis of prior experience.

Disadvantages:

o These algorithms are not able to solve complex tasks.


o It may predict the wrong output if the test data is different from the training data.
o It requires lots of computational time to train the algorithm.

Applications of Supervised Learning

Some common applications of Supervised Learning are given below:

 Image Segmentation:
Supervised Learning algorithms are used in image segmentation. In this process, image classification is
performed on different image data with pre-defined labels.
 Medical Diagnosis:
Supervised algorithms are also used in the medical field for diagnosis purposes. It is done by using
medical images and past labelled data with labels for disease conditions. With such a process, the
machine can identify a disease for the new patients.
 Fraud Detection - Supervised Learning classification algorithms are used for identifying fraud
transactions, fraud customers, etc. It is done by using historic data to identify the patterns that can lead
to possible fraud.
 Spam detection - In spam detection & filtering, classification algorithms are used. These algorithms
classify an email as spam or not spam. The spam emails are sent to the spam folder.

Prepared By Keshab Pal


 Speech Recognition - Supervised learning algorithms are also used in speech recognition. The
algorithm is trained with voice data, and various identifications can be done using the same, such as
voice-activated passwords, voice commands, etc.

Fig:- Supervised Machine learning

 Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle, and
Polygon. Now the first step is that we need to train the model for each shape.

 If the given shape has four sides, and all the sides are equal, then it will be labelled as a Square.

 If the given shape has three sides, then it will be labelled as a triangle.

 If the given shape has six equal sides then it will be labelled as hexagon.

 Now, after training, we test our model using the test set, and the task of the model is to identify the
shape.

 The machine is already trained on all types of shapes, and when it finds a new shape, it classifies the
shape on the bases of a number of sides, and predicts the output.

Types of supervised Machine Learning Algorithms:

Supervised learning can be further divided into two types of problems:

Regression: - Regression algorithms are used if there is a relationship between the input variable and the
output variable. It is used for the prediction of continuous variables, such as Weather forecasting, Market
Trends, etc.

Classification: - Classification algorithms are used when the output variable is categorical, which means
there are two classes such as Yes-No, Male-Female, True-false, etc.

Prepared By Keshab Pal


2] Unsupervised Machine Learning
 Unsupervised learning is the training of a machine using information that is neither classified nor
labeled and allowing the algorithm to act on that information without guidance.

 Unlike supervised learning, no teacher is provided that means no training will be given to the machine.

 The goal of unsupervised learning is to find the underlying structure of dataset, group that data
according to similarities, and represent that dataset in a compressed format.

 Unsupervised learning works on unlabeled and uncategorized data which make unsupervised learning
more important.

 In unsupervised learning, the models are trained with the data that is neither classified nor labelled, and
the model acts on that data without any supervision.

 It uses two techniques i.e. Clustering (grouping the objects) and Association (finding the relationships
between variables) for problem solving.

 The main aim of the unsupervised learning algorithm is to group or categories the unsorted
dataset according to the similarities, patterns, and differences. Machines are instructed to find the
hidden patterns from the input dataset.

 It applies suitable algorithms such as k-means clustering, Decision tree, etc.

Advantages:

o These algorithms can be used for complicated tasks compared to the supervised ones because these
algorithms work on the unlabeled dataset.
o Unsupervised algorithms are preferable for various tasks as getting the unlabeled dataset is easier as
compared to the labelled dataset.

Disadvantages:

o The output of an unsupervised algorithm can be less accurate as the dataset is not labelled, and
algorithms are not trained with the exact output in prior.
o Working with Unsupervised learning is more difficult as it works with the unlabelled dataset that does
not map with the output.

Applications of Unsupervised Learning

o Network Analysis: Unsupervised learning is used for identifying plagiarism and copyright in
document network analysis of text data for scholarly articles.
o Recommendation Systems: Recommendation systems widely use unsupervised learning techniques
for building recommendation applications for different web applications and e-commerce websites.

Prepared By Keshab Pal


o Anomaly Detection: Anomaly detection is a popular application of unsupervised learning, which can
identify unusual data points within the dataset. It is used to discover fraudulent transactions.
o Singular Value Decomposition: Singular Value Decomposition or SVD is used to extract particular
information from the database. For example, extracting information of each user located at a particular
location.

Example of Unsupervised Learning

Working of unsupervised learning can be understood by the below diagram:

Here, we have taken an unlabeled input data, which means it is not categorized and corresponding outputs
are also not given. Now, this unlabeled input data is fed to the machine learning model in order to train it.
Firstly, it will interpret the raw data to find the hidden patterns from the data.

3] Reinforcement Learning
 Reinforcement Learning is a feedback-based Machine learning technique in which an agent learns to
behave in an environment by performing the actions and seeing the results of actions.

 In Reinforcement Learning, the agent learns automatically using feedbacks without any labeled data,
unlike supervised learning.

 The primary goal of an agent in reinforcement learning is to improve the performance by getting the
maximum positive rewards.

 Reinforcement learning works on a feedback-based process, in which an AI agent (A software


component) automatically explore its surrounding by hitting & trail, taking action, learning from
experiences, and improving its performance.

 It is based on the hit and trial process.

Prepared By Keshab Pal


 In reinforcement learning, there is no labelled data like supervised learning, and agents learn from their
experiences only.

 The agent takes the next action and changes states according to the feedback of the previous action.

Advantages

o It helps in solving complex real-world problems which are difficult to be solved by general techniques.
o The learning model of RL is similar to the learning of human beings; hence most accurate results can
be found.
o Helps in achieving long term results.

Disadvantage

o RL algorithms are not preferred for simple problems.


o RL algorithms require huge data and computations.
o Too much reinforcement learning can lead to an overload of states which can weaken the results.

Real-world Use cases of Reinforcement Learning

o Video Games:
RL algorithms are much popular in gaming applications. It is used to gain super-human performance.
Some popular games that use RL algorithms are AlphaGO and AlphaGO Zero.
o Resource Management:
The "Resource Management with Deep Reinforcement Learning" paper showed that how to use RL in
computer to automatically learn and schedule resources to wait for different jobs in order to minimize
average job slowdown.
o Robotics:
RL is widely being used in Robotics applications. Robots are used in the industrial and manufacturing
area, and these robots are made more powerful with reinforcement learning. There are different
industries that have their vision of building intelligent robots using AI and Machine learning
technology.
o Text Mining
Text-mining, one of the great applications of NLP, is now being implemented with the help of
Reinforcement Learning by Salesforce company.

Example of Reinforcement Learning


Example: Bicycle learning, game playing, etc.

Prepared By Keshab Pal


Fig: - Reinforcement Learning

Statistical-based Learning in AI

 Statistical learning is a field that focuses on developing and analyzing models that can make predictions
or inferences based on data.
 Statistical Learning is a subfield of machine learning that focuses on understanding and modeling
patterns in data using probability theory, statistics, and mathematical models. It provides the
foundation for many modern AI algorithms, especially in supervised and unsupervised learning.
 Statistical learning is often used in scientific research and statistical analysis.
 Statistical learning refers to a set of methods and techniques used in machine learning and statistics to
analyze and make predictions or decisions based on data.
 It involves developing models that can uncover patterns, relationships, and trends within datasets,
allowing for the extraction of valuable insights and the creation of predictive models.
 The purpose of statistical modeling is to find the relationship between variables and to test the
hypothesis.
 In Statistical Modeling takes a lot of assumptions to identify the underlying distributions and
relationships.
 The model was developed on training data and tested on testing data. It is Mostly used for research
purposes.
 It is not best suited to a large amount of data.
 Statistical learning techniques are widely applied in various domains, including finance, healthcare,
marketing, and natural language processing.
 These methods enable computers to learn from data, make predictions, and uncover patterns that may
not be apparent through traditional programming approaches.

Applications of Statistical Learning in AI


✅ Natural Language Processing (NLP) – Sentiment analysis, chatbots
✅ Computer Vision – Face recognition, object detection
✅ Medical Diagnosis – Predicting diseases from patient data
✅ Financial Modeling – Fraud detection, stock market prediction
✅ Recommendation Systems – Netflix, Amazon, YouTube suggestions

Prepared By Keshab Pal


Naive Bayes Model

Naive Bayes is a probabilistic machine learning model that's widely used for classification tasks. It is based
on Bayes' Theorem, which describes the probability of an event based on prior knowledge of conditions
related to the event.

The model is termed "naive" because it makes a strong assumption: all features are independent of each
other given the class label. Despite this simplification, Naive Bayes classifiers perform surprisingly well in
many practical applications.

How Naive Bayes Works:

1. Data Preparation:
o The data is prepared by calculating the probabilities of each feature given the class and the prior
probability of each class.
2. Classification:
o For a new, unseen data point, the algorithm calculates the probability of the data point belonging to
each class.
o It then assigns the data point to the class with the highest probability.

Applications

Naive Bayes classifiers are commonly used in:

Text Classification: Spam detection, sentiment analysis, and document categorization.

Medical Diagnosis: Predicting the likelihood of diseases based on symptoms.

Recommendation Systems: Recommending products or content based on user behavior.

Weather Prediction: Predicting weather conditions based on historical data.

Advantages

Simplicity: Easy to implement and understand.

Efficiency: Fast training and prediction, especially useful for large datasets.

Performance: Performs well with less training data and in many real-world scenarios, despite the
independence assumption.

Disadvantages

Struggles with Complex Relationships: Naïve Bayes is not well-suited for datasets where complex
relationships exist between features.

Poor Performance on Small Data: Naïve Bayes relies on probability estimation, which may be unreliable
if the dataset is small or unbalanced.

Prepared By Keshab Pal


Genetic Algorithm
 Genetic Algorithm (GA) is a search-based optimization technique based on the principles of Genetics
and Natural Selection.

 Genetic algorithms simulate the process of natural selection which means those species that can
adapt to changes in their environment can survive and reproduce and go to the next generation.

 It is frequently used to find optimal or near-optimal solutions to difficult problems which otherwise
would take a lifetime to solve.

 A genetic algorithm (GA) is a heuristic search algorithm used to solve search and optimization
problems.

 Genetic algorithms have been successfully applied to various optimization problems, including
parameter tuning, scheduling, routing, and machine learning.

 Genetic algorithms are based on the ideas of natural selection and genetics. These are intelligent
exploitation of random searches provided with historical data to direct the search into the region of
better performance in solution space.

 They are commonly used to generate high-quality solutions for optimization problems and search
problems.

How Genetic Algorithm Work?


The genetic algorithm works on the evolutionary generational cycle to generate high-quality solutions.
These algorithms use different operations that either enhance or replace the population to give an improved
fit solution.

It basically involves five phases to solve the complex optimization problems, which are given as below:

o Initialization
o Fitness Assignment
o Selection
o Reproduction
o Termination

Prepared By Keshab Pal


General Workflow of a Simple Genetic Algorithm

1. Initialization

The process of a genetic algorithm starts by generating the set of individuals, which is called population.
Here each individual is the solution for the given problem. An individual contains or is characterized by a
set of parameters called Genes. Genes are combined into a string and generate chromosomes, which is the
solution to the problem. One of the most popular techniques for initialization is the use of random binary
strings.

2. Fitness Assignment

Fitness function is used to determine how fit an individual is? It means the ability of an individual to
compete with other individuals. In every iteration, individuals are evaluated based on their fitness function.
The fitness function provides a fitness score to each individual. This score further determines the
probability of being selected for reproduction. The high the fitness score, the more chances of getting

Prepared By Keshab Pal


selected for reproduction.

3. Selection

The selection phase involves the selection of individuals for the reproduction of offspring. All the selected
individuals are then arranged in a pair of two to increase reproduction. Then these individuals transfer their
genes to the next generation.

4. Reproduction

This phase involves the creation of a child population. In this step, the genetic algorithm uses two variation
operators (Crossover & Mutation) that are applied to the parent population.

Crossover: Two or more parent solutions are combined to create new offspring. The crossover plays a
most significant role in the reproduction phase of the genetic algorithm. In this process, a crossover point is
selected at random within the genes. Then the crossover operator swaps genetic information of two parents
from the current generation to produce a new individual representing the offspring.

Mutation: Small random changes are made to some solutions to introduce diversity. The mutation operator
inserts random genes in the offspring (new child) to maintain the diversity in the population. It can be done
by flipping some bits in the chromosomes. Mutation helps in solving the issue of premature convergence
and enhances diversification.

5. Termination

After the reproduction phase, a stopping criterion is applied as a base for termination. The algorithm
terminates after the threshold fitness solution is reached. It will identify the final solution as the best
solution in the population.

Learning with Neural Networks


 A neural network is a form of AI-based learning designed to help computers analyze data similarly to
humans. Each neural network is made up of layers of nodes, which pass data between each other.

 Through training data and experience, neural networks give machines the ability to learn from mistakes
and improve their performance over time. As a result, neural networks are ideal for handling more
complex data-related tasks.

 It is a type of machine learning process, called deep learning, that uses interconnected nodes or neurons
in a layered structure that resembles the human brain.

 The key advantages of neural network are that they are able to extract data features automatically
without needing the input of the programmer.

 neural networks mimic how the human brain operates, enabling computer programs in AI, machine
learning, and deep learning to identify patterns and address common issues.

 The structure of artificial neural networks is based on the concept of nodes, which contain input,
hidden, and output layers. There are connections between each node or artificial neuron, and each one

Prepared By Keshab Pal


has a threshold and weight that go along with it.

Artificial Neural Network primarily consists of three layers:

 Input Layer: It accepts inputs in several different formats provided by the programmer.

 Hidden Layer: The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.

 Output Layer: The input goes through a series of transformations using the hidden layer, which finally
results in output that is conveyed using this layer.

Application of Artificial Neural Networks (ANN)


Artificial Neural Networks (ANNs) are currently the most widely used Machine Learning techniques.

There are several areas uses of Artificial Neural Networks and they are:

Social Media: Artificial Neural Networks are used heavily in Social Media. For example, let’s take
the ‘People you may know’ feature on Facebook that suggests people that you might know in real life so
that you can send them friend requests.

Marketing and Sales: When you log onto E-commerce sites like Amazon and Flipkart, they will
recommend your products to buy based on your previous browsing history.

Medical
We can use it in detecting cancer cells and analyzing the MRI images to give detailed results.

Personal Assistants: - I am sure you all have heard of Siri, Alexa, Cortana, etc., and also heard them based
on the phones you have!!! These are personal assistants and an example of speech recognition that
uses Natural Language Processing to interact with the users and formulate a response accordingly.

Image processing: - We can use satellite imagery processing for agricultural and defense use.

Signature Classification: We employ artificial neural networks to recognize signatures and categorize
them according to the person’s class when developing these authentication systems.

Prepared By Keshab Pal


Biological Neural Networks Vs. Artificial Neural Networks (ANN)

Features Artificial Neural Network Biological Neural Network

Definition It is the mathematical model which is It is also composed of several


mainly inspired by the biological processing pieces known as neurons that
neuron system in the human brain. are linked together via synapses.

Processing Its processing was sequential and It processes the information in a parallel
centralized. and distributive manner.

Size It is small in size. It is large in size.

Control Its control unit keeps track of all All processing is managed centrally.
Mechanism computer-related operations.

Rate It processes the information at a faster It processes the information at a slow


speed. speed.

Complexity It cannot perform complex pattern The large quantity and complexity of
recognition. the connections allow the brain to
perform complicated tasks.

Feedback It doesn't provide any feedback. It provides feedback.

Fault There is no fault tolerance. It has fault tolerance.


tolerance

Operating Its operating environment is well- Its operating environment is poorly


Environment defined and well-constrained defined and unconstrained.

Memory Its memory is separate from a Its memory is integrated into the
processor, localized, and non-content processor, distributed, and content-
addressable. addressable.

Reliability It is very vulnerable. It is robust.

Learning It has very accurate structures and They are tolerant to ambiguity.
formatted data.

Response Its response time is measured in Its response time is measured in


time milliseconds. nanoseconds.

Prepared By Keshab Pal


Activation Functions
 An Activation Function decides whether a neuron should be activated or not. This means that it will
decide whether the neuron’s input to the network is important or not in the process of prediction using
simpler mathematical operations.

 It’s just a thing function that you use to get the output of node. It is also known as Transfer Function.

 The role of the Activation Function is to derive output from a set of input values fed to a node (or a
layer).

 The activation function defines the output of a node based on a set of specific inputs in machine
learning, deep neural networks, and artificial neural networks.

 It is used to determine the output of neural network like yes or no. It maps the resulting values in
between 0 to 1 or -1 to 1 etc. (depending upon the function).

 Activation functions are necessary for neural networks because, without them, the output of the model
would simply be a linear function of the input. In other words, it wouldn’t be able to handle large
volumes of complex data.

 This non-linearity allows the network to learn complex patterns and perform more sophisticated tasks.

Types of Activation Functions in Neural Network

The Activation Functions can be basically divided into 2 types-

1) Linear Activation Function

2) Non-linear Activation Functions

1] Linear Activation Function

As you can see the function is a line or linear. Therefore, the output of the functions will not be confined
between any range.

Prepared By Keshab Pal


 Equation : f(x) = x

 Range : (-infinity to infinity)

 It doesn’t help with the complexity or various parameters of usual data that is fed to the neural
networks.

2] Non-Linear Activation Function

The Nonlinear Activation Functions are the most used activation functions. Nonlinearity helps to makes the
graph look something like this

 It makes it easy for the model to generalize or adapt with variety of data and to differentiate
between the output.

 Modern neural network models use non-linear activation functions. They allow the model to create
complex mappings between the network’s inputs and outputs, such as images, video, audio, and
data sets that are non-linear or have high dimensionality.

Prepared By Keshab Pal


Types of ANN

Artificial Neural Networks (ANNs) come in various forms, each with its strengths and applications. Here's
a breakdown of some common types of ANNs:
1. Feedforward Neural Networks:

 The most basic and widely used type of ANN. Information flows in one direction, from the input
layer through hidden layers (if any) to the output layer.

 This network might or might not have hidden node layers, making their functioning more
interpretable. It's prepared to process large amounts of noise.

 No loops or connections going back to previous layers.

 The primary advantage of this network is that it figures out how to evaluate and recognize input
patterns.

 Used for various tasks like image recognition, speech recognition, and function approximation.

2. Recurrent Neural Networks (RNNs):

 Designed to handle sequential data like text, speech, or time series data.

 RNNs have internal loops that allow information to persist across time steps. This enables them to
learn dependencies between elements in a sequence.

 Designed to save the output of a layer, Recurrent Neural Network is fed back to the input to help in
predicting the outcome of the layer.

 Subtypes of RNNs address limitations of the basic RNN architecture:

 It stores information required for its future use. If the prediction is wrong, the learning rate is
employed to make small changes.

Prepared By Keshab Pal


3. Single layered Neural Networks:

A single-layered neural network, also known as a perceptron, is the simplest form of an Artificial Neural
Network (ANN). While not as powerful as their multi-layered counterparts, they provide a fundamental
building block for understanding how ANNs work and can be useful for specific tasks.

Here's a deeper dive into single-layered ANNs:

Structure and Function:


 A single-layered ANN consists of:

o Input Layer: Receives the input data (features).

o Output Layer: Produces a single output value.

 There are no hidden layers, unlike multi-layered ANNs.

 Each input is connected to a single output neuron by a weighted connection.

 The output neuron applies a weighted sum of the inputs and a bias term, followed by an activation
function to generate the final output.

Prepared By Keshab Pal


4. Multi-layered neural networks
 Multi-layered neural networks (MLNNs), also commonly called deep neural networks when they have
many hidden layers, are the workhorses of the artificial neural network (ANN) world.
 They overcome the limitations of single-layered networks by introducing hidden layers, allowing them
to learn complex patterns and solve a wider range of problems.
 A Multi-Layered Neural Network consists of multiple layers of artificial neurons or nodes. Unlike
Single-Layer Neural networks, in recent times most networks have Multi-Layered Neural Network.

Here's a breakdown of multi-layered neural networks:


Structure:
 Unlike single-layered perceptrons, MLNNs have:
o Input layer: Receives the raw data.
o Hidden layers: Perform the main information processing. There can be one or more hidden layers, and
the number of neurons in each layer can vary.
o Output layer: Produces the final output of the network.

Prepared By Keshab Pal


Learning by Training ANN

Artificial Neural Networks (ANNs) learn through a process called training. This involves feeding the
network a large dataset of labeled examples and iteratively adjusting the connections between neurons to
minimize the difference between the network's output and the desired output for those examples. Here's a
breakdown of the key steps involved:
1. Data Preparation:
 The first step is to gather a dataset relevant to the task you want the ANN to perform. This data needs to
be labeled, meaning each data point should have a corresponding desired output (e.g., image with a
label indicating the object in the image).
 The data is then preprocessed to ensure it's in a format suitable for the ANN. This might involve scaling
the data to a specific range or encoding categorical data into numerical values.
2. Choosing an ANN Architecture:
 The architecture of an ANN refers to its structure, including the number of layers, the number of
neurons in each layer, and the connections between them. The choice of architecture depends on the
complexity of the task and the size of the dataset.
3. Setting Up the Training Process:
 This involves defining parameters like:
o Learning Rate: Controls how much the weights are adjusted during each iteration.
o Loss Function: A mathematical function that measures the difference between the network's output
and the desired output. Common choices include mean squared error for regression tasks and cross-
entropy for classification tasks.
o Optimizer: An algorithm that determines how to update the weights based on the learning rate and the
loss function. Popular optimizers include gradient descent and its variants (e.g., Adam, RMSprop).
4. Forward Pass and Backpropagation:

Prepared By Keshab Pal


 This is the core of the training process:
o Forward Pass: The input data is fed into the network, and it propagates through the layers, applying
activation functions at each neuron. The final output is generated.
o Backpropagation: The loss function is calculated based on the difference between the network's
output and the desired output. This error is then propagated backward through the network, calculating
the contribution of each neuron's weights to the overall error.
o Weight Update: Based on the calculated gradients (rates of change of the error with respect to the
weights), the weights and biases of the connections are adjusted in a way that reduces the overall loss.

5. Iteration and Improvement:


 Steps 4a (forward pass) and 4b (backpropagation with weight update) are repeated for multiple
iterations (epochs) over the entire training dataset.
 With each iteration, the network's performance (loss) should ideally decrease as it learns to map the
input data to the desired output.
6. Evaluation and Refinement:
 After training, the network's performance is evaluated on a separate validation set (data not used
during training) to assess its generalization ability (how well it performs on unseen data).
 Based on the evaluation results, the ANN architecture, hyperparameters (learning rate, etc.), or training
process might be adjusted for further improvement.

Hebbian learning
Hebbian learning is a fundamental principle in neuroscience and artificial intelligence, inspired by the way
neurons in the brain strengthen their connections through repeated stimulation. The principle, often
summarized as "cells that fire together wire together," was proposed by Donald Hebb in 1949. In the
context of AI, Hebbian learning offers a way to understand and implement how learning occurs through the
adjustment of synaptic weights based on activity patterns.

It is used for pattern classification. It is a single layer neural network, i.e. it has one input layer and one
output layer.

Applications in AI

1.Neural Networks: Hebbian learning can be used to train artificial neural networks, particularly in
unsupervised learning scenarios. It helps in forming feature detectors and self-organizing maps.

2.Pattern Recognition: The rule is useful for learning patterns and associations in data, enabling systems
to recognize and classify patterns based on the learned representations.

3.Reinforcement Learning: Hebbian principles can be integrated with reinforcement learning to adjust

Prepared By Keshab Pal


synaptic strengths based on the success of actions taken by an agent, enhancing the agent's ability to learn
from interactions with the environment.

4.Cognitive Modeling: Hebbian learning is employed in cognitive models that simulate human learning
and memory processes, providing insights into how humans learn and recall information.

Perceptron Learning
Perceptron learning is a foundational concept in artificial neural networks (ANNs) and machine learning.

It represents one of the earliest and simplest forms of neural networks, introduced by Frank Rosenblatt in
1958.

The perceptron algorithm provides a way for a machine to learn a binary classifier—a function that can
decide whether an input, represented by a vector of numbers, belongs to one class or another.

Perceptron learning revolves around adjusting the weights of the connections between the input layer and
the single output neuron in a perceptron.

This adjustment is based on the difference between the desired output (target) and the actual output
generated by the perceptron for a given input.

Perceptrons and perceptron learning are primarily used for educational purposes due to their simplicity.
They offer a good introduction to the concepts of neural network training and understanding how weights
are adjusted based on errors.

In some specific cases, perceptrons can be useful for solving simple linear classification problems, such as
classifying data points as even or odd numbers.

Types of Perceptron

 Single-Layer Perceptron: This type of perceptron is limited to learning linearly separable patterns.
effective for tasks where the data can be divided into distinct categories through a straight line.

 Multilayer Perceptron: Multilayer perceptrons possess enhanced processing capabilities as they


consist of two or more layers, adept at handling more complex patterns and relationships within the
data.

Basic Components of Perceptron

Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains three main
components. These are as follows:

Prepared By Keshab Pal


Input Nodes or Input Layer: This is the primary component of Perceptron which accepts the initial data
into the system for further processing. Each input node contains a real numerical value.

Wight and Bias: Weight parameter represents the strength of the connection between units. This is another
most important parameter of Perceptron components. Weight is directly proportional to the strength of the
associated input neuron in deciding the output. Further, Bias can be considered as the line of intercept in a
linear equation.

Activation Function: These are the final and important components that help to determine whether the
neuron will fire or not. Activation Function can be considered primarily as a step function.

Back-propagation Learning
 In machine learning, backpropagation is an effective algorithm used to train artificial neural
networks, especially in feed-forward neural networks.
 Backpropagation is an iterative algorithm, that helps to minimize the cost function by determining
which weights and biases should be adjusted. During every epoch, the model learns by adapting the
weights and biases to minimize the loss by moving down toward the gradient of the error. Thus, it
involves the two most popular optimization algorithms, such as gradient descent or stochastic
gradient descent.
 Computing the gradient in the backpropagation algorithm helps to minimize the cost function and it
can be implemented by using the mathematical rule called chain rule from calculus to navigate
through complex layers of the neural network.
Advantages of Using the Backpropagation Algorithm in Neural Networks

Backpropagation, a fundamental algorithm in training neural networks, offers several advantages that
make it a preferred choice for many machine learning tasks. Here, we discuss some key advantages of
using the backpropagation algorithm:

1. Ease of Implementation: Backpropagation does not require prior knowledge of neural networks,
making it accessible to beginners. Its straightforward nature simplifies the programming process, as it
primarily involves adjusting weights based on error derivatives.

Prepared By Keshab Pal


2. Simplicity and Flexibility: The algorithm’s simplicity allows it to be applied to a wide range of
problems and network architectures. Its flexibility makes it suitable for various scenarios, from
simple feedforward networks to complex recurrent or convolutional neural networks.

3. Efficiency: Backpropagation accelerates the learning process by directly updating weights based on
the calculated error derivatives. This efficiency is particularly advantageous in training deep neural
networks, where learning features of a function can be time-consuming.

4. Generalization: Backpropagation enables neural networks to generalize well to unseen data by


iteratively adjusting weights during training. This generalization ability is crucial for developing
models that can make accurate predictions on new, unseen examples.

5. Scalability: Backpropagation scales well with the size of the dataset and the complexity of the
network. This scalability makes it suitable for large-scale machine learning tasks, where training data
and network size are significant factors.

The Backpropagation algorithm works by two different passes; they are:


 Forward pass: In forward pass, initially the input is fed into the input layer. Since the inputs are raw
data, they can be used for training our neural network.

 Backward pass: In the backward pass process shows, the error is transmitted back to the network
which helps the network, to improve its performance by learning and adjusting the internal weights.

Deep Learning

 Deep learning is a subset of machine learning in artificial intelligence (AI) that focuses on algorithms
inspired by the structure and function of the brain's neural networks.
 It uses artificial neural networks (ANNs) with multiple hidden layers to learn complex patterns from
data.
 These deep neural networks are inspired by the structure and function of the human brain.
 In a fully connected Deep neural network, there is an input layer and one or more hidden layers
connected one after the other. Each neuron receives input from the previous layer neurons or the input
layer. The output of one neuron becomes the input to other neurons in the next layer of the network,
and this process continues until the final layer produces the output of the network.
 Deep learning AI can be used for supervised, unsupervised as well as reinforcement machine learning.
 Deep learning can also learn from unlabeled data, while more basic machine learning models may
require more context about the data they are fed in order to "learn" correctly.
Deep Learning Architectures

Deep learning architectures are the specific structures of artificial neural networks used in deep learning.
These architectures define how neurons are interconnected within a deep neural network (DNN) and how
information flows through the network. Here's a breakdown of some common deep learning architectures:
1. Convolutional Neural Networks (CNNs):
 Convolutional neural networks tend to be used in computer vision solutions using images as input. They
capture the spatial aspects of the data; rather than every pixel being see as a standalone feature, the fact

Prepared By Keshab Pal


that pixels are next to each other or within close proximity can be taken into consideration.
 Primarily used for image and video analysis. They use convolutional layers to detect local patterns like
edges, textures, and shapes.
 Particularly well-suited for image recognition and analysis tasks.
 CNNs leverage the concept of spatial locality, where pixels in an image have a strong correlation with
their neighboring pixels.
 Key components of CNNs:
o Convolutional Layers: Apply filters to extract features from the input image. These filters progressively
detect lower-level features (edges, corners) to higher-level features (shapes, objects) as they move
through the network.
o Pooling Layers: Downsample the data representation, reducing its dimensionality and computational
cost.
o Fully-connected Layers: Similar to traditional ANNs, used for classification or regression tasks at the
end of the network.
2. Recurrent Neural Networks (RNNs):
 Designed to handle sequential data like text, speech, or time series data.
 Used for sequential data like time series or natural language. Variants include Long Short-Term
Memory (LSTM) and Gated Recurrent Units (GRUs), which address the vanishing gradient problem.
 RNNs have loops or connections that allow information to persist across time steps. This enables them
to learn dependencies between elements in a sequence.
 Types of RNNs address limitations of the basic RNN architecture:
o Long Short-Term Memory (LSTM): Can learn long-term dependencies in sequences by using memory
cells that control information flow.
o Gated Recurrent Unit (GRU): Another variant known for efficiency and handling long sequences.
3. Generative Adversarial Networks (GANs):
 Comprise two competing neural networks:
 Consist of two networks (generator and discriminator) that compete, resulting in the generation of
realistic data, such as images or videos.
o Generator: Aims to create new data instances that resemble the training data.
o Discriminator: Tries to distinguish between real data and the generated data by the generator.
 Through this adversarial training process, the generator learns to create increasingly realistic outputs,
like images, music, or even text.

These are just a few of the most common deep learning architectures. There are many other variations and

Prepared By Keshab Pal


ongoing research in developing new architectures for various purposes.

Prepared By Keshab Pal

You might also like