Unit-5 Machine Learning
Unit-5 Machine Learning
Learning
Learning is the process of acquiring new or modifying existing knowledge, behaviors, skills, values, or
preferences. Learning is the process of converting experience into expertise or knowledge
Machine learning (ML) is a critical subset of artificial intelligence (AI) that focuses on the development
of algorithms and statistical models enabling computers to perform tasks without explicit instructions,
relying instead on patterns and inference.
Machine learning enables a computer system to make predictions or take some decisions using
historical data without being explicitly programmed.
Machine learning uses a massive amount of structured and semi-structured data so that a machine
learning model can generate accurate result or give predictions based on that data.
Arthur Samuel, a pioneer in the field of artificial intelligence and computer gaming, coined the
term “Machine Learning”.
The goal of ML is to allow machines to learn from data so that they can give accurate output. Deep
learning is a main subset of machine learning.
Machine learning is working to create machines that can perform only those specific tasks for which
they are trained.
Machine learning is used in many different applications, from image and speech recognition to natural
language processing, recommendation systems, fraud detection, portfolio optimization, automated task,
and so on.
Machine learning models are also used to power autonomous vehicles, drones, and robots, making
them more intelligent and adaptable to changing environments.
Data(Input)
Machine
Program
Output
Learning
Based on the methods and way of learning, machine learning is divided into mainly three types, which are:
3) Reinforcement Learning
1] Supervised learning
Supervised learning is a paradigm in machine learning where input objects and a desired output value
train a model. The training data is processed, building a function that maps new data on expected
output values.
Supervised learning is a process of providing input data as well as correct output data to the machine
learning model. The aim of a supervised learning algorithm is to find a mapping function to map the
input variable(x) with the output variable(y).
In supervised learning, models are trained using labelled dataset, where the model learns about each
type of data. Once the training process is completed, the model is tested on the basis of test data and
then it predicts the output.
In supervised learning, the algorithm is provided with input features and corresponding output labels,
and it learns to generalize from this data to make predictions on new, unseen data.
Advantages:
o Since supervised learning work with the labelled dataset so we can have an exact idea about the
classes of objects.
o These algorithms are helpful in predicting the output on the basis of prior experience.
Disadvantages:
Image Segmentation:
Supervised Learning algorithms are used in image segmentation. In this process, image classification is
performed on different image data with pre-defined labels.
Medical Diagnosis:
Supervised algorithms are also used in the medical field for diagnosis purposes. It is done by using
medical images and past labelled data with labels for disease conditions. With such a process, the
machine can identify a disease for the new patients.
Fraud Detection - Supervised Learning classification algorithms are used for identifying fraud
transactions, fraud customers, etc. It is done by using historic data to identify the patterns that can lead
to possible fraud.
Spam detection - In spam detection & filtering, classification algorithms are used. These algorithms
classify an email as spam or not spam. The spam emails are sent to the spam folder.
Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle, and
Polygon. Now the first step is that we need to train the model for each shape.
If the given shape has four sides, and all the sides are equal, then it will be labelled as a Square.
If the given shape has three sides, then it will be labelled as a triangle.
If the given shape has six equal sides then it will be labelled as hexagon.
Now, after training, we test our model using the test set, and the task of the model is to identify the
shape.
The machine is already trained on all types of shapes, and when it finds a new shape, it classifies the
shape on the bases of a number of sides, and predicts the output.
Regression: - Regression algorithms are used if there is a relationship between the input variable and the
output variable. It is used for the prediction of continuous variables, such as Weather forecasting, Market
Trends, etc.
Classification: - Classification algorithms are used when the output variable is categorical, which means
there are two classes such as Yes-No, Male-Female, True-false, etc.
Unlike supervised learning, no teacher is provided that means no training will be given to the machine.
The goal of unsupervised learning is to find the underlying structure of dataset, group that data
according to similarities, and represent that dataset in a compressed format.
Unsupervised learning works on unlabeled and uncategorized data which make unsupervised learning
more important.
In unsupervised learning, the models are trained with the data that is neither classified nor labelled, and
the model acts on that data without any supervision.
It uses two techniques i.e. Clustering (grouping the objects) and Association (finding the relationships
between variables) for problem solving.
The main aim of the unsupervised learning algorithm is to group or categories the unsorted
dataset according to the similarities, patterns, and differences. Machines are instructed to find the
hidden patterns from the input dataset.
Advantages:
o These algorithms can be used for complicated tasks compared to the supervised ones because these
algorithms work on the unlabeled dataset.
o Unsupervised algorithms are preferable for various tasks as getting the unlabeled dataset is easier as
compared to the labelled dataset.
Disadvantages:
o The output of an unsupervised algorithm can be less accurate as the dataset is not labelled, and
algorithms are not trained with the exact output in prior.
o Working with Unsupervised learning is more difficult as it works with the unlabelled dataset that does
not map with the output.
o Network Analysis: Unsupervised learning is used for identifying plagiarism and copyright in
document network analysis of text data for scholarly articles.
o Recommendation Systems: Recommendation systems widely use unsupervised learning techniques
for building recommendation applications for different web applications and e-commerce websites.
Here, we have taken an unlabeled input data, which means it is not categorized and corresponding outputs
are also not given. Now, this unlabeled input data is fed to the machine learning model in order to train it.
Firstly, it will interpret the raw data to find the hidden patterns from the data.
3] Reinforcement Learning
Reinforcement Learning is a feedback-based Machine learning technique in which an agent learns to
behave in an environment by performing the actions and seeing the results of actions.
In Reinforcement Learning, the agent learns automatically using feedbacks without any labeled data,
unlike supervised learning.
The primary goal of an agent in reinforcement learning is to improve the performance by getting the
maximum positive rewards.
The agent takes the next action and changes states according to the feedback of the previous action.
Advantages
o It helps in solving complex real-world problems which are difficult to be solved by general techniques.
o The learning model of RL is similar to the learning of human beings; hence most accurate results can
be found.
o Helps in achieving long term results.
Disadvantage
o Video Games:
RL algorithms are much popular in gaming applications. It is used to gain super-human performance.
Some popular games that use RL algorithms are AlphaGO and AlphaGO Zero.
o Resource Management:
The "Resource Management with Deep Reinforcement Learning" paper showed that how to use RL in
computer to automatically learn and schedule resources to wait for different jobs in order to minimize
average job slowdown.
o Robotics:
RL is widely being used in Robotics applications. Robots are used in the industrial and manufacturing
area, and these robots are made more powerful with reinforcement learning. There are different
industries that have their vision of building intelligent robots using AI and Machine learning
technology.
o Text Mining
Text-mining, one of the great applications of NLP, is now being implemented with the help of
Reinforcement Learning by Salesforce company.
Statistical-based Learning in AI
Statistical learning is a field that focuses on developing and analyzing models that can make predictions
or inferences based on data.
Statistical Learning is a subfield of machine learning that focuses on understanding and modeling
patterns in data using probability theory, statistics, and mathematical models. It provides the
foundation for many modern AI algorithms, especially in supervised and unsupervised learning.
Statistical learning is often used in scientific research and statistical analysis.
Statistical learning refers to a set of methods and techniques used in machine learning and statistics to
analyze and make predictions or decisions based on data.
It involves developing models that can uncover patterns, relationships, and trends within datasets,
allowing for the extraction of valuable insights and the creation of predictive models.
The purpose of statistical modeling is to find the relationship between variables and to test the
hypothesis.
In Statistical Modeling takes a lot of assumptions to identify the underlying distributions and
relationships.
The model was developed on training data and tested on testing data. It is Mostly used for research
purposes.
It is not best suited to a large amount of data.
Statistical learning techniques are widely applied in various domains, including finance, healthcare,
marketing, and natural language processing.
These methods enable computers to learn from data, make predictions, and uncover patterns that may
not be apparent through traditional programming approaches.
Naive Bayes is a probabilistic machine learning model that's widely used for classification tasks. It is based
on Bayes' Theorem, which describes the probability of an event based on prior knowledge of conditions
related to the event.
The model is termed "naive" because it makes a strong assumption: all features are independent of each
other given the class label. Despite this simplification, Naive Bayes classifiers perform surprisingly well in
many practical applications.
1. Data Preparation:
o The data is prepared by calculating the probabilities of each feature given the class and the prior
probability of each class.
2. Classification:
o For a new, unseen data point, the algorithm calculates the probability of the data point belonging to
each class.
o It then assigns the data point to the class with the highest probability.
Applications
Advantages
Efficiency: Fast training and prediction, especially useful for large datasets.
Performance: Performs well with less training data and in many real-world scenarios, despite the
independence assumption.
Disadvantages
Struggles with Complex Relationships: Naïve Bayes is not well-suited for datasets where complex
relationships exist between features.
Poor Performance on Small Data: Naïve Bayes relies on probability estimation, which may be unreliable
if the dataset is small or unbalanced.
Genetic algorithms simulate the process of natural selection which means those species that can
adapt to changes in their environment can survive and reproduce and go to the next generation.
It is frequently used to find optimal or near-optimal solutions to difficult problems which otherwise
would take a lifetime to solve.
A genetic algorithm (GA) is a heuristic search algorithm used to solve search and optimization
problems.
Genetic algorithms have been successfully applied to various optimization problems, including
parameter tuning, scheduling, routing, and machine learning.
Genetic algorithms are based on the ideas of natural selection and genetics. These are intelligent
exploitation of random searches provided with historical data to direct the search into the region of
better performance in solution space.
They are commonly used to generate high-quality solutions for optimization problems and search
problems.
It basically involves five phases to solve the complex optimization problems, which are given as below:
o Initialization
o Fitness Assignment
o Selection
o Reproduction
o Termination
1. Initialization
The process of a genetic algorithm starts by generating the set of individuals, which is called population.
Here each individual is the solution for the given problem. An individual contains or is characterized by a
set of parameters called Genes. Genes are combined into a string and generate chromosomes, which is the
solution to the problem. One of the most popular techniques for initialization is the use of random binary
strings.
2. Fitness Assignment
Fitness function is used to determine how fit an individual is? It means the ability of an individual to
compete with other individuals. In every iteration, individuals are evaluated based on their fitness function.
The fitness function provides a fitness score to each individual. This score further determines the
probability of being selected for reproduction. The high the fitness score, the more chances of getting
3. Selection
The selection phase involves the selection of individuals for the reproduction of offspring. All the selected
individuals are then arranged in a pair of two to increase reproduction. Then these individuals transfer their
genes to the next generation.
4. Reproduction
This phase involves the creation of a child population. In this step, the genetic algorithm uses two variation
operators (Crossover & Mutation) that are applied to the parent population.
Crossover: Two or more parent solutions are combined to create new offspring. The crossover plays a
most significant role in the reproduction phase of the genetic algorithm. In this process, a crossover point is
selected at random within the genes. Then the crossover operator swaps genetic information of two parents
from the current generation to produce a new individual representing the offspring.
Mutation: Small random changes are made to some solutions to introduce diversity. The mutation operator
inserts random genes in the offspring (new child) to maintain the diversity in the population. It can be done
by flipping some bits in the chromosomes. Mutation helps in solving the issue of premature convergence
and enhances diversification.
5. Termination
After the reproduction phase, a stopping criterion is applied as a base for termination. The algorithm
terminates after the threshold fitness solution is reached. It will identify the final solution as the best
solution in the population.
Through training data and experience, neural networks give machines the ability to learn from mistakes
and improve their performance over time. As a result, neural networks are ideal for handling more
complex data-related tasks.
It is a type of machine learning process, called deep learning, that uses interconnected nodes or neurons
in a layered structure that resembles the human brain.
The key advantages of neural network are that they are able to extract data features automatically
without needing the input of the programmer.
neural networks mimic how the human brain operates, enabling computer programs in AI, machine
learning, and deep learning to identify patterns and address common issues.
The structure of artificial neural networks is based on the concept of nodes, which contain input,
hidden, and output layers. There are connections between each node or artificial neuron, and each one
Input Layer: It accepts inputs in several different formats provided by the programmer.
Hidden Layer: The hidden layer presents in-between input and output layers. It performs all the
calculations to find hidden features and patterns.
Output Layer: The input goes through a series of transformations using the hidden layer, which finally
results in output that is conveyed using this layer.
There are several areas uses of Artificial Neural Networks and they are:
Social Media: Artificial Neural Networks are used heavily in Social Media. For example, let’s take
the ‘People you may know’ feature on Facebook that suggests people that you might know in real life so
that you can send them friend requests.
Marketing and Sales: When you log onto E-commerce sites like Amazon and Flipkart, they will
recommend your products to buy based on your previous browsing history.
Medical
We can use it in detecting cancer cells and analyzing the MRI images to give detailed results.
Personal Assistants: - I am sure you all have heard of Siri, Alexa, Cortana, etc., and also heard them based
on the phones you have!!! These are personal assistants and an example of speech recognition that
uses Natural Language Processing to interact with the users and formulate a response accordingly.
Image processing: - We can use satellite imagery processing for agricultural and defense use.
Signature Classification: We employ artificial neural networks to recognize signatures and categorize
them according to the person’s class when developing these authentication systems.
Processing Its processing was sequential and It processes the information in a parallel
centralized. and distributive manner.
Control Its control unit keeps track of all All processing is managed centrally.
Mechanism computer-related operations.
Complexity It cannot perform complex pattern The large quantity and complexity of
recognition. the connections allow the brain to
perform complicated tasks.
Memory Its memory is separate from a Its memory is integrated into the
processor, localized, and non-content processor, distributed, and content-
addressable. addressable.
Learning It has very accurate structures and They are tolerant to ambiguity.
formatted data.
It’s just a thing function that you use to get the output of node. It is also known as Transfer Function.
The role of the Activation Function is to derive output from a set of input values fed to a node (or a
layer).
The activation function defines the output of a node based on a set of specific inputs in machine
learning, deep neural networks, and artificial neural networks.
It is used to determine the output of neural network like yes or no. It maps the resulting values in
between 0 to 1 or -1 to 1 etc. (depending upon the function).
Activation functions are necessary for neural networks because, without them, the output of the model
would simply be a linear function of the input. In other words, it wouldn’t be able to handle large
volumes of complex data.
This non-linearity allows the network to learn complex patterns and perform more sophisticated tasks.
As you can see the function is a line or linear. Therefore, the output of the functions will not be confined
between any range.
It doesn’t help with the complexity or various parameters of usual data that is fed to the neural
networks.
The Nonlinear Activation Functions are the most used activation functions. Nonlinearity helps to makes the
graph look something like this
It makes it easy for the model to generalize or adapt with variety of data and to differentiate
between the output.
Modern neural network models use non-linear activation functions. They allow the model to create
complex mappings between the network’s inputs and outputs, such as images, video, audio, and
data sets that are non-linear or have high dimensionality.
Artificial Neural Networks (ANNs) come in various forms, each with its strengths and applications. Here's
a breakdown of some common types of ANNs:
1. Feedforward Neural Networks:
The most basic and widely used type of ANN. Information flows in one direction, from the input
layer through hidden layers (if any) to the output layer.
This network might or might not have hidden node layers, making their functioning more
interpretable. It's prepared to process large amounts of noise.
The primary advantage of this network is that it figures out how to evaluate and recognize input
patterns.
Used for various tasks like image recognition, speech recognition, and function approximation.
Designed to handle sequential data like text, speech, or time series data.
RNNs have internal loops that allow information to persist across time steps. This enables them to
learn dependencies between elements in a sequence.
Designed to save the output of a layer, Recurrent Neural Network is fed back to the input to help in
predicting the outcome of the layer.
It stores information required for its future use. If the prediction is wrong, the learning rate is
employed to make small changes.
A single-layered neural network, also known as a perceptron, is the simplest form of an Artificial Neural
Network (ANN). While not as powerful as their multi-layered counterparts, they provide a fundamental
building block for understanding how ANNs work and can be useful for specific tasks.
The output neuron applies a weighted sum of the inputs and a bias term, followed by an activation
function to generate the final output.
Artificial Neural Networks (ANNs) learn through a process called training. This involves feeding the
network a large dataset of labeled examples and iteratively adjusting the connections between neurons to
minimize the difference between the network's output and the desired output for those examples. Here's a
breakdown of the key steps involved:
1. Data Preparation:
The first step is to gather a dataset relevant to the task you want the ANN to perform. This data needs to
be labeled, meaning each data point should have a corresponding desired output (e.g., image with a
label indicating the object in the image).
The data is then preprocessed to ensure it's in a format suitable for the ANN. This might involve scaling
the data to a specific range or encoding categorical data into numerical values.
2. Choosing an ANN Architecture:
The architecture of an ANN refers to its structure, including the number of layers, the number of
neurons in each layer, and the connections between them. The choice of architecture depends on the
complexity of the task and the size of the dataset.
3. Setting Up the Training Process:
This involves defining parameters like:
o Learning Rate: Controls how much the weights are adjusted during each iteration.
o Loss Function: A mathematical function that measures the difference between the network's output
and the desired output. Common choices include mean squared error for regression tasks and cross-
entropy for classification tasks.
o Optimizer: An algorithm that determines how to update the weights based on the learning rate and the
loss function. Popular optimizers include gradient descent and its variants (e.g., Adam, RMSprop).
4. Forward Pass and Backpropagation:
Hebbian learning
Hebbian learning is a fundamental principle in neuroscience and artificial intelligence, inspired by the way
neurons in the brain strengthen their connections through repeated stimulation. The principle, often
summarized as "cells that fire together wire together," was proposed by Donald Hebb in 1949. In the
context of AI, Hebbian learning offers a way to understand and implement how learning occurs through the
adjustment of synaptic weights based on activity patterns.
It is used for pattern classification. It is a single layer neural network, i.e. it has one input layer and one
output layer.
Applications in AI
1.Neural Networks: Hebbian learning can be used to train artificial neural networks, particularly in
unsupervised learning scenarios. It helps in forming feature detectors and self-organizing maps.
2.Pattern Recognition: The rule is useful for learning patterns and associations in data, enabling systems
to recognize and classify patterns based on the learned representations.
3.Reinforcement Learning: Hebbian principles can be integrated with reinforcement learning to adjust
4.Cognitive Modeling: Hebbian learning is employed in cognitive models that simulate human learning
and memory processes, providing insights into how humans learn and recall information.
Perceptron Learning
Perceptron learning is a foundational concept in artificial neural networks (ANNs) and machine learning.
It represents one of the earliest and simplest forms of neural networks, introduced by Frank Rosenblatt in
1958.
The perceptron algorithm provides a way for a machine to learn a binary classifier—a function that can
decide whether an input, represented by a vector of numbers, belongs to one class or another.
Perceptron learning revolves around adjusting the weights of the connections between the input layer and
the single output neuron in a perceptron.
This adjustment is based on the difference between the desired output (target) and the actual output
generated by the perceptron for a given input.
Perceptrons and perceptron learning are primarily used for educational purposes due to their simplicity.
They offer a good introduction to the concepts of neural network training and understanding how weights
are adjusted based on errors.
In some specific cases, perceptrons can be useful for solving simple linear classification problems, such as
classifying data points as even or odd numbers.
Types of Perceptron
Single-Layer Perceptron: This type of perceptron is limited to learning linearly separable patterns.
effective for tasks where the data can be divided into distinct categories through a straight line.
Mr. Frank Rosenblatt invented the perceptron model as a binary classifier which contains three main
components. These are as follows:
Wight and Bias: Weight parameter represents the strength of the connection between units. This is another
most important parameter of Perceptron components. Weight is directly proportional to the strength of the
associated input neuron in deciding the output. Further, Bias can be considered as the line of intercept in a
linear equation.
Activation Function: These are the final and important components that help to determine whether the
neuron will fire or not. Activation Function can be considered primarily as a step function.
Back-propagation Learning
In machine learning, backpropagation is an effective algorithm used to train artificial neural
networks, especially in feed-forward neural networks.
Backpropagation is an iterative algorithm, that helps to minimize the cost function by determining
which weights and biases should be adjusted. During every epoch, the model learns by adapting the
weights and biases to minimize the loss by moving down toward the gradient of the error. Thus, it
involves the two most popular optimization algorithms, such as gradient descent or stochastic
gradient descent.
Computing the gradient in the backpropagation algorithm helps to minimize the cost function and it
can be implemented by using the mathematical rule called chain rule from calculus to navigate
through complex layers of the neural network.
Advantages of Using the Backpropagation Algorithm in Neural Networks
Backpropagation, a fundamental algorithm in training neural networks, offers several advantages that
make it a preferred choice for many machine learning tasks. Here, we discuss some key advantages of
using the backpropagation algorithm:
1. Ease of Implementation: Backpropagation does not require prior knowledge of neural networks,
making it accessible to beginners. Its straightforward nature simplifies the programming process, as it
primarily involves adjusting weights based on error derivatives.
3. Efficiency: Backpropagation accelerates the learning process by directly updating weights based on
the calculated error derivatives. This efficiency is particularly advantageous in training deep neural
networks, where learning features of a function can be time-consuming.
5. Scalability: Backpropagation scales well with the size of the dataset and the complexity of the
network. This scalability makes it suitable for large-scale machine learning tasks, where training data
and network size are significant factors.
Backward pass: In the backward pass process shows, the error is transmitted back to the network
which helps the network, to improve its performance by learning and adjusting the internal weights.
Deep Learning
Deep learning is a subset of machine learning in artificial intelligence (AI) that focuses on algorithms
inspired by the structure and function of the brain's neural networks.
It uses artificial neural networks (ANNs) with multiple hidden layers to learn complex patterns from
data.
These deep neural networks are inspired by the structure and function of the human brain.
In a fully connected Deep neural network, there is an input layer and one or more hidden layers
connected one after the other. Each neuron receives input from the previous layer neurons or the input
layer. The output of one neuron becomes the input to other neurons in the next layer of the network,
and this process continues until the final layer produces the output of the network.
Deep learning AI can be used for supervised, unsupervised as well as reinforcement machine learning.
Deep learning can also learn from unlabeled data, while more basic machine learning models may
require more context about the data they are fed in order to "learn" correctly.
Deep Learning Architectures
Deep learning architectures are the specific structures of artificial neural networks used in deep learning.
These architectures define how neurons are interconnected within a deep neural network (DNN) and how
information flows through the network. Here's a breakdown of some common deep learning architectures:
1. Convolutional Neural Networks (CNNs):
Convolutional neural networks tend to be used in computer vision solutions using images as input. They
capture the spatial aspects of the data; rather than every pixel being see as a standalone feature, the fact
These are just a few of the most common deep learning architectures. There are many other variations and