0% found this document useful (0 votes)
13 views

Lecture 4

Uploaded by

sayanpal854
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Lecture 4

Uploaded by

sayanpal854
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Paradigms and Hypothesis

Prof. Subir Kumar Das, Dept. of CSE


Prof. Subir Kumar Das, Dept. of CSE
Learning Paradigms
• Machine Learning (ML) is an application where algorithms can learn from
experience without being explicitly programmed.
• Our algorithm receives input data that we want it to analyze, and it then
outputs what it has discovered from our input.
• What distinguishes ML is the element of learning.
• The learning paradigms in ML are categorized based on their resemblance
to human interventions, each serving specific purposes and applications.
• This dynamic field encompasses various learning paradigms, each with its
unique approach to handling data.
• These different approaches are called ML or machine learning paradigms
and help us understand how a computer learns from data, specifically
from the input.
• Machine learning is commonly separated into three main learning
paradigms:
• Supervised Learning, Unsupervised Learning and Reinforcement Learning.
• These paradigms differ in the tasks they can solve and in how the data is
presented to the computer.Prof. Subir Kumar Das, Dept. of CSE
Supervised Learning
• Similar to concept mapping, supervised learning is one of the types of
machine learning paradigms.
• You put something in, and you receive something back.
• A function that tries to abstract the system to have rules that probably
don’t make sense to a natural human is produced when you input data
and evaluate it.
• It involves labeled datasets, where each data observation is paired with a
corresponding class label.
• In supervised learning, the computer learns from a set of input-
output pairs, which are called labeled examples:

• Algorithms aim to build a mathematical function that maps input features


to desired output values based on these labeled examples.
• The goal is usually to train a predictive model from these pairs.
• A predictive model is a program that is able to guess the output value
(a.k.a. label) for a new unseen input.
Prof. Subir Kumar Das, Dept. of CSE
Supervised Learning

• Our goal is to predict the weight of an animal from its other


characteristics, so we rewrite this dataset as a set of input-output pairs:
• The input variables (here, age and gender) are
generally called features, and the set of features
representing an example is called a feature vector.
• Now we can use this predictor to guess the weight
of a new object:
• In[•]:= p[{5yr, "Female"}]
• Out[•]=
Prof. Subir3.65234 kgof CSE
Kumar Das, Dept.
Supervised Learning
• Benefits
• A high degree of control. The model trainer provides the data and rules
for what the model will learn.
• Easy and intuitive to understand. Correct predictions of training data are
easy for most people to understand.
• Suitable for understanding relationships between input and output data if
both can be provided.
• Disadvantages
• By definition, supervised learning models must provide both input and
output data.
• Supervised machine learning algorithms require a data set with inputs
and outputs to train. This data can be challenging to obtain or create.
• The developer may want the model to outperform the human player, in
which supervised machine learning is not the correct paradigm
• There is always a risk of overfitting models and forecasts.
• Supervised learning requires manual algorithm selection. Given the
problem, it is necessary toProf.
choose the
Subir Kumar Das,correct
Dept. of CSE algorithm for manual use.
Unsupervised Learning
• In unsupervised learning, algorithms work with unlabeled data to identify
patterns and relationships.
• It is not used as much as supervised learning, but it unlocks different
types of applications.
• These methods uncover commonalities within the data without
predefined categories.
• This is self-organized type ML paradigm.
• It let the system work so that the clusters make sense based on the
requirements.
• For example, if someone handed over a bunch of vegetables and asked to
sort them, and couldn’t tell what it was or the purpose of the sorting,
some one could choose color.
• Potatoes, carrots, etc., are of different colors, while sorting method may
follow that to accomplish the task.
• In this learning the data is provided and let the system make sense of it
based on some vague initial assumption implied in the data.
• Unsupervised learning can be used for a diverse range of tasks.
• One of them is called clustering, and its goal is to separate data examples
Prof. Subir Kumar Das, Dept. of CSE
into groups called clusters
Unsupervised Learning

• An application of clustering could be to automatically separate customers


of a company to create better marketing campaigns.
• Clustering is also simply used as an exploration tool to obtain insights
about the data and make informed decisions.
• It is helpful in various medical applications.
• It does not require data marking and a suitable methodology for
clustering and finding patterns and structures in raw untagged data.
• It can be used for dimensionality reduction, which is advantageous when
the number of supervised learning features is high.
• But its computations based on trial and error are often more time-
consuming than supervised learning.
• Compared to a model trained on labeled data, the results may be less
Prof. Subir Kumar Das, Dept. of CSE
accurate, e.g., identified patterns.
Reinforcement Learning
• Reinforcement learning focuses on enabling intelligent agents to learn
tasks through trial-and-error interactions with dynamic environments.
• Without the need for labeled datasets, agents make decisions to
maximize a reward function.
• This autonomous exploration and learning approach is crucial for tasks
where explicit programming is challenging.
• The data is not provided as a fixed set of examples.
• Rather, the data to learn from is obtained by interacting with an external
system called the environment.
• The name “reinforcement learning” originates from behavioral
psychology, but it could just as well be called “interactive learning.”
• Reinforcement learning is often used to teach agents, such as robots, to
learn a given task.
• The agent learns by taking actions in the environment and
receiving observations from this environment
• The agent starts its learning process by acting randomly in the
environment, and then the agent gradually learns from its experience to
perform the task better using a sort of trial-and-error strategy.
• The learning is usually guided by a reward that is given to the agent
Prof. Subir Kumar Das, Dept. of CSE
depending on its performance.
Reinforcement Learning

• Reinforcement learning operates on an action-reward feedback loop,


where agents take actions, receive rewards, and interpret the
environment’s state.
• This iterative process allows the agent to autonomously learn optimal
actions to maximize positive feedback.
• Benefits
• Does not require training data to work and can be
used in uncertain environments with little
information.
• Models improve with experience and can perform
better than the human who wrote them
• Disadvantages
• More straightforward problems could be better solved by a supervised or
unsupervised machine learning approach and may require a lot of
Prof. Subir Kumar Das, Dept. of CSE
processing power and cost of learning can be high
Prof. Subir Kumar Das, Dept. of CSE
Semi-Supervised Learning
• Semi-supervised learning strikes a balance by combining a small amount
of labeled data with a larger pool of unlabeled data.
• This approach leverages the benefits of both supervised and
unsupervised learning paradigms, making it a cost-effective and efficient
method for training models when the labeled data is limited.
• In semi-supervised learning, a part of the data is in the form of input-
output pairs, like in supervised learning

• Another part of the data only contains inputs:


• The goal is generally to learn a predictive model from both of these
datasets.
• Semi-supervised learning is thus a
supervised learning problem for
which some training labels are
missing.
• The unlabeled dataset is much
bigger than the labeled dataset.
Prof. Subir Kumar Das, Dept. of CSE
Prof. Subir Kumar Das, Dept. of CSE
Ensemble Learning
• Ensemble learning is a process where multiple base models (most often
referred as “weak learners”) are combined and trained to solve the same
problem.
• This method is based on the concept that weak learner alone performs
task poorly but when combined with other weak learners, they form a
strong learner and these ensemble models produce more accurate results.

• Ensemble learning is a technique that combines multiple machine learning


algorithms to produce one optimal predictive model with reduced
variance (using bagging), bias (using boosting) and improved predictions
(using stacking).
• Ensemble methods fall into two broad categories, i.e., sequential
ensemble techniques and parallel ensemble techniques.

Prof. Subir Kumar Das, Dept. of CSE


Other Learning Paradigms
• Online learning is a way to learn iteratively from a stream of data.
• In its pure form, the model updates itself after each example given.

• This kind of learning could be used by a bank needing to continuously


update its fraud detection system by learning from the numerous
transactions made every day
• Active learning is a way to teach a predictive model by interacting with an
on-demand source of information.
• At the beginning of an active learning procedure, the data only consists of
inputs
• During the learning procedure, the student model can request some of
these unknown outputs from a teacher (also known as oracle).
• A teacher is a system able to predict (sometimes not perfectly) the output
from a given input

Prof. Subir Kumar Das, Dept. of CSE


Other Learning Paradigms
• Transfer learning deals with transferring knowledge from one learning task
to another learning task.
• It is typically used to learn more efficiently from small datasets when we
have access to a much larger dataset that is similar (but different).
• The strategy is generally to train a model on the large dataset and then
use this pre-trained model to help train another model on the task that we
really care about

• Self-Supervised Learning (SSL) generally refers to a supervised learning


problem for which the inputs and outputs can be obtained from the data
itself, without needing any human annotation.
• SSL transforms unsupervised ML problems into supervised ones,
enhancing learning efficiency. This paradigm is particularly relevant with
the rise of large language models.
• To learn how to do this, we can use a dataset of sentences:

• We can then transform thisProf.dataset into a supervised learning problem:


Subir Kumar Das, Dept. of CSE
Hypothesis
• In a machine learning problem the input is denoted by x and the output
is denoted by y
• In order to do machine learning, there should exist a relationship
(pattern) between the input and output values.
• Lets say that this the function: y=f(x), this known as the target function.
• However, f(.) is unknown function to user.
• So machine learning algorithms try to guess a ``hypothesis'' function h(x)
that approximates the unknown f(.)
• The set of all possible hypotheses is known as the Hypothesis set H(.)
• The goal is the learning process is to find the final hypothesis that best
approximates the unknown target function.
• Therefore a hypothesis refers to a tentative explanation or model that the
algorithm proposes based on the given data.
• It is the model’s presumption regarding the connection between the input
features and the result.
• A Hypothesis space is a complete range of models and their possible
parameters that can be used to model the data. It is signified by “H”.
• In other words, the Hypothesis
Prof. Subiris a subset
Kumar ofCSEHypothesis Space.
Das, Dept. of
Generalization of Hypothesis
• Hypotheses in machine learning are formulated based on various
algorithms and techniques, each with its representation. For example:
• Linear Regression: h(X)=θ0​+θ1​X1​+θ2​X2​+…+θn​Xn​
• Decision Trees: h(X)=Tree(X)
• Neural Networks: h(X)=NN(X)
• The process of machine learning involves not only formulating hypotheses
but also evaluating their performance.
• Once a hypothesis is formulated and evaluated, the next step is to test its
generalization capabilities.
• Generalization is a term usually refers to a Machine Learning models
ability to perform well on the new unseen data.
• After being trained on a training set, a model can digest new data and can
able to make accurate predictions.
• In the context of machine learning, the theory of generalization refers to
the ability of a machine learning model to perform well on new, unseen
data after being trained on a limited dataset.
• Generalization is a key goal in machine learning because the ultimate
objective is to build models that can make accurate predictions or
Prof. Subir Kumar Das, Dept. of CSE
classifications on data that they have not seen before.
Generalization of Hypothesis
• Generalization is crucial because the ultimate goal of machine learning is
to make accurate predictions or decisions based on new inputs.
• This theory is closely related to the concepts of overfitting and
underfitting.
• Overfitting occurs when a model learns the training data too well,
including its noise and outliers, resulting in poor performance on new
data.
• Underfitting, on the other hand, happens when a model is too simple to
capture the underlying patterns in the training data, also leading to poor
performance on new data.
• The balance between these two extremes is where good generalization
lies.
• The theory of generalization is concerned with understanding how
different factors,
• such as the complexity of the model,
• the size of the training dataset,
• the noise in the data, and
• the optimization algorithm used during training,
Prof. Subir Kumar Das, Dept. of CSE
• affect the model's ability to generalize to new data.
Shattering

• "Dichotomy" is something split into two parts; usually two parts that
appear to contradict one another.
• In machine learning, it refer to the division or separation of data points
into 2 distinct classes or categories based on certain criteria or features.
• Dichotomies are commonly used in supervised learning, where the goal is
to classify data points into one of two classes or categories.
• A collection of points in a space can labeled be as positive or negative.
• If a classifier can accurately divide the points into positive and negative
groups regardless of how we choose to label them, then this set of points
is said to be shattered.
• Shattering is the ability of a model to classify a set of points perfectly.
• More generally, the model can create a function that can divide the points
into two distinct classes without overlapping.
• A set S of examples is shattered by a set of functions H if for every
partition of the examples in Prof.SSubir
into positive
Kumar Das, Dept. ofand
CSE negative examples there is
a function in H that gives exactly these labels to the examples
Thank You

Prof. Subir Kumar Das, Dept. of CSE

You might also like