0% found this document useful (0 votes)

3 views

College notes

Computational learning theory (CoLT) is a branch of AI that uses mathematical methods to analyze learning tasks and algorithms, extending statistical learning theory. It includes concepts like sample complexity, which quantifies the number of training samples needed for effective learning, and VC dimension, which measures the capacity of a hypothesis set. Additionally, ensemble learning techniques, such as bagging and boosting, are discussed for improving predictive performance by combining multiple models.

Uploaded by

harshnebhnani02

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

College notes

Uploaded by

harshnebhnani02

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

UNIT 4 COMPUTATIONAL LEARNING THEORY

Computational learning theory (CoLT) is a branch of AI concerned with using mathematical methods or
the design applied to computer learning programs. It involves using mathematical frameworks for the
purpose of quantifying learning tasks and algorithms.

It seeks to use the tools of theoretical computer science to quantify learning problems. This includes
characterizing the difficulty of learning specific tasks.

Computational learning theory can be considered to be an extension of statistical learning theory or SLT
for short, that makes use of formal methods for the purpose of quantifying learning algorithms.

 Computational Learning Theory (CoLT): Formal study of learning tasks.

 Statistical Learning Theory (SLT): Formal study of learning algorithms.

This division of learning tasks vs. learning algorithms is arbitrary, and in practice, there is quite a large
degree of overlap between these two fields.

Computational learning theory is essentially a sub-field of artificial intelligence (AI) that focuses on
studying the design and analysis of machine learning algorithms.
Sample Complexity

The sample complexity of a machine learning algorithm represents the number of training-
samples that it needs in order to successfully learn a target function.

More precisely, the sample complexity is the number of training-samples that we need to supply
to the algorithm, so that the function returned by the algorithm is within an arbitrarily small error
of the best possible function, with probability arbitrarily close to 1.

There are two variants of sample complexity:

 The weak variant fixes a particular input-output distribution;

 The strong variant takes the worst-case sample complexity over all input-output
distributions.
The No free lunch theorem, discussed below, proves that, in general, the strong sample
complexity is infinite, i.e. that there is no algorithm that can learn the globally-optimal target
function using a finite number of training samples.

However, if we are only interested in a particular class of target functions (e.g, only linear
functions) then the sample complexity is finite, and it depends linearly on the VC dimension on
the class of target functions.

VC Dimension
The Vapnik-Chervonenkis (VC) dimension is a measure of the capacity of a hypothesis set to fit
different data sets. It was introduced by Vladimir Vapnik and Alexey Chervonenkis in the 1970s
and has become a fundamental concept in statistical learning theory. The VC dimension is a
measure of the complexity of a model, which can help us understand how well it can fit different
data sets.
The VC dimension of a hypothesis set H is the largest number of points that can be shattered by
H. A hypothesis set H shatters a set of points S if, for every possible labeling of the points in S,
there exists a hypothesis in H that correctly classifies the points. In other words, a hypothesis set
shatters a set of points if it can fit any possible labeling of those points.
Bounds of VC – Dimension
The VC dimension provides both upper and lower bounds on the number of training examples
required to achieve a given level of accuracy. The upper bound on the number of training
examples is logarithmic in the VC dimension, while the lower bound is linear.
Applications of VC – Dimension
The VC dimension has a wide range of applications in machine learning and statistics. For
example, it is used to analyze the complexity of neural networks, support vector machines, and
decision trees. The VC dimension can also be used to design new learning algorithms that are
robust to noise and can generalize well to unseen data.
Ensemble Learning

 Ensemble learning is a machine learning technique that combines the predictions from
multiple individual models to obtain a better predictive performance than any single
model. The basic idea behind ensemble learning is to leverage the wisdom of the crowd
by aggregating the predictions of multiple models, each of which may have its own
strengths and weaknesses. This can lead to improved performance and generalization.

 Ensemble learning can be thought of as compensation for poor learning algorithms that
are computationally more expensive than a single model. But they are more efficient than
a single non-ensemble model that has passed through a lot of learning. In this article, we
will have a comprehensive overview of the importance of ensemble learning and how it
works, different types of ensemble classifiers, advanced ensemble learning techniques,
and some algorithms (such as random forest, xgboost) for better clarification of the
common ensemble classifiers and finally their uses in the technical world.

 Several individual base models (experts) are fitted to learn from the same data and
produce an aggregation of output based on which a final decision is taken. These base
models can be machine learning algorithms such as decision trees (mostly used), linear
models, support vector machines (SVM), neural networks, or any other model that is
capable of making predictions.

 Most commonly used ensembles include techniques such as Bagging- used to generate
Random Forest algorithms and Boosting- to generate algorithms such as Adaboost,
Xgboost etc.

There are two techniques given below that are used to perform ensemble decision tree.
Bagging
Bagging is used when our objective is to reduce the variance of a decision tree. Here the concept
is to create a few subsets of data from the training sample, which is chosen randomly with
replacement. Now each collection of subset data is used to prepare their decision trees thus, we
end up with an ensemble of various models. The average of all the assumptions from numerous
tress is used, which is more powerful than a single decision tree.

Random Forest is an expansion over bagging. It takes one additional step to predict a random
subset of data. It also makes the random selection of features rather than using all features to
develop trees. When we have numerous random trees, it is called the Random Forest.

These are the following steps which are taken to implement a Random forest:

 Let us consider X observations Y features in the training data set. First, a model from the
training data set is taken randomly with substitution.
 The tree is developed to the largest.
 The given steps are repeated, and prediction is given, which is based on the collection of
predictions from n number of trees.

Advantages of using Random Forest technique:

 It manages a higher dimension data set very well.

 It manages missing quantities and keeps accuracy for missing data.

Disadvantages of using Random Forest technique:

Since the last prediction depends on the mean predictions from subset trees, it won't give precise
value for the regression model.

Boosting:
Boosting is another ensemble procedure to make a collection of predictors. In other words, we fit
consecutive trees, usually random samples, and at each step, the objective is to solve net error
from the prior trees.

If a given input is misclassified by theory, then its weight is increased so that the upcoming
hypothesis is more likely to classify it correctly by consolidating the entire set at last converts
weak learners into better performing models.

Gradient Boosting is an expansion of the boosting procedure.

1. Gradient Boosting = Gradient Descent + Boosting

It utilizes a gradient descent algorithm that can optimize any differentiable loss function. An
ensemble of trees is constructed individually, and individual trees are summed successively. The
next tree tries to restore the loss ( It is the difference between actual and predicted values).

Advantages of using Gradient Boosting methods:

 It supports different loss functions.

 It works well with interactions.

Disadvantages of using a Gradient Boosting methods:

 It requires cautious tuning of different hyper-parameters.

Bagging Boosting
Various training data subsets are randomly drawn Each new subset contains the components
with replacement from the whole training dataset. that were misclassified by previous models.
Bagging attempts to tackle the over-fitting issue. Boosting tries to reduce bias.
If the classifier is unstable (high variance), then If the classifier is steady and straightforward
we need to apply bagging. (high bias), then we need to apply boosting.
Every model receives an equal weight. Models are weighted by their performance.
Objective to decrease variance, not bias. Objective to decrease bias, not variance.
It is the easiest way of connecting predictions that It is a way of connecting predictions that
belong to the same type. belong to the different types.
New models are affected by the performance
Every model is constructed independently.
of the previously developed model.

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
COS10008 Assignment1 TP1 19 PDF
No ratings yet
COS10008 Assignment1 TP1 19 PDF
8 pages
All All 'Enter The Probabilities:': CLC Clear Close Disp
100% (1)
All All 'Enter The Probabilities:': CLC Clear Close Disp
4 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Random Forest
No ratings yet
Random Forest
10 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
47 pages
Random Forest
No ratings yet
Random Forest
10 pages
7 - Ensemble Techniques-Converted Updated
No ratings yet
7 - Ensemble Techniques-Converted Updated
8 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Module 2
No ratings yet
Module 2
34 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
UNIT-3 Notes
No ratings yet
UNIT-3 Notes
12 pages
UNIT-5 ML notes
No ratings yet
UNIT-5 ML notes
24 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
4 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
8 pages
Ensemble Learning: Wisdom of The Crowd
100% (1)
Ensemble Learning: Wisdom of The Crowd
12 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
unit 5 ML
No ratings yet
unit 5 ML
14 pages
Random Forest
No ratings yet
Random Forest
20 pages
ML-Lecture-15-Ensemble
No ratings yet
ML-Lecture-15-Ensemble
27 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
Classification Through Ensembling Techniques
No ratings yet
Classification Through Ensembling Techniques
10 pages
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
No ratings yet
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
21 pages
Unit-3(1)
No ratings yet
Unit-3(1)
63 pages
Unit 4
No ratings yet
Unit 4
17 pages
Module 5,1 Ensemble_Bagging, RF,Boosting
No ratings yet
Module 5,1 Ensemble_Bagging, RF,Boosting
66 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
M4 - FDS
No ratings yet
M4 - FDS
15 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
1729585037_ML11_Generalization
No ratings yet
1729585037_ML11_Generalization
40 pages
ML - Module 4
No ratings yet
ML - Module 4
8 pages
UNIT-IV
No ratings yet
UNIT-IV
22 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Unit 3
No ratings yet
Unit 3
99 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
24 pages
ML Unit-3
No ratings yet
ML Unit-3
28 pages
ML UNIT-3 PART-1
No ratings yet
ML UNIT-3 PART-1
17 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Bagging vs Boosting - Javatpoint
No ratings yet
Bagging vs Boosting - Javatpoint
8 pages
Ensemble Methods
No ratings yet
Ensemble Methods
12 pages
Ensemble Techniques and Random Forest: - Linear Algebra. - Basics of Machine Learning
No ratings yet
Ensemble Techniques and Random Forest: - Linear Algebra. - Basics of Machine Learning
8 pages
Ens Embling
No ratings yet
Ens Embling
19 pages
ML UNIT 3-1
No ratings yet
ML UNIT 3-1
14 pages
UNIT IV
No ratings yet
UNIT IV
18 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Unit-3(1)
No ratings yet
Unit-3(1)
59 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
15 pages
Bagging and Random Forest Presentation1
100% (2)
Bagging and Random Forest Presentation1
23 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Year2 Maths Phys 2024-2025 Ass2 Gako
0% (1)
Year2 Maths Phys 2024-2025 Ass2 Gako
3 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
Intermediate Algebra Posttest PDF
No ratings yet
Intermediate Algebra Posttest PDF
3 pages
Telecommunication Questions
No ratings yet
Telecommunication Questions
2 pages
Kementerian Pendidikan, Kebudayaan, Riset, Dan Teknologi Universitas Negeri Semarang (Unnes)
No ratings yet
Kementerian Pendidikan, Kebudayaan, Riset, Dan Teknologi Universitas Negeri Semarang (Unnes)
4 pages
Lecture 6 Algorithms and Problem Solving
No ratings yet
Lecture 6 Algorithms and Problem Solving
29 pages
AIML Brochure B05
No ratings yet
AIML Brochure B05
3 pages
Introduction To Algorithms CS 445: Discussion Session 4 Instructor: DR Alon Efrat TA: Pooja Vaswani 02/28/2005
No ratings yet
Introduction To Algorithms CS 445: Discussion Session 4 Instructor: DR Alon Efrat TA: Pooja Vaswani 02/28/2005
16 pages
Basic-Set Trellis Min-Max Decoder Architecture For Nonbinary LDPC Codes With High-Order Galois Fields
No ratings yet
Basic-Set Trellis Min-Max Decoder Architecture For Nonbinary LDPC Codes With High-Order Galois Fields
12 pages
Assignment No 1 DSA
No ratings yet
Assignment No 1 DSA
3 pages
DAA-PPT
No ratings yet
DAA-PPT
118 pages
Sol Mock Exam
No ratings yet
Sol Mock Exam
10 pages
Bubble Sort and Selection Sort Updated
No ratings yet
Bubble Sort and Selection Sort Updated
22 pages
Convolution and Correlation of Signals For PDF
100% (1)
Convolution and Correlation of Signals For PDF
21 pages
Numerical Optimization: Unit 9: Penalty Method and Interior Point Method Unit 10: Filter Method and The Maratos Effect
No ratings yet
Numerical Optimization: Unit 9: Penalty Method and Interior Point Method Unit 10: Filter Method and The Maratos Effect
24 pages
Significant Digits Practice
No ratings yet
Significant Digits Practice
1 page
Ia-2 - Ai - QP 2024
No ratings yet
Ia-2 - Ai - QP 2024
3 pages
Real Time Active Noise Cancellation With Simulink and Data Acquisition Toolbox
No ratings yet
Real Time Active Noise Cancellation With Simulink and Data Acquisition Toolbox
5 pages
Jacobi, Gauss Seidel, LU Decomposition
No ratings yet
Jacobi, Gauss Seidel, LU Decomposition
7 pages
Artificial Intelligence (AI2002) Course Outline Spring 2023
No ratings yet
Artificial Intelligence (AI2002) Course Outline Spring 2023
2 pages
Data Structures in JavaScript_ Arrays, HashMaps, and Lists _ Adrian Mejia Blog
No ratings yet
Data Structures in JavaScript_ Arrays, HashMaps, and Lists _ Adrian Mejia Blog
50 pages
Some Formulae For Digital Signal Processing: Prof. Dr.-Ing. Walter Kellermann Andreas Brendel, M.SC
No ratings yet
Some Formulae For Digital Signal Processing: Prof. Dr.-Ing. Walter Kellermann Andreas Brendel, M.SC
6 pages
Cs 171 07a Games MiniMax
No ratings yet
Cs 171 07a Games MiniMax
28 pages
Coding For Interviews Dynamic Programming Cheat Sheet
33% (3)
Coding For Interviews Dynamic Programming Cheat Sheet
4 pages
A Narrowband Active Noise Control System With a Frequency Estimation Algorithm Based on Parallel Adaptive Notch Filter
No ratings yet
A Narrowband Active Noise Control System With a Frequency Estimation Algorithm Based on Parallel Adaptive Notch Filter
40 pages
2022-Maurice Bellanger - Digital Signal Processing_ Theory and Practice, 10th Edition-WILEY (2024) (1)
No ratings yet
2022-Maurice Bellanger - Digital Signal Processing_ Theory and Practice, 10th Edition-WILEY (2024) (1)
397 pages
Data Structures - 30052014 - 112017PM
No ratings yet
Data Structures - 30052014 - 112017PM
2 pages
Digital Bildbehandling: Laborationer #2: Fourier Transform and Image Morphology
No ratings yet
Digital Bildbehandling: Laborationer #2: Fourier Transform and Image Morphology
3 pages

College notes

Uploaded by

College notes

Uploaded by

UNIT 4 COMPUTATIONAL LEARNING THEORY

 Computational Learning Theory (CoLT): Formal study of learning tasks.

There are two variants of sample complexity:

 The weak variant fixes a particular input-output distribution;

Advantages of using Random Forest technique:

 It manages a higher dimension data set very well.

Disadvantages of using Random Forest technique:

Gradient Boosting is an expansion of the boosting procedure.

1. Gradient Boosting = Gradient Descent + Boosting

Advantages of using Gradient Boosting methods:

 It supports different loss functions.

Disadvantages of using a Gradient Boosting methods:

 It requires cautious tuning of different hyper-parameters.

You might also like