0% found this document useful (0 votes)
10 views14 pages

Module 10- Part 2- Boosting models

The document discusses boosting models in machine learning, specifically focusing on AdaBoost, Gradient Boosting Machine (GBM), and XGBoost. It explains the differences between bagging and boosting, highlighting how boosting iteratively improves models based on previous errors. Additionally, it outlines the pros and cons of XGBoost, noting its complexity compared to other boosting methods.

Uploaded by

Aashir Aftab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views14 pages

Module 10- Part 2- Boosting models

The document discusses boosting models in machine learning, specifically focusing on AdaBoost, Gradient Boosting Machine (GBM), and XGBoost. It explains the differences between bagging and boosting, highlighting how boosting iteratively improves models based on previous errors. Additionally, it outlines the pros and cons of XGBoost, noting its complexity compared to other boosting methods.

Uploaded by

Aashir Aftab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Module 10- Part II

Boosting Models
AdaBoost, GBM, XGBoost

Prof. Pedram Jahangiry

Prof. Pedram Jahangiry


Class Modules
• Module 1- Introduction to Machine Learning
• Module 2- Setting up Machine Learning Environment
• Module 3- Linear Regression (Econometrics approach)
• Module 4- Machine Learning Fundamentals
• Module 5- Linear Regression (Machine Learning approach)
• Module 6- Penalized Regression (Ridge, LASSO, Elastic Net)
• Module 7- Logistic Regression
• Module 8- K-Nearest Neighbors (KNN)
• Module 9- Classification and Regression Trees (CART)
• Module 10- Bagging and Boosting
• Module 11- Dimensionality Reduction (PCA)
• Module 12- Clustering (KMeans – Hierarchical)

Prof. Pedram Jahangiry


Road map ML Algorithm

Supervised Unsupervised

Dimensionality
Regression Classification Clustering
Reduction

Linear / Logistic Principle K-Mean


Polynomial regression Component
Penalized Analysis
regression (PCA)
KNN KNN Hierarchical

SVR SVM SVC

1. Decision Trees (DTs)


Tree-based Tree-based
Regression models Classification models 2. Bagging, Random Forests
3. Boosting

Prof. Pedram Jahangiry


Topics
Part I
1. Bagging vs Boosting
2. AdaBoost
3. Gradient Boosting Machine (GBM)
4. XGBoost

Part II
Pros and Cons

Prof. Pedram Jahangiry


Part I
1. Bagging vs Boosting
2. AdaBoost
3. Gradient Boosting Machine (GBM)
4. XGBoost

Prof. Pedram Jahangiry


Bagging vs Boosting

• Bagging consists of creating many “copies” of the training data


(each copy is slightly different from another) and then apply the
weak learner to each copy to obtain multiple weak models and then
combine them.
• In bagging, the bootstrapped trees are independent from each other.

• Boosting consists of using the “original” training data and iteratively


creating multiple models by using a weak learner. Each new model
would be different from the previous ones in the sense that the weak
learner, by building each new model tries to “fix” the errors which
previous models make.
• In boosting, each tree is grown using information from previous tree.

Prof. Pedram Jahangiry


AdaBoost (Adaptive Boosting)
• Forest of weak learners (trees with only 1 feature;
stumps).
• Each tree (stump) depends on the previous tree’s
errors rather than being independent.

1) Starting with usual splitting criteria!


2) Each tree (stump) gets different weight based on
its prediction accuracy.
3) Each observation gets a weight inversely related
to its predicted outcome. (ex, misclassified ones
get more weight).
Source: Towards data science
4) Aggregation is done based on each weak
learner’s weight.

Prof. Pedram Jahangiry


AdaBoost
Key features:
• Adaptive: Updates the weights of misclassified instances at each step.
• Tends to be sensitive to noise and outliers.
• Can be used with various base classifiers, but most commonly used with decision stumps.

• AdaBoost is old: AdaBoost is a popular boosting technique introduced by Yoav Freund and Robert
Schapire in 1996.

Prof. Pedram Jahangiry


Gradient Boosting Machine (GBM)
Source: Geeksforgeeks

• In gradient boosting, each weak learner corrects its


predecessor’s error.
• Unlike AdaBoost, the weights of the training instances
are not tweaked, instead, each predictor is trained using
the residual errors of predecessor as labels.
• Unlike AdaBoost, each tree can be larger than a stump.
However, the trees are still small. By fitting a small tree
to the residuals, the GBM slowly improve 𝑓መ in areas
where it does not perform well.

• Learning rate shrinks the contribution of each tree. There is a trade-off between learning rate and
number of trees. Learning rate slows down the process even further, allowing for more and different
shaped trees to attack the residuals.
• Aggregation is done by adding the first tree predictions and a scaled (shrunk) version of the following
trees.

Prof. Pedram Jahangiry


Extreme Gradient Boosting (XGBoost)
• XGBoost is a refined and customized version of a gradient boosting decision tree system, created
with performance and speed in mind.
• Extreme refers to the fact that the algorithms and methods have been customized to push the limit
of what is possible for gradient boosting algorithms.

Prof. Pedram Jahangiry


Put it all together!

Prof. Pedram Jahangiry


Part II
Pros and Cons

Prof. Pedram Jahangiry


XGBoost’s Pros and Cons

Pros:

Cons:
• XGBoost is more difficult to understand, visualize and to tune compared to AdaBoost and
random forests. There is a multitude of hyperparameters that can be tuned to increase
performance.

Prof. Pedram Jahangiry


Class Modules
✓ Module 1- Introduction to Machine Learning
✓ Module 2- Setting up Machine Learning Environment
✓ Module 3- Linear Regression (Econometrics approach)
✓ Module 4- Machine Learning Fundamentals
✓ Module 5- Linear Regression (Machine Learning approach)
✓ Module 6- Penalized Regression (Ridge, LASSO, Elastic Net)
✓ Module 7- Logistic Regression
✓ Module 8- K-Nearest Neighbors (KNN)
✓ Module 9- Classification and Regression Trees (CART)
✓ Module 10- Bagging and Boosting
• Module 11- Dimensionality Reduction (PCA)
• Module 12- Clustering (KMeans – Hierarchical)

Prof. Pedram Jahangiry

You might also like