0% found this document useful (0 votes)

13 views

XGBoost_ Unleashing the Power of Gradient Boosting

Slides on the XG Boost Algorithm

Uploaded by

enpass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

XGBoost_ Unleashing the Power of Gradient Boosting

Slides on the XG Boost Algorithm

Uploaded by

enpass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 10

XGBoost: Unleashing the

Power of Gradient Boosting

- A Dive into its Key Features and Advantages

Introduction

● What is XGBoost?
○ An open-source library for gradient boosting
○ Combines decision trees with regularization to create highly accurate models
○ Widely used in machine learning competitions and real-world applications

● Boosting is an ensemble learning technique where weak learners are combined

sequentially to create a strong model. It focuses on correcting errors made by
previous learners in the sequence.

● XGBoost stands out as an advanced boosting algorithm known for its efficiency and
performance.
Workings

In machine learning, weak learners are simple models, like decision trees. On their own, they might
not be incredibly accurate. But boosting comes in and harnesses their collective power. Here's how:

Stage 1: The First Foothold: Start with a single weak learner (decision tree) trained on your data. It
makes predictions, but inevitably gets some wrong.

Stage 2: Boosting the Signal: Focus on the errors! Increase the weights of data points the first tree
misclassified. Train a new weak learner to specifically address those errors.

Stage 3: Climbing Together: Combine the predictions of both trees. The first tree provides a rough
direction, and the second fine-tunes it based on the mistakes. Repeat!

Stage N: Reaching the Peak: With each iteration, build a new weak learner focused on the
remaining errors, weighted more for those the previous trees missed. Combine all predictions into a
final, powerful ensemble model.
Key Observations

● The weak learners are typically simple models like decision trees, but can be any
type of learner.

● Each new learner focuses on the residuals (errors) left by the previous ones.

● Boosting improves both bias (systematic error) and variance (random error) of the
model.

● This iterative process results in an ensemble model that is often much more accurate
than any individual weak learner.
Objective function
XGBoost's objective function is typically expressed as:

Obj = Loss(y, y_pred) + alpha * Reg_L1 + beta * Reg_L2

Loss represents the training loss function. Typical loss functions are Mean squared error
(for regression) Logistic loss (for binary classification) and Multiclass logloss (for multiclass
classification)

- y denotes the true labels and y_pred signifies the model's predicted values.
- alpha and beta control the strength of L1 and L2 regularization, respectively.
Key Features
● Speed and Efficiency: Optimized algorithms for parallel processing
● Regularization: Built-in mechanisms to prevent overfitting. Controls
model complexity through penalties on tree structure and leaf values
● Flexibility: Supports various objective functions for regression and
classification tasks and can handle missing values and categorical features
● Sparsity Awareness: Efficiently handles sparse data with many zero
features
● Distributed Learning : Scales to large datasets using distributed
computing frameworks like SparkEnables collaborative training on multiple
machines
● Interpretability: Feature importance scores help identify key drivers of
the model
Hyperparameters - Learning Task Parameters

● objective: Specifies the learning task and the corresponding objective function (e.g.,
'reg:squarederror' for regression, 'binary:logistic' for binary classification,
'multi:softmax' for multiclass classification).

● eval_metric: Evaluation metric used for the validation data (e.g., 'rmse' for
regression, 'logloss' for binary classification).
Hyperparameters - Tree Booster Parameters

● eta (learning rate): Shrinks the contribution of each tree, preventing overfitting.
Smaller values lead to slower learning but robust models. Typical values range
from 0.01 to 0.3.

● max_depth: Maximum depth of a tree. Higher values may lead to overfitting.

● subsample: Fraction of training data to be randomly sampled during each

boosting round. Values between 0.5 and 1.0 are common.

● colsample_bytree: Fraction of features to be randomly sampled for building

each tree.

● min_child_weight: Minimum sum of instance weight (hessian) needed in a child.

Hyperparameters - Regularization Parameters

● gamma: Minimum loss reduction required to make a further partition on a leaf node.

● alpha: L1 regularization parameter, controlling tree sparsity (number of non-zero

leaf nodes).

● lambda: L2 regularization parameter, shrinking leaf node weights to reduce model

complexity.
Other Hyperparameters

scale_pos_weight: Controls the balance of positive and negative weights, particularly

useful for imbalanced classes.

nthread: Sets the number of threads to use for parallel processing

n_estimators: Number of boosting rounds or trees to build.

early_stopping_rounds: If a metric does not improve for a certain number of rounds,

training is stopped.

Xgboost Presentation
100% (3)
Xgboost Presentation
54 pages
Gentle Introduction of XGBoost Library _ by Mohit Sharma _ Medium
No ratings yet
Gentle Introduction of XGBoost Library _ by Mohit Sharma _ Medium
17 pages
Xg boosting reference
No ratings yet
Xg boosting reference
6 pages
XGBoost
No ratings yet
XGBoost
4 pages
Lesson 8 - Ensemble Learning
No ratings yet
Lesson 8 - Ensemble Learning
61 pages
Session 10 - Ensemble Methods (XGBoost)
No ratings yet
Session 10 - Ensemble Methods (XGBoost)
37 pages
XGBoost - A Powerful Machine Learning Algorithm For Beginners
No ratings yet
XGBoost - A Powerful Machine Learning Algorithm For Beginners
3 pages
Breast Cancer Tumor Prediction Using XGBOOST
No ratings yet
Breast Cancer Tumor Prediction Using XGBOOST
1 page
XGBoost & Adaboost
No ratings yet
XGBoost & Adaboost
22 pages
Xgboost: Notebook
No ratings yet
Xgboost: Notebook
8 pages
Learning Rate (Or Eta)
No ratings yet
Learning Rate (Or Eta)
4 pages
Extreme Gradient Boosting
No ratings yet
Extreme Gradient Boosting
8 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
XGboost Tutorial
100% (1)
XGboost Tutorial
13 pages
XGBoost Tuning 1597155827
No ratings yet
XGBoost Tuning 1597155827
7 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
MLR PDF
No ratings yet
MLR PDF
2 pages
Machine Learning With R
No ratings yet
Machine Learning With R
2 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
Learning Rate (Or Eta)
No ratings yet
Learning Rate (Or Eta)
2 pages
106-110
No ratings yet
106-110
6 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
Ensemble methods_b45145f8047e51ea0d65d32fc07eb528
No ratings yet
Ensemble methods_b45145f8047e51ea0d65d32fc07eb528
21 pages
Machine Learning
No ratings yet
Machine Learning
93 pages
Xgboost PDF
100% (1)
Xgboost PDF
128 pages
Xgboost: A Scalable Tree Boosting System: Tianqi Chen Tqchen@Cs - Washington.Edu Carlos Guestrin Guestrin@Cs - Washington.Edu
100% (1)
Xgboost: A Scalable Tree Boosting System: Tianqi Chen Tqchen@Cs - Washington.Edu Carlos Guestrin Guestrin@Cs - Washington.Edu
13 pages
05.XGBoost
No ratings yet
05.XGBoost
6 pages
rfp0697 Chenaemb
No ratings yet
rfp0697 Chenaemb
10 pages
Xgboost: A Scalable Tree Boosting System: Tianqi Chen Tqchen@Cs - Washington.Edu Carlos Guestrin Guestrin@Cs - Washington.Edu
No ratings yet
Xgboost: A Scalable Tree Boosting System: Tianqi Chen Tqchen@Cs - Washington.Edu Carlos Guestrin Guestrin@Cs - Washington.Edu
13 pages
Xg Boost
No ratings yet
Xg Boost
5 pages
XGBoost
No ratings yet
XGBoost
22 pages
_LECTURE+NOTES_Boosting
No ratings yet
_LECTURE+NOTES_Boosting
8 pages
XGBoost
No ratings yet
XGBoost
4 pages
XGBOOST Advanced
100% (1)
XGBOOST Advanced
128 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
107 Boostong Models
No ratings yet
107 Boostong Models
27 pages
About The Model
No ratings yet
About The Model
1 page
Chapter 7 - Ensemble
No ratings yet
Chapter 7 - Ensemble
12 pages
Plagiarism
No ratings yet
Plagiarism
18 pages
AIEdge MLArchive
No ratings yet
AIEdge MLArchive
93 pages
Module 3.5 Ensemble Learning XGBoost
No ratings yet
Module 3.5 Ensemble Learning XGBoost
26 pages
Course DataCamp Classification With XGBoost
100% (1)
Course DataCamp Classification With XGBoost
39 pages
xgboost
No ratings yet
xgboost
4 pages
XG Boost
100% (1)
XG Boost
4 pages
XGBoost Algorithm
No ratings yet
XGBoost Algorithm
26 pages
Boosting
No ratings yet
Boosting
12 pages
Xgboost
No ratings yet
Xgboost
4 pages
XG Boost
No ratings yet
XG Boost
4 pages
ML-Lecture-15-Ensemble
No ratings yet
ML-Lecture-15-Ensemble
27 pages
Unit V -Multiple Learners
No ratings yet
Unit V -Multiple Learners
54 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Plagiarism
No ratings yet
Plagiarism
20 pages
Gradient Boosting Machine and SHAP For Biogas Production
No ratings yet
Gradient Boosting Machine and SHAP For Biogas Production
73 pages
Ada Boost
No ratings yet
Ada Boost
2 pages
ML mod1
No ratings yet
ML mod1
48 pages
XGBoost WM
No ratings yet
XGBoost WM
39 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Challenge of Vanishing/Exploding Gradients in Deep Neural Networks
No ratings yet
The Challenge of Vanishing/Exploding Gradients in Deep Neural Networks
8 pages
Instant ebooks textbook Introduction to the design and analysis of algorithms 3rd edition Edition Levitin download all chapters
100% (1)
Instant ebooks textbook Introduction to the design and analysis of algorithms 3rd edition Edition Levitin download all chapters
61 pages
Introduction To Splay Tree Data Structure
No ratings yet
Introduction To Splay Tree Data Structure
12 pages
Chapter 4: Divide and Conquer
No ratings yet
Chapter 4: Divide and Conquer
41 pages
Booths
No ratings yet
Booths
2 pages
DSEB65 Homework2
No ratings yet
DSEB65 Homework2
3 pages
DS Theory HW 3
No ratings yet
DS Theory HW 3
6 pages
08 Adversarial Search
No ratings yet
08 Adversarial Search
36 pages
21.1 Structure of Fibonacci Heaps
No ratings yet
21.1 Structure of Fibonacci Heaps
12 pages
Hash Function
No ratings yet
Hash Function
43 pages
Chapter 3 - Simple Sorting and Searching
100% (1)
Chapter 3 - Simple Sorting and Searching
18 pages
Operating Systems Concepts Project List
No ratings yet
Operating Systems Concepts Project List
2 pages
Deep-Learning-Keras-Tensorflow - 1.1.1 Perceptron and Adaline - Ipynb at Master Leriomaggio - Deep-Learning-Keras-Tensorflow
No ratings yet
Deep-Learning-Keras-Tensorflow - 1.1.1 Perceptron and Adaline - Ipynb at Master Leriomaggio - Deep-Learning-Keras-Tensorflow
11 pages
Fast Fourier Transform
No ratings yet
Fast Fourier Transform
16 pages
Dendrogram - Slides
No ratings yet
Dendrogram - Slides
27 pages
Solve With Us Week 11
No ratings yet
Solve With Us Week 11
11 pages
CleanCodeHandbook v1.0.1 PDF
100% (1)
CleanCodeHandbook v1.0.1 PDF
100 pages
Algorithom Lab 5 Sawon
No ratings yet
Algorithom Lab 5 Sawon
8 pages
Difference Between Scheduling Algorithms
No ratings yet
Difference Between Scheduling Algorithms
2 pages
Fast Fourier Transform: XK Xne K N
No ratings yet
Fast Fourier Transform: XK Xne K N
44 pages
ABES Geek-50 Question Paper
No ratings yet
ABES Geek-50 Question Paper
2 pages
Travelling Salesman Problem Using Branch and Bound Approach: Chaitanya Pothineni December 13, 2013
No ratings yet
Travelling Salesman Problem Using Branch and Bound Approach: Chaitanya Pothineni December 13, 2013
8 pages
Max Cliq Report PDF
No ratings yet
Max Cliq Report PDF
9 pages
ADA Solved Model Paper 2024
No ratings yet
ADA Solved Model Paper 2024
43 pages
21-Birthday Attack and HMAC-16-03-2024
No ratings yet
21-Birthday Attack and HMAC-16-03-2024
39 pages
Homework Lecture3 Complexity
No ratings yet
Homework Lecture3 Complexity
4 pages
Csp
No ratings yet
Csp
30 pages
AI Search Iterative Deepening
No ratings yet
AI Search Iterative Deepening
4 pages
Pso Tutorial
No ratings yet
Pso Tutorial
5 pages

XGBoost_ Unleashing the Power of Gradient Boosting

Uploaded by

XGBoost_ Unleashing the Power of Gradient Boosting

Uploaded by

XGBoost: Unleashing the

Power of Gradient Boosting

- A Dive into its Key Features and Advantages

● Boosting is an ensemble learning technique where weak learners are combined

Obj = Loss(y, y_pred) + alpha * Reg_L1 + beta * Reg_L2

● max_depth: Maximum depth of a tree. Higher values may lead to overfitting.

● subsample: Fraction of training data to be randomly sampled during each

● colsample_bytree: Fraction of features to be randomly sampled for building

● min_child_weight: Minimum sum of instance weight (hessian) needed in a child.

● alpha: L1 regularization parameter, controlling tree sparsity (number of non-zero

● lambda: L2 regularization parameter, shrinking leaf node weights to reduce model

scale_pos_weight: Controls the balance of positive and negative weights, particularly

nthread: Sets the number of threads to use for parallel processing

n_estimators: Number of boosting rounds or trees to build.

early_stopping_rounds: If a metric does not improve for a certain number of rounds,

You might also like