Open In App

Boosting in Machine Learning | Boosting and AdaBoost

Last Updated : 14 May, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Boosting is an ensemble learning technique that sequentially combines multiple weak classifiers to create a strong classifier. It is done by training a model using training data and is then evaluated. Next model is built on that which tries to correct the errors present in the first model. This procedure is continued and models are added until either the complete training data set is predicted correctly or predefined number of iterations is reached.

Think of it like in a class a teacher focuses more on weak learners to improve its academic performance similarly boosting works.

Adaboost and its working

AdaBoost (Adaptive Boosting) is a boosting technique that assigns equal weights to all training samples initially and iteratively adjusts these weights by focusing more on misclassified datapoints for next model. It effectively reduces bias and variance making it useful for classification tasks but it can be sensitive to noisy data and outliers.

Training a boosting model 

The above diagram explains the AdaBoost algorithm in a very simple way. Let’s try to understand it in a stepwise process: 

Step 1: Initial Model (B1)

  • The dataset consists of multiple data points (red, blue and green circles).
  • Equal weight is assigned to each data point.
  • The first weak classifier attempts to create a decision boundary.
  • 8 data points are wrongly classified.

Step 2: Adjusting Weights (B2)

  • The misclassified points from B1 are assigned higher weights (shown as darker points in the next step).
  • A new classifier is trained with a refined decision boundary focusing more on the previously misclassified points.
  • Some previously misclassified points are now correctly classified.
  • 6 data points are wrongly classified.

Step 3: Further Adjustment (B3)

  • The newly misclassified points from B2 receive higher weights to ensure better classification.
  • The classifier adjusts again using an improved decision boundary and 4 data points remain misclassified.

Step 4: Final Strong Model (B4 - Ensemble Model)

  • The final ensemble classifier combines B1, B2 and B3 to get strengths of all weak classifiers.
  • By aggregating multiple models the ensemble model achieves higher accuracy than any individual weak model.

Now that we have learned how boosting works using Adaboost now we will learn more about different types of boosting algorithms.

Types Of Boosting Algorithms

There are several types of boosting algorithms some of the most famous and useful models are as :

  1. Gradient Boosting: Gradient Boosting constructs models in a sequential manner where each weak learner minimizes the residual error of the previous one using gradient descent. Instead of adjusting sample weights like AdaBoost Gradient Boosting reduces error directly by optimizing a loss function.
  2. XGBoost: XGBoost is an optimized version of Gradient Boosting that uses regularization to prevent overfitting. It is faster, efficient and supports handling both numerical and categorical variables.
  3. CatBoost: CatBoost is particularly effective for datasets with categorical features. It employs symmetric decision trees and a unique encoding method that considers target values, making it superior in handling categorical data without preprocessing.

Boosting vs Bagging 

Bagging is another ensemble learning technique used to improve accuracy. Unlike bagging where models are trained independently on different machine learning models, Boosting models are trained sequentially with each model correcting the errors of its predecessor. Here is a in depth difference between them:

Feature             Boosting                                      Bagging                                
Combination TypeCombines predictions of different weak modelsCombines predictions of the same type of model
GoalReduces biasReduces variance
Model DependencyNew models depend on previous models' errorsAll the models have the same weightage
WeightingModels are weighted based on performanceAll models have equal weight.

Training Data Sampling

Each new model focuses more on the misclassified examples

Each model is trained on random subset of data

Error handling

Focuses on correcting errors made by previous models

Averages out errors from multiple models

Parallelism

Models are built sequentially less parallelizable

Models can be built in parallel

Overfitting

Less prone to overfitting with proper regularization

Can be prone to overfitting with complex base models

Model Complexity

Typically uses simpler models (like decision stumps)

Can use complex models (like full decision trees)

Example

AdaBoost, Gradient Boosting, XGBoost, LightGBM

Random Forest, Bagged Decision Trees

Advantages of Boosting 

  • Improved Accuracy: By combining multiple weak learners it enhances predictive accuracy for both classification and regression tasks.
  • Robustness to Overfitting: Unlike traditional models it dynamically adjusts weights to prevent overfitting.
  • Handles Imbalanced Data Well: It prioritizes misclassified points making it effective for imbalanced datasets.
  • Better Interpretability: The sequential nature of helps break down decision-making making the model more interpretable.

By understanding Boosting and its applications we can use its capabilities to solve complex real-world problems effectively.


Next Article
Practice Tags :

Similar Reads