0% found this document useful (0 votes)
31 views

Learning Rate (Or Eta)

This document summarizes key hyperparameters for XGBoost models, including parameters that control regularization and complexity (e.g. learning rate, max depth), hyperparameters related to tree construction (e.g. min child weight, subsample), and other parameters like the objective function and number of estimators. It provides definitions and recommended value ranges for hyperparameters like learning rate (0.01-0.3), number of estimators (100-1000), and max depth (3-10). The document also notes that hyperparameters should be tuned based on the specific task and dataset.

Uploaded by

anuragpanda222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Learning Rate (Or Eta)

This document summarizes key hyperparameters for XGBoost models, including parameters that control regularization and complexity (e.g. learning rate, max depth), hyperparameters related to tree construction (e.g. min child weight, subsample), and other parameters like the objective function and number of estimators. It provides definitions and recommended value ranges for hyperparameters like learning rate (0.01-0.3), number of estimators (100-1000), and max depth (3-10). The document also notes that hyperparameters should be tuned based on the specific task and dataset.

Uploaded by

anuragpanda222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODT, PDF, TXT or read online on Scribd
You are on page 1/ 4

 Learning Rate (or eta):

o Definition: Controls the step size shrinkage used in updating the weights of the model
during each boosting iteration.
o Where to use: Central to controlling the step size during boosting and preventing
overfitting.
o When to use: Lower values make the boosting process more conservative and require
more boosting rounds to converge, while higher values may lead to overfitting.
o XGBoost Hyperparameter: learning_rate
o Recommended values: Typically in the range [0.01, 0.3].
o
 Number of Estimators (n_estimators):
o Definition: Number of boosting rounds or trees to build.
o Where to use: Dictates the number of boosting rounds and the overall complexity of
the model.
o When to use: Higher values can improve model performance, but increasing the
number of estimators also increases computation time.
o XGBoost Hyperparameter: n_estimators
o Recommended values: Depends on the size of the dataset and computational
resources, but typically in the range [100, 1000].
o
 Maximum Depth (max_depth):
o Definition: Maximum depth of a tree in the ensemble.
o Where to use: Controls the depth of individual trees and the complexity of the model.
o When to use: Higher values allow for more complex trees, but too high may lead to
overfitting.
o XGBoost Hyperparameter: max_depth
o Recommended values: Typically in the range [3, 10].
o
 Minimum Child Weight (min_child_weight):
o Definition: Minimum sum of instance weight required in a child node. It helps
prevent overfitting by controlling the minimum size of child nodes.
o Where to use: Ensures that each leaf node has a minimum number of instances, thus
reducing the complexity of the model.
o When to use: Higher values make the algorithm more conservative and reduce the
risk of overfitting.
o XGBoost Hyperparameter: min_child_weight
o Recommended values: Typically in the range [1, 10].
o
 Subsample:
o Definition: Fraction of observations to be randomly sampled for each tree. It
introduces randomness and reduces overfitting.
o Where to use: Controls the randomness of the data sampling process for each tree.
o When to use: Lower values make the model more robust to noise but may lead to
underfitting.
o XGBoost Hyperparameter: subsample
o Recommended values: Typically in the range [0.5, 1.0].
o
 Colsample bytree:
o Definition: Fraction of features to be randomly sampled for each tree. It introduces
randomness and reduces overfitting.
o Where to use: Controls the randomness of feature selection for each tree.
o When to use: Lower values reduce overfitting by introducing more randomness in
feature selection.
o XGBoost Hyperparameter: colsample_bytree
o Recommended values: Typically in the range [0.5, 1.0].
o
 Gamma:
o Definition: Minimum loss reduction required to make a further partition on a leaf
node. It acts as regularization by controlling the complexity of trees.
o Where to use: Helps prevent overfitting by penalizing overly complex trees.
o When to use: Higher values make the algorithm more conservative.
o XGBoost Hyperparameter: gamma
o Recommended values: Typically in the range [0, 0.2].
o
 Regularization Parameters (reg_alpha and reg_lambda):
o Definition: L1 and L2 regularization terms applied to the weights. They help prevent
overfitting by penalizing large parameter values.
o Where to use: Controls the amount of regularization applied to the model.
o When to use: Increase values to increase regularization and reduce overfitting.
o XGBoost Hyperparameters: reg_alpha, reg_lambda
o Recommended values: Typically in the range [0, 0.5].

 Subsample bytree:
o Definition: Fraction of observations to be randomly sampled for each tree. It
introduces randomness and reduces overfitting.
o Where to use: Similar to subsample, but this parameter specifically controls the
randomness for sampling observations when constructing each tree.
o When to use: Can be useful for further fine-tuning the sampling process at the tree
level.
o XGBoost Hyperparameter: subsample_bytree
o Recommended values: Typically in the range [0.5, 1.0].
o
 Lambda:
o Definition: L2 regularization term on weights. It penalizes large coefficients and
helps prevent overfitting.
o Where to use: Provides an alternative way to control regularization compared to
reg_lambda.
o When to use: Can be used as an additional regularization term to further control
overfitting.
o XGBoost Hyperparameter: lambda
o Recommended values: Typically in the range [0, 0.5].
o
 Alpha:
o Definition: L1 regularization term on weights. It encourages sparsity in the weight
vectors.
o Where to use: Provides an alternative way to control regularization compared to
reg_alpha.
o When to use: Useful when dealing with high-dimensional data or when you suspect
that many features are irrelevant.
o XGBoost Hyperparameter: alpha
o Recommended values: Typically in the range [0, 0.5].
o
 Scale Pos Weight:
o Definition: Controls the balance of positive and negative weights. It is useful for
imbalanced classification tasks.
o Where to use: Can be used to address class imbalance by assigning different weights
to positive and negative examples.
o When to use: Relevant for binary or multi-class classification tasks with imbalanced
class distributions.
o XGBoost Hyperparameter: scale_pos_weight
o Recommended values: Typically set to the ratio of negative examples to positive
examples.

 Tree Booster Parameters:


o Definition: Parameters specific to the tree booster (XGBoost's default booster).
o Where to use: Control the behavior and performance of individual trees in the
ensemble.
o When to use: When fine-tuning the boosting algorithm for specific requirements.
o XGBoost Hyperparameters:
 tree_method: Method used to grow trees. Options include auto, exact,
approx, hist, and gpu_hist.
 grow_policy: Controls how trees are added during training. Options include
depthwise and lossguide.
 max_leaves: Maximum number of leaves for each tree. Can be used instead of
max_depth.
 sample_type and normalize_type: Parameters related to histogram-based
algorithms.
 rate_drop and skip_drop: Parameters for dropout regularization.

 Objective Function:
o Definition: The loss function to be optimized during training.
o Where to use: Define the specific task and type of problem (e.g., regression,
classification).
o When to use: When you need to specify a custom loss function or use a non-default
objective.
o XGBoost Hyperparameters:
 objective: The objective function to optimize. Common options include
reg:squarederror for regression tasks and binary:logistic or
multi:softmax for classification tasks.
 eval_metric: Evaluation metric used during training. Different from the objective
function, it's used to monitor performance during training. Common options include rmse,
mae, logloss, error, auc, etc.
 Other Parameters:
o Definition: Additional parameters that control various aspects of the XGBoost
algorithm.
o Where to use: Fine-tuning specific aspects of the algorithm or handling edge cases.
o When to use: Depending on the specific requirements of the task.
o XGBoost Hyperparameters:
 booster: The type of boosting model to use (default is gbtree for tree booster).
 verbosity: Controls the level of details printed during training.
 nthread: Number of threads to use for parallel computation.
 random_state: Seed for random number generation.
These additional hyperparameters offer more flexibility and control over the behavior and
performance of the XGBoost model. When tuning hyperparameters, it's essential to consider
the specific requirements of your task and dataset, and to experiment with different
combinations to find the optimal settings.

You might also like