0% found this document useful (0 votes)

12 views

unit 4 ml

The document provides an overview of ensemble learning techniques, focusing on methods like bagging, boosting, and stacking to improve model performance. It explains key concepts such as diversity, combining models, and the advantages and disadvantages of ensemble methods, including voting classifiers. Additionally, it details basic ensemble techniques like max voting, averaging, and weighted average, along with practical applications and examples in machine learning.

Uploaded by

sauravpatel220930

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

unit 4 ml

Uploaded by

sauravpatel220930

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

UNIT-IV

Ensemble Learning and Random Forest: Introduction to Ensemble Learning, Basic Ensemble Techniques (Max Voting,
Averaging, Weighted Average), Voting Classifiers, Bagging and Pasting, Out-of-Bag Evaluation, Random Patches and
Random Subspaces, Random Forests (Extra-Trees, Feature Importance), Boosting (AdaBoost, Gradient Boosting),
Stacking.

# Introduction to Ensemble Learning

Ensemble learning is a machine learning technique that combines multiple individual models to solve a problem and
improve the overall performance compared to a single model. The basic idea is that by combining multiple models,
the weaknesses of one model can be compensated by the strengths of others. This approach typically leads to more
accurate and robust predictions.
Key Concepts of Ensemble Learning:
1. Diversity: For an ensemble method to be effective, the models combined in the ensemble must be diverse,
meaning they make different kinds of errors. This diversity allows the ensemble to correct individual model
errors, improving the overall prediction.
2. Combining Models: The results from multiple models are combined to form a final prediction. The
combination methods can include:
o Averaging: For regression tasks, the predictions of individual models are averaged.
o Voting: For classification tasks, the class with the majority of votes is chosen.
3. Improved Accuracy: The ensemble model usually performs better than individual models by reducing
variance (overfitting) and bias (underfitting).
Types of Ensemble Learning Methods:
1. Bagging (Bootstrap Aggregating):
o In bagging, multiple models (usually the same type) are trained on different subsets of the training
data, often generated by bootstrapping (sampling with replacement).
o Each model votes or averages its predictions, and the final result is based on this aggregated output.
o Example: Random Forest is a well-known bagging algorithm, where multiple decision trees are
trained independently, and their predictions are averaged or voted on.
2. Boosting:
o Boosting focuses on combining weak learners (models that perform slightly better than random
guessing) in a sequential manner.
o Each new model is trained to correct the errors made by previous models, giving more weight to the
misclassified data points.
o The final prediction is a weighted combination of the individual model predictions.
o Example: AdaBoost, Gradient Boosting, and XGBoost are popular boosting algorithms.
3. Stacking:
o Stacking (or stacked generalization) involves training multiple models and using another model,
known as a meta-model or blender, to combine the predictions of the base models.
o Unlike bagging and boosting, where models are combined directly, stacking uses the predictions of
the base models as input features for the meta-model.
o This approach allows for a more sophisticated combination of models.
Advantages of Ensemble Learning:
 Improved Accuracy: By combining several models, ensemble learning often leads to better performance,
especially in complex problems.
 Reduced Overfitting: It can help reduce overfitting (variance) by averaging out individual model errors.
 Robustness: Ensemble methods are more robust and less likely to perform poorly compared to a single
model.
Disadvantages of Ensemble Learning:
 Increased Complexity: Ensembles can be computationally expensive, requiring more time and resources for
training and prediction.
 Interpretability: Ensemble methods, especially those like boosting and stacking, can be harder to interpret
because they combine multiple models.
Common Applications:
 Classification: Ensemble methods are widely used in classification tasks, such as spam detection, image
recognition, and sentiment analysis.
 Regression: For regression tasks like predicting house prices or stock prices, ensemble methods can improve
the accuracy of predictions.
 Anomaly Detection: Ensemble learning can also be applied in anomaly detection to identify unusual patterns
in data.
Conclusion:
Ensemble learning is a powerful technique that enhances the predictive performance by combining multiple models.
By leveraging the strengths and compensating for the weaknesses of individual models, ensemble methods like
bagging, boosting, and stacking can help address various machine learning challenges, from overfitting to increasing
accuracy and robustness.

# Basic Ensemble Techniques

Ensemble learning combines the predictions of multiple models to improve performance. The key to making an
ensemble work effectively is how the individual model outputs are combined. Below are three common techniques
used for combining predictions: Max Voting, Averaging, and Weighted Average.
1. Max Voting (Majority Voting)
 Definition: Max voting is typically used in classification problems. In this approach, each individual model in
the ensemble casts a vote for a class, and the class that receives the most votes is chosen as the final
prediction.
 How it works:
o Suppose you have an ensemble of N models.
o Each model outputs a class label for a given data point.
o The class that appears most frequently across all the models is the predicted class for that data point.
 Example:
o Model 1: Class 0
o Model 2: Class 1
o Model 3: Class 0
o Model 4: Class 0
o Model 5: Class 1
o Max Voting Result: Class 0 wins because it has the most votes (3 votes for Class 0 vs. 2 votes for Class
1).
 Use Case: This technique is commonly used in bagging methods like Random Forest, where each decision
tree in the forest "votes" for a class.
2. Averaging
 Definition: Averaging is used in regression tasks. In this technique, the outputs of all the models in the
ensemble are averaged to make the final prediction. This helps to smooth out errors made by individual
models, leading to more stable predictions.
 How it works:
o Suppose you have an ensemble of NNN models, and each model predicts a numeric value for the
given data point.
o The final prediction is the average of these individual predictions.
 Example:
o Model 1: 10
o Model 2: 12
o Model 3: 11
o Model 4: 9
o Model 5: 10
o Averaging Result: The final prediction is the average of all the predictions:

 Use Case: Averaging is commonly used in bagging methods like Random Forest Regression, where multiple
regression trees provide their predictions, and the final prediction is the average of all.
3. Weighted Average
 Definition: The weighted average method is a more refined version of the averaging technique. In this
method, each model's prediction is given a different weight, and the final prediction is the weighted average
of all model predictions. The weight can be based on the model's performance or confidence in the
prediction.
 How it works:
o Each model's prediction is multiplied by a weight that reflects the model's accuracy or reliability.
o The final prediction is calculated as the sum of all the weighted predictions divided by the total sum
of the weights.
 Example:
o Suppose you have the following models and their predictions:
 Model 1: Prediction = 10, Weight = 0.4
 Model 2: Prediction = 12, Weight = 0.3
 Model 3: Prediction = 11, Weight = 0.2
 Model 4: Prediction = 9, Weight = 0.1
o The weighted average prediction is:

 Use Case: Weighted averaging can be used when models have different performances or when certain
models are known to be more reliable for certain types of data. For instance, models with lower error rates
might be assigned higher weights.
Technique Type of Problem How it Works Example
Each model votes for a class, and the class with Random Forest
Max Voting Classification
the most votes is the final prediction. (classification)
The final prediction is the average of all model
Averaging Regression Random Forest Regression
predictions.
Predictions are weighted by the model’s
Weighted Used in more sophisticated
Regression/Classification performance, and the final prediction is a
Average ensembles like boosting
weighted average.
Summary:
These ensemble techniques—max voting, averaging, and weighted averaging—are foundational methods for
combining predictions from multiple models, and they serve as the core of more advanced ensemble methods like
Random Forests (bagging), Boosting algorithms (like AdaBoost or Gradient Boosting), and Stacking.

/_________________________________________________________________________/
#Voting Classifiers
A Voting Classifier is a type of ensemble learning method that combines multiple machine learning models (also
known as "base models" or "weak learners") to make a final prediction based on the majority voting principle. It is
primarily used for classification tasks, where the goal is to combine the predictions of multiple classifiers to increase
accuracy and robustness.
Key Concept of Voting Classifiers:
 Each base model in the ensemble makes a prediction for a given instance.
 Voting occurs to decide the final predicted class:
o The class that receives the most votes from the individual models is chosen as the final prediction.
Voting classifiers are particularly useful when combining different models with complementary strengths, which
leads to improved overall performance.
Types of Voting Classifiers:
1. Hard Voting (Majority Voting):
o Definition: In hard voting, each classifier in the ensemble casts a vote for a class, and the class with
the most votes is chosen as the final prediction. If there is a tie (e.g., two classes have the same
number of votes), a predefined tie-breaking rule may be used.
o How it works:
 Suppose you have an ensemble of NNN classifiers.
 Each classifier assigns a predicted class label to the input data point.
 The final prediction is the class label that appears most frequently across all classifiers.
o Example:
 Model 1: Class 0
 Model 2: Class 1
 Model 3: Class 0
 Model 4: Class 0
Model 5: Class 1
Hard Voting Result: Class 0 is chosen because it has the most votes (3 votes for Class 0 vs. 2
votes for Class 1).
o Use Case: This approach is simple, and often used when combining multiple classifiers like decision
trees, logistic regression, or support vector machines (SVMs) in an ensemble.
2. Soft Voting:
o Definition: Soft voting is a more advanced version of voting. Instead of using hard class labels, soft
voting relies on the predicted probabilities (class probabilities) for each class and takes a weighted
average of these probabilities to make the final prediction.
o How it works:
 For each classifier, the predicted probability of each class is computed.
 The predicted probabilities of each class are averaged (or summed, depending on the
method) across all classifiers.
 The final predicted class is the one with the highest average (or summed) probability.
o Example:
 Model 1: Probability for Class 0 = 0.6, Probability for Class 1 = 0.4
 Model 2: Probability for Class 0 = 0.7, Probability for Class 1 = 0.3
 Model 3: Probability for Class 0 = 0.5, Probability for Class 1 = 0.5
 Soft Voting Result:

Final prediction: Class 0 (since it has the higher average probability).


o Use Case: Soft voting is more effective than hard voting when the models in the ensemble produce
well-calibrated class probabilities. This is often the case with models like Logistic Regression or Naive
Bayes, which provide probabilities as output.
Example of Using a Voting Classifier in Python:
Here’s an example of how to implement a Voting Classifier using scikit-learn in Python, with both hard voting and
soft voting.
python

from sklearn.ensemble import VotingClassifier

from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load a dataset
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Define base classifiers

clf1 = LogisticRegression(max_iter=200)
clf2 = DecisionTreeClassifier(random_state=42)
clf3 = SVC(probability=True, random_state=42)

# Create a Voting Classifier (hard voting)

voting_clf_hard = VotingClassifier(estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='hard')

# Create a Voting Classifier (soft voting)

voting_clf_soft = VotingClassifier(estimators=[('lr', clf1), ('dt', clf2), ('svc', clf3)], voting='soft')

# Train the hard voting classifier

voting_clf_hard.fit(X_train, y_train)

# Train the soft voting classifier

voting_clf_soft.fit(X_train, y_train)

# Evaluate the models

print("Hard Voting Classifier Accuracy: ", voting_clf_hard.score(X_test, y_test))
print("Soft Voting Classifier Accuracy: ", voting_clf_soft.score(X_test, y_test))

Advantages of Voting Classifiers:

 Improved Accuracy: By combining multiple models, a voting classifier often outperforms individual
classifiers.
 Robustness: Voting classifiers are more robust to errors because they combine different perspectives (from
different models) to make a final decision.
 Versatility: Voting classifiers can combine a variety of different model types (e.g., decision trees, support
vector machines, logistic regression), leveraging their unique strengths.
Disadvantages of Voting Classifiers:
 Complexity: The ensemble models are often more computationally expensive to train and use, especially
with a large number of classifiers.
 Diminished Returns: If the base models are very similar or have high bias/variance, the ensemble’s
performance might not significantly improve compared to the individual models.
 Interpretability: With multiple models combined in an ensemble, it may be harder to interpret the decision-
making process compared to using a single model.
Use Cases:
 Ensemble for Classification Problems: Voting classifiers are effective in combining different classifiers (e.g.,
decision trees, logistic regression, SVMs) for tasks like sentiment analysis, image classification, and spam
detection.
 Improved Prediction Stability: In cases where individual models might have high variance (e.g., decision
trees), a voting classifier can reduce the variance and increase stability.
Conclusion:
A Voting Classifier is an effective and simple ensemble method that improves classification performance by
combining multiple models. Hard voting works by taking the majority class prediction, while soft voting averages the
predicted probabilities to make a more refined decision. By using these techniques, voting classifiers can produce
more accurate, robust, and stable predictions than individual classifiers, especially in complex or noisy datasets.

# Bagging and Pasting in Machine Learning

Bagging and Pasting are two ensemble learning techniques that aim to improve the performance of machine
learning models by combining multiple base models trained on different subsets of data. Both methods are forms of
Bootstrap Aggregating (Bagging) but differ in how the training data subsets are generated.
Let’s explore both methods in more detail.

1. Bagging (Bootstrap Aggregating)

Bagging, short for Bootstrap Aggregating, is an ensemble learning method that helps to reduce variance and
improve the performance of machine learning models, particularly those that tend to overfit, such as decision trees.
Key Concept of Bagging:
 Data Sampling: In bagging, multiple models are trained on random subsets of the original training data.
These subsets are generated by bootstrapping, which means sampling the data with replacement. This
allows some data points to be repeated in the same subset, while others may not appear at all.
 Combining Predictions: Once the individual models are trained, their predictions are combined to produce a
final output:
o Classification: For classification tasks, the most common method is majority voting (for each data
point, the class predicted by the majority of models is selected).
o Regression: For regression tasks, the predictions are usually averaged to get the final result.
How Bagging Works:
1. Data Subsets: From the original training dataset, NNN bootstrap samples (random subsets of data with
replacement) are drawn.
2. Train Models: Each bootstrap sample is used to train an individual model, such as a decision tree, on the
data.
3. Combine Results:
o For Classification: Each model "votes" on the predicted class for a given input, and the class with the
majority of votes is chosen.
o For Regression: The predictions of all models are averaged to obtain the final prediction.
Example: Random Forest
 Random Forest is one of the most popular algorithms based on bagging. It builds a collection of decision
trees, each trained on a bootstrap sample of the data. The final prediction is made based on majority voting
(for classification) or averaging (for regression).
Advantages of Bagging:
 Reduces Overfitting: Bagging is particularly effective for high-variance models (e.g., decision trees) as it
reduces variance without increasing bias significantly.
 Improves Accuracy: By combining multiple models, bagging typically results in a more robust model than any
individual model.
 Parallelism: Since each model is trained independently, bagging can be parallelized, making it
computationally efficient.
Disadvantages of Bagging:
 Computationally Expensive: Training multiple models on different data subsets can require significant
computational resources.
 Model Interpretability: When using a large ensemble of complex models (like decision trees), interpreting
the overall decision-making process becomes challenging.

2. Pasting
Pasting is a variation of bagging with a subtle but important difference in how the training subsets are created.
Key Concept of Pasting:
 Data Sampling: Unlike bagging, pasting generates training subsets by sampling without replacement from
the original training data. In other words, no data point can appear more than once in any given training
subset.
 Combining Predictions: After training the individual models on different subsets of the data, the predictions
are combined in the same way as in bagging: using majority voting for classification or averaging for
regression.
How Pasting Works:
1. Data Subsets: From the original dataset, NNN samples are drawn without replacement to create each
training subset.
2. Train Models: Each subset is used to train an individual model, just like in bagging.
3. Combine Results:
o For Classification: Each model casts a vote for a predicted class, and the majority vote is taken as the
final prediction.
o For Regression: The predictions from all models are averaged.
Key Difference Between Bagging and Pasting:
 Bagging samples data with replacement, meaning a data point can appear multiple times in the same subset.
 Pasting samples data without replacement, meaning each subset contains unique data points and no data
point appears more than once.
Example:
Imagine a dataset with 10 data points:
 In Bagging: A training subset could contain, for example, data points {1, 2, 2, 4, 5, 7, 8, 8, 9, 10} (with
repetitions).
 In Pasting: A training subset would contain {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} (with no repetitions).
Advantages of Pasting:
 No Redundancy: Since there are no repeated data points in each training subset, pasting might be more
efficient in utilizing the available training data.
 Good for High-Variance Models: Like bagging, pasting helps reduce overfitting by combining multiple
models.
Disadvantages of Pasting:
 Limited Diversity: Since no data points are repeated in each subset, the subsets might have more in common
with each other than in bagging. This could lead to a reduced level of model diversity, which can impact
performance.
 Computational Cost: Like bagging, pasting requires training multiple models, which can be computationally
expensive.

Comparison of Bagging and Pasting:

Aspect Bagging Pasting
Sampling Method Sampling with replacement Sampling without replacement
Training Data Some data points may appear multiple times in Each data point appears at most once in a
Subsets each subset subset
Higher model diversity due to repeated data Lower diversity compared to bagging due to
Model Diversity
points in subsets unique data points in each subset
Computational Can be parallelized, but subsets are typically
Can be parallelized effectively
Efficiency smaller
Best for reducing high variance in models prone Suitable when the dataset is large, and data
Use Case
to overfitting (e.g., decision trees) points should not be reused

Example of Bagging and Pasting in Python (with scikit-learn):

Both bagging and pasting can be implemented using the BaggingClassifier in scikit-learn. Although the class is named
BaggingClassifier, the only difference between bagging and pasting in scikit-learn is the sampling method, controlled
via the max_samples parameter.
python
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load dataset
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Bagging (Bootstrap Aggregating) using decision tree classifier

bagging_clf = BaggingClassifier(DecisionTreeClassifier(), n_estimators=50, random_state=42)
bagging_clf.fit(X_train, y_train)
print(f"Bagging Accuracy: {bagging_clf.score(X_test, y_test)}")

# Pasting using decision tree classifier (max_samples = 1.0 ensures no repetition of data points)
pasting_clf = BaggingClassifier(DecisionTreeClassifier(), n_estimators=50, max_samples=1.0, bootstrap=False,
random_state=42)
pasting_clf.fit(X_train, y_train)
print(f"Pasting Accuracy: {pasting_clf.score(X_test, y_test)}")
Conclusion:
 Bagging and Pasting are ensemble methods that improve the performance of machine learning models by
reducing overfitting and increasing robustness.
 Bagging uses bootstrap sampling (sampling with replacement), while Pasting uses sampling without
replacement.
 Both methods work by training multiple models on different subsets of data and combining their predictions,
but bagging tends to have more model diversity due to repeated samples in the training sets, while pasting
ensures no redundancy in the data subsets.

# Out-of-Bag (OOB) Evaluation

Out-of-Bag (OOB) Evaluation is a technique used to estimate the performance of an ensemble model, particularly in
methods like Bagging (Bootstrap Aggregating) and Random Forests, without needing a separate validation set or
cross-validation. It leverages the inherent structure of bootstrapping to evaluate model performance on data that
wasn't used in training individual models.
Key Concept of Out-of-Bag Evaluation:
In bagging, each model is trained on a bootstrap sample—a random subset of the training data with replacement.
Since each subset can have repeated data points, some data points are left out of the training set for each model.
These left-out data points are called Out-of-Bag (OOB) samples.
Out-of-Bag Evaluation uses these OOB samples to estimate the performance of the model. The main idea is:
 For each data point in the training set, you can track how often it is left out of the bootstrap samples during
model training.
 When predicting the class (in classification) or the value (in regression) for a data point, only the models that
did not see that point during training are used.
 This allows for a validation-like process without needing to reserve a separate validation set.
How Out-of-Bag (OOB) Evaluation Works:
1. Bootstrapping: During the training phase, each model in the ensemble is trained on a bootstrap sample
(random subset with replacement). For each model, a portion of the data is left out (OOB samples).
2. OOB Prediction: Each data point has multiple models that did not see it during training (since the data point
was left out of the bootstrap sample). These models are used to predict the outcome for that data point.
3. Performance Estimation: The predictions made by the models that did not use a particular data point are
compared to the actual label (for classification) or value (for regression) of that data point. The performance
(e.g., accuracy, mean squared error) is averaged across all data points.
Example:
Consider a dataset with 1000 samples:
 Each decision tree in the random forest is trained on a bootstrap sample, and each tree leaves out some
samples from the original dataset (because of sampling with replacement).
 For each data point in the dataset, we can see how often it was left out and which trees are available to make
predictions for that data point.
 If a data point was not in a tree’s training set, that tree can be used to predict the class or value of that point.
The final prediction is often the average or majority vote of the predictions from all trees that did not use
that point.
 The OOB error rate is then calculated as the average error of all predictions made using OOB samples.
Advantages of Out-of-Bag Evaluation:
 No Need for a Separate Validation Set: OOB evaluation effectively uses the data points that were left out
during the training process to estimate the model's performance, so it eliminates the need for an extra
holdout validation set. This is especially useful when the dataset is small.
 Efficient: Since the models are trained on different subsets of the data, OOB evaluation can be done without
additional data splits, which saves time and computation.
 Accurate Estimate: The OOB estimate can be as accurate as other validation techniques like cross-validation.
In fact, for Random Forests, it is often as good or better because the OOB process inherently tests each
model on data that it has not seen.
 Reduces Overfitting: By using the OOB samples for evaluation, the model's tendency to overfit to the training
data is reduced. The OOB samples act as a kind of pseudo-validation set that helps provide an unbiased
estimate of model performance.
Example of Out-of-Bag Evaluation in Random Forests (with Python):
In scikit-learn, RandomForestClassifier and RandomForestRegressor provide built-in support for OOB evaluation.
Here's an example using RandomForestClassifier.
python
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
# Load dataset
X, y = load_iris(return_X_y=True)

# Split into train and test sets (no validation set used here)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Initialize Random Forest Classifier with OOB evaluation enabled
rf = RandomForestClassifier(n_estimators=100, oob_score=True, random_state=42)
# Train the model
rf.fit(X_train, y_train)
# Access the Out-of-Bag score (similar to accuracy)
print(f"Out-of-Bag Score: {rf.oob_score_}")
# Evaluate on test data
test_accuracy = rf.score(X_test, y_test)
print(f"Test Accuracy: {test_accuracy}")

Interpretation of the Output:

 The OOB Score provides the accuracy of the Random Forest classifier on the out-of-bag samples (those data
points that were not included in each bootstrap sample). It is used as an estimate of the model's
performance on unseen data.
 The Test Accuracy shows how well the model performs on the test set, which is typically used as a final
evaluation.
Use Case:
 Random Forests: OOB evaluation is particularly useful in Random Forests, where multiple trees are trained
on bootstrapped data samples. Each tree in the forest is evaluated on the data points that were not included
in its own training set. This provides an unbiased estimate of performance without needing to use a separate
validation set.
 Other Ensemble Models: Although OOB evaluation is most commonly used in Random Forests, it can also be
applied to other bagging algorithms, like BaggingClassifier and BaggingRegressor.
Summary of Key Points:
 Out-of-Bag (OOB) Evaluation is an efficient method to estimate the performance of an ensemble model like
Random Forests.
 It works by using data points that were not included in the bootstrap samples during training.
 It is an advantage over using a separate validation set because it makes full use of the training data without
overfitting, saving both time and computation.
 In scikit-learn, the oob_score attribute can be used to retrieve the OOB score for Random Forest models.
 OOB evaluation provides a reliable estimate of model performance, often comparable to cross-validation.

(EN-VN) RBA Self Audit Checklist - Y2022
100% (1)
(EN-VN) RBA Self Audit Checklist - Y2022
48 pages
The Following Tables Contain Financial Statements For Dynastatics Corporation Although
No ratings yet
The Following Tables Contain Financial Statements For Dynastatics Corporation Although
2 pages
unit 4 pdf
No ratings yet
unit 4 pdf
9 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
unit 5 ML
No ratings yet
unit 5 ML
14 pages
Ensemble Learning
100% (1)
Ensemble Learning
7 pages
ML Uint 4-2
No ratings yet
ML Uint 4-2
20 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
Unit I ML (I) 24-25
No ratings yet
Unit I ML (I) 24-25
79 pages
AIML-UNIT-4
No ratings yet
AIML-UNIT-4
17 pages
Ensemble Learning and Random Forest 4th
No ratings yet
Ensemble Learning and Random Forest 4th
19 pages
Time To Explore (5) ML
No ratings yet
Time To Explore (5) ML
9 pages
Random Forest Class Lecture Notes
No ratings yet
Random Forest Class Lecture Notes
2 pages
ML UNIT 3-1
No ratings yet
ML UNIT 3-1
14 pages
Machine learning
No ratings yet
Machine learning
76 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Unit-I (Ensemble Learning)
No ratings yet
Unit-I (Ensemble Learning)
67 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Unit 3
No ratings yet
Unit 3
22 pages
What Is Bagging and How It Works Machine Learning 1615649432
100% (1)
What Is Bagging and How It Works Machine Learning 1615649432
5 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Ensemble Learning: Comprehensive Explanation: Base Models
No ratings yet
Ensemble Learning: Comprehensive Explanation: Base Models
20 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
4 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
AIML UNIT 5
No ratings yet
AIML UNIT 5
49 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
UNIT 3 AML
No ratings yet
UNIT 3 AML
9 pages
Unit IV Aiml
No ratings yet
Unit IV Aiml
32 pages
ML Ass
No ratings yet
ML Ass
21 pages
Pa Mod - 3,4,5
No ratings yet
Pa Mod - 3,4,5
47 pages
Ensemble Methods
No ratings yet
Ensemble Methods
27 pages
College notes
No ratings yet
College notes
9 pages
ML Unit-3
No ratings yet
ML Unit-3
28 pages
Bagging vs Boosting in Machine Learning
No ratings yet
Bagging vs Boosting in Machine Learning
5 pages
Unit 3
No ratings yet
Unit 3
99 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Bagging Vs Boosting in Machine Learning
No ratings yet
Bagging Vs Boosting in Machine Learning
4 pages
Bagging
No ratings yet
Bagging
6 pages
Artificial intelligence ca1 Anurup golui reg - 12324136
No ratings yet
Artificial intelligence ca1 Anurup golui reg - 12324136
7 pages
Ai ML Unit 4 Notes
No ratings yet
Ai ML Unit 4 Notes
42 pages
Machine learning
No ratings yet
Machine learning
5 pages
33_Assignment 7_ Implementation of Ensemble techniques
No ratings yet
33_Assignment 7_ Implementation of Ensemble techniques
7 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Unit 4 Updated Notes
No ratings yet
Unit 4 Updated Notes
13 pages
Ensemble Learning: Wisdom of The Crowd
100% (1)
Ensemble Learning: Wisdom of The Crowd
12 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Unit V Aiml
No ratings yet
Unit V Aiml
18 pages
Classification Through Ensembling Techniques
No ratings yet
Classification Through Ensembling Techniques
10 pages
Question Set-1
No ratings yet
Question Set-1
10 pages
AAM PST Answers-1
No ratings yet
AAM PST Answers-1
10 pages
ML Module 5 2022 PDF
100% (2)
ML Module 5 2022 PDF
31 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
8 pages
??????? ???????? ??????????!
No ratings yet
??????? ???????? ??????????!
16 pages
Ensemble Methods (Final)
No ratings yet
Ensemble Methods (Final)
16 pages
Random Forest
No ratings yet
Random Forest
29 pages
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
No ratings yet
JNTUK R20 B.Tech CSE 3-2 Machine Learning Unit 3 Notes
21 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
ML - Module 4
No ratings yet
ML - Module 4
8 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
RF Menu Configuration With ITS Mobile
No ratings yet
RF Menu Configuration With ITS Mobile
9 pages
Fulltext
No ratings yet
Fulltext
35 pages
Logcat 1690528395717
No ratings yet
Logcat 1690528395717
68 pages
PE - Gas Sizing - Guide
No ratings yet
PE - Gas Sizing - Guide
8 pages
13398 - Year - B.com. CBCS Pattern Semester-VI Subject - UCA6C05 - Income Tax
No ratings yet
13398 - Year - B.com. CBCS Pattern Semester-VI Subject - UCA6C05 - Income Tax
8 pages
Transcription Water and Sanitation For Health Facility Improvement Tool WASH FIT Course Introduction EN
No ratings yet
Transcription Water and Sanitation For Health Facility Improvement Tool WASH FIT Course Introduction EN
2 pages
Financial Analysis of MRF LTD
No ratings yet
Financial Analysis of MRF LTD
9 pages
Heartseeker v3
No ratings yet
Heartseeker v3
2 pages
Datasheet-Advanced Security+EDR
No ratings yet
Datasheet-Advanced Security+EDR
4 pages
FAPE Approach Plate
No ratings yet
FAPE Approach Plate
11 pages
Gullas vs. PNB
No ratings yet
Gullas vs. PNB
5 pages
READ ME !! (English)
No ratings yet
READ ME !! (English)
4 pages
Coastal Eco Tourism in India
No ratings yet
Coastal Eco Tourism in India
5 pages
Heat and Mass Transfer Prelim Problem Set
No ratings yet
Heat and Mass Transfer Prelim Problem Set
3 pages
General Instruction: Please Avoid Erasures Any Form of Erasures Are Not Counted
No ratings yet
General Instruction: Please Avoid Erasures Any Form of Erasures Are Not Counted
4 pages
PSLE Science Systems Unlocked
No ratings yet
PSLE Science Systems Unlocked
5 pages
Ijmet 09 05 114
No ratings yet
Ijmet 09 05 114
9 pages
Mathematics 3
No ratings yet
Mathematics 3
2 pages
NCP (Acute Gastroenteritis) Vizconde, Ehreiz Raiden C. BSN2-A
No ratings yet
NCP (Acute Gastroenteritis) Vizconde, Ehreiz Raiden C. BSN2-A
5 pages
Design Approach 1 Combination 1 (A1+M1+R1) Combination 2 (A2+M2+R1) Design Approach 2 (A1+M1+R2) Design Approach 3 (A1or A2) +M2+R3
No ratings yet
Design Approach 1 Combination 1 (A1+M1+R1) Combination 2 (A2+M2+R1) Design Approach 2 (A1+M1+R2) Design Approach 3 (A1or A2) +M2+R3
4 pages
Titanic
No ratings yet
Titanic
11 pages
WR&M Hacked To 2D6
No ratings yet
WR&M Hacked To 2D6
9 pages
Blender Manual
No ratings yet
Blender Manual
40 pages
Borang Daftar MyHDW (Keluarga)
No ratings yet
Borang Daftar MyHDW (Keluarga)
5 pages
Course Outline Dfn50343 Ent Network - Sesi220232024
No ratings yet
Course Outline Dfn50343 Ent Network - Sesi220232024
4 pages
MPSS SRS
No ratings yet
MPSS SRS
10 pages
Module 3 Mains Test Series Eng
No ratings yet
Module 3 Mains Test Series Eng
10 pages
Plan Diseño de Mezcla de CCR: Proyecto: Suministro de Agua Construccion
No ratings yet
Plan Diseño de Mezcla de CCR: Proyecto: Suministro de Agua Construccion
28 pages

unit 4 ml

Uploaded by

unit 4 ml

Uploaded by

UNIT-IV

# Introduction to Ensemble Learning

# Basic Ensemble Techniques

Final prediction: Class 0 (since it has the higher average probability).

from sklearn.ensemble import VotingClassifier

# Define base classifiers

# Create a Voting Classifier (hard voting)

# Create a Voting Classifier (soft voting)

# Train the hard voting classifier

# Train the soft voting classifier

# Evaluate the models

Advantages of Voting Classifiers:

# Bagging and Pasting in Machine Learning

1. Bagging (Bootstrap Aggregating)

Comparison of Bagging and Pasting:

Example of Bagging and Pasting in Python (with scikit-learn):

# Bagging (Bootstrap Aggregating) using decision tree classifier

# Out-of-Bag (OOB) Evaluation

Interpretation of the Output:

You might also like