0% found this document useful (0 votes)

141 views3 pages

Randomized Decision Trees II: 1 Feature Selection

This document summarizes randomized decision trees and boosting algorithms. It discusses randomized feature selection for decision trees using a subset of high-dimensional features. It then explains boosting intuition as combining multiple weak learners to produce a stronger learner. The full boosting algorithm trains weak learners sequentially on examples weighted by previous performance, upweighting misclassified examples on each round to produce an "expert" learner.

Uploaded by

Christine Straub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

141 views3 pages

Randomized Decision Trees II: 1 Feature Selection

Uploaded by

Christine Straub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Randomized Decision Trees II

compiled by Alvin Wan from Professor Jitendra Maliks lecture

1 Feature Selection

Note that depth-limited trees have a finite number of combinations.

1.1 Randomized Feature Selection

Suppose X is 1000-dimensional. We can randomly select a subset of features to create a new

decision tree. This is called randomized feature selection. If the features are nominal,
such as hair color. Our questions will simply compare: black or brown? brown?

2 Boosting

Our first intuition is the wisdom of the crowds. Our second is that we want experts for
different types of samples. In other words, some trees perform better on particular samples.
How do we give each tree a different weight? This note will cover only the algorithm and
not the proof.

2.1 Intuition

Let us consider the boosting algorithm first proposed, trimmed.

1. Train weak learner.

2. Get weak hypothesis, ht : X {1, +1}.

3. Choose t .

1
At is a single decision tree with error rate t . t is the weighting for a decision tree t.

1 1 t
t = ln
2 t

First thing to notice is that t is at least 0.5, for a binary classification problem. Any less
(say, 0.45) and we can simply invert the classification for a higher accuracy (1-0.45 = 0.55).
Consider the worst case scenario, where t = 0.5. Plugging in t = 0.5, we get t = 0, as
expected.

We pick this weighting per the error rate of a decision tree. Create a classifier h1 , compute
its error 1 and weight 1 . Repeat this for the second, third etc. trees. Here is our scheme,
to train an expert. Take the samples that h1 classified incorrectly. Train h2 on those.
Take the samples that h2 classified incorrectly. Train h3 on those. We can continue in this
fashion to produce an expert. This is the intuition for it, but in reality, we will instead
give more weight to the samples that h1 classified incorrectly.

2.2 Full Algorithm

Now, let us consider the original boosting algorithm in its full glory.

1. Train weak learner with distribution Dt .

2. Get weak hypothesis, ht : X {1, +1}.

3. Choose t .

4. Update Dt = Dt+1 .

P
We compute a probability distribution Dt over the samples. We know that i Dt (i) = 1,
and we can initialize D1 to be

1
D1 (i) = , i
n

For each sample, we multiply its old weight by a factor, which effectively gives more weight
to examples that were classified incorrectly. First, note that for a good classifier, t is high

2
and we weight Dt+1 less, by making our factor et . If otherwise, t is low and our factor is
et , to scale Dt+1 up.

(
Dt (i) et if ht (xi ) = yi
Dt+1 =
Zt et if ht (xi ) 6= yi

We can summarize this update as the following.

Dt (i)exp(t yt ht (xt ))
Dt+1 =
Zt

CDMP Mock Test 1
100% (3)
CDMP Mock Test 1
19 pages
03 Database Management System Important Questions Answers
No ratings yet
03 Database Management System Important Questions Answers
35 pages
Handout9 Trees Bagging Boosting
100% (1)
Handout9 Trees Bagging Boosting
23 pages
Introduction To TensorFlow For Artificial Intelligence
No ratings yet
Introduction To TensorFlow For Artificial Intelligence
41 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
CS229 Supplemental Lecture Notes: 1 Boosting
No ratings yet
CS229 Supplemental Lecture Notes: 1 Boosting
11 pages
Decision Trees and Boosting: Helge Voss (MPI-K, Heidelberg) TMVA Workshop
No ratings yet
Decision Trees and Boosting: Helge Voss (MPI-K, Heidelberg) TMVA Workshop
30 pages
Ensembles 1
No ratings yet
Ensembles 1
4 pages
Boosting
No ratings yet
Boosting
11 pages
Tex
No ratings yet
Tex
7 pages
CSC 3304 Lecture 08 Boosting Ensemble Methods
No ratings yet
CSC 3304 Lecture 08 Boosting Ensemble Methods
41 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Artificial Intelligence Fundamentals: Learning: Boosting
No ratings yet
Artificial Intelligence Fundamentals: Learning: Boosting
24 pages
07 Boosting Notes
No ratings yet
07 Boosting Notes
10 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Boosting Mit
No ratings yet
Boosting Mit
36 pages
Boosting and Additive Tree
No ratings yet
Boosting and Additive Tree
26 pages
gbt
No ratings yet
gbt
24 pages
DSA5102_lecture3
No ratings yet
DSA5102_lecture3
34 pages
chapter 3- boosting theory
No ratings yet
chapter 3- boosting theory
7 pages
cs229 Notes Ensemble
No ratings yet
cs229 Notes Ensemble
7 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
ML Unit 4
No ratings yet
ML Unit 4
47 pages
Random Forest
No ratings yet
Random Forest
10 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Module 3.5 Ensemble Learning XGBoost
No ratings yet
Module 3.5 Ensemble Learning XGBoost
26 pages
Week 12
No ratings yet
Week 12
34 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Extremely Randomized Trees: Pierre Geurts
No ratings yet
Extremely Randomized Trees: Pierre Geurts
40 pages
Trees, Boosting, and Random Forest
No ratings yet
Trees, Boosting, and Random Forest
14 pages
unit3-ml
No ratings yet
unit3-ml
23 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Supervised Learning: Overview 3: Rayid Ghani
No ratings yet
Supervised Learning: Overview 3: Rayid Ghani
20 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Bagging Boosting Comparisons
No ratings yet
Bagging Boosting Comparisons
35 pages
An Empirical Comparison of Pruning Methods For Decision Tree Induction
No ratings yet
An Empirical Comparison of Pruning Methods For Decision Tree Induction
17 pages
Lec 29
No ratings yet
Lec 29
33 pages
AIML QB in Short Form
No ratings yet
AIML QB in Short Form
48 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Unit 4
No ratings yet
Unit 4
33 pages
G
No ratings yet
G
7 pages
ML-Lecture-15-Ensemble
No ratings yet
ML-Lecture-15-Ensemble
27 pages
ML8Ensembles (1)
No ratings yet
ML8Ensembles (1)
31 pages
AdaBoost Notes
No ratings yet
AdaBoost Notes
5 pages
22 Boosting
No ratings yet
22 Boosting
32 pages
09_EnsembleLearning
No ratings yet
09_EnsembleLearning
36 pages
A Study of Adaboost With Naive Bayesian
No ratings yet
A Study of Adaboost With Naive Bayesian
15 pages
کتاب هفتم بارگزاری شده
No ratings yet
کتاب هفتم بارگزاری شده
57 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Improving Classification of J48 Algorithm Using Bagging, Boosting and Blending Ensemble Methods On SONAR Dataset Using WEKA
No ratings yet
Improving Classification of J48 Algorithm Using Bagging, Boosting and Blending Ensemble Methods On SONAR Dataset Using WEKA
3 pages
Lecture-10-boosting
No ratings yet
Lecture-10-boosting
20 pages
DMI UNIT 4
No ratings yet
DMI UNIT 4
34 pages
Aiml QB With Ans - 075736
No ratings yet
Aiml QB With Ans - 075736
69 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
ML mod1
No ratings yet
ML mod1
48 pages
Unit-3 Introduction To Machine Learning Algorithms
No ratings yet
Unit-3 Introduction To Machine Learning Algorithms
18 pages
Lecture 2.1 - AML
No ratings yet
Lecture 2.1 - AML
32 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
Suitability of Various Intelligent Tree Based Classifiers For Diagnosing Noisy Medical Data
No ratings yet
Suitability of Various Intelligent Tree Based Classifiers For Diagnosing Noisy Medical Data
12 pages
Bagging vs Boosting - Javatpoint
No ratings yet
Bagging vs Boosting - Javatpoint
8 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
n9 PDF
No ratings yet
n9 PDF
6 pages
n14 PDF
No ratings yet
n14 PDF
4 pages
Linear Regression: 1 Perspective 1: Maximum Likelihood Estimation
No ratings yet
Linear Regression: 1 Perspective 1: Maximum Likelihood Estimation
5 pages
n15 PDF
No ratings yet
n15 PDF
4 pages
Bias-Variance Tradeoffs: 1 Single Sample MLE
No ratings yet
Bias-Variance Tradeoffs: 1 Single Sample MLE
7 pages
Clustering With Gradient Descent: 1 Performance
No ratings yet
Clustering With Gradient Descent: 1 Performance
4 pages
Convolutional Neural Networks: 1 Convolution
No ratings yet
Convolutional Neural Networks: 1 Convolution
2 pages
n27 PDF
No ratings yet
n27 PDF
3 pages
UC Berkeley EECS: Cal Day, April 18, 2015
No ratings yet
UC Berkeley EECS: Cal Day, April 18, 2015
50 pages
Neural Networks: Derivation: 1 Model
No ratings yet
Neural Networks: Derivation: 1 Model
9 pages
Contra Positive
No ratings yet
Contra Positive
9 pages
n25 PDF
No ratings yet
n25 PDF
8 pages
h31 Higher Order Derivatives Velocity and Acceleration
No ratings yet
h31 Higher Order Derivatives Velocity and Acceleration
2 pages
120712ChE128 7 LiqLiq Extract
No ratings yet
120712ChE128 7 LiqLiq Extract
39 pages
Cheat Sheet
No ratings yet
Cheat Sheet
2 pages
Competition 2008 Spring
No ratings yet
Competition 2008 Spring
2 pages
Formulas Area
No ratings yet
Formulas Area
5 pages
SK Voters by Barangay San Antonio
No ratings yet
SK Voters by Barangay San Antonio
14 pages
Automatic Speech Recognition (Attempt) : ECE 113DB Final Project, Winter 2019 Fong Chi Ho, Zijun Sun, Shao Xiong Lee
No ratings yet
Automatic Speech Recognition (Attempt) : ECE 113DB Final Project, Winter 2019 Fong Chi Ho, Zijun Sun, Shao Xiong Lee
4 pages
Mason's Rule
No ratings yet
Mason's Rule
4 pages
LocalGLMnet: A Deep Learning Architecture For Actuaries
No ratings yet
LocalGLMnet: A Deep Learning Architecture For Actuaries
35 pages
5_6339038654182200073
No ratings yet
5_6339038654182200073
3 pages
Fine-Tuning and Masked Lan-Guage Models: 11.1 Bidirectional Transformer Encoders
No ratings yet
Fine-Tuning and Masked Lan-Guage Models: 11.1 Bidirectional Transformer Encoders
17 pages
Lecture 1
No ratings yet
Lecture 1
25 pages
Lecture 5-Naïve Bayes
No ratings yet
Lecture 5-Naïve Bayes
26 pages
Back Propagation
No ratings yet
Back Propagation
33 pages
Unit 1 DIgital Notes
No ratings yet
Unit 1 DIgital Notes
23 pages
Chapter - 6 Artificial Neural Network (Ann) Modeling
No ratings yet
Chapter - 6 Artificial Neural Network (Ann) Modeling
24 pages
Jan11 CBFC1103 INTRO COMM Skema 2
No ratings yet
Jan11 CBFC1103 INTRO COMM Skema 2
10 pages
Cs 801 Practicals
No ratings yet
Cs 801 Practicals
67 pages
Essential Process Modeling Part 1
No ratings yet
Essential Process Modeling Part 1
26 pages
Multimodal Discourse Analysis
No ratings yet
Multimodal Discourse Analysis
17 pages
4.1 Intro Nosql
No ratings yet
4.1 Intro Nosql
45 pages
Student Advising With Artificial Intelligence: Supervised By:dr - Amani Abdo
No ratings yet
Student Advising With Artificial Intelligence: Supervised By:dr - Amani Abdo
22 pages
P 4 Andp 5
No ratings yet
P 4 Andp 5
4 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
17 pages
Introduction of Business Intelligence
No ratings yet
Introduction of Business Intelligence
48 pages
Day 1-FDP
No ratings yet
Day 1-FDP
20 pages
Unit-3 (MLT)
No ratings yet
Unit-3 (MLT)
46 pages
Fuzzy Logic Control of Blood Pressure During Anesthesia
100% (1)
Fuzzy Logic Control of Blood Pressure During Anesthesia
14 pages
Lung Nodule Detection and Classification From Thorax CT-scan Using RetinaNet With Transfer Learning
No ratings yet
Lung Nodule Detection and Classification From Thorax CT-scan Using RetinaNet With Transfer Learning
11 pages
EMNIST: An Extension of MNIST To Handwritten Letters: Gregory Cohen, Saeed Afshar, Jonathan Tapson, and Andr e Van Schaik
No ratings yet
EMNIST: An Extension of MNIST To Handwritten Letters: Gregory Cohen, Saeed Afshar, Jonathan Tapson, and Andr e Van Schaik
10 pages
Exercise Form Correction Using Pose Estimation (Apr 2020)
No ratings yet
Exercise Form Correction Using Pose Estimation (Apr 2020)
3 pages
SaurabhWableTechnical Projrct Manager
No ratings yet
SaurabhWableTechnical Projrct Manager
4 pages

Randomized Decision Trees II: 1 Feature Selection

Uploaded by

Randomized Decision Trees II: 1 Feature Selection

Uploaded by

Randomized Decision Trees II

compiled by Alvin Wan from Professor Jitendra Maliks lecture

Note that depth-limited trees have a finite number of combinations.

1.1 Randomized Feature Selection

Suppose X is 1000-dimensional. We can randomly select a subset of features to create a new

Let us consider the boosting algorithm first proposed, trimmed.

1. Train weak learner.

2. Get weak hypothesis, ht : X {1, +1}.

2.2 Full Algorithm

1. Train weak learner with distribution Dt .

2. Get weak hypothesis, ht : X {1, +1}.

We can summarize this update as the following.

You might also like