Decision Tree Print
Decision Tree Print
Decision tree builds classifica on or regression models in the form of a tree structure. It breaks down a dataset into
smaller and smaller subsets while at the same me an associated decision tree is incrementally developed. The final
result is a tree with decision nodes and leaf nodes. A decision node (e.g., Outlook) has two or more branches (e.g.,
Sunny, Overcast and Rainy). Leaf node (e.g., Play) represents a classifica on or decision. The topmost decision node
in a tree which corresponds to the best predictor called root node. Decision trees can handle both categorical and
numerical data.
Algorithm
The core algorithm for building decision trees called ID3 by J. R. Quinlan which employs a top-down, greedy search
through the space of possible branches with no backtracking. ID3 uses Entropy and Informa on Gain to construct a
decision tree. In ZeroR model there is no predictor, in OneR model we try to find the single best predictor, naive
Bayesian includes all predictors using Bayes' rule and the independence assump ons between predictors
but decision tree includes all predictors with the dependence assump ons between predictors.
Entropy
A decision tree is built top-down from a root node and involves par oning the data into subsets that contain
instances with similar values (homogenous). ID3 algorithm uses entropy to calculate the homogeneity of a sample. If
the sample is completely homogeneous the entropy is zero and if the sample is an equally divided it has entropy of
one.
To build a decision tree, we need to calculate two types of entropy using frequency tables as follows:
http://www.saedsayad.com/decision_tree.htm 1/4
3/24/2018 Decision Tree
Informa on Gain
The informa on gain is based on the decrease in entropy a er a dataset is split on an a ribute. Construc ng a
decision tree is all about finding a ribute that returns the highest informa on gain (i.e., the most homogeneous
branches).
Step 2: The dataset is then split on the different a ributes. The entropy for each branch is calculated. Then it is added
propor onally, to get total entropy for the split. The resul ng entropy is subtracted from the entropy before the split.
The result is the Informa on Gain, or decrease in entropy.
http://www.saedsayad.com/decision_tree.htm 2/4
3/24/2018 Decision Tree
Step 3: Choose a ribute with the largest informa on gain as the decision node, divide the dataset by its branches
and repeat the same process on every branch.
Step 4b: A branch with entropy more than 0 needs further spli ng.
http://www.saedsayad.com/decision_tree.htm 3/4
3/24/2018 Decision Tree
Step 5: The ID3 algorithm is run recursively on the non-leaf branches, un l all data is classified.
A decision tree can easily be transformed to a set of rules by mapping from the root node to the leaf nodes one by
one.
http://www.saedsayad.com/decision_tree.htm 4/4