We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 28
DECISION TREE
Machine learning Algorithms
Dr. Adven Description
A Decision Tree has many analogies
in real life and turns out, it has influenced a wide area of Machine Learning, covering both Classification and Regression. In decision analysis, a decision tree can be used to visually and explicitly represent decisions and decision making. What is a Decision Tree?
A decision tree is a map of the possible outcomes
of a series of related choices. It allows an individual or organization to weigh possible actions against one another based on their costs, probabilities, and benefits.
A decision tree typically starts with a single node
which branches into possible outcomes. Each of those outcomes leads to additional nodes, which branch off into other possibilities. This gives it a tree-like shape. Advantages of Decision Trees Decision trees generate understandable rules. Decision trees perform classification without requiring much computation. Decision trees are capable of handling both continuous and categorical variables. Decision trees provide a clear indication of which features are most important for prediction or classification. Disadvantages of Decision Trees Decision trees are less appropriate for estimation tasks where the goal is to predict the value of a continuous attribute. Decision trees are prone to errors in classification problems with many class and a relatively small number of training examples. Decision trees can be computationally expensive to train. Creating a Decision Tree Let us consider a scenario where a new planet is discovered by a group of astronomers. Now the question is whether it could be ‘the next earth?’ The answer to this question will revolutionize the way people live. There is n number of deciding factors which need to be thoroughly researched to take an intelligent decision i.e.
whether water is present on the planet
what is the temperature whether the surface is prone to continuous storms flora and fauna survives the climate or not, etc. Here’s the decision tree Decision Tree Constituents
A decision tree has the following constituents :
Root Node: The factor of ‘temperature’ is considered as the root in this case. Internal Node: The nodes with one incoming edge and 2 or more outgoing edges. Leaf Node: This is the terminal node with no out- going edge. Classification Rules Classification rules are the cases in which all the scenarios are taken into consideration and a class variable is assigned to each. Class Variable:
Each leaf node is assigned a class-variable. A class-
variable is the final output which leads to our decision. Implementation of the algorithm When you start to implement the algorithm, the first question is: ‘How to pick the starting test condition?’ The answer to this question lies in the values of ‘Entropy’ and ‘Information Gain’. What is Entropy and Information Gain Entropy: Entropy in Decision Tree stands for homogeneity. If the data is completely homogenous, the entropy is 0, else if the data is divided (50-50%) entropy is 1. Information Gain: Information Gain is the decrease/increase in Entropy value when the node is split. Note: An attribute should have the highest information gain to be selected for splitting. Based on the computed values of Entropy and Information Gain, we choose the best attribute at any particular step. Creating a Tree (Example)
Sr. No. A1 A2 Class
1 T T + 2 T T + 3 T F - 4 F F + 5 F T - 6 F T - Example (cont.) Tree Construction Example 2 Example 2 Example 2 Example 2 (Gini Index) Example 2 (Gini Index) Dealing with Numerical Valued Attributes Dealing with Numerical Valued Attributes Dealing with Numerical Valued Attributes Dealing with Numerical Valued Attributes Overfitting in DTs Geometric Intuition of overfitting Geometric Intuition of overfitting Underfitting Underfitting
Hands On Machine Learning With Scikit Learn and TensorFlow Techniques and Tools to Build Learning Machines 3rd Edition by OReilly Media ISBN 9781098122461 1098122461 pdf download