0% found this document useful (0 votes)
0 views

lec06decisiontreesandid3algorithm_727c2262eb504a6ee5d0bcf1f5c4d0c3_

The document discusses the Decision Trees and ID3 Algorithm, focusing on their application in decision-making processes and classification tasks. It explains how decision trees are constructed, the importance of attributes in classification, and the statistical measures of entropy and information gain used to evaluate these attributes. The ID3 algorithm is highlighted as a method for building decision trees by selecting the most informative attributes based on their ability to classify training examples.

Uploaded by

aslantepeemin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

lec06decisiontreesandid3algorithm_727c2262eb504a6ee5d0bcf1f5c4d0c3_

The document discusses the Decision Trees and ID3 Algorithm, focusing on their application in decision-making processes and classification tasks. It explains how decision trees are constructed, the importance of attributes in classification, and the statistical measures of entropy and information gain used to evaluate these attributes. The ID3 algorithm is highlighted as a method for building decision trees by selecting the most informative attributes based on their ability to classify training examples.

Uploaded by

aslantepeemin
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Machine Learning & Applications

Decision Trees
& ID3 Algorithm
INSTRUCTOR: JAWAD RASHEED
Agenda ▪ How Decision-making process evolve?
▪ Decision Trees
▪ ID3 Algorithm
▪ Statistical Test
▪ Entropy, and Information Gain
▪ Finding attributing as best classifier

March 17, 2025 2


Decision-Making
Process

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 3


Decision-Making process
● What to do this weekend?
● If my friends are visiting
● We will go to see the downtown area of the city.

● If not
● Then, if it’s sunny I’ll play and go to park
● But if it’s windy and I’m rich, I’ll go shopping
● If it’s rainy, I’ll stay in

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 4


Decision Tree

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 5


Decision Tree
● Decision tree learning is one of the most widely used and practical methods for inductive inference

● Decision tree learning is a method for approximating discrete-valued target functions, in which the
learned function is represented by a decision tree.

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 6


Decision Tree
● Learned trees can also be re-represented as sets of if-then rules to improve human readability.

● It is robust to noisy data and capable of learning disjunctive expressions.

● It has been successfully applied to a broad range of tasks from learning to diagnose medical cases to
learning to assess the credit risk of loan applicants.

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 7


Decision Tree Representation
● Decision trees classify instances by sorting them down the tree from the root to some leaf node,
which provides the classification of the instance.

● Each node in the tree specifies a test of some attribute of the instance.

● An instance is classified by starting at the root node of the tree, testing the attribute specified by this
node, then moving down the tree branch corresponding to the value of the attribute in the given
example.

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 8


Decision Tree Representation

● Test the following instance for the given decision tree:


(Outlook = Sunny, Temperature = Hot, Humidity = High, Wind = Strong)
March 17, 2025 INSTRUCTOR: JAWAD RASHEED 9
Appropriate problems for Decision Tree
Learning
● Decision tree learning is generally best suited to problems with the following characteristics:
● Instances are represented by attribute-value pairs
● The target function has discrete output values
● Disjunctive descriptions may be required
● The training data may contain errors
● The training data may contain missing attribute values

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 10


ID3 Algorithm

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 11


ID3 algorithm
● In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm
invented by Ross Quinlan
● ID3, learns decision trees by constructing them top-down approach
● It begins with the question “which attribute should be tested at the root
of the tree”? Ross Quinlan
● To answer this question, each instance attribute is evaluated using a School of Computer
Science & Engineering,
statistical test. University of New South
Wales, Sydney Australia
● Statistical Test: Which attribute is the best?
Published paper in
Machine Learning, 1986

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 12


Statistical Test

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 13


Statistical test – which attribute is the
best?
● We would like to select the attribute that is most useful for classifying examples.

● What is a good quantitative measure of the worth of an attribute?

● We will define a statistical property, called information gain, that measures how well a given
attribute separates the training examples according to their target classification

● ID3 uses this information gain measure to select among the candidate attributes at each step while
growing the tree.

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 14


Entropy measures
● In order to define information gain precisely, we begin by defining a measure commonly used in
information theory, called entropy.

● It characterizes the (im)purity of an arbitrary collection of examples.


𝑐
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 ≡ − 𝑝⊕ 𝑙𝑜𝑔2 𝑝⊕ − 𝑝⊖ 𝑙𝑜𝑔2 𝑝⊝ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 ≡ ෍ −𝑝𝑖 𝑙𝑜𝑔2 𝑝𝑖
𝑖=1

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 15


Entropy measures
𝑐
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 ≡ − 𝑝⊕ 𝑙𝑜𝑔2 𝑝⊕ − 𝑝⊖ 𝑙𝑜𝑔2 𝑝⊝ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 ≡ ෍ −𝑝𝑖 𝑙𝑜𝑔2 𝑝𝑖
𝑖=1

● For Example: Suppose S is a collection of 14 examples of some Boolean concept, including 9 positive
and 5 negative examples

9 9 5 5
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 [9⊕ , 5⊖ ] = − 14
𝑙𝑜𝑔2 14
− 14
𝑙𝑜𝑔2 14
= -(0.6429)(-0.6374) – (0.3571)(-1.4854)
= -0.4098 + 0.5305
= 0.940

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 16


Entropy measures
● Notice that the entropy is 0 if all members of S belong to the same class.

● If all members are positive, then 𝑝⊖ is 0, and


𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 = −1 . 𝑙𝑜𝑔2 1 − 0 . 𝑙𝑜𝑔2 0
= −1 . 0 − 0 . 𝑙𝑜𝑔2 0 = 0

● Note: Entropy is 1 when the collection contains an equal number of


positive and negative examples.

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 17


Information gain
● Given entropy as a measure of the impurity in a collection of training examples.
● We can now define a measure of the effectiveness of an attribute in classifying the training data.
● The measure we will use, is called information gain.
● The information gain, Gain(S, A) of an attribute A relative to a collection of examples S, is defined as

𝑆𝑣
𝐺𝑎𝑖𝑛 𝑆, 𝐴 ≡ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − ෍ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆𝑣 )
𝑆
𝑣 ∈𝑉𝑎𝑙𝑢𝑒𝑠(𝐴)

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 18


Day Outlook Temperature Humidity Wind PlayTennis
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7
D8
Overcast
Sunny
Cool
Mild
Normal
High
Strong
Weak
Yes
No
Information
D9
D10
Sunny
Rain
Cool
Mild
Normal
Normal
Weak
Weak
Yes
Yes
gain
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 19


Example: Information gain calculation
𝑆𝑣
𝐺𝑎𝑖𝑛 𝑆, 𝐴 ≡ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑆 − ෍ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆𝑣 )
𝑆
𝑣 ∈𝑉𝑎𝑙𝑢𝑒𝑠(𝐴) Day Outlook Temperature Humidity Wind PlayTennis
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes

Lets do it on the board ☺ D5


D6
D7
Rain
Rain
Overcast
Cool
Cool
Cool
Normal
Normal
Normal
Weak
Strong
Strong
Yes
No
Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 20


Information gain: Training examples
9 9 5 5
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 [9⊕ , 5⊖ ] = − 𝑙𝑜𝑔2 − 𝑙𝑜𝑔2
14 14 14 14
= 0.940
Day Outlook Temperature Humidity Wind PlayTennis
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 21


Which attribute is the best classifier?

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 22


Information gain

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 23


Which attribute should be tested next?

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 24


Decision Tree Representation

● Test the following instance for the given decision tree:


(Outlook = Sunny, Temperature = Hot, Humidity = High, Wind = Strong)
March 17, 2025 INSTRUCTOR: JAWAD RASHEED 25
For queries: [email protected]

March 17, 2025 INSTRUCTOR: JAWAD RASHEED 26

You might also like