0% found this document useful (0 votes)

10 views

Lec 3

The K-Means algorithm is an unsupervised learning algorithm used for clustering. It aims to partition a set of data points into k clusters by minimizing the variance within each cluster. The algorithm works by assigning each data point to the nearest cluster centroid and then recalculating the centroid based on the mean of all points within the cluster. This process is repeated until convergence. K-Means is widely used in customer segmentation and pattern recognition tasks.

Uploaded by

Mohammad Ahmad

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Lec 3

Uploaded by

Mohammad Ahmad

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

CSC354

Machine Learning
Dr Muhammad Sharjeel
03
Decision Trees
 General motive of Decision Tree (DT) is to create a training model which can
predict class (or value) of target variables by learning decision rules inferred from
prior data (training data)
 In a DT, each node represents a feature (attribute), each link (branch) a decision
(rule) and each leaf an outcome

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Belongs to the family of supervised learning algorithms
 Could be used to solve both classification and regression problems
 Transparent algorithm, means decisions can be read and understood

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Algorithm pseudocode
1. Place the best attribute of the dataset (complete training set) at the root of
the tree
2. Split the training set into subsets in such a way that each subset contains
data with the same value for an attribute
3. Repeat step 1 and step 2 on each subset until you find leaf nodes in all the
branches of the tree

CSC354 – Machine Learning Dr Muhammad Sharjeel

 To create DT
 Shortlist a root node among all the nodes (nodes are ‘features/attributes’ in the dataset)
 Determine a node (attribute) that best classifies the training data and use it as the root
 Repeat the process for each branch

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Three implementations used to create DTs
 ID3
 C4.5
 CART

CSC354 – Machine Learning Dr Muhammad Sharjeel

 ID3 (Iterative Dichotomiser), uses information gain as metric
 Dichotomisation means dividing something into two completely opposite things
 ID3 iteratively divides attributes into two groups (dominant vs others) to construct
a tree
 Dominant attributes are selected based on information gain
 Performs top-down, greedy search through the space of possible decision trees
 Top-down means it starts building the tree from the top
 Greedy means at each iteration it selects the best feature at the present moment to
create a node

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Which attribute (node) best classifies the training data?
 Most dominant attribute would be the one with the highest information gain
 Information gain calculates the reduction in the entropy
 Entropy (uncertainty) of a dataset is the measure of disorder in the target attribute
 Entropy measures
 How well a given attribute/feature separates (or classifies) the target classes
 Attribute with the highest information gain is selected as the best one

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Entropy is the measurement of the impurity or randomness in the values of the
dataset
 A low disorder (no disorder) implies a low level of impurity
 Values between 0 and 1. A ‘1’ signifies a higher level of disorder or more impurity

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Formulae to calculate Entropy and Information Gain
 Entropy (S) = ∑ – p(I) . log2p(I)
 Gain (S, A) = Entropy(S) – ∑ [ p(S|A) . Entropy(S|A) ]

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Compute the entropy [Entropy(S)] for the entire dataset
 For each attribute/feature:
 Calculate entropy [Entropy(A)] for each value of the attribute
 Calculate average information entropy (IE) for the attribute
 Calculate information gain (IG) for the attribute
 Pick the highest gain attribute
 Repeat until the complete tree is formed

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Example dataset, 14 instances, 4 input attributes
No. Outlook Temperature Humidity Wind PlayGolf
1 Sunny Hot High Weak No
2 Sunny Hot High Strong No
3 Overcast Hot High Weak Yes
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No
7 Overcast Cool Normal Strong Yes
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
10 Rain Mild Normal Weak Yes
11 Sunny Mild Normal Strong Yes
12 Overcast Mild High Strong Yes
13 Overcast Hot Normal Weak Yes
14 Rain Mild High Strong No

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Compute the entropy [Entropy(S)] for the entire dataset
 Entropy(S) = – p(Yes) . log2p(Yes) – p(No) . log2p(No)
 Entropy(S) = – (9/14) . log2(9/14) – (5/14) . log2(5/14) = 0.940

CSC354 – Machine Learning Dr Muhammad Sharjeel

 For each attribute/feature: (let say, Outlook)
 Calculate entropy [Entropy(A)] for each value of the attribute, i.e., in case of Outlook,
'Sunny', 'Rainy’, 'Overcast'

Outlook PlayGolf Outlook PlayGolf Outlook PlayGolf

Sunny No Rain Yes Overcast Yes
Sunny No Rain Yes Overcast Yes
Sunny No Rain No Overcast Yes
Sunny Yes Rain Yes Overcast Yes
Sunny Yes Rain No

Calculations for Outlook (Sunny)

Outlook Positive Negative Entropy
Sunny 2 3 0.971 -(2/5).log2(2/5) - (3/5).log2(3/5)
-(0.4).(-1.322)- (0.6).(-0.737)
Rainy 3 2 0.971
0.5288 + .4422
Overcast 4 0 0 = 0.971

CSC354 – Machine Learning Dr Muhammad Sharjeel

 For each attribute/feature:
 Calculate average information entropy (IE) for the attribute (i.e., Outlook)
 IE(Outlook) = (2+3/9+5)*0.971 + (3+2/9+5)*0.971 + (4+0/9+5)*0
 IE(Outlook) = 0.693

 Calculate information gain (IG) for the attribute (i.e., Outlook)

 IG(Outlook) = 0.940 – 0.693 = 0.247

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Pick the highest gain attribute, in this case, Outlook

Attributes Gain
Outlook 0.247

Temperature 0.029
Outlook
Humidity 0.152

Wind 0.048

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Overcast) only contains examples of ‘Yes’
 Outlook (Sunny, Rain) contains both ‘Yes’ and ‘No’ examples

Outlook

Sunny Overcast Rain

? Yes ?

 Repeat until the complete tree is formed

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Overcast) only contains examples of ‘Yes’
 Outlook (Sunny, Rain) contains both ‘Yes’ and ‘No’ examples

Outlook

Yes

Outlook Temperature Humidity Wind PlayGolf Outlook Temperature Humidity Wind PlayGolf
Sunny Hot High Weak No Rain Mild High Weak Yes
Sunny Hot High Strong No Rain Cool Normal Weak Yes
Sunny Mild High Weak No Rain Cool Normal Strong No
Sunny Cool Normal Weak Yes Rain Mild Normal Weak Yes
Sunny Mild Normal Strong Yes Rain Mild High Strong No

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Sunny)
Outlook Temperature Humidity Wind PlayGolf
Sunny Hot High Weak No
Sunny Hot High Strong No
Sunny Mild High Weak No
Sunny Cool Normal Weak Yes
Sunny Mild Normal Strong Yes

 Entropy(S) = 0.971
 Entropy(A)[Temperature](Cool) = 0
 Entropy(A)[Temperature](Hot) = 0
 Entropy(A)[Temperature](Mild) = 1
 IE(Temperature) = 0.400
 IG(Temperature) = 0.571

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Sunny)
Outlook Temperature Humidity Wind PlayGolf
Sunny Hot High Weak No
Sunny Hot High Strong No
Sunny Mild High Weak No
Sunny Cool Normal Weak Yes
Sunny Mild Normal Strong Yes

 Entropy(S) = 0.971
 Entropy(A)[Humidity](High) = 0
 Entropy(A)[Humidity](Normal) = 0
 IE(Humidity) = 0
 IG(Humidity) = 0.971

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Sunny)
Outlook Temperature Humidity Wind PlayGolf
Sunny Hot High Weak No
Sunny Hot High Strong No
Sunny Mild High Weak No
Sunny Cool Normal Weak Yes
Sunny Mild Normal Strong Yes

 Entropy(S) = 0.971
 Entropy(A)[Wind](Strong) = 1
 Entropy(A)[Wind](Weak) = 0.918
 IE(Wind) = 0.951
 IG(Wind) = 0.020

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Pick the highest gain attribute, in this case, Humidity

Outlook

Sunny Overcast Rain

Humidity Yes ?

Normal High

Yes No

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Rain)
Outlook Temperature Humidity Wind PlayGolf
Rain Mild High Weak Yes
Rain Cool Normal Weak Yes
Rain Cool Normal Strong No
Rain Mild Normal Weak Yes
Rain Mild High Strong No

 Entropy(S) = 0.971
 Entropy(A)[Temperature](Cool) = 1
 Entropy(A)[Temperature](Mild) = 0.918
 IE(Temperature) = 0.951
 IG(Temperature) = 0.020

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Rain)
Outlook Temperature Humidity Wind PlayGolf
Rain Mild High Weak Yes
Rain Cool Normal Weak Yes
Rain Cool Normal Strong No
Rain Mild Normal Weak Yes
Rain Mild High Strong No

 Entropy(S) = 0.971
 Entropy(A)[Humidity](High) = 1
 Entropy(A)[Humidity](Normal) = 0.918
 IE(Humidity) = 0.951
 IG(Humidity) = 0.020

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Rain)
Outlook Temperature Humidity Wind PlayGolf
Rain Mild High Weak Yes
Rain Cool Normal Weak Yes
Rain Cool Normal Strong No
Rain Mild Normal Weak Yes
Rain Mild High Strong No

 Entropy(S) = 0.971
 Entropy(A)[Wind](Weak) = 0
 Entropy(A)[Wind](Strong) = 0
 IE(Humidity) = 0
 IG(Humidity) = 0.971

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Pick the highest gain attribute, in this case, Wind

Outlook

Sunny Overcast Rain

Humidity Yes Wind

Normal High Weak Strong

Yes No Yes No

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Use the final DT (ID3) to classify an unseen example
 Outlook = Sunny, Temperature = Cool, Humidity = High, Wind = Strong
 Output = No

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Shortcomings of ID3
 Information gain reduces the entropy due to the selection of a particular
attribute
 Biasness in considering attributes with a large number of distinct values
which might lead to overfitting
 Continues to go deeper and deeper (builds many branches) to reduce the
training error but results in an increased test error
 Overfitting: Model fits on training data well but fails to generalize
 Underfitting: Model is too simple to find the patterns in the data

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Improving ID3
 Pruning is a mechanism that reduces the size and complexity of a DT by
removing unnecessary nodes
 Pre-pruning, stops the tree construction bit early
 Do not split a node if its goodness measure is below a threshold value
 Post-pruning, once a DT is complete, cross-validation is performed to
test whether expanding a node makes an improvement
 If it shows an improvement, continue expanding the node
 If it shows a reduction in accuracy, node is converted to a leaf node
 To overcome problems in information gain, the information gain ratio is
used (C4.5)

CSC354 – Machine Learning Dr Muhammad Sharjeel

 C4.5 is the improved version of ID3
 Create more generalized models
 Works with continuous data
 Could handle missing data
 Avoids overfitting
 Also known as J48 (C4.5 release 8)
 Uses the information gain ratio as metric to split the dataset
 Information gain (used in ID3) tends to prefer the attributes with more categories
 Such attributes tends to have lower entropy
 Results in overfitting
 Gain ratio mitigates this issue by penalising attributes having more categories
 It uses split information (or intrinsic information)

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Information gain ratio
 GainRatio(A) = Gain(A) / SplitInfo(A)
 Split information
 SplitInfo(A) = -∑ |Dj|/|D| . log2|Dj|/|D|

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Split information for Outlook attribute
 Sunny = 5, Overcast = 4, Rain = 5
 SplitInfo(Outlook) = - (5/14).log2(5/14) - (4/14).log2(4/14) - (5/14).log2(5/14) = 1.577
 GainRatio(Outlook) = 0.247/1.577 = 0.156

 Entropy of the whole dataset, Outlook attribute entropy, and information gain of Outlook already calculated
(ID3)
 Entropy(S) = 0.940
 IE[Outlook] = 0.693
 IG(Outlook) = 0.940 – 0.693 = 0.247

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Gain ratio for Temperature attribute
 Hot = 4, Mild = 6, Cool = 4
 SplitInfo(Temperature) = - (4/14).log2(4/14) - (6/14).log2(6/14) - (4/14).log2(4/14) = 1.556
 GainRatio(Temperature) = 0.029/1.556 = 0.018
 Gain ratio for Humidity attribute
 High = 7, Normal = 7
 SplitInfo(Humidity) = - (7/14).log2(7/14) - (7/14).log2(7/14) = 1
 GainRatio(Humidity) = 0.152/1 = 0.152
 Gain ratio for Wind attribute
 Weak = 8, Strong = 6
 SplitInfo(Wind) = - (8/14).log2(8/14) - (6/14).log2(6/14) = 0.985
 GainRatio(Wind) = 0.048/0.985 = 0.048

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Gain ratio of Outlook is the highest, so it will be the root node

Outlook

Yes

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Sunny)
Outlook Temperature Humidity Wind PlayGolf
Sunny Hot High Weak No
Sunny Hot High Strong No
Sunny Mild High Weak No
Sunny Cool Normal Weak Yes
Sunny Mild Normal Strong Yes

 GainRatio(Temperature) = 0.571/1.521 = 0.375

 GainRatio(Humidity) = 0.971/0.971 = 1
 GainRatio(Wind) = 0.020/0.971 = 0.233

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook (Rain)
Outlook Temperature Humidity Wind PlayGolf
Rain Mild High Weak Yes
Rain Cool Normal Weak Yes
Rain Cool Normal Strong No
Rain Mild Normal Weak Yes
Rain Mild High Strong No

 GainRatio(Temperature) = 0.020/0.971 = 0.020

 GainRatio(Humidity) = 0.020/0.971 = 0.020
 GainRatio(Wind) = 0.971/0.971 = 1

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Final DT using C4.5

Outlook

Sunny Overcast Rain

Humidity Yes Wind

Normal High Weak Strong

Yes No Yes No

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Use the final DT (C4.5) to classify an unseen example
 Outlook = Rain, Temperature = Cool, Humidity = High, Wind = Weak
 Output = Yes

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Some drawback of C4.5
 Split ratio is higher for multi-valued attributes (more outcomes)
 Tends to prefer unbalanced splits in which one partition is much smaller than others
 Classification And Regression Tree (CART) uses gini index as metric
 If a dataset D contains examples from n classes, gini index is defined as
 Gini(D) = 1 – Σ (Pi)2 for i=1 to n (number of classes)
 It creates a binary tree
 If there are more than two outcomes of an attribute then gini index is
 GiniA(D) = (D1/D).Gini(D1) + (D2/D).Gini(D2)
 Reduction in impurity
 Gini(A) = Gini(D) – GiniA(D)

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Total 14 examples, 9 positive, 5 negative
 Gini(D) = 1 – ((9/14)2 + (5/14)2) = 0.459
 Compute gini index of each attribute
 Start with Outlook (Sunny, Overcast, Rain)
 Attribute has three values, it will have 6 subsets
 {(Sunny, Overcast), (Overcast, Rain), (Sunny, Rain), (Sunny), (Overcast), (Rain)}
 Empty and All subsets are not used
 Gini(S,O), R = (9/14) x [1 – ((6/9)2 + (3/9)2)] + (5/14) x [1 – ((3/5)2 + (2/5)2)] = 0.457
 Gini(O,R), S = (9/14) x [1 – ((7/9)2 + (2/9)2)] + (5/14) x [1 – ((2/5)2 + (3/5)2)] = 0.393
 Gini(S,R), O = (10/14) x [1 – ((5/10)2 + (5/10)2)] + (4/14) x [1 – ((4/4)2 + (0/4)2)] = 0.357
 Gini(A) = 0.459 – 0.357 = 0.101

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Temperature (Hot, Mild, Cool)
 Attribute has three values, it will have 6 subsets
 {(Hot, Mild), (Hot, Cool), (Mild, Cool), (Hot), (Mild), (Cool)}
 Gini(H,M), C = (10/14) x [1 – ((6/10)2 + (4/10)2)] + (4/14) x [1 – ((3/4)2 + (1/4)2)] = 0.450
 Gini(H,C), M = (8/14) x [1 – ((5/8)2 + (3/8)2)] + (6/14) x [1 – ((4/6)2 + (2/6)2)] = 0.458
 Gini(M,C), H = (10/14) x [1 – ((7/10)2 + (3/10)2)] + (4/14) x [1 – ((2/4)2 + (2/4)2)] = 0.442
 Gini(A) = 0.459 – 0.442 = 0.016

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Humidity (High, Normal)
 Attribute has only two values
 GiniH, N = (7/14) x [1 – ((6/7)2 + (1/7)2)] + (7/14) x [1 – ((3/7)2 + (4/7)2)] = 0.367
 Gini(A) = 0.459 – 0.367 = 0.091

 Wind (Weak, Strong)

 Attribute has only two values
 GiniW, S = (8/14) x [1 – ((6/8)2 + (2/8)2)] + (6/14) x [1 – ((3/6)2 + (3/6)2)] = 0.428
 Gini(A) = 0.459 – 0.428 = 0.030

 Attribute with the highest gini index is Outlook, hence, it will be chosen as root node
 Within the Outlook, [(Sunny, Rain), Overcast] [Gini(S,R), O] has the lowest gini index

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Partial DT using CART
Outlook

Sunny, Rain Overcast

Outlook Temperature Humidity Wind PlayGolf

Yes
Sunny Hot High Weak No
Sunny Hot High Strong No
Rain Mild High Weak Yes
Rain Cool Normal Weak Yes
Rain Cool Normal Strong No
Sunny Mild High Weak No
Sunny Cool Normal Weak Yes
Rain Mild Normal Weak Yes
Sunny Mild Normal Strong Yes
Rain Mild High Strong No

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Calculate the gini index for the following subset Outlook (Sunny, Rain)

Outlook Temperature Humidity Wind PlayGolf

Sunny Hot High Weak No
Sunny Hot High Strong No
Rain Mild High Weak Yes
Rain Cool Normal Weak Yes
Rain Cool Normal Strong No
Sunny Mild High Weak No
Sunny Cool Normal Weak Yes
Rain Mild Normal Weak Yes
Sunny Mild Normal Strong Yes
Rain Mild High Strong No

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Information Gain: biased toward high branching features
 Gain Ratio: Prefers splits with some partitions being much smaller than the others
 Gini Index: Balanced around 0.5

CSC354 – Machine Learning Dr Muhammad Sharjeel

 C4.5 with the continues (numeric) data
 Example dataset, 14 instances, 4 input attributes, 2 attributes with continuous data
No. Outlook Temperature Humidity Wind PlayGolf
1 Sunny 85 85 Weak No
2 Sunny 80 90 Strong No
3 Overcast 83 78 Weak Yes
4 Rain 70 96 Weak Yes
5 Rain 68 80 Weak Yes
6 Rain 65 70 Strong No
7 Overcast 64 65 Strong Yes
8 Sunny 72 95 Weak No
9 Sunny 69 70 Weak Yes
10 Rain 75 80 Weak Yes
11 Sunny 75 70 Strong Yes
12 Overcast 72 90 Strong Yes
13 Overcast 81 75 Weak Yes
14 Rain 71 80 Strong No

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook and Wind are nominal attributes
 Gain ratio for Wind = 0.048
 Gain ratio for Outlook = 0.156
 Humidity and Temperature are continuous attributes
 Convert continuous values to nominal ones
 Perform binary split based on a threshold value
 Threshold should be a value which offers maximum gain for an attribute

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Separate dataset into two parts
 Instances less than or equal to (<=)
 Instances greater than (>=)
 How?
 Sort the attribute values in ascending order
 Calculate gain ratio for every value
 Value which maximizes the gain would be the threshold (separator)

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Sort the Humidity values smallest to largest
Humidity PlayGolf

65 Yes
70 No
70 Yes
70 Yes
75 Yes
78 Yes
80 Yes
80 Yes
80 No
85 No
90 No
90 Yes
95 No
96 Yes

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Humidity (65)
 Entropy(Humidity<=65) = -(0/1).log2(0/1) – (1/1).log2(1/1) = 0
 Entropy(Humidity>65) = -(5/13).log2(5/13) – (8/13).log2(8/13) = 0.961
 Gain(Humidity<=,> 65) = 0.940 – (1/14).0 – (13/14).(0.961) = 0.048
 SplitInfo(Humidity<=,> 65) = -(1/14).log2(1/14) -(13/14).log2(13/14) = 0.371
 GainRatio(Humidity<=,> 65) = 0.126
 Humidity (70)
 Entropy(Humidity<=70) = – (1/4).log2(1/4) – (3/4).log2(3/4) = 0.811
 Entropy(Humidity>70) = – (4/10).log2(4/10) – (6/10).log2(6/10) = 0.970
 Gain(Humidity<=,> 70) = 0.940 – (4/14).(0.811) – (10/14).(0.970) = 0.014
 SplitInfo(Humidity<=,> 70) = -(4/14).log2(4/14) -(10/14).log2(10/14) = 0.863
 GainRatio(Humidity<=,> 70) = 0.016

CSC354 – Machine Learning Dr Muhammad Sharjeel

 GainRatio(Humidity<=,> 75) = 0.047
 GainRatio(Humidity <=,> 78) = 0.090
 GainRatio(Humidity <=,> 80) = 0.107
 GainRatio(Humidity <=,> 85) = 0.027
 GainRatio(Humidity <=,> 90) = 0.016
 GainRatio(Humidity <=,> 95) = 0.128

 No calculation of gain ratio for Humidity (96) because it cannot be greater than
this value
 Gain is maximum when threshold is equal to Humidity (80)

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Apply the same process on Temperature as its values are continuous too
 Gain is maximum when Temperature (80)
 GainRatio(Temperature<=, > 83) = 0.305
 Gain ratio for all the attributes is summarized in the following table
Attribute GainRatio
Wind 0.049
Outlook 0.155
Humidity <=, > 0.107
Temperature <=, > 0.305

 Temperature will be the root node as it has the highest gain ratio value
 Can you build the complete DT?

CSC354 – Machine Learning Dr Muhammad Sharjeel

 DTs famous implementation models
 CHAID = 1980
 CART = 1984
 ID3 = 1986
 C4.5 = 1993

CSC354 – Machine Learning Dr Muhammad Sharjeel

 CHAID (CHi-square Automatic Interaction Detection)
 Uses chi-square tests to find the most dominant feature
 Check if there is a relationship between two variables and chooses the independent
variable that has the strongest interaction with the dependent variable
 √((y – y’)2 / y’) where y is actual and y’ is expected value

CSC354 – Machine Learning Dr Muhammad Sharjeel

 How to construct a DT using CHAID?
 Find the most dominant feature in the dataset
No. Outlook Temperature Humidity Wind Hour-Played
1 Sunny Hot High Weak 25
2 Sunny Hot High Strong 30
3 Overcast Hot High Weak 46
4 Rain Mild High Weak 45
5 Rain Cool Normal Weak 52
6 Rain Cool Normal Strong 23
7 Overcast Cool Normal Strong 43
8 Sunny Mild High Weak 35
9 Sunny Cool Normal Weak 38
10 Rain Mild Normal Weak 46
11 Sunny Mild Normal Strong 48
12 Overcast Mild High Strong 52
13 Overcast Hot Normal Weak 44
14 Rain Mild High Strong 30

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook
 3 possible values (Sunny, Rain, and Overcast)
 2 decisions (Yes and No)
 Chi-square (yes) - (sunny) - (outlook) = √((2 – 2.5)2 / 2.5) = 0.316

Yes No Total Expected Chi-square (Yes) Chi-square (No)

Sunny 2 3 5 2.5 0.316 0.316

Rain 4 0 4 2 1.414 1.414

Overcast 3 2 5 2.5 0.316 0.316

 Chi-square (outlook) = 0.316+0.316+1.414+1.414+0.316+0.316 = 4.092

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Outlook = 0.316+0.316+1.414+1.414+0.316+0.316 = 4.092
 Temperature = 0 + 0 + 0.577 + 0.577 + 0.707 + 0.707 = 2.569
 Humidity = 0.267 + 0.267 + 1.336 + 1.336 = 3.207
 Wind = 0.802 + 0.802 + 0 + 0 = 1.604

 Outlook has the highest chi-square value (most significant feature) and will be
the root node
 Can you build the complete DT?

CSC354 – Machine Learning Dr Muhammad Sharjeel

 How to construct a DT when the output attribute is a numeric value?

No. Outlook Temperature Humidity Wind Hour-Played

1 Sunny Hot High Weak 25
2 Sunny Hot High Strong 30
3 Overcast Hot High Weak 46
4 Rain Mild High Weak 45
5 Rain Cool Normal Weak 52
6 Rain Cool Normal Strong 23
7 Overcast Cool Normal Strong 43
8 Sunny Mild High Weak 35
9 Sunny Cool Normal Weak 38
10 Rain Mild Normal Weak 46
11 Sunny Mild Normal Strong 48
12 Overcast Mild High Strong 52
13 Overcast Hot Normal Weak 44
14 Rain Mild High Strong 30

CSC354 – Machine Learning Dr Muhammad Sharjeel

 How to construct a DT when the output attribute is a numeric value?
 Regression problems are solved by using the metric ‘standard deviation’
No. Outlook Temperature Humidity Wind Hour-Played
1 Sunny Hot High Weak 25
2 Sunny Hot High Strong 30
3 Overcast Hot High Weak 46
4 Rain Mild High Weak 45
5 Rain Cool Normal Weak 52
6 Rain Cool Normal Strong 23
7 Overcast Cool Normal Strong 43
8 Sunny Mild High Weak 35
9 Sunny Cool Normal Weak 38
10 Rain Mild Normal Weak 46
11 Sunny Mild Normal Strong 48
12 Overcast Mild High Strong 52
13 Overcast Hot Normal Weak 44
14 Rain Mild High Strong 30

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Regression problems are solved by using the metric ‘standard deviation’
 Hours_Played = {25, 30, 46, 45, 52, 23, 43, 35, 38, 46, 48, 52, 44, 30}
 Average= 39.78
 Standard deviation = 9.32

 Outlook
 Overcast = 3.49
 Rain = 10.87
 Sunny = 7.78
 Weighted SD (Outlook) = (4/14)x3.49 + (5/14)x10.87 + (5/14)x7.78 = 7.66
 SD reduction (Outlook) = 9.32 – 7.66 = 1.66

CSC354 – Machine Learning Dr Muhammad Sharjeel

 Regression problems are solved by using the metric ‘standard deviation’
 SD reduction (Outlook) = 9.32 – 7.66 = 1.66
 SD reduction (Temperature) = 9.32 – 8.84 = 0.47
 SD reduction (Humidity) = 9.32 – 9.04 = 0.27
 SD reduction (Wind) = 9.32 – 9.03 = 0.29

 Outlook will be the root node as it has the highest SD reduction value
 Can you build the complete DT?

CSC354 – Machine Learning Dr Muhammad Sharjeel

Thanks

English B Paper 1 TZ2 HL Markscheme
No ratings yet
English B Paper 1 TZ2 HL Markscheme
9 pages
ID3 Algorithm For Decision Trees
No ratings yet
ID3 Algorithm For Decision Trees
16 pages
Prose Analysis THE FAN CLUB by Rona Maynard
0% (1)
Prose Analysis THE FAN CLUB by Rona Maynard
6 pages
Machine Learning
No ratings yet
Machine Learning
52 pages
EXP-3-To Implement CART Decision Tree Algorithm
No ratings yet
EXP-3-To Implement CART Decision Tree Algorithm
14 pages
ML_Unit-2_Material
No ratings yet
ML_Unit-2_Material
20 pages
ML UNIT III
No ratings yet
ML UNIT III
18 pages
AI_01_ID3
No ratings yet
AI_01_ID3
7 pages
Ml Unit 2 Final_iii Yr
No ratings yet
Ml Unit 2 Final_iii Yr
72 pages
Unit 2
No ratings yet
Unit 2
20 pages
DM UNIT III (1)
No ratings yet
DM UNIT III (1)
87 pages
ML Unit-2 Material WORD
No ratings yet
ML Unit-2 Material WORD
25 pages
MIS410-Chapter6
No ratings yet
MIS410-Chapter6
47 pages
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-08-19 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-08-19 Reference-Material-I
11 pages
Decision Tree (Class 37-38) 169692509554958626652505a71d481
No ratings yet
Decision Tree (Class 37-38) 169692509554958626652505a71d481
45 pages
Predictive Modeling Week3
No ratings yet
Predictive Modeling Week3
68 pages
Ml Lecture04x2
No ratings yet
Ml Lecture04x2
16 pages
Practice Q Machine Learning Ans
No ratings yet
Practice Q Machine Learning Ans
54 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
Unit 4 - Decision Tree ID3
No ratings yet
Unit 4 - Decision Tree ID3
5 pages
Classification: Decision Trees
No ratings yet
Classification: Decision Trees
30 pages
ML - Unit 2 - Part I
No ratings yet
ML - Unit 2 - Part I
15 pages
Classification and Regression Trees (CART)
No ratings yet
Classification and Regression Trees (CART)
6 pages
Decision Trees For Classification - A Machine Learning Algorithm - Xoriant
No ratings yet
Decision Trees For Classification - A Machine Learning Algorithm - Xoriant
4 pages
VI Sem Machine Learning CS 601
No ratings yet
VI Sem Machine Learning CS 601
28 pages
Decision Tree Entropy Gini
No ratings yet
Decision Tree Entropy Gini
5 pages
ID3 Lecture4
No ratings yet
ID3 Lecture4
25 pages
Saad Assign 1 AI
No ratings yet
Saad Assign 1 AI
6 pages
Unit IV Notes
No ratings yet
Unit IV Notes
20 pages
3. Tree Models
No ratings yet
3. Tree Models
42 pages
Machine Learning
No ratings yet
Machine Learning
31 pages
Decision Trees CLS
No ratings yet
Decision Trees CLS
43 pages
VI Sem Machine Learning CS 601 PDF
No ratings yet
VI Sem Machine Learning CS 601 PDF
28 pages
ID3 Dozier Seals
No ratings yet
ID3 Dozier Seals
30 pages
Decision Tree
100% (1)
Decision Tree
10 pages
02 DecisionTrees Done
No ratings yet
02 DecisionTrees Done
68 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
decision tree
No ratings yet
decision tree
5 pages
Session 6 - Decision Tree
No ratings yet
Session 6 - Decision Tree
37 pages
Decision Trees For Classification - A Machine Learning Algorithm - Xoriant Blog
No ratings yet
Decision Trees For Classification - A Machine Learning Algorithm - Xoriant Blog
17 pages
Classification and Prediction
No ratings yet
Classification and Prediction
81 pages
Enhancements To Basic Decision Tree Induction, C4.5
No ratings yet
Enhancements To Basic Decision Tree Induction, C4.5
53 pages
Examples
No ratings yet
Examples
8 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
Module 3-1 PDF
No ratings yet
Module 3-1 PDF
43 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
12 pages
Machine Learning - Part 1
100% (1)
Machine Learning - Part 1
80 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
16 pages
Decision Tree Random Forest Theory
No ratings yet
Decision Tree Random Forest Theory
13 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
7-Decision Trees Learning
No ratings yet
7-Decision Trees Learning
51 pages
Lec 3&4
No ratings yet
Lec 3&4
20 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
12 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
15 pages
Unit-2 Notes
No ratings yet
Unit-2 Notes
20 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
decision-tree-intro-MDT903
No ratings yet
decision-tree-intro-MDT903
40 pages
Video Tutorial: Decision Tree Learning
No ratings yet
Video Tutorial: Decision Tree Learning
21 pages
3.1 C 4.5 Algorithm-19
No ratings yet
3.1 C 4.5 Algorithm-19
10 pages
Fujifilm X-T5 Companion: A Guide to Mastering Your Camera
From Everand
Fujifilm X-T5 Companion: A Guide to Mastering Your Camera
Arthur Cam
No ratings yet
Lecture 18
No ratings yet
Lecture 18
62 pages
Lecture 19
No ratings yet
Lecture 19
50 pages
Lecture 17
No ratings yet
Lecture 17
49 pages
Lecture 12
No ratings yet
Lecture 12
73 pages
Lecture 16
No ratings yet
Lecture 16
43 pages
Lecture 15
No ratings yet
Lecture 15
56 pages
Lecture 14
No ratings yet
Lecture 14
45 pages
Lecture 7
No ratings yet
Lecture 7
40 pages
Lecture 8
No ratings yet
Lecture 8
37 pages
Lecture 11
No ratings yet
Lecture 11
41 pages
Lecture 4
No ratings yet
Lecture 4
40 pages
Lecture 10
No ratings yet
Lecture 10
44 pages
Lecture 9
No ratings yet
Lecture 9
44 pages
Lecture 6
No ratings yet
Lecture 6
39 pages
Lecture 1
No ratings yet
Lecture 1
53 pages
Lecture 5
No ratings yet
Lecture 5
40 pages
Lecture 3
No ratings yet
Lecture 3
43 pages
Lec 1
No ratings yet
Lec 1
52 pages
Unit 7
No ratings yet
Unit 7
9 pages
Past Tense Talk
No ratings yet
Past Tense Talk
3 pages
Weekly Activity Midterm CHN (Vivas, Mariah Jamby A. BSN-II)
No ratings yet
Weekly Activity Midterm CHN (Vivas, Mariah Jamby A. BSN-II)
5 pages
NIIT
100% (3)
NIIT
12 pages
Morning Meeting Plan 2
No ratings yet
Morning Meeting Plan 2
8 pages
EF3e Beg Endtest Answerkey
No ratings yet
EF3e Beg Endtest Answerkey
8 pages
Kti Problem Solving
No ratings yet
Kti Problem Solving
3 pages
Mathematics: Quarter 3
No ratings yet
Mathematics: Quarter 3
13 pages
Soal Ulangan Tengah Semester 2 Sekolah Dasar TAHUN PELAJARAN ....................
No ratings yet
Soal Ulangan Tengah Semester 2 Sekolah Dasar TAHUN PELAJARAN ....................
4 pages
Class Program 1 Hour
No ratings yet
Class Program 1 Hour
16 pages
Final CV
No ratings yet
Final CV
5 pages
Revision Notes Constitution
No ratings yet
Revision Notes Constitution
14 pages
Nos Lesson Plan - Portfolio
No ratings yet
Nos Lesson Plan - Portfolio
5 pages
Introducti On To Handicraft S
100% (1)
Introducti On To Handicraft S
30 pages
Bulan North Central School-A Explicit Lesson Plan in English 2 I. Objectives
No ratings yet
Bulan North Central School-A Explicit Lesson Plan in English 2 I. Objectives
4 pages
DLL Mathematics 2 q2 w3
100% (1)
DLL Mathematics 2 q2 w3
10 pages
Thesis Introduction Background of The Study
100% (2)
Thesis Introduction Background of The Study
5 pages
Academic Stress
No ratings yet
Academic Stress
13 pages
English: Quarter 1 - Module 6: Use A/an
No ratings yet
English: Quarter 1 - Module 6: Use A/an
13 pages
LP- Q1- Unit 1 (1)
No ratings yet
LP- Q1- Unit 1 (1)
7 pages
Long Case Kayachikitsa
No ratings yet
Long Case Kayachikitsa
7 pages
Foundations of Curriculum Development 1 2
No ratings yet
Foundations of Curriculum Development 1 2
33 pages
Voice and Tense Lecture 4 Nptel
No ratings yet
Voice and Tense Lecture 4 Nptel
8 pages
ECE 102 Child Development I Prental Infancy NOT VETTED 2
No ratings yet
ECE 102 Child Development I Prental Infancy NOT VETTED 2
115 pages
BMGT 300 - Conflict Management - 221012 - 161201
No ratings yet
BMGT 300 - Conflict Management - 221012 - 161201
11 pages
Example of Undervoltage Load Shedding Implementation
No ratings yet
Example of Undervoltage Load Shedding Implementation
6 pages
Pathfit 4 Overview and Lesson 1 2
No ratings yet
Pathfit 4 Overview and Lesson 1 2
13 pages
List of Empaneled Hco'S of Chgs Allahabad: Hospitals Sr. No. Name of The Hospitals Approved For
No ratings yet
List of Empaneled Hco'S of Chgs Allahabad: Hospitals Sr. No. Name of The Hospitals Approved For
4 pages