0% found this document useful (0 votes)
70 views3 pages

How to compute the complexity parameter α?: Study Notes of CART

The document discusses how to compute the complexity parameter (α) for CART decision tree models. It presents a formula to calculate α as the ratio of the difference between the risk of a node (R(t)) and its subtree (R(Tt)) to the number of terminal nodes in the subtree minus one. It proves this formula works by showing that increasing α increases the risk of subtrees faster than individual nodes, until their risks are equal. An example calculation on a sample dataset with 5 terminal nodes demonstrates applying the formula to compute α for each node.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views3 pages

How to compute the complexity parameter α?: Study Notes of CART

The document discusses how to compute the complexity parameter (α) for CART decision tree models. It presents a formula to calculate α as the ratio of the difference between the risk of a node (R(t)) and its subtree (R(Tt)) to the number of terminal nodes in the subtree minus one. It proves this formula works by showing that increasing α increases the risk of subtrees faster than individual nodes, until their risks are equal. An example calculation on a sample dataset with 5 terminal nodes demonstrates applying the formula to compute α for each node.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Study Notes of CART (2):

How to compute the complexity parameter α?


by Yin Zhao
School of Mathematical Sciences
USM, Penang, Malaysia
December 2013

Proposition: The complexity parameter α (i.e. cp in R “rpart” package) is

R(t) − R(Tt )
α=
|T̃ | − 1

Proof:
Recall that the definition: Rα (T ) = R(T ) + α|T̃ |, and Tt is a branch including node
t. For any single node t ∈ T , we have

Rα (t) = R(t) + α (1)

since there is only one terminal node at a single node.


Similarly, for any branch Tt ∈ T , we have

Rα (Tt ) = R(Tt ) + α|T̃t | (2)

When α = 0, R0 (t) = R(t) > R(Tt ) = R0 (Tt ). This inequality is guaranteed


because the first step of pruning is to prune off all of the terminal nodes which
satisfy R(t) = R(tL ) + R(tR ). That is, the remaining nodes must be satisfied
R(t) > R(tL ) + R(tR ) (the details can be found in the previous notes). Further-
more, the inequality holds for sufficient small α.
Then if we gradually increase α, Rα (Tt ) increases faster than Rα (t) since the coeffi-
cients |T̃t | > 1. In other words, at a certain α we will have Rα (Tt ) = Rα (t). Solve the
equations (1) and (2), we have

R(t) + α = R(Tt ) + α|T̃t |

1
⇒ (T̃t − 1) · α = R(t) − R(Tt )
R(t) − R(Tt )
⇒α=
|T̃ | − 1
as desired.
Example:
This example will simply show how to calculate the complexity parameter α (see
Figure 1 below). The data set has 2 classes say A, B, and 200 samples in all. T1 is a
subtree of the whole tree T , there are 5 terminal nodes in T1 , say t5 , t6 , t7 , t8 , and t9 .

Figure 1: Subtree T1 obtains 5 leaves

According to the formula, we have

R(t1 ) − R(Tt1 ) 100/200 − 0


α(T1 (t1 )) = = = 1/8
5−1 4
R(t2 ) − R(Tt2 ) 10/200 − 0
α(T1 (t2 )) = = = 1/40
3−1 2
R(t3 ) − R(Tt3 ) 60/200 − 0
α(T1 (t3 )) = = = 3/10
2−1 1

2
R(t4 ) − R(Tt4 ) 2/200 − 0
α(T1 (t4 )) = = = 1/100
2−1 1
α(T1 (t4 )) is the first value of α since it obtains the lowest value. That is, we prune
the tree below the node t4 . After this a new iteration should be used as before and
the tree will be pruned once again.

You might also like