0% found this document useful (0 votes)

12 views14 pages

Merging Result-Merged

This homework assignment involves building classification models using Naive Bayes and decision trees. It provides a dataset for each modeling task and asks students to: 1) Estimate probabilities and build a Naive Bayes model to classify data points from the given dataset. 2) Build a decision tree model for another classification dataset using a simplified CART algorithm. Students are asked to calculate subsets, Gini indices, and make predictions using the decision tree. 3) The assignment is due on May 30th, 2023 and is out of 100 total points. It asks students to include their code and figures in a single PDF submission.

Uploaded by

ellison0930

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views14 pages

Merging Result-Merged

Uploaded by

ellison0930

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Homework Assignment 4

CSE 151A: Introduction to Machine Learning

Due: May 30th, 2023, 9:30am (Pacific Time)

Instructions: Please answer the questions below, attach your code in the document, and
insert figures to create a single PDF file. You may search information online but you will
need to write code/find solutions to answer the questions yourself.

Grade: out of 100 points

1 (40 points) Naı̈ve Bayes

In this question, we would like to build a Naı̈ve Bayes model for a classification task. Assume
there is a classification dataset S = {(x(i) , y (i) ), i = 1, ..., 8} where each data point (x, y)
contains a feature vector x = (x1 , x2 , x3 ); x1 , x2 , x3 2 {0, 1} and a ground-truth label y 2
{0, 1}. The dataset S can be read from the table below:

i x1 x2 x3 y
1 0 0 1 1
2 0 1 1 1
3 1 1 0 1
4 0 0 1 1
5 0 1 0 0
6 1 1 0 0
7 1 0 0 0
8 0 0 1 0

In Naı̈ve Bayes model, we use random variable Xi 2 {0, 1} to represent i-th dimension of the
feature vector x, and random variable Y 2 {0, 1} to represent the class label y. Thus, we
can estimate probabilities P (Y ), P (Xi |Y ) and P (Xi , Y ) by counting data points in dataset
S, for example:

#{data points with y = 1} 4

P (Y = 1) = = = 0.5
#{all data points} 8
#{data points with x1 = 1 and y = 0} 2
P (X1 = 1|Y = 0) = = = 0.5
#{data points with y = 0} 4
P (X1 = 1, Y = 1) = P (X1 = 1|Y = 1)P (Y = 1)
#{data points with x1 = 1 and y = 1} 1
= = = 0.125
#{all data points} 8

1
It is noteworthy that only probabilities P (Y ), P (Xi |Y ) and P (Xi , Y ) can be directly esti-
mated from dataset S in Naı̈ve Bayes model. Other joint probabilities (e.g. P (X1 , X2 ) and
P (X1 , X2 , X3 )) should not be estimated by directly counting the data points.

Next, we can use the probabilities P (Y ) and P (Xi |Y ) to build our Naı̈ve Bayes model for
classification: For a feature vector x = (x1 , x2 , x3 ), we can estimate the probability P (Y =
y|X1 = x1 , X2 = x2 , X3 = x3 ) with the conditional independence assumptions:

P (X1 = x1 , X2 = x2 , X3 = x3 , Y = y)
P (Y = y|X1 = x1 , X2 = x2 , X3 = x3 ) =
P (X1 = x1 , X2 = x2 , X3 = x3 )
P (X1 = x1 , X2 = x2 , X3 = x3 |Y = y)P (Y = y)
=
P (X1 = x1 , X2 = x2 , X3 = x3 )
⇣Q ⌘
3
i=1 P (X i = x i |Y = y) P (Y = y)
=
P (X1 = x1 , X2 = x2 , X3 = x3 )

where the joint probability P (X1 = x1 , X2 = x2 , X3 = x3 ) can be calculated as:

1
X
P (X1 = x1 , X2 = x2 , X3 = x3 ) = P (X1 = x1 , X2 = x2 , X3 = x3 , Y = y)
y=0
1 ⇣
X ⌘
= P (X1 = x1 , X2 = x2 , X3 = x3 |Y = y)P (Y = y)
y=0
!
1
X ⇣Y
3 ⌘
= P (Xi = xi |Y = y) P (Y = y)
y=0 i=1

Finally, if we find:

P (Y = 1|X1 = x1 , X2 = x2 , X3 = x3 ) > P (Y = 0|X1 = x1 , X2 = x2 , X3 = x3 )

then we can predict the class of feature vector x = (x1 , x2 , x3 ) to be 1, otherwise 0. It is

noteworthy that although conditional independence assumptions are made in Naı̈ve Bayes
model, P (Y = 1|X1 = x1 , X2 = x2 , X3 = x3 ) + P (Y = 0|X1 = x1 , X2 = x2 , X3 = x3 ) should
still be 1.

1. (15 pts) Please estimate the following probabilities:

(1) P (X1 = 1, Y = 0), (2) P (Y = 0), (3) P (X1 = 1|Y = 1).

Note that these probabilities can be directly estimated by counting from dataset S.

2
2. (18 pts) Please calculate the probability P (Y = 1|X1 = 1, X2 = 1, X3 = 0) in Naı̈ve
Bayes model using conditional independence assumptions.

3. (7 pts) Please calculate the probability P (Y = 0|X1 = 1, X2 = 1, X3 = 0) in Naı̈ve

Bayes model and predict the class of feature vector x = (1, 1, 0).

3
2 (40 points) Decision Tree
In this question, we would like to create a decision tree model for a binary classification task.
Assume there is a classification dataset T = {(x(i) , y (i) ), i = 1, ..., 5} where each data point
(x, y) contains a feature vector x = (x1 , x2 ) 2 R2 and a ground-truth label y 2 {0, 1}. The
dataset T can be read from the table below:
i x1 x2 y
1 1.0 2.0 1
2 2.0 2.0 1
3 3.0 2.0 0
4 2.0 3.0 0
5 1.0 3.0 0
To build the decision tree model, we use a simplified CART algorithm, which is a recursive
procedure as follows:
• Initialize a root node with dataset T and set it as current node.
• Start a procedure for current node:
– Step 1: Assume the dataset in current node is Tcur . Check if all data points in
Tcur are in the same class:
∗ If it is true, set current node as a leaf node to predict the common class in
Tcur , and then terminate current procedure.
∗ If it is false, continue the procedure.
– Step 2: Traverse all possible splitting rules. Each splitting rule is represented by
a vector (j, t), which compares feature xj and threshold t to split the dataset Tcur
into two subsets T1 , T2 :
T1 = {(x, y) 2 Tcur where xj  t},
T2 = {(x, y) 2 Tcur where xj > t}.
We will traverse the rules over all feature dimensions j 2 {0, 1} and thresholds
t 2 {xj |(x, y) 2 Tcur }.
– Step 3: Decide the best splitting rule. The best splitting rule (j ⇤ , t⇤ ) minimizes
the weighted sum of Gini indices of T1 , T2 :
|T1 |Gini(T1 ) + |T2 |Gini(T2 )
(j ⇤ , t⇤ ) = arg min
j,t |T1 | + |T2 |
where the Gini(·) is defined as:
1
X
Gini(Ti ) = 1 P (Y = y)2 ,
y=0

#{data points with label y in Ti }

P (Y = y) = .
#{data points in Ti }
– Step 4: We split the dataset Tcur into two subsets T1⇤ , T2⇤ following the best
splitting rule (j ⇤ , t⇤ ). Then we set current node as a branch node and create child
nodes with the subsets T1⇤ , T2⇤ respectively. For each child node, start from Step 1
again recursively.

4
If we run the above decision tree building procedure on dataset T and find the generated tree
is shown below:

T
x2  2.0 x2 > 2.0

T1⇤ T2⇤
x1  2.0 x1 > 2.0

⇤ ⇤
T11 T12

Please answer the questions:

1. (16 pts) Calculate the subsets T1⇤ , T2⇤ , T11

⇤ ⇤
, T12 using the given decision tree.

2. (12 pts) Calculate Gini(T1⇤ ) and Gini(T2⇤ ).

5
3. (12 pts) With the given tree, we can predict the class of a feature vector x = (x1 , x2 ):

• Start from the root node of the tree:

– Step 1: If current node is a branch node, we evaluate conditions on branch
edges with x, choose the satisfied branch to go through, and repeat Step 1.
– Step 2: If current node is a leaf node, the common class of the subset in the
leaf node will be used as prediction.

Please predict the following feature vectors using the given tree:
(1) x = (2, 1),
(2) x = (3, 1),
(3) x = (3, 3).

4. (Bonus Question, 10 pts extra) In this question, you need to implement the decision
tree algorithm. Please download the Jupyter notebook HW4 Decision Tree.ipynb and
fill in the blanks. Note that since the same dataset T is used in the notebook, you can
use the code to check if your previous answers are correct or not. Please attach your
code and results in Gradescope submission.

6
3 (20 points) Bagging and Boosting
Assume we obtain T linear classifiers {ht , t = 1, ..., T } where each classifier h : R2 ! {+1, 1}
predicts the class ŷ 2 {+1, 1} with given feature vector x = (x1 , x2 ) as follows:
(
+1 if a 0,
ŷ = h(x) = sign(w1 x1 + w2 x2 + b) where sign(a) =
1 if a < 0.

where w1 , w2 , b 2 R are the parameters.

• In a bagging model Hbagging of the T linear classifiers, we calculate the average prediction
using classifiers {ht }, and then use it to predict the class ŷbagging :

⇣1 X
T ⌘
ŷbagging = Hbagging (x) = sign ht (x)
T t=1

• In a boosting model Hboosting of the T linear classifiers, we calculate the weighted sum
of predictions using classifiers {ht }, and then use it to predict the class ŷboosting :

⇣X
T ⌘
ŷboosting = Hboosting (x) = sign ↵t ht (x)
t=1

where {↵t , t = 1, ..., T } are the weight coefficients.

In this problem, suppose we have 3 linear classifiers (i.e. T = 3):

h1 (x) = sign(x1 + x2 + 1), h2 (x) = sign(x1 x2 ), h3 (x) = sign(x1 2x2 + 1).

Please answer the questions below:

1. (10 pts) Please calculate the ŷbagging of feature vector x = (1, 2) using bagging on these
three classifiers.

7
2. (10 pts) Please calculate the ŷboosting of feature vector x = (1, 2) using boosting on these
three classifiers. The weight coefficients are ↵1 = 0.8, ↵2 = 0.2, ↵3 = 0.3.

8
5/25/23, 4:06 PM HW4_Decision_Tree

Part I. Implement a decision tree algorithm and make

predictions.
In [16]: import numpy as np

In [30]: class TreeNode:

""" Node class in the decision tree. """
def __init__(self, T):
self.type = 'leaf' # Type of current node. Could be 'leaf' or 'branch' (
self.left = None # Left branch of the tree (for leaf node, it is None)
self.right = None # Right branch of the tree (for leaf node, it is None
self.dataset = T # Dataset of current node, which is a tuple (X, Y).
# X is the feature array and Y is the label vector.

def set_as_leaf(self, common_class):

""" Set current node as leaf node. """
self.type = 'leaf'
self.left = None
self.right = None
self.common_class = common_class

def set_as_branch(self, left_node, right_node, split_rule):

""" Set current node as branch node. """
self.type = 'branch'
self.left = left_node
self.right = right_node
# split_rule should be a tuple (j, t).
# When x_j <= t, it goes to left branch.
# When x_j > t, it goes to right branch.
self.split_rule = split_rule

In [31]: # Prepare for dataset.

def get_dataset():
X = np.array(
[[1.0, 2.0],
[2.0, 2.0],
[3.0, 2.0],
[2.0, 3.0],
[1.0, 3.0]
])
Y = np.array(
[1,
1,
0,
0,
0])
T = (X, Y) # The dataset T is a tuple of feature array X and label vector Y.
return T

T = get_dataset()

In this part, you are required to implement the decision tree algorithm shown in the
problem description of Q2 in HW4:

localhost:8888/nbconvert/html/HW4_Decision_Tree.ipynb?download=false 1/6
5/25/23, 4:06 PM HW4_Decision_Tree

The 4 steps are marked in comments of the following code. Please fill in the missing
blanks (e.g. "...") in the TODOs:

In [32]: # Initialization.
root_node = TreeNode(T)

In [88]: # Procedure for current node.

def build_decision_tree_procedure(node_cur, depth=0):
# Step 1. Check if all data points in T_cur are in the same class
# - If it is true, set current node as a *leaf node* to predict the
# and then terminate current procedure.
# - If it is false, continue the procedure.

T_cur = node_cur.dataset
X_cur, Y_cur = T_cur # Get current feature array X_cur and label vector Y_c
if (Y_cur == 1).all():
print(' ' * depth + '+-> leaf node (predict 1).')
print(' ' * depth + ' Gini: {:.3f}'.format(Gini(T_cur)))
print(' ' * depth + ' samples: {}'.format(len(X_cur)))
node_cur.set_as_leaf(1)
return
elif (Y_cur == 0).all():
print(' ' * depth + '+-> leaf node (predict 0).')
print(' ' * depth + ' Gini: {:.3f}'.format(Gini(T_cur)))
print(' ' * depth + ' samples: {}'.format(len(X_cur)))
node_cur.set_as_leaf(0)
return

# Step 2. Traverse all possible splitting rules.

# - We will traverse the rules over all feature dimensions j in {0,
# thresholds t in X_cur[:, j] (i.e. all x_j in current feature arr
all_rules = []

#### TODO 1 STARTS ###

# Please traverse the rules over all feature dimensions j in {0, 1} and
# thresholds t in X_cur[:, j] (i.e. all x_j in current feature array X_cur
# and save all rules in all_rules variable.
# The all_rules variable should be a list of tuples such as [(0, 1.0), (0, 2

for j in range(2):
for t in range(len(X_cur[:, j])):
all_rules.append((j, X_cur[t, j]))
all_rules_set = set(all_rules)
all_rules = list(all_rules_set)
all_rules.sort()
#### TODO 1 ENDS ###

#print('All rules:', all_rules) # Code for debugging.

# Step 3. Decide the best splitting rule.

best_rule = (_, _)
best_weighted_sum = 1.0
for (j, t) in all_rules:

#### TODO 2 STARTS ###

# For each splitting rule (j, t), we use it to split the dataset T_cur i
# Hint: You may refer to Step 4 to understand how to set inds1, X1, Y1,

# - Create subset T1.

localhost:8888/nbconvert/html/HW4_Decision_Tree.ipynb?download=false 2/6
5/25/23, 4:06 PM HW4_Decision_Tree

inds1 = X_cur[:,j] <= t # Indices vector for those data po

X1 = X_cur[inds1] # Feature array with inds1 in X_cur
Y1 = Y_cur[inds1] # Label vector with inds1 in Y_cur.
T1 = (X1, Y1) # Subset T1 contains feature array and label ve
len_T1 = len(T1[0]) # Size of subset T1.
# - Create subset T2.
inds2 = X_cur[:,j] > t # Indices vector for those data points with
X2 = X_cur[inds2] # Feature array with inds2 in X_cur
Y2 = Y_cur[inds2] # Label vector with inds2 in Y_cur.
T2 = (X2, Y2) # Subset T2 contains feature array and label
len_T2 = len(T2[0]) # Size of subset T2.
#### TODO 2 ENDS ###

# Calculate weighted sum and try to find the best one.

weighted_sum = (len_T1*Gini(T1) + len_T2*Gini(T2)) / (len_T1 + len_T2)

# print('Rule:', (j, t), 'len_T1, len_T2:', len(T1), len(T2), 'weighted_

if weighted_sum < best_weighted_sum:

#### TODO 3 STARTS ####

# Update the best rule and best weighted sum with current ones.

best_rule = (j, t)
best_weighted_sum = weighted_sum
#### TODO 3 ENDS ####

# Step 4. - We split the dataset T_cur into two subsets best_T1, best_T2 fol
# the best splitting rule (best_j, best_t).
# - Then we set current node as a *branch* node and create child nod
# the subsets best_T1, best_T2 respectively.
# - For each child node, start from *Step 1* again recursively.

best_j, best_t = best_rule

# - Create subset best_T1 and corresponding child node.
best_inds1 = X_cur[:,best_j] <= best_t
best_X1 = X_cur[best_inds1]
best_Y1 = Y_cur[best_inds1]
best_T1 = (best_X1, best_Y1)
node1 = TreeNode(best_T1)
# - Create subset best_T2 and corresponding child node.
best_inds2 = X_cur[:,best_j] > best_t
best_X2 = X_cur[best_inds2]
best_Y2 = Y_cur[best_inds2]
best_T2 = (best_X2, best_Y2)
node2 = TreeNode(best_T2)
# - Set current node as branch node and create child nodes.
node_cur.set_as_branch(left_node=node1, right_node=node2, split_rule=best_ru
print(' ' * depth + '+-> branch node')
print(' ' * depth + ' Gini: {:.3f}'.format(Gini(T_cur)))
print(' ' * depth + ' samples: {}'.format(len(X_cur)))
# - For each child node, start from Step 1 again recursively.
print(' ' * (depth + 1) + '|-> left branch: x_{} <= {} (with {} data poin
build_decision_tree_procedure(node1, depth+1) # Note: The depth is only used
print(' ' * (depth + 1) + '|-> right branch: x_{} > {} (with {} data poin
build_decision_tree_procedure(node2, depth+1)

def Gini(Ti):
""" Calculate the Gini index given dataset Ti. """
Xi, Yi = Ti # Get the feature array Xi and label vector Yi.

localhost:8888/nbconvert/html/HW4_Decision_Tree.ipynb?download=false 3/6
5/25/23, 4:06 PM HW4_Decision_Tree

if len(Yi) == 0: # If the dataset Ti is empty, it simply returns 0.

return 0

num = 0
for i in range(len(Yi)):
if Yi[i] == 1 :
num += 1

#### TODO 4 STARTS ####

# Implement the Gini index function.

P_Y1 = num / len(Yi) # Estimate probability P(Y=1) in Yi

P_Y0 = (len(Yi) - num) / len(Yi) # Estimate probability P(Y=0) in Yi
Gini_Ti = 1 - P_Y1**2 - P_Y0**2 # Calculate Gini index: Gini_Ti = 1 - P(Y=1
#### TODO 4 ENDS ####

return Gini_Ti

After you finish the above code blank filling, you can use the following code to build the
decision tree. The following code also shows the structure of the tree.

In [90]: # Build the decision tree.

build_decision_tree_procedure(root_node)

# If your code is correct, you should output:

#
# +-> branch node
# Gini: 0.480
# samples: 5
# |-> left branch: x_1 <= 2.0 (with 3 data point(s)).
# +-> branch node
# Gini: 0.444
# samples: 3
# .....
#
# You can also use the sklearn results to validate your decision tree
# (the threshold could be slightly different but the structure of the tree shoul

+-> branch node

Gini: 0.480
samples: 5
|-> left branch: x_1 <= 2.0 (with 3 data point(s)).
+-> branch node
Gini: 0.444
samples: 3
|-> left branch: x_0 <= 2.0 (with 2 data point(s)).
+-> leaf node (predict 1).
Gini: 0.000
samples: 2
|-> right branch: x_0 > 2.0 (with 1 data point(s)).
+-> leaf node (predict 0).
Gini: 0.000
samples: 1
|-> right branch: x_1 > 2.0 (with 2 data point(s)).
+-> leaf node (predict 0).
Gini: 0.000
samples: 2

With the obtained decision tree, you can predict the class of new feature vectors:

localhost:8888/nbconvert/html/HW4_Decision_Tree.ipynb?download=false 4/6
5/25/23, 4:06 PM HW4_Decision_Tree

In [91]: def decision_tree_predict(node_cur, x):

if node_cur.type == 'leaf':
return node_cur.common_class
else:
j, t = node_cur.split_rule
if x[j] <= t:
return decision_tree_predict(node_cur.left, x)
else:
return decision_tree_predict(node_cur.right, x)

In [92]: for x in [(2,1), (3,1), (3,3)]:

y_pred = decision_tree_predict(root_node, x)
print('Prediction of {} is {}'.format(x, y_pred))

Prediction of (2, 1) is 1
Prediction of (3, 1) is 0
Prediction of (3, 3) is 0

Part II. Use Scikit-learn to build the tree and make

predictions.
The following code uses Scikit-learn to build the decision tree. You can use it to check if
your previous implementation is correct or not.

In [93]: # Ref: https://scikit-learn.org/stable/modules/tree.html#tree-algorithms-id3-c4-

from sklearn import tree
X, Y = T
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, Y)

The following code illustrates the obtained decision tree. It should have same structure
and similar rules compared with the tree in your own implementation.

In [94]: # Plotting the tree.

tree.plot_tree(clf)

Out[94]: [Text(0.6, 0.8333333333333334, 'x[1] <= 2.5\ngini = 0.48\nsamples = 5\nvalue =

[3, 2]'),
Text(0.4, 0.5, 'x[0] <= 2.5\ngini = 0.444\nsamples = 3\nvalue = [1, 2]'),
Text(0.2, 0.16666666666666666, 'gini = 0.0\nsamples = 2\nvalue = [0, 2]'),
Text(0.6, 0.16666666666666666, 'gini = 0.0\nsamples = 1\nvalue = [1, 0]'),
Text(0.8, 0.5, 'gini = 0.0\nsamples = 2\nvalue = [2, 0]')]

localhost:8888/nbconvert/html/HW4_Decision_Tree.ipynb?download=false 5/6
5/25/23, 4:06 PM HW4_Decision_Tree

The following code makes the predictions using the obtained decision tree. It should
have identical results as the ones for your own implementaion.

In [95]: # Predict the class.

for x in [(2,1), (3,1), (3,3)]:
y_pred = clf.predict(np.array([x]))[0]
print('Prediction of {} is {}'.format(x, y_pred))

Prediction of (2, 1) is 1
Prediction of (3, 1) is 0
Prediction of (3, 3) is 0

localhost:8888/nbconvert/html/HW4_Decision_Tree.ipynb?download=false 6/6

MLT - Lab - Manual FINAL
No ratings yet
MLT - Lab - Manual FINAL
38 pages
Optimization Technique MCQ with Answer
No ratings yet
Optimization Technique MCQ with Answer
24 pages
Ai Combined Update
No ratings yet
Ai Combined Update
274 pages
ML Unit II_Final
No ratings yet
ML Unit II_Final
138 pages
ML5_Implementation
No ratings yet
ML5_Implementation
32 pages
Slide 3
No ratings yet
Slide 3
23 pages
P4-DTRF 1
No ratings yet
P4-DTRF 1
63 pages
22MCA1008 - Varun ML LAB ASSIGNMENTS
100% (1)
22MCA1008 - Varun ML LAB ASSIGNMENTS
41 pages
04. UNIT-IV(DMWH6EM)
No ratings yet
04. UNIT-IV(DMWH6EM)
33 pages
21 Decision Trees
No ratings yet
21 Decision Trees
62 pages
Aiml Practical
No ratings yet
Aiml Practical
17 pages
MANUAL (1)
No ratings yet
MANUAL (1)
34 pages
AI Assignment: Vishal Batch 10 17SCSE101611
No ratings yet
AI Assignment: Vishal Batch 10 17SCSE101611
4 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
12 pages
decision_trees_implementation (1)
No ratings yet
decision_trees_implementation (1)
13 pages
AI Lab M.Tech
No ratings yet
AI Lab M.Tech
29 pages
Lab Program 3
No ratings yet
Lab Program 3
6 pages
ml4
No ratings yet
ml4
5 pages
ANACONDA EX-7
No ratings yet
ANACONDA EX-7
3 pages
ML Recap
No ratings yet
ML Recap
96 pages
Exam1 Practice Solutions
No ratings yet
Exam1 Practice Solutions
25 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
Data Mining NOTES
No ratings yet
Data Mining NOTES
57 pages
ML lab manual
No ratings yet
ML lab manual
25 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
IT Interview Questions a Primer for the IT Job Interviews (Concepts, Problems and Interview Questions) (Karumanchi, Narasimha) (Z-Library)
No ratings yet
IT Interview Questions a Primer for the IT Job Interviews (Concepts, Problems and Interview Questions) (Karumanchi, Narasimha) (Z-Library)
511 pages
Lab 4_Logistic Regression_kNN_Notes
No ratings yet
Lab 4_Logistic Regression_kNN_Notes
6 pages
Pra 5 ML
No ratings yet
Pra 5 ML
5 pages
Homework Solution 01 KNN DT
No ratings yet
Homework Solution 01 KNN DT
4 pages
Classification Problems
No ratings yet
Classification Problems
53 pages
Import Import Def
No ratings yet
Import Import Def
2 pages
Decision Trees
No ratings yet
Decision Trees
11 pages
ML Priyesha - 778
No ratings yet
ML Priyesha - 778
23 pages
Random Forest: The Algorithm in A Nutshell
No ratings yet
Random Forest: The Algorithm in A Nutshell
10 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
DT-2023-24-sols
No ratings yet
DT-2023-24-sols
8 pages
Week 7 - Graded
No ratings yet
Week 7 - Graded
17 pages
DECISION TREES
No ratings yet
DECISION TREES
7 pages
ML File
No ratings yet
ML File
13 pages
Mlsp Lab Exp4
No ratings yet
Mlsp Lab Exp4
9 pages
Classification and Prediction
No ratings yet
Classification and Prediction
81 pages
ML Lab PT
No ratings yet
ML Lab PT
25 pages
Machine Learning Laboratory Record Book: 1 Find S Algorithm
No ratings yet
Machine Learning Laboratory Record Book: 1 Find S Algorithm
22 pages
Lab_Manual2 (2)
No ratings yet
Lab_Manual2 (2)
6 pages
ML Lab Record
No ratings yet
ML Lab Record
33 pages
178 hw1
No ratings yet
178 hw1
4 pages
DA_LAB3_221IT064
No ratings yet
DA_LAB3_221IT064
6 pages
P 4 Andp 5
No ratings yet
P 4 Andp 5
4 pages
ai int-1
No ratings yet
ai int-1
6 pages
hw1
No ratings yet
hw1
4 pages
Data Mining Assignment No. 1
No ratings yet
Data Mining Assignment No. 1
7 pages
ML_UNIT3
No ratings yet
ML_UNIT3
24 pages
Data Structures & Algorithms in Python 1st Edition John Canning - eBook PDF all chapter instant download
100% (5)
Data Structures & Algorithms in Python 1st Edition John Canning - eBook PDF all chapter instant download
66 pages
Machine Learning Unit4
No ratings yet
Machine Learning Unit4
8 pages
AD3461 ML lab manual
No ratings yet
AD3461 ML lab manual
32 pages
Unit-3 Alt
No ratings yet
Unit-3 Alt
24 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
ML Lab
No ratings yet
ML Lab
7 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
19EAC203 LAB REPORT Combined
No ratings yet
19EAC203 LAB REPORT Combined
73 pages
Machine Learning
No ratings yet
Machine Learning
44 pages
Recursion RT
No ratings yet
Recursion RT
18 pages
601 sp09 Midterm Solutions
No ratings yet
601 sp09 Midterm Solutions
14 pages
OR New
No ratings yet
OR New
8 pages
Arpita Shaw - CA1 - DSA
No ratings yet
Arpita Shaw - CA1 - DSA
9 pages
CS293 Tutorial Solutions
No ratings yet
CS293 Tutorial Solutions
15 pages
Daa PDF
No ratings yet
Daa PDF
16 pages
Solve Linear Programming Problems - MATLAB Linprog - MathWorks India
No ratings yet
Solve Linear Programming Problems - MATLAB Linprog - MathWorks India
24 pages
Myhashtable
No ratings yet
Myhashtable
4 pages
00 CS 312 Comp Algo Course Description
No ratings yet
00 CS 312 Comp Algo Course Description
7 pages
sparky notes AP CS 2015
No ratings yet
sparky notes AP CS 2015
9 pages
Prefix Postfix Expression
No ratings yet
Prefix Postfix Expression
12 pages
17 Query Processing PDF
No ratings yet
17 Query Processing PDF
23 pages
DAA-Practical List
No ratings yet
DAA-Practical List
2 pages
Arrays Student Notes PDF
No ratings yet
Arrays Student Notes PDF
2 pages
Priority Queue
No ratings yet
Priority Queue
3 pages
Finxter Python Cheat Sheet Complex Data Types
No ratings yet
Finxter Python Cheat Sheet Complex Data Types
1 page
Lecture 1 (Fundamental of Algorithms)
No ratings yet
Lecture 1 (Fundamental of Algorithms)
26 pages
Creating Arrays - Arrays Are Declared by Using The Array Type Name, Optionally Followed by "Of"
No ratings yet
Creating Arrays - Arrays Are Declared by Using The Array Type Name, Optionally Followed by "Of"
3 pages
CSE-304 Design & Analysis of Algorithm: Recurrence Relation
No ratings yet
CSE-304 Design & Analysis of Algorithm: Recurrence Relation
5 pages
CS3491 Set6
No ratings yet
CS3491 Set6
2 pages
Certified Global Minima
100% (1)
Certified Global Minima
8 pages
Trainlm (Neural Network Toolbox)
No ratings yet
Trainlm (Neural Network Toolbox)
3 pages
450 Questions
No ratings yet
450 Questions
15 pages
Fast Rls Algorithm PDF
No ratings yet
Fast Rls Algorithm PDF
2 pages
Stack Ques
No ratings yet
Stack Ques
6 pages
Data Structures
No ratings yet
Data Structures
2 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)

Merging Result-Merged

Uploaded by

Merging Result-Merged

Uploaded by

Homework Assignment 4

CSE 151A: Introduction to Machine Learning

Due: May 30th, 2023, 9:30am (Pacific Time)

Grade: out of 100 points

1 (40 points) Naı̈ve Bayes

#{data points with y = 1} 4

where the joint probability P (X1 = x1 , X2 = x2 , X3 = x3 ) can be calculated as:

P (Y = 1|X1 = x1 , X2 = x2 , X3 = x3 ) > P (Y = 0|X1 = x1 , X2 = x2 , X3 = x3 )

then we can predict the class of feature vector x = (x1 , x2 , x3 ) to be 1, otherwise 0. It is

1. (15 pts) Please estimate the following probabilities:

(1) P (X1 = 1, Y = 0), (2) P (Y = 0), (3) P (X1 = 1|Y = 1).

3. (7 pts) Please calculate the probability P (Y = 0|X1 = 1, X2 = 1, X3 = 0) in Naı̈ve

#{data points with label y in Ti }

Please answer the questions:

1. (16 pts) Calculate the subsets T1⇤ , T2⇤ , T11

2. (12 pts) Calculate Gini(T1⇤ ) and Gini(T2⇤ ).

• Start from the root node of the tree:

where w1 , w2 , b 2 R are the parameters.

where {↵t , t = 1, ..., T } are the weight coefficients.

In this problem, suppose we have 3 linear classifiers (i.e. T = 3):

h1 (x) = sign(x1 + x2 + 1), h2 (x) = sign(x1 x2 ), h3 (x) = sign(x1 2x2 + 1).

Please answer the questions below:

Part I. Implement a decision tree algorithm and make

In [30]: class TreeNode:

def set_as_leaf(self, common_class):

def set_as_branch(self, left_node, right_node, split_rule):

In [31]: # Prepare for dataset.

In [88]: # Procedure for current node.

# Step 2. Traverse all possible splitting rules.

#### TODO 1 STARTS ###

#print('All rules:', all_rules) # Code for debugging.

# Step 3. Decide the best splitting rule.

#### TODO 2 STARTS ###

# - Create subset T1.

inds1 = X_cur[:,j] <= t # Indices vector for those data po

# Calculate weighted sum and try to find the best one.

# print('Rule:', (j, t), 'len_T1, len_T2:', len(T1), len(T2), 'weighted_

if weighted_sum < best_weighted_sum:

#### TODO 3 STARTS ####

best_j, best_t = best_rule

if len(Yi) == 0: # If the dataset Ti is empty, it simply returns 0.

#### TODO 4 STARTS ####

P_Y1 = num / len(Yi) # Estimate probability P(Y=1) in Yi

In [90]: # Build the decision tree.

# If your code is correct, you should output:

+-> branch node

In [91]: def decision_tree_predict(node_cur, x):

In [92]: for x in [(2,1), (3,1), (3,3)]:

Part II. Use Scikit-learn to build the tree and make

In [93]: # Ref: https://scikit-learn.org/stable/modules/tree.html#tree-algorithms-id3-c4-

In [94]: # Plotting the tree.

Out[94]: [Text(0.6, 0.8333333333333334, 'x[1] <= 2.5\ngini = 0.48\nsamples = 5\nvalue =

In [95]: # Predict the class.

You might also like