0% found this document useful (0 votes)
6 views

Exp 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Exp 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Experiment-3

Exp.no:3

Aim: Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use
an appropriate dataset for building the decision tree and apply this knowledge to classify a new
sample.

Description:
Decision Tree: It is a Supervised learning technique that can be used for both classification and
Regression problems, but mostly it is preferred for solving Classification problems.

It is a tree-structured classifier, where internal nodes represent the features of a dataset, branches
represent the decision rules and each leaf node represents the outcome.

In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision
nodes are used to make any decision and have multiple branches, whereas Leaf nodes are the
output of those decisions and do not contain any further branches. The decisions or the test are
performed on the basis of features of the given dataset.

It is a graphical representation for getting all the possible solutions to a problem/decision based
on given conditions.

ID3 Algorithm: The ID3 algorithm is specifically designed for building decision trees from a
given dataset. Its primary objective is to construct a tree that best explains the relationship
between attributes in the data and their corresponding class labels.

Entropy: A measure of disorder or uncertainty in a set of data is called entropy. Entropy is a tool
used in ID3 to measure a dataset’s disorder or impurity. By dividing the data into as homogenous
subsets as feasible, the objective is to minimize entropy.

H(S)=Σ−(Pi∗log2(Pi))

Information Gain: A measure of how well a certain quality reduces uncertainty is called
Information Gain.

IG(A,D)=H(S)–∑v∣S∣∣Sv∣×H(Sv)]

ID3 splits the data at each stage, choosing the property that maximizes Information Gain. It is
computed using the distinction between entropy prior to and following the split.
Dataset 1:

RID Age Income Student Credit v


1 youth high no fair No
2 youth high no excellent No
3 middle high no fair yes
4 senior medium no fair Yes
5 senior low yes fair Yes
6 senior low yes excellent No
7 middle low yes excellent Yes
8 youth medium no fair No
9 youth low yes fair Yes
10 senior medium yes fair Yes

Program:

You might also like