0% found this document useful (0 votes)

3 views

DM Lab 04

Uploaded by

zeeshan09877890

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

DM Lab 04

Uploaded by

zeeshan09877890

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Lab 04

Decision trees
1) Useful Concepts
1. Basic Concepts of Classification
Classification is a supervised learning task where the objective is to predict the label (or class) of
a given input based on its features. In classification, the output is discrete (e.g., Yes/No, 0/1, or
categories).
Example:
• Problem: Predict whether a student will pass or fail based on study hours.
• Features: Study hours, Attendance.
• Target: Pass or Fail.
2. Introduction to Decision Trees
A decision tree is a flowchart-like structure used for classification (and regression). It breaks
down the dataset into smaller subsets by asking a sequence of questions based on feature values,
ultimately leading to a decision.
• Internal Nodes: Represent a decision based on a feature.
• Branches: Represent the outcome of that decision.
• Leaf Nodes: Represent the final class or prediction.
3. Building a Decision Tree
Decision trees are built by recursively partitioning the data based on the feature that best
separates the classes. The objective is to minimize impurity within each subset.
• Impurity: A measure of how mixed the classes are within a node.
o Gini Impurity
o Entropy
4. Hunt’s Algorithm
Hunt’s Algorithm is one of the earliest methods to build decision trees. It follows a recursive
partitioning approach.
Steps of Hunt's Algorithm:
1. If all records at a node belong to the same class, label it as a leaf node.
2. If records are mixed, select a feature to split them.
3. Recur for each subset created by the split.
5. Attribute Conditions in Decision Trees
Attributes (or features) can be either categorical or numerical. The conditions to split data differ
based on the attribute type:
• For Categorical Features: The condition checks for specific category membership.
o Example: If color == 'red'.
• For Numerical Features: The condition checks for a threshold value.
o Example: If age > 30.
6. Best Split and Decision Tree Algorithm
The best split is the one that maximizes class separation (reduces impurity the most). Commonly
used metrics to find the best split are:
• Gini Index: Measures how pure the subsets are.
• Entropy: Measures the amount of uncertainty in the subset.
• Information Gain: The reduction in entropy after the split.
7. Characteristics of Decision Tree Induction
• Easy to interpret and visualize.
• No need for data scaling.
• Sensitive to overfitting (pruning may be required).
• Handles categorical and numerical data.

2) Solved Lab Activities

Activity 1: Building a Decision Tree from Scratch

Using a small dataset, we will manually build a decision tree.

Follow these steps:

1. Calculate Gini or Entropy for each feature.

2. Determine the best feature and threshold to split.
3. Recursively repeat for each subset.

Using Decision Trees in Python (Scikit-Learn). We will use Python’s Scikit-learn library to
build a decision tree.
Activity 2: Loan Approval Prediction
Problem: Predict whether a loan application will be approved based on various features
like income, loan amount, credit history, and marital status.
Dataset Features:
• Applicant Income

• Loan Amount
• Credit History (0: Bad, 1: Good)
• Marital Status (Single, Married)
• Loan Status (Approved or Not Approved - Target Variable)
Importing Libraries and Dataset:
Graded Task: Implement a Decision Tree Classifier using Scikit-Learn

Objective: In this task, you will implement a decision tree classifier using the scikit-learn library
in Python. You will work with the Iris dataset, which is a well-known dataset for classification
tasks. The goal is to train and evaluate your model, visualize the decision tree, and report the
model's accuracy.

Task Steps:

1. Load the Iris Dataset

• Use Scikit-learn’s load_iris method to load the dataset.

• Split the dataset into training and testing sets.

2. Build the Decision Tree Classifier

• Train a decision tree classifier on the training data using Scikit-

learn's DecisionTreeClassifier.
• Choose a suitable criterion (e.g., gini or entropy) to evaluate the best split in the tree.

3. Evaluate the Model

• Test the classifier on the test dataset and calculate the accuracy.
• Print the accuracy of the model.

4. Visualize the Decision Tree

• Visualize the trained decision tree using the tree.plot_tree() method.
• Include feature names and class labels in the visualization.

5. Hyperparameter Tuning (Bonus)

• Experiment with different values for max_depth and min_samples_split to see how they
impact the model's accuracy and structure.
• Comment on the effects of these parameters.

6. Provide Interpretation

• Provide an interpretation of the decision tree and the results.

• Describe how the tree splits data based on the features and how it makes decisions.

Program Overview: p1 p2 p3 p4 p5
No ratings yet
Program Overview: p1 p2 p3 p4 p5
60 pages
Schematic Arduino NANO-V3-CH340G ATMEGA328P
100% (4)
Schematic Arduino NANO-V3-CH340G ATMEGA328P
1 page
Prac 6
No ratings yet
Prac 6
6 pages
Lab # 10
No ratings yet
Lab # 10
6 pages
Title: Implementation of Decision Tree Classification: Department of Computer Science and Engineering
No ratings yet
Title: Implementation of Decision Tree Classification: Department of Computer Science and Engineering
8 pages
practical 15 python
No ratings yet
practical 15 python
6 pages
Decision Tree and Related Techniques For Classification in Scalation
No ratings yet
Decision Tree and Related Techniques For Classification in Scalation
12 pages
Unit-5 Decision Trees & Ensembles Methods
No ratings yet
Unit-5 Decision Trees & Ensembles Methods
11 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
entropy and information gain for decision tree algorithm
No ratings yet
entropy and information gain for decision tree algorithm
12 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Decision Tree Ppt
0% (1)
Decision Tree Ppt
24 pages
decision tree
No ratings yet
decision tree
13 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Week 6 - 7 - Classification
No ratings yet
Week 6 - 7 - Classification
67 pages
Soft Computing Lab Practical Assignment 2
No ratings yet
Soft Computing Lab Practical Assignment 2
10 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
13 pages
Data Mining
No ratings yet
Data Mining
15 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
3-Classification, Clustering and Prediction
No ratings yet
3-Classification, Clustering and Prediction
142 pages
Lecture 8
No ratings yet
Lecture 8
28 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
Experiment No-2
No ratings yet
Experiment No-2
4 pages
Data Mining Unit-Iii
No ratings yet
Data Mining Unit-Iii
36 pages
Unit-3 Alt
No ratings yet
Unit-3 Alt
24 pages
Building A Decision Tree Classifier From Scratch
No ratings yet
Building A Decision Tree Classifier From Scratch
10 pages
Trees and Forests: Machine Learning With Python Cookbook
No ratings yet
Trees and Forests: Machine Learning With Python Cookbook
5 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Lec.7.intro.D.S. Fall 2023
No ratings yet
Lec.7.intro.D.S. Fall 2023
26 pages
Introduction To Decision Tree: Gini Index
No ratings yet
Introduction To Decision Tree: Gini Index
15 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
ML Unit 3 New
No ratings yet
ML Unit 3 New
24 pages
Decision Trees
No ratings yet
Decision Trees
38 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Decision Trees
No ratings yet
Decision Trees
21 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
Unit 3 Classification
No ratings yet
Unit 3 Classification
71 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Classification: Decision Trees: Business Analytics Lecture 7/8
No ratings yet
Classification: Decision Trees: Business Analytics Lecture 7/8
35 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Dwdm-Unit-3 R16
No ratings yet
Dwdm-Unit-3 R16
14 pages
AIML Module-04
No ratings yet
AIML Module-04
46 pages
Module 04 Edited
No ratings yet
Module 04 Edited
19 pages
Supervised Learning Algorithm
No ratings yet
Supervised Learning Algorithm
59 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
17 pages
Decision Trees Presentation
No ratings yet
Decision Trees Presentation
10 pages
Week 8 - Understanding the Decision Tree
No ratings yet
Week 8 - Understanding the Decision Tree
28 pages
CH 5
No ratings yet
CH 5
84 pages
Types of Pruning Techniques
No ratings yet
Types of Pruning Techniques
10 pages
DWDM - Unit - V
No ratings yet
DWDM - Unit - V
93 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
Karbala Refinery Project
No ratings yet
Karbala Refinery Project
5 pages
CCE 411: Digital Communication Systems: American University of Science & Technology
No ratings yet
CCE 411: Digital Communication Systems: American University of Science & Technology
4 pages
De Thi Mon He Thong Truyen Dong Servo -SERV334029
No ratings yet
De Thi Mon He Thong Truyen Dong Servo -SERV334029
11 pages
Quadrature Amplitude Modulation (QAM) - Wireless Pi
No ratings yet
Quadrature Amplitude Modulation (QAM) - Wireless Pi
13 pages
2-Commissioning Fireworks Rev 0
No ratings yet
2-Commissioning Fireworks Rev 0
16 pages
Brian Daniels Phddoc R PDF
No ratings yet
Brian Daniels Phddoc R PDF
212 pages
Living in The IT Era
No ratings yet
Living in The IT Era
6 pages
Course Link
No ratings yet
Course Link
2 pages
eta/VPG Training Manual: A LS/DYNA Based Full Vehicle Simulation Solution Package
No ratings yet
eta/VPG Training Manual: A LS/DYNA Based Full Vehicle Simulation Solution Package
54 pages
The Verified Log Book Will Become Mandatory For Renewal From January 2017
No ratings yet
The Verified Log Book Will Become Mandatory For Renewal From January 2017
1 page
Using Bellville Springs To Maintain Bolt Preload
No ratings yet
Using Bellville Springs To Maintain Bolt Preload
30 pages
Practical Exam: Aim: To Design A State Diagram For Online Shopping Theory
No ratings yet
Practical Exam: Aim: To Design A State Diagram For Online Shopping Theory
4 pages
BO9
No ratings yet
BO9
9 pages
Akuvox E11R Series Door Phone Admin Guide V1
No ratings yet
Akuvox E11R Series Door Phone Admin Guide V1
46 pages
E-Plastic Management System
No ratings yet
E-Plastic Management System
9 pages
8085 Microprocessor Questions
100% (1)
8085 Microprocessor Questions
6 pages
Error Detection, Correction and Wireless Communication
No ratings yet
Error Detection, Correction and Wireless Communication
75 pages
1.5 Huawei RH V3 Routine Maintenance and Troubleshooting V1 0
No ratings yet
1.5 Huawei RH V3 Routine Maintenance and Troubleshooting V1 0
27 pages
Customer Relationship Management and Patronage in Service Industry (A Study of Hotel)
100% (1)
Customer Relationship Management and Patronage in Service Industry (A Study of Hotel)
76 pages
Ec3361-Edc Lab Manual
No ratings yet
Ec3361-Edc Lab Manual
47 pages
Ylsk Series
No ratings yet
Ylsk Series
171 pages
Fraud Magazine
No ratings yet
Fraud Magazine
68 pages
Chapter 2 Module
No ratings yet
Chapter 2 Module
22 pages
PSSE
No ratings yet
PSSE
6 pages
Battery Report
No ratings yet
Battery Report
30 pages
BNQ
No ratings yet
BNQ
5 pages
3rd Term Comp
No ratings yet
3rd Term Comp
23 pages
1dac10-G6-Q3-Itfa Assignment Submission Form
No ratings yet
1dac10-G6-Q3-Itfa Assignment Submission Form
4 pages

DM Lab 04

Uploaded by

DM Lab 04

Uploaded by

Lab 04

2) Solved Lab Activities

Activity 1: Building a Decision Tree from Scratch

Using a small dataset, we will manually build a decision tree.

Follow these steps:

1. Calculate Gini or Entropy for each feature.

1. Load the Iris Dataset

• Use Scikit-learn’s load_iris method to load the dataset.

2. Build the Decision Tree Classifier

• Train a decision tree classifier on the training data using Scikit-

3. Evaluate the Model

4. Visualize the Decision Tree

5. Hyperparameter Tuning (Bonus)

• Provide an interpretation of the decision tree and the results.

You might also like