0% found this document useful (0 votes)

15 views

Machine Learning

This is machine learning Pdf

Uploaded by

faaizotho3

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

Machine Learning

This is machine learning Pdf

Uploaded by

faaizotho3

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Complete understanding of Decision Tree with

GridSearchCV
Prashant Sundge

LinkedIn GitHub

� If you found this kernel helpful, please consider giving it an upvote! Your support motivates me to create
more valuable content.

� I'd also love to hear your feedback and comments. Let me know if you have any questions, suggestions,
or insights. Your input is highly appreciated!

Happy coding! �
Table of Contents
1. Introduction
2. Definition of a Decision Tree
3. Function of a Decision Tree
4. Import Libraries
5. Read Dataset
6. IF ELSE Representation of Decision Tree
7. Tree Representation of Dataset
8. Decision Tree Basics
9. Decision Tree Terminologies
10. Decision Tree Terminology Naming Conventions
11. Example of Decision Tree
12. ROOT NODE IF ELSE NODE
13. Building the Tree
14. Entropy
15. Entropy and Gini Impurity Ranges
16. Information Gain
17. Iris Dataset With Plain Decision Tree
18. Tree Function Definitions
19. Model Evaluations
20. Confusion Matrix Function Definition
21. Cancer Dataset with Entropy in Decision Tree
22. Accuracy and Model Evaluations
23. Gini Impurity
24. Gini Impurity Formula
25. Hyperparameters and GridSearchCV
26. What are Hyperparameters
27. The Power of GridSearchCV
28. Putting It All Together
29. Play_tennis Dataset with Gini Impurity and Grid Search
30. Label Encoder
31. What Are Regression Trees?
32. Mean Square Error
33. Building a Regression Tree
34. Step 1: Initial Split and Calculation of Predicted Outputs and Mean Square Error
35. Step 2: Repeated Split and Mean Square Error Calculation
36. Step 3: Choosing the Split Point
37. Regression Dataset Model Predictions
38. Regression Tree plotted
39. Regression confusion matrix plotted
40. What happens when there are multiple independent variables?
41. Reference
Introduction
Definition of a Decision Tree:
A Decision Tree is a hierarchical and tree-like structure used in machine learning for both classification and
regression tasks. It systematically divides data into subsets based on the values of input features, ultimately
leading to decisions or predictions. It consists of nodes, branches, and leaves, where nodes represent
feature attributes, branches represent decision rules, and leaves represent the final outcomes or predictions.

Function of a Decision Tree:

The primary function of a Decision Tree is to make decisions or predictions based on input data. It works by
recursively splitting the dataset into subsets based on the most informative features, using a set of
predefined criteria such as Gini impurity or entropy for classification and mean squared error for regression.
The tree structure guides the decision-making process, where each internal node represents a decision
point, and each leaf node represents a class label (in classification) or a predicted value (in regression).

In classification, a Decision Tree helps classify data into different categories or classes, while in regression, it
predicts numerical values based on the input features. The simplicity and interpretability of Decision Trees
make them valuable tools in machine learning, allowing users to understand and visualize decision-making
processes.

Import Librabries
In [1]: import pandas as pd

Read Dataset
In [2]: example=pd.read_excel("Example1.xlsx")

In [3]: example

Out[3]: Gender Occupation Suggestion

0 F Student ENGINEER

1 F Programmer JAVA

2 M Programmer PYTHON

3 F Programmer JAVA

4 M Student ENGINEER

5 M Student ENGINEER

IF ELSE Representation of Decision Tree

In [4]: def decision(row):
if row['Occupation'] == 'Student':
print("ENGINEER")
elif row['Gender'] == 'F':
print("JAVA")
else:
print("PYHTON")
In [5]: example['Suggestion_pred'] = example.apply(decision, axis=1)

ENGINEER
JAVA
PYHTON
JAVA
ENGINEER
ENGINEER

Tree Representation of Dataset

2. Decision Tree Basics

Here, we'll dive into the fundamental concepts of decision trees...

Decision Tree Terminologies

• Root Node: The initial node at the beginning of a decision tree, where the entire population or dataset
starts dividing based on various features or conditions.
• Decision Nodes: Nodes resulting from the splitting of root nodes are known as decision nodes. These
nodes represent intermediate decisions or conditions within the tree.
• Leaf Nodes: Nodes where further splitting is not possible, often indicating the final classification or
outcome. Leaf nodes are also referred to as terminal nodes.
• Sub-Tree: Similar to a subsection of a graph being called a sub-graph, a sub-section of a decision tree
is referred to as a sub-tree. It represents a specific portion of the decision tree.
• Pruning: The process of removing or cutting down specific nodes in a decision tree to prevent
overfitting and simplify the model.
• Branch / Sub-Tree: A subsection of the entire decision tree is referred to as a branch or sub-tree. It
represents a specific path of decisions and outcomes within the tree.
• Parent and Child Node: In a decision tree, a node that is divided into sub-nodes is known as a parent
node, and the sub-nodes emerging from it are referred to as child nodes. The parent node represents a
decision or condition, while the child nodes represent the potential outcomes or further decisions
based on that condition.
Decision Tree Terminalogy Naming conventions

Example of Decision Tree

In [6]: play_tennis=pd.read_csv('play_tennis.csv')

In [7]: play_tennis

Out[7]: day outlook temp humidity wind play

0 D1 Sunny Hot High Weak No

1 D2 Sunny Hot High Strong No

2 D3 Overcast Hot High Weak Yes

3 D4 Rain Mild High Weak Yes

4 D5 Rain Cool Normal Weak Yes

5 D6 Rain Cool Normal Strong No

6 D7 Overcast Cool Normal Strong Yes

7 D8 Sunny Mild High Weak No

8 D9 Sunny Cool Normal Weak Yes

9 D10 Rain Mild Normal Weak Yes

10 D11 Sunny Mild Normal Strong Yes

11 D12 Overcast Cool Normal Strong Yes

12 D13 Sunny Mild Normal Strong Yes

13 D14 Rain Mild High Strong No

ROOT NODE IF ELSE NODE

Root Node: Outlook

• If Outlook is Sunny:
▪ Subnode: Humidity
◦ If Humidity is High: Don't Play Tennis
◦ If Humidity is Normal: Play Tennis
• If Outlook is Overcast: Play Tennis
• If Outlook is Rainy:
▪ Subnode: Wind
◦ If Wind is Weak: Play Tennis
◦ If Wind is Strong: Don't Play Tennis

A decision tree is a machine learning algorithm that makes decisions based on the values of attributes
(features) in a dataset. It works as follows:

Building the Tree

Root Node: The entire dataset is considered initially, and a root node is created.

Splitting: To create child nodes, the algorithm selects an attribute and splits the data based on its values.
The attribute selection is based on criteria like Gini Impurity and Entropy.

Entropy:
• Entropy measures the average information content in a dataset. For a node, it's calculated as:
Entropy(S) is the entropy of node S . c is the number of classes. pi is the probability of a randomly chosen
data point belonging to class i.

Entropy is minimized when all data points in the node belong to a single class, and it is a measure of the
disorder in the data.

In [8]: play_tennis

Out[8]: day outlook temp humidity wind play

0 D1 Sunny Hot High Weak No

1 D2 Sunny Hot High Strong No

2 D3 Overcast Hot High Weak Yes

3 D4 Rain Mild High Weak Yes

4 D5 Rain Cool Normal Weak Yes

5 D6 Rain Cool Normal Strong No

6 D7 Overcast Cool Normal Strong Yes

7 D8 Sunny Mild High Weak No

8 D9 Sunny Cool Normal Weak Yes

9 D10 Rain Mild Normal Weak Yes

10 D11 Sunny Mild Normal Strong Yes

11 D12 Overcast Cool Normal Strong Yes

12 D13 Sunny Mild Normal Strong Yes

13 D14 Rain Mild High Strong No

Example

• To illustrate the equation, we will do an example that calculates the entropy of our dataset
When a dataset is completely homogeneous (all data points belong to a single class), it has zero impurity,
and the entropy is zero (Equation 1.4).

In contrast, if the dataset can be equally divided into two classes, it is entirely non-homogeneous, resulting
in maximum impurity of 100%, and the entropy is one (Equation 1.3).

Impurity and entropy are measures used to quantify the level of disorder or uncertainty in a dataset, with
higher values indicating greater impurity and uncertainty, and lower values indicating greater homogeneity
and certainty.

Entropy and GiNi Impurity Ranges

Information Gain:
• Information Gain is a measure used to assess the effectiveness of an attribute in classifying a training
dataset. It quantifies the expected reduction in entropy achieved by partitioning the dataset based on
this attribute.

• Information Gain, denoted as Gain(S, A), is a function of an attribute A relative to a collection of data
S.

• The formula for Information Gain is as follows:

Where:

• Gain(S, A) is the Information Gain for attribute A in dataset S.

• Entropy(S) is the entropy of the entire dataset S.
• S_v represents subsets of the dataset S created by partitioning it based on the values of attribute A.
• The summation iterates over all possible values of attribute A.

• The Information Gain measures how much uncertainty or impurity is reduced when you split the
dataset based on attribute A. A higher Information Gain indicates that attribute A is more effective in
making distinctions within the dataset.

• Decision tree algorithms use Information Gain (or similar criteria) to determine the best attribute for
splitting the data at each node, aiming to create a tree structure that maximizes the reduction in
impurity as it grows.

To become more clear, let’s use this equation and measure the information gain of attribute Wind from
the dataset of Figure 1. The dataset has 14 instances, so the sample space is 14 where the sample has 9
positive and 5 negative instances. The Attribute Wind can have the values Weak or Strong. Therefore,

Values(Wind) = Weak, Strong

So, the information gain by the Wind attribute is 0.048.

Let’s calculate the information gain by the Outlook attribute.

These two examples should make us clear that how we can calculate information gain. The information
gain of the 4 attributes of Figure 1 dataset are:
It's crucial to keep in mind that the primary objective of assessing information gain is to pinpoint the
attribute that is most valuable for classifying the training set. Our ID3 algorithm will employ this selected
attribute as the root from which to construct the decision tree. Subsequently, it will once more compute
information gain to determine the attribute for the next node.

Based on our calculations, it is evident that the attribute providing the most substantial information gain is
"Outlook." This attribute will serve as the foundational root of our decision tree.
In Figure 3, we present a visual representation of the decision tree constructed during the initial stage of the
ID3 algorithm. Here's a breakdown of the process:

• The training examples are effectively sorted into their respective descendant nodes within the tree
structure.

• One of the descendant nodes, labeled as "Overcast," contains only positive instances and, as a result, is
transformed into a leaf node with the classification "Yes."

• For the remaining two nodes, a critical question emerges: Which attribute should be chosen for further
testing? To address this, we extend these nodes by selecting attributes that offer the highest
information gain concerning the new subset of examples.

• The subsequent step involves identifying the attribute that is most suitable for testing within the
"Sunny" descendant node.

The Dataset in Figure 1 has the value Sunny on Day1, Day2, Day8, Day9, Day11. So the Sample Space S=5
here.

We can now measure the information gain of Temperature and Wind by following the same way we
measured Gain(S, Humidity). Finally, we will get:

At this stage, Humidity emerges as the attribute that yields the highest information gain. Therefore, in the
"Sunny" descendant node following "Outlook," the attribute chosen is "Humidity."

• The "High" descendant node exclusively contains negative examples, while the "Normal" descendant
node exclusively contains positive examples. Consequently, both of these nodes transition into leaf
nodes and cannot be expanded further.

• If we apply the same procedure to extend the "Rain" descendant, we find that the attribute "Wind"
provides the most information. I'll leave this part for readers to perform the calculations themselves.

As a result, our final decision tree structure aligns with Figure 4:

Figure 4:

In [9]: import pandas as pd

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from matplotlib import pyplot as plt
from sklearn import tree
from sklearn.tree import DecisionTreeRegressor
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt

Iris Dataset With Plane Decision Tree

In [10]: iris=load_iris()

In [11]: x=iris.data
y=iris.target

In [12]: y

Out[12]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [13]: X_train, X_test, y_train, y_test =train_test_split(x,y, test_size=0.2, random_state=42)

In [14]: clf=DecisionTreeClassifier()
clf.fit(X_train, y_train)

Out[14]: ▾ DecisionTreeClassifier

DecisionTreeClassifier()

In [15]: y_pred=clf.predict(X_test)

In [16]: accuracy = accuracy_score(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")

Accuracy: 1.00
Tree Function Definations
In [17]: def display_tree(model):
# Create a figure with a larger size and set the background color
plt.figure(figsize=(15, 10))
plt.rcParams['axes.facecolor'] = 'lightgray'

# Plot the decision tree with more cosmetics

plot_tree(
model,
filled=True,

rounded=True,
proportion=True,
precision=2,
fontsize=12,
)

# Show the plot

plt.show()

In [18]: display_tree(clf)

Model Evaluations
In [19]: from sklearn.metrics import confusion_matrix,classification_report, accuracy_score, ConfusionMatrixDisplay

Confusion Matrix Function defination

In [20]: def confusion_matrix_fun(y_test, y_pred):
cm=confusion_matrix(y_test, y_pred)
cm_display=ConfusionMatrixDisplay(confusion_matrix = cm)
print(classification_report(y_test, y_pred))
cm_display.plot()
plt.show()

In [21]: confusion_matrix_fun(y_test, y_pred)

precision recall f1-score support

0 1.00 1.00 1.00 10

1 1.00 1.00 1.00 9
2 1.00 1.00 1.00 11

accuracy 1.00 30
macro avg 1.00 1.00 1.00 30
weighted avg 1.00 1.00 1.00 30

In [22]: from sklearn import datasets

dataset_list=[name for name in dir(datasets) if name.startswith('load_')]
for dataset in dataset_list:
print(dataset)

load_breast_cancer
load_diabetes
load_digits
load_files
load_iris
load_linnerud
load_sample_image
load_sample_images
load_svmlight_file
load_svmlight_files
load_wine

Cancer Dataset with Entropy in Decision tree

In [23]: from sklearn.datasets import load_breast_cancer

In [24]: cancer=load_breast_cancer()

In [25]: x=cancer.data
y=cancer.target

In [26]: y
Out[26]: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0,
1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0,
1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1,
1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0,
0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1,
1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1,
1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0,
0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0,
1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1,
1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0,
0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0,
0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0,
1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1,
1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1,
1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0,
1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1,
1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1,
1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1])

In [27]: X_train, X_test, y_train, y_test=train_test_split(x,y, test_size=0.2, random_state=123)

In [28]: cancer_model=DecisionTreeClassifier(criterion='entropy', splitter='best' )

In [29]: cancer_model.fit(X_train, y_train)

Out[29]: ▾ DecisionTreeClassifier
DecisionTreeClassifier(criterion='entropy')

In [30]: y_cancer_predict=cancer_model.predict(X_test)

Accurecy and Model Evaluations

In [31]: cancer_accuracy = accuracy_score(y_test, y_cancer_predict)
print(f"Accuracy: {cancer_accuracy:.2f}")

Accuracy: 0.97

In [32]: display_tree(cancer_model)
In [33]: confusion_matrix_fun(y_test, y_cancer_predict)

precision recall f1-score support

0 0.97 0.95 0.96 41

1 0.97 0.99 0.98 73

accuracy 0.97 114

macro avg 0.97 0.97 0.97 114
weighted avg 0.97 0.97 0.97 114

Gini Impurity
Que- How can we determine the optimal feature for partitioning the dataset and what criteria should we
use to evaluate the quality of these partitions when constructing a decision tree?

Gini Impurity Formula:

If we have C total classes and p(i) is the probability of picking a datapoint with class I, then the Gini Impurity
is calculated as:
Now, we have the formula, let’s calculate the Gini Impurity for our dataset;

We have 2 classes (0 and 1);

Gini Impurity(df) = 1- p²(0)-p²(1) = 1–(5/14)²-(9/14)² = 0.459.

The next step is to calculate the Gini Impurity for the 4 features (outlook, temp, humidity, windy), and
decide which feature will be the root node.

Let’s calculate the Gini Impurity for Outlook; as you may notice, the outlook feature is a categorical variable,
we have three possible values (sunny, overcast, and rainy).

When outlook = sunny, sunny (2yes/3no), for outlook = overcast(4yes/0no) and, finally for outlook =
rainy(3yes/2no).

Gini Impurity(outlook = sunny) = 1-(2/5)²-(3/5)² = 0.48

Gini Impurity(outlook = overcast) = 1-(4/4)² = 0

Gini Impurity(outlook = sunny) = 1-(3/5)²-(2/5)² = 0.48

We’ll calculate the Gini Impurity of outlook by weighting the impurity of each branch and how many
elements it has,

Gini Impurity(outlook) = 5/14 0.48 + 4/14 0 + 5/14 * 0.48 = 0.34

Congratulation! you have just calculated the Gini Impurity for the first feature, to calculate the Gini Gain,
which is calculated by subtracting the weighted impurities of the branches from the original impurity.

Gini Gain(outlook) = Gini Impurity(df) — GiniImpurity(outlook)

Gini Gain(outlook) = 0.459–0.34 = 0.119

Which feature should I use as a decision node(root node)?

The best split is chosen by maximizing the Gini Gain or by minimizing the Gini Impurity.

In our example, the outlook has the minimum Gini Impurity value and the maximum Gini Gain value, so, It
In our example, the outlook has the minimum Gini Impurity value and the maximum Gini Gain value, so, It
will be chosen as the root decision to split our data.

Hyperparameters and GridSearchCV

As a data scientist, you're often tasked with building machine learning models to solve complex problems.
To make these models perform at their best, you need to understand the concept of hyperparameters and
how to optimize them. In this article, we'll break it down in simple terms and introduce you to the power of
GridSearchCV, a handy tool to find the best hyperparameters for your models.

What are Hyperparameters?

Hyperparameters are like the dials and switches on a machine learning model, controlling how it learns
from data. Unlike the model's parameters (like weights and biases), which it learns from the training data,
hyperparameters are set by you before the training begins. Let's explore them in simple language:

• Criterion: Think of this as the decision-making principle for your model. It can be either "gini" or
"entropy," determining how your model chooses which questions to ask during training.

• Splitter: This hyperparameter is all about how the model makes choices. It can "split" by selecting the
"best" feature or do it "randomly." Like flipping a coin to decide.

• Max Depth: Imagine this as a tree in your backyard. The "max depth" is like deciding how tall this tree
can grow. You can set it to a number (like 10), or you can let it grow as tall as it wants (None).

• Min Samples Split: This hyperparameter tells the model how many samples need to be at a branch
before it splits. It's like saying, "Hey, only split if there are at least 5 apples on this branch."

• Min Samples Leaf: Now, think of this as a rule for when to stop growing a branch. It tells the model
not to make a new branch if there are fewer than a certain number of samples left.

The Power of GridSearchCV

Finding the right hyperparameters can be like searching for a needle in a haystack. You could guess, but
why not use a smarter approach? That's where GridSearchCV comes in:

• Grid Search: It's like having a map of the entire haystack, marking specific spots where you think the
needle might be. In our case, these spots are different combinations of hyperparameter values.

• Random Search: Imagine instead of a map, you randomly drop pins into the haystack. Grid Search
checks every marked spot, while Random Search explores some of them, hoping to find the needle
faster.

• Manual Search: Sometimes, you don't need a map; you know the haystack well. You manually choose
where to look for the needle. This is like setting hyperparameters based on your intuition.

• Bayesian Optimization: This is like having a detective that learns from previous attempts. It doesn't
waste time revisiting spots where the needle isn't. It adapts and focuses on promising areas.

• Genetic Algorithms: Think of this as evolution. It starts with a population of possibilities and creates
new ones by mixing and mutating them. Over time, it gets closer to finding the best hyperparameters.

Putting It All Together

Here's how it all works together:

You define a grid of hyperparameters, setting the values you want to explore.
• You define a grid of hyperparameters, setting the values you want to explore.

• GridSearchCV, or one of the other methods, goes through each combination of hyperparameters and
trains your model with them.

• It evaluates the model's performance using cross-validation and selects the best set of
hyperparameters based on an evaluation metric like accuracy or error.

• Finally, you train your model using the best hyperparameters, making it perform at its peak on your
specific data.

Hyperparameter tuning is like finding the best settings for your machine learning model. It's a bit like
tuning a musical instrument - finding just the right notes to play. With techniques like GridSearchCV, you
can make your models sing beautifully on your data, and that's what makes you a powerful data scientist.

Remember, finding the right hyperparameters is not a one-time thing. It's an iterative process that requires
experimentation and fine-tuning. So, keep exploring, keep learning, and keep improving your models to
achieve the best results. Happy modeling!

Play_tennis Dataset with Gini Impurity and grid search

In [35]: tennis=play_tennis.drop('day', axis=1)
tennis
Out[35]: outlook temp humidity wind play

0 Sunny Hot High Weak No

1 Sunny Hot High Strong No

2 Overcast Hot High Weak Yes

3 Rain Mild High Weak Yes

4 Rain Cool Normal Weak Yes

5 Rain Cool Normal Strong No

6 Overcast Cool Normal Strong Yes

7 Sunny Mild High Weak No

8 Sunny Cool Normal Weak Yes

9 Rain Mild Normal Weak Yes

10 Sunny Mild Normal Strong Yes

11 Overcast Cool Normal Strong Yes

12 Sunny Mild Normal Strong Yes

13 Rain Mild High Strong No

Lebel encoder
In [36]: from sklearn.preprocessing import LabelEncoder

In [37]: le=LabelEncoder()

In [38]: for col in tennis:

tennis[col]=le.fit_transform(tennis[col])

In [39]: tennis

Out[39]: outlook temp humidity wind play

0 2 1 0 1 0

1 2 1 0 0 0

2 0 1 0 1 1

3 1 2 0 1 1

4 1 0 1 1 1

5 1 0 1 0 0

6 0 0 1 0 1

7 2 2 0 1 0

8 2 0 1 1 1

9 1 2 1 1 1

10 2 2 1 0 1

11 0 0 1 0 1

12 2 2 1 0 1

13 1 2 0 0 0

In [40]: y=tennis['play']

In [41]: x=tennis.drop('play', axis=1)

In [42]: y

Out[42]: 0 0
1 0
2 1
3 1
4 1
5 0
6 1
7 0
8 1
9 1
10 1
11 1
12 1
13 0
Name: play, dtype: int32

In [43]: x

Out[43]: outlook temp humidity wind

0 2 1 0 1

1 2 1 0 0

2 0 1 0 1

3 1 2 0 1

4 1 0 1 1

5 1 0 1 0

6 0 0 1 0

7 2 2 0 1

8 2 0 1 1

9 1 2 1 1

10 2 2 1 0

11 0 0 1 0

12 2 2 1 0

13 1 2 0 0

In [44]: X_train, X_test, y_train, y_test=train_test_split(x,y, test_size=0.2, random_state=123)

In [45]: from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import GridSearchCV

In [46]: # Define the parameter grid

param_grid = {
'criterion': ['gini', 'entropy'],
'splitter': ['best', 'random'],
'max_depth': [None, 5, 10, 15],
'min_samples_split': [2, 5, 10],
'min_samples_leaf': [1, 2, 4],
}

In [47]: tennis_model=DecisionTreeClassifier()
In [48]: # Use GridSearchCV to find the best hyperparameters
grid_search = GridSearchCV(tennis_model, param_grid, cv=5)
grid_search.fit(x, y)

# Get the best parameters and the best estimator

best_params = grid_search.best_params_
best_estimator = grid_search.best_estimator_

print("Best Parameters:", best_params)

# Train the model with the best hyperparameters

best_estimator.fit(x, y)

Best Parameters: {'criterion': 'gini', 'max_depth': 5, 'min_samples_leaf': 4, 'min_samples_spli

t': 5, 'splitter': 'random'}
Out[48]: ▾ DecisionTreeClassifier
DecisionTreeClassifier(max_depth=5, min_samples_leaf=4, min_samples_split=5,
splitter='random')

In [49]: tennis_pred_y=best_estimator.predict(X_test)

In [50]: tennis_accuracy = accuracy_score(y_test, tennis_pred_y)

print(f"Accuracy: {cancer_accuracy:.2f}")

Accuracy: 0.97

In [51]: display_tree(best_estimator)

In [52]: confusion_matrix_fun(y_test, tennis_pred_y)

precision recall f1-score support

0 1.00 1.00 1.00 1

1 1.00 1.00 1.00 2

accuracy 1.00 3
macro avg 1.00 1.00 1.00 3
weighted avg 1.00 1.00 1.00 3
What Are Regression Trees ?
A regression tree is a machine learning model that is used for regression tasks, where the goal is to predict
continuous, numerical values (outputs) rather than discrete categories. It functions similarly to a decision
tree, but instead of making categorical decisions, it makes splits and decisions to estimate and predict
numeric outcomes.

Mean Square Error

• In Decision Trees for Classification, we learned that the tree makes decisions by asking the right
questions at each node to classify data accurately.

• To do this, it uses measures like Entropy and Information Gain. However, when we're predicting
continuous values, we can't use the same approach.

• We need a different way to measure how much our predictions differ from the actual target, and that's
where the mean square error comes in.

It helps us understand how far off our predictions are from the real values we want to predict.

In the Regression Tree algorithm:

• We have the actual value Y and our prediction Y_hat.
• What we care about is how much our prediction varies from the target, regardless of the direction.
• So, we square the difference between Y and Y_hat.
• Then, we add up all these squared differences for all data points.
• Finally, we divide this sum by the total number of data points to calculate the Mean Square Error (MSE).

In Regression Trees, just like in Classification trees:

• We aim to reduce the Mean Square Error.
• But instead of reducing entropy as in Classification trees, we focus on minimizing the MSE at each child
node to improve our predictions for continuous values.
Building a Regression Tree
Let’s consider a dataset where we have 2 variables, as shown below

Source:- https://medium.com/analytics-vidhya/regression-trees-decision-tree-for-regression-machine-
learning-e4d7525d8047
Step 1: Initial Split and Calculation of Predicted Outputs and Mean Square
Error
• Sort the data based on X (already sorted in this case).

• Calculate the average of the first 2 rows in variable X, which is (1+2)/2 = 1.5 according to the given
dataset.

• Divide the dataset into two parts (Part A and Part B) based on the condition: X < 1.5 and X ≥ 1.5.

• Part A consists of only one point, which is the first row (1,1), and all the other points are in Part B.

• Calculate the average of all Y values in Part A and Part B separately. These two values are the predicted
output of the decision tree for X < 1.5 and X ≥ 1.5, respectively.

• Using the predicted and original values, calculate the Mean Square Error (MSE) and note it down.

Step 2: Repeated Split and Mean Square Error Calculation

• Repeat the process for different pairs of rows in sorted X:
• Calculate the average for the second 2 numbers of sorted X, which is (2+3)/2 = 2.5.
• Split the dataset again based on X < 2.5 and X ≥ 2.5 into Part A and Part B.
• Predict outputs and find the Mean Square Error as shown in step 1.
• This process is repeated for the third 2 numbers, the fourth 2 numbers, the 5th, 6th, 7th, and so on until
the (n-1)th 2 numbers, where n is the number of records or rows in the dataset.

Step 3: Choosing the Split Point

• Now that we have (n-1) mean squared errors calculated, we need to choose the point at which we are
going to split the dataset.
• Choose the point that resulted in the lowest mean squared error upon splitting. In this case, the point is
X = 5.5.
• Hence, the tree will be split into two parts: X < 5.5 and X ≥ 5.5.
• The root node is selected this way, and the data points that go towards the left child and right child of
the root node are further recursively exposed to the same algorithm for further splitting.

This process creates a decision tree for regression by iteratively finding the best split points based on the
lowest Mean Square Error, which helps in making accurate predictions for continuous variables.

Regression Dataset Model Predictions

In [53]: df_reg=pd.read_csv('play_tennis_reg.csv')

In [54]: df_reg=df_reg.drop('day', axis=1)

In [55]: y=df_reg['Hours played']

In [56]: from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()

for col in df_reg:

df_reg[col]=le.fit_transform(df_reg[col])

In [57]: x=df_reg

In [58]: X_train, X_test, y_train, y_test=train_test_split(x,y, test_size=0.2, random_state=123)

In [59]: reg_model=DecisionTreeClassifier()

In [60]: reg_model.fit(X_train, y_train)

Out[60]: ▾ DecisionTreeClassifier

DecisionTreeClassifier()

In [61]: y_pred_reg=reg_model.predict(X_test)

In [62]: cancer_accuracy = accuracy_score(y_test, y_pred_reg)

print(f"Accuracy: {cancer_accuracy:.2f}")

Accuracy: 0.67

Regression Tree plotted

In [63]: display_tree(reg_model)

Regression confusion matrix plotted

In [64]: confusion_matrix_fun(y_test, y_pred_reg)
C:\Users\prashantkumar.sundge\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py:14
69: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in label
s with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
C:\Users\prashantkumar.sundge\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py:14
69: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels w
ith no true samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
C:\Users\prashantkumar.sundge\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py:14
69: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in label
s with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
C:\Users\prashantkumar.sundge\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py:14
69: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels w
ith no true samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
C:\Users\prashantkumar.sundge\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py:14
69: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in label
s with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
C:\Users\prashantkumar.sundge\Anaconda3\lib\site-packages\sklearn\metrics\_classification.py:14
69: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels w
ith no true samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
precision recall f1-score support

26 0.00 0.00 0.00 0

35 0.00 0.00 0.00 1
48 1.00 1.00 1.00 1
52 1.00 1.00 1.00 1

accuracy 0.67 3
macro avg 0.50 0.50 0.50 3
weighted avg 0.67 0.67 0.67 3

The logic behind the algorithm itself is not rocket science. All we are doing is splitting the data-set by
selecting certain points that best splits the data-set and minimises the mean square error. And the way we
are selecting these points is by going through an iterative process of calculating mean square error for all
the splits and choosing the split that has the least value for the mse. So, It only natural this works.
What happens when there are multiple
independent variables ?
• Let us consider that there are 3 variables similar to the independent variable X from fig 2.2.

• At each node, All the 3 variables would go through the same process as what X went through in the
above example. The data would be sorted based on the 3 variables separately.

• The points that minimises the mse are calculated for all the 3 variables. out of the 3 variables and the
points calculated for them, the one that has the least mse would be chosen.

Referance
www.analyticsvidhya.com

https://medium.com/@jairiidriss

https://scikit-learn.org

https://medium.com/analytics-vidhya/regression-trees-decision-tree-for-regression-machine-learning-
e4d7525d8047

The AI Wealth Creation Blueprint PDF
67% (3)
The AI Wealth Creation Blueprint PDF
50 pages
Procedural Generation in Game Design
93% (14)
Procedural Generation in Game Design
339 pages
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
88% (8)
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
56 pages
A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification
No ratings yet
A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification
16 pages
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
81% (48)
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
708 pages
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
100% (10)
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
821 pages
A Coomer's Guide To AI Dungeon
No ratings yet
A Coomer's Guide To AI Dungeon
30 pages
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
100% (25)
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
306 pages
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
100% (24)
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
52 pages
Banana Pancakes - Ukulele Chord Chart
100% (1)
Banana Pancakes - Ukulele Chord Chart
2 pages
The Fabric of Reality
100% (1)
The Fabric of Reality
6 pages
75 Productivity Hacks - System Sunday
100% (7)
75 Productivity Hacks - System Sunday
75 pages
Military Remote Viewing Manual
100% (5)
Military Remote Viewing Manual
72 pages
DM UNIT III (1)
No ratings yet
DM UNIT III (1)
87 pages
Lec-3-Decision Trees
No ratings yet
Lec-3-Decision Trees
47 pages
Ml Unit 2 Final_iii Yr
No ratings yet
Ml Unit 2 Final_iii Yr
72 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
3. Tree Models
No ratings yet
3. Tree Models
42 pages
Decision Tree: Dept of CS & IT Bahauddin Zakariya University, Sahiwal Campus
No ratings yet
Decision Tree: Dept of CS & IT Bahauddin Zakariya University, Sahiwal Campus
31 pages
MIS410-Chapter6
No ratings yet
MIS410-Chapter6
47 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Enhancements To Basic Decision Tree Induction, C4.5
No ratings yet
Enhancements To Basic Decision Tree Induction, C4.5
53 pages
DM(LP1)-1
No ratings yet
DM(LP1)-1
15 pages
AI_01_ID3
No ratings yet
AI_01_ID3
7 pages
Module 3
No ratings yet
Module 3
102 pages
Unit 3 MLT
No ratings yet
Unit 3 MLT
18 pages
Module 3-1 PDF
No ratings yet
Module 3-1 PDF
43 pages
Deep Learning: Decision Trees I
No ratings yet
Deep Learning: Decision Trees I
45 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
Module 3
No ratings yet
Module 3
101 pages
ML - Unit 2 - Part I
No ratings yet
ML - Unit 2 - Part I
15 pages
Decision Tree Random Forest Theory
No ratings yet
Decision Tree Random Forest Theory
13 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
DM-Lecture Decision Trees (A)
No ratings yet
DM-Lecture Decision Trees (A)
161 pages
decision-tree-intro-MDT903
No ratings yet
decision-tree-intro-MDT903
40 pages
decision_tree_learning_lecture
No ratings yet
decision_tree_learning_lecture
13 pages
Classification and Clustering
No ratings yet
Classification and Clustering
59 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
12 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
No ratings yet
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
21 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
ML Unit-2 Material WORD
No ratings yet
ML Unit-2 Material WORD
25 pages
Unit 3
No ratings yet
Unit 3
33 pages
Unit-2 Notes
No ratings yet
Unit-2 Notes
20 pages
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-08-19 Reference-Material-I
No ratings yet
FALLSEM2023-24 CSE4020 ELA VL2023240104096 2023-08-19 Reference-Material-I
11 pages
Big Data Analytics - Unit 3
No ratings yet
Big Data Analytics - Unit 3
55 pages
Unit-3 Decision Tree Learning (Februray 26, 2024)
No ratings yet
Unit-3 Decision Tree Learning (Februray 26, 2024)
51 pages
Unit IV Notes
No ratings yet
Unit IV Notes
20 pages
module 2
No ratings yet
module 2
42 pages
Module 2 Notes v1 PDF
No ratings yet
Module 2 Notes v1 PDF
20 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
5 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
Unit 3
No ratings yet
Unit 3
46 pages
Ml Lecture04x2
No ratings yet
Ml Lecture04x2
16 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
14 pages
Video Tutorial: Decision Tree Learning
No ratings yet
Video Tutorial: Decision Tree Learning
21 pages
Springer.linguistic Decision Trees for Classification-2014
No ratings yet
Springer.linguistic Decision Trees for Classification-2014
43 pages
Tree
No ratings yet
Tree
7 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
Saad Assign 1 AI
No ratings yet
Saad Assign 1 AI
6 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
Lab Program 3
No ratings yet
Lab Program 3
6 pages
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
No ratings yet
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
22 pages
ML Lecture 3
No ratings yet
ML Lecture 3
13 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
Data Minin1
No ratings yet
Data Minin1
104 pages
25-questions-to-test-your-skills-on-decision-trees
No ratings yet
25-questions-to-test-your-skills-on-decision-trees
10 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Lab Program 3
No ratings yet
Lab Program 3
6 pages
The ID3 Algorithm
No ratings yet
The ID3 Algorithm
9 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
The Secrets of A Slot Machine
No ratings yet
The Secrets of A Slot Machine
4 pages
Teas Topics To Study
100% (12)
Teas Topics To Study
6 pages
2045: The Year Man Becomes Immortal
No ratings yet
2045: The Year Man Becomes Immortal
9 pages
Mythic Magazine #009
100% (3)
Mythic Magazine #009
27 pages
My Ai Cheat List
100% (11)
My Ai Cheat List
3 pages
Mercity - Ai-Guide To Fine-Tuning LLMs Using PEFT and LoRa Techniques
No ratings yet
Mercity - Ai-Guide To Fine-Tuning LLMs Using PEFT and LoRa Techniques
25 pages
Improved Statistical Test
87% (171)
Improved Statistical Test
20 pages
Deep Thinking Where Machine Intelligence PDF
100% (1)
Deep Thinking Where Machine Intelligence PDF
3 pages
Next Generation Sequencing Data Analysis
No ratings yet
Next Generation Sequencing Data Analysis
435 pages
Algebra Workbook
100% (3)
Algebra Workbook
299 pages
Download Complete Artificial Intelligence and Problem Solving 1st Edition Danny Kopec PDF for All Chapters
100% (4)
Download Complete Artificial Intelligence and Problem Solving 1st Edition Danny Kopec PDF for All Chapters
61 pages
Scientific American - April 2024
100% (1)
Scientific American - April 2024
88 pages
Ghosh S. Mathematics and Computer Science Vol 1. 2023
No ratings yet
Ghosh S. Mathematics and Computer Science Vol 1. 2023
743 pages
List of Deepfake Tools
No ratings yet
List of Deepfake Tools
5 pages
Prompt Engineering - Links and Resources
No ratings yet
Prompt Engineering - Links and Resources
2 pages
Websites and Tools Links
No ratings yet
Websites and Tools Links
3 pages
A Methodology For Detecting Credit Card Fraud
No ratings yet
A Methodology For Detecting Credit Card Fraud
60 pages
Cognitive Bias Cheat Sheet
100% (1)
Cognitive Bias Cheat Sheet
17 pages
Define Central Tendency
No ratings yet
Define Central Tendency
8 pages
paper11
No ratings yet
paper11
16 pages
Event Detection in Football: Improving The Reliability of Match Analysis
No ratings yet
Event Detection in Football: Improving The Reliability of Match Analysis
11 pages
Godavari Engg College 24-25 Internship Report
No ratings yet
Godavari Engg College 24-25 Internship Report
19 pages
Quantitative Techniques - 4th Dec
No ratings yet
Quantitative Techniques - 4th Dec
68 pages
Mental Health Detection Using Machine Learning
No ratings yet
Mental Health Detection Using Machine Learning
7 pages
Predictionof Diabetesusing Machine Learning
No ratings yet
Predictionof Diabetesusing Machine Learning
6 pages
PAM - Complete
No ratings yet
PAM - Complete
322 pages
QT Project
No ratings yet
QT Project
21 pages
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
No ratings yet
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
10 pages
Cs8091 Bigdata Analytics Question Bank
No ratings yet
Cs8091 Bigdata Analytics Question Bank
40 pages
2021-SPE-208657-Successful Application of ML To Improve Dynamic Modeling HM For Complex Gas Condensate Reservoirs in Hai Thach Field, Nam Con Son Basin, Offshore Vietnam
No ratings yet
2021-SPE-208657-Successful Application of ML To Improve Dynamic Modeling HM For Complex Gas Condensate Reservoirs in Hai Thach Field, Nam Con Son Basin, Offshore Vietnam
14 pages
or 2 Decision Theory
No ratings yet
or 2 Decision Theory
57 pages
Falling Rule Lists - Fulton Wang, Cynthia Rudin
No ratings yet
Falling Rule Lists - Fulton Wang, Cynthia Rudin
10 pages
Decision Tree Questions
No ratings yet
Decision Tree Questions
3 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Operations Research Module TIE 5208 - 2015
100% (1)
Operations Research Module TIE 5208 - 2015
72 pages
Unit
No ratings yet
Unit
13 pages
Natural Language Processing and ML Based Student Mental Health Analysis Using Non Clinical Texts PDF
No ratings yet
Natural Language Processing and ML Based Student Mental Health Analysis Using Non Clinical Texts PDF
53 pages
Bachelor of Technology: Diabetes Disease Prediction Using Machine Learning
No ratings yet
Bachelor of Technology: Diabetes Disease Prediction Using Machine Learning
58 pages
AgroAdvisor_Crop_Yield_Prediction_Crop_and_Fertili
No ratings yet
AgroAdvisor_Crop_Yield_Prediction_Crop_and_Fertili
27 pages
OD11 PL Decision Analysis
No ratings yet
OD11 PL Decision Analysis
4 pages
CHAPTER 12-Quantitative Techniques For Decision-Making
No ratings yet
CHAPTER 12-Quantitative Techniques For Decision-Making
135 pages
Jntuk R20 ML Unit-Iii
100% (1)
Jntuk R20 ML Unit-Iii
21 pages
Unit 3
100% (1)
Unit 3
21 pages
X P X P X P PX: Decision Theory
No ratings yet
X P X P X P PX: Decision Theory
5 pages
Thesis Fall 2022
No ratings yet
Thesis Fall 2022
16 pages
Chap 06projectriskmanagement
No ratings yet
Chap 06projectriskmanagement
32 pages
Kiran Kumar Mini
No ratings yet
Kiran Kumar Mini
113 pages