0% found this document useful (0 votes)

146 views52 pages

Fundamentals of Machine Learning For Predictive Data Analytics

The document discusses techniques for feature selection and handling continuous variables in machine learning models. It covers alternative feature selection metrics like information gain ratio and Gini index that address issues with entropy-based metrics. It also describes methods for transforming continuous descriptive features into binary features for decision trees by sorting data, finding classification thresholds, and selecting the threshold with highest information gain. The document is about feature engineering methods for predictive modeling.

Uploaded by

Mba Nani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

146 views52 pages

Fundamentals of Machine Learning For Predictive Data Analytics

Uploaded by

Mba Nani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Feature Selection Cont. Desc. Features Cont.

Targets Noise and Overfitting Ensembles Summary

Fundamentals of Machine Learning for

Predictive Data Analytics
Chapter 4: Information-based Learning
Sections 4.4, 4.5

John Kelleher and Brian Mac Namee and Aoife D’Arcy

[email protected] [email protected] [email protected]

Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

1 Alternative Feature Selection Metrics

2 Handling Continuous Descriptive Features

3 Predicting Continuous Targets

4 Noisy Data, Overfitting and Tree Pruning

5 Model Ensembles
Boosting
Bagging

6 Summary
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Alternative Feature Selection

Metrics
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Entropy based information gain, preferences features with

many values.
One way of addressing this issue is to use information
gain ratio which is computed by dividing the information
gain of a feature by the amount of information used to
determine the value of the feature:
IG (d, D)
GR (d, D) = X (1)
− (P(d = l) × log2 (P(d = l)))
l∈levels(d)
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

IG (S TREAM, D) = 0.3060
IG (S LOPE, D) = 0.5774
IG (E LEVATION, D) = 0.8774
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

H (S TREAM, D)
X
=− P(S TREAM = l) × log2 (P(S TREAM = l))
l∈ ’true’,
n o
’false’

4 4 3 3
=− /7 × log2 ( /7 ) + /7 × log2 ( /7 )

= 0.9852 bits

H (S LOPE, D)
X
=− P(S LOPE = l) × log2 (P(S LOPE = l))
(
’flat’,
)
l∈ ’moderate’,
’steep’

1 1 1 1 5 5
=− /7 × log2 ( /7 ) + /7 × log2 ( /7 ) + /7 × log2 ( /7 )

= 1.1488 bits

H (E LEVATION, D)
X
=− P(E LEVATION = l) × log2 (P(E LEVATION = l))
 
’low’,
 
’medium’,

l∈
 ’high’, 
’highest’
 

1 1 2 2 3 3 1 1
=− /7 × log2 ( /7 ) + /7 × log2 ( /7 ) + /7 × log2 ( /7 ) + /7 × log2 ( /7 )

= 1.8424 bits
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

0.3060
GR (S TREAM, D) = = 0.3106
0.9852
0.5774
GR (S LOPE, D) = = 0.5026
1.1488
0.8774
GR (E LEVATION, D) = = 0.4762
1.8424
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Slope

flat moderate steep

conifer riparian Elevation

low medium high highest

chaparral Stream chaparral conifer

true false

riparian chaparral

Figure: The vegetation classification decision tree generated using

Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Another commonly used measure of impurity is the Gini

index:
X
Gini (t, D) = 1 − P(t = l)2 (2)
l∈levels(t)

The Gini index can be thought of as calculating how often

you would misclassify an instance in the dataset if you
classified it based on the distribution of classifications in
the dataset.
Information gain can be calculated using the Gini index by
replacing the entropy measure with the Gini index.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Gini (V EGETATION, D)
X
=1− P(V EGETATION = l)2
( )
’chapparal’,
l∈ ’riparian’,
’conifer’
2 2
2 2 2
= 1 − (3/7 ) + /7 + /7

= 0.6531
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Table: Partition sets (Part.), entropy, Gini index, remainder (Rem.),

and information gain (Info. Gain) by feature

Split by Partition Info.

Feature Level Part. Instances Gini Index Rem. Gain
’true’ D1 d2 , d3 , d6 , d7 0.625
S TREAM 0.5476 0.1054
’false’ D2 d1 , d4 , d5 0.4444
’flat’ D3 d5 0
S LOPE ’moderate’ D4 d2 0 0.4 0.2531
’steep’ D5 d1 , d3 , d4 , d6 , d7 0.56
’low’ D6 d2 0
’medium’ D7 d3 , d4 0.5
E LEVATION 0.3333 0.3198
’high’ D8 d1 , d5 , d7 0.4444
’highest’ D9 d6 0
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Handling Continuous Descriptive

Features
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

The easiest way to handle continuous valued descriptive

features is to turn them into boolean features by defining a
threshold and using this threshold to partition the instances
based their value of the continuous descriptive feature.
How do we set the threshold?
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

1 The instances in the dataset are sorted according to the

continuous feature values.
2 The adjacent instances in the ordering that have different
classifications are then selected as possible threshold
points.
3 The optimal threshold is found by computing the
information gain for each of these classification transition
boundaries and selecting the boundary with the highest
information gain as the threshold.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Once a threshold has been set the dynamically created

new boolean feature can compete with the other
categorical features for selection as the splitting feature at
that node.
This process can be repeated at each node as the tree
grows.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Table: Dataset for predicting the vegetation in an area with a

continuous E LEVATION feature (measured in feet).
ID S TREAM S LOPE E LEVATION V EGETATION
1 false steep 3 900 chapparal
2 true moderate 300 riparian
3 true steep 1 500 riparian
4 false steep 1 200 chapparal
5 false flat 4 450 conifer
6 true steep 5 000 conifer
7 true steep 3 000 chapparal
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Table: Dataset for predicting the vegetation in an area sorted by the

continuous E LEVATION feature.
ID S TREAM S LOPE E LEVATION V EGETATION
2 true moderate 300 riparian
4 false steep 1 200 chapparal
3 true steep 1 500 riparian
7 true steep 3 000 chapparal
1 false steep 3 900 chapparal
5 false flat 4 450 conifer
6 true steep 5 000 conifer
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Table: Partition sets (Part.), entropy, remainder (Rem.), and

information gain (Info. Gain) for the candidate E LEVATION thresholds:
≥750, ≥1 350, ≥2 250 and ≥4 175.

Split by Partition Info.

Threshold Part. Instances Entropy Rem. Gain
D1 d2 0.0
≥750 1.2507 0.3060
D2 d4 , d3 , d7 , d1 , d5 , d6 1.4591
D3 d2 , d4 1.0
≥1 350 1.3728 0.1839
D4 d3 , d7 , d1 , d5 , d6 1.5219
D5 d2 , d4 , d3 0.9183
≥2 250 0.9650 0.5917
D6 d7 , d1 , d5 , d6 1.0
D7 d2 , d4 , d3 , d7 , d1 0.9710
≥4 175 0.6935 0.8631
D8 d5 , d6 0.0
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Elevation

<4,175 ≥4,175

ID Stream Slope Elevation Vegetation

2 true moderate 300 riparian ID Stream Slope Elevation Vegetation
4 false steep 1,200 chaparral
D7 D8 5 false flat 4,450 conifer
3 true steep 1,500 riparian
6 true steep 5,000 conifer
7 true steep 3,000 chaparral
1 false steep 3,900 chaparral

Figure: The vegetation classification decision tree after the dataset

has been split using E LEVATION ≥ 4 175.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Elevation

<4,175 ≥4,175

Stream conifer

true false

Elevation chaparral

<2,250 ≥2,250

riparian chaparral

Figure: The decision tree that would be generated for the vegetation
classification dataset listed in Table 3 [17] using information gain.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Predicting Continuous Targets

Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Regression trees are constructed so as to reduce the

variance in the set of training examples at each of the leaf
nodes in the tree
We can do this by adapting the ID3 algorithm to use a
measure of variance rather than a measure of classification
impurity (entropy) when selecting the best attribute
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

The impurity (variance) at a node can be calculated using

the following equation:
Pn 2
i=1 ti − t̄
var (t, D) = (3)
n−1

We select the feature to split on at a node by selecting the

feature that minimizes the weighted variance across the
resulting partitions:
X |Dd=l |
d[best] = argmin × var (t, Dd=l ) (4)
d∈d |D|
l∈levels(d)
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

u uu uu uu -
(a) Target

u uu uu uu -
(b) Underfitting

u uu uu uu -
(c) Goldilocks

h
u h
uhu u u hh
hh uu -
(d) Overfitting

Figure: (a) A set of instances on a continuous number line; (b), (c),

and (d) depict some of the potential groupings that could be applied
to these instances.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Table: A dataset listing the number of bike rentals per day.

ID S EASON W ORK DAY R ENTALS ID S EASON W ORK DAY R ENTALS

1 winter false 800 7 summer false 3 000
2 winter false 826 8 summer true 5 800
3 winter true 900 9 summer true 6 200
4 spring false 2 100 10 autumn false 2 910
5 spring true 4 740 11 autumn false 2 880
6 spring true 4 900 12 autumn true 2 820
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Table: The partitioning of the dataset in Table 5 [25] based on S EASON

and W ORK DAY features and the computation of the weighted
variance for each partitioning.

Split by |Dd=l | Weighted

Feature Level Part. Instances |D| var (t, D) Variance
’winter’ D1 d1 , d2 , d3 0.25 2 692
’spring’ D2 d4 , d5 , d6 0.25 2 472 533 13
S EASON 1 379 331 13
’summer’ D3 d7 , d8 , d9 0.25 3 040 000
’autumn’ D4 d10 , d11 , d12 0.25 2 100
’true’ D5 d3 , d5 , d6 , d8 , d9 , d12 0.50 4 026 346 31
W ORK DAY 2 551 813 13
’false’ D6 d1 , d2 , d4 , d7 , d10 , d11 0.50 1 077 280
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Season

winter spring summer autumn

ID Work Day Rentals ID Work Day Rentals ID Work Day Rentals ID Work Day Rentals
1 false 800 4 false 2,100 7 false 3,000 10 false 2,910
D1 D2 D3 D4
2 false 826 5 true 4,740 8 true 5,800 11 false 2,880
3 true 900 6 true 4,900 9 true 6,200 12 true 2,820

Figure: The decision tree resulting from splitting the data in Table 5
[25]
using the feature S EASON.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Season

winter autumn

Work Day spring summer Work Day

true false true false

ID Rentals Pred. ID Rentals Pred.

ID Rentals Pred. ID Rentals Pred.
1 800 Work Day Work Day 10 2,910
3 900 900 813 12 2,820 2,820 2,895
2 826 11 2,880

true false true false

ID Rentals Pred. ID Rentals Pred.

ID Rentals Pred. ID Rentals Pred.
5 4,740 8 5,800
4,820 4 2,100 2,100 6,000 7 3,000 3,000
6 4,900 9 6,200

Figure: The final decision tree induced from the dataset in Table 5
[25]
. To illustrate how the tree generates predictions, this tree lists the
instances that ended up at each leaf node and the prediction (PRED.)
made by each leaf node.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Noisy Data, Overfitting and Tree

Pruning
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

In the case of a decision tree, over-fitting involves splitting

the data on an irrelevant feature.

The likelihood of over-fitting occurring increases as a tree gets

deeper because the resulting classifications are based on
smaller and smaller subsets as the dataset is partitioned after
each feature test in the path.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Pre-pruning: stop the recursive partitioning early.

Pre-pruning is also known as forward pruning.
Common Pre-pruning Approaches
1 early stopping
2 χ2 pruning

Post-pruning: allow the algorithm to grow the tree as

much as it likes and then prune the tree of the branches
that cause over-fitting.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Common Post-pruning Approach

Using the validation set evaluate the prediction accuracy
achieved by both the fully grown tree and the pruned copy
of the tree. If the pruned copy of the tree performs no
worse than the fully grown tree the node is a candidate for
pruning.
0.5

Performance on Training Set

Performance on Validation Set
0.4
Misclassification Rate
0.3
0.2
0.1

0 50 100 150 200

Training Iteration
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Table: An example validation set for the post-operative patient routing

task.
C ORE - S TABLE -
ID T EMP T EMP G ENDER D ECISION
1 high true male gen
2 low true female icu
3 high false female icu
4 high false male icu
5 low false female icu
6 low true male icu
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Core-Temp
[icu]

low high

Gender Stable-Temp
[icu] [gen]

male female true false

icu gen gen icu

Figure: The decision tree for the post-operative patient routing task.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Core-Temp
[icu] Core-Temp Core-Temp
[icu] [icu] (1)

low high
low high low high

Gender Stable-Temp
Stable-Temp Stable-Temp
[icu] (0) [gen] icu icu (0)
[gen] (2) [gen]

male female true false true false true false

icu (0) gen (2) gen icu gen (0) icu (0) gen (0) icu (0)

(a) (b) (c)

Figure: The iterations of reduced error pruning for the decision tree in
Figure 7 [34] using the validation set in Table 7 [33] . The subtree that is
being considered for pruning in each iteration is highlighted in black.
The prediction returned by each non-leaf node is listed in square
brackets. The error rate for each node is given in round brackets.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Advantages of pruning:
Smaller trees are easier to interpret
Increased generalization accuracy when there is noise in
the training data (noise dampening).
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Model Ensembles
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Rather than creating a single model they generate a set of

models and then make predictions by aggregating the
outputs of these models.
A prediction model that is composed of a set of models is
called a model ensemble.
In order for this approach to work the models that are in
the ensemble must be different from each other.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

There are two standard approaches to creating

ensembles:
1 boosting
2 bagging.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Boosting

Boosting works by iteratively creating models and adding

them to the ensemble.
The iteration stops when a predefined number of models
have been added.
When we use boosting each new model added to the
ensemble is biased to pay more attention to instances that
previous models miss-classified.
This is done by incrementally adapting the dataset used to
train the models. To do this we use a weighted dataset
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Boosting

Weighted Dataset
Each instance has an associated weight wi ≥ 0,
1
Initially set to n where n is the number of instances in the
dataset.
After each model is added to the ensemble it is tested on
the training data and the weights of the instances the
model gets correct are decreased and the weights of the
instances the model gets incorrect are increased.
These weights are used as a distribution over which the
dataset is sampled to created a replicated training set,
where the replication of an instance is proportional to its
weight.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Boosting

During each training iteration the algorithm:

1 Induces a model and calculates the total error, , by
summing the weights of the training instances for which the
predictions made by the model are incorrect.
2 Increases the weights for the instances misclassified using:

1
w[i] ← w[i] × (5)
2×
3 Decreases the weights for the instances correctly
classified:
1
w[i] ← w[i] × (6)
2 × (1 − )
4 Calculate a confidence factor, α, for the model such that
α increases as decreases:

1 1−
α = × loge (7)
2
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Boosting

Once the set of models have been created the ensemble

makes predictions using a weighted aggregate of the
predictions made by the individual models.
The weights used in this aggregation are simply the
confidence factors associated with each model.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Bagging

When we use bagging (or bootstrap aggregating) each

model in the ensemble is trained on a random sample of
the dataset known as bootstrap samples.
Each random sample is the same size as the dataset and
sampling with replacement is used.
Consequently, every bootstrap sample will be missing
some of the instances from the dataset so each bootstrap
sample will be different and this means that models trained
on different bootstrap samples will also be different
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Bagging

When bagging is used with decision trees each bootstrap

sample only uses a randomly selected subset of the
descriptive features in the dataset. This is known as
subspace sampling.
The combination of bagging, subspace sampling, and
decision trees is known as a random forest model.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Bagging

ID F1 F2 F3 Target
1 - - - -
2 - - - -
3 - - - -
4 - - - -

Bagging and
Subspace Sampling

ID F1 F3 Target ID F2 F3 Target ID F1 F3 Target

1 - - - 2 - - - 1 - - -
1 - - - 2 - - - 3 - - -
2 - - - 4 - - - 3 - - -
3 - - - 4 - - - 4 - - -

Machine Learning Machine Learning Machine Learning

Algorithm Algorithm Algorithm

F3 F2 F1

F1 F3

MODEL ENSEMBLE

Figure: The process of creating a model ensemble using bagging

and subspace sampling.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Bagging

Which approach should we use? Bagging is simpler to

implement and parallelize than boosting and, so, may be
better with respect to ease of use and training time.
Empirical results indicate:
boosted decision tree ensembles were the best performing
model of those tested for datasets containing up to 4,000
descriptive features.
random forest ensembles (based on bagging) performed
better for datasets containing more that 4,000 features.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Summary
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

The decision tree model makes predictions based on

sequences of tests on the descriptive feature values of a
query
The ID3 algorithm as a standard algorithm for inducing
decision trees from a dataset.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Decision Trees: Advantages

interpretable.
handle both categorical and continuous descriptive
features.
has the ability to model the interactions between
descriptive features (diminished if pre-pruning is
employed)
relatively, robust to the curse of dimensionality.
relatively, robust to noise in the dataset if pruning is used.
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

Decision Tress: Potential Disadvantages

trees become large when dealing with continuous features.
decision trees are very expressive and sensitive to the
dataset, as a result they can overfit the data if there are a
lot of features (curse of dimensionality)
eager learner (concept drift).
Feature Selection Cont. Desc. Features Cont. Targets Noise and Overfitting Ensembles Summary

1 Alternative Feature Selection Metrics

2 Handling Continuous Descriptive Features

3 Predicting Continuous Targets

4 Noisy Data, Overfitting and Tree Pruning

5 Model Ensembles
Boosting
Bagging

6 Summary

Decision Tree
100% (4)
Decision Tree
66 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
74 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
Unit6 -2 Classification-Decision-Trees_25625586-1bf9-4821-a721-70db2d7805ef
No ratings yet
Unit6 -2 Classification-Decision-Trees_25625586-1bf9-4821-a721-70db2d7805ef
36 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
ML Unit-3 ppt
No ratings yet
ML Unit-3 ppt
92 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
DWDM UNIT 4
No ratings yet
DWDM UNIT 4
80 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
Textbook Chap 4- Information_Based_Learning
No ratings yet
Textbook Chap 4- Information_Based_Learning
100 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Complete Download Discovering Psychology: The Science of Mind (MindTap Course List), 4th Edition John T. Cacioppo PDF All Chapters
100% (3)
Complete Download Discovering Psychology: The Science of Mind (MindTap Course List), 4th Edition John T. Cacioppo PDF All Chapters
41 pages
DS_w12_DT
No ratings yet
DS_w12_DT
61 pages
COS10022 DSP Week05 Decision Tree and Random Forest
No ratings yet
COS10022 DSP Week05 Decision Tree and Random Forest
50 pages
CSE445 NSU Week_4
No ratings yet
CSE445 NSU Week_4
48 pages
Concepts and Techniques: Data Mining
100% (1)
Concepts and Techniques: Data Mining
81 pages
07_Decision tree
No ratings yet
07_Decision tree
45 pages
3 Decision Trees_LMS
No ratings yet
3 Decision Trees_LMS
47 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
3. Tree Models
No ratings yet
3. Tree Models
42 pages
Class Basic
No ratings yet
Class Basic
75 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
Session 6 - Decision Tree
No ratings yet
Session 6 - Decision Tree
37 pages
06 Classification Decision Tree
No ratings yet
06 Classification Decision Tree
42 pages
Decision Tree
No ratings yet
Decision Tree
34 pages
7. Decision Tree & Random Forest
No ratings yet
7. Decision Tree & Random Forest
41 pages
Week - 2 Day - 2 Machine Learning 2 - 3
No ratings yet
Week - 2 Day - 2 Machine Learning 2 - 3
33 pages
Unit-3_ML
No ratings yet
Unit-3_ML
47 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
08 Class Basic
No ratings yet
08 Class Basic
86 pages
Decision Tree
No ratings yet
Decision Tree
27 pages
BookSlides 4B Information Based Learning Edited
No ratings yet
BookSlides 4B Information Based Learning Edited
64 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Unit 4
No ratings yet
Unit 4
19 pages
Unit 6 Finalized
No ratings yet
Unit 6 Finalized
30 pages
20210913115613D3708 - Session 05-08 Decision Tree Classification
No ratings yet
20210913115613D3708 - Session 05-08 Decision Tree Classification
37 pages
ML-Lec-07-Decision Tree Overfitting
No ratings yet
ML-Lec-07-Decision Tree Overfitting
25 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
Decision Tree
No ratings yet
Decision Tree
36 pages
Decision Tree Classifier-Introduction, ID3
No ratings yet
Decision Tree Classifier-Introduction, ID3
34 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
42 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
G IX Lesson Plan March
No ratings yet
G IX Lesson Plan March
12 pages
Scheufele - Agenda-Setting, Priming, and Framing Revisited
No ratings yet
Scheufele - Agenda-Setting, Priming, and Framing Revisited
22 pages
Bandura - Social Learning Theory: Saul Mcleod
No ratings yet
Bandura - Social Learning Theory: Saul Mcleod
4 pages
Demo SOFTEXPERT GRC PDF
No ratings yet
Demo SOFTEXPERT GRC PDF
59 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Chapter 3 - Communicating With Older Persons
No ratings yet
Chapter 3 - Communicating With Older Persons
15 pages
BookSlides 6B Probability-Based Learning
No ratings yet
BookSlides 6B Probability-Based Learning
74 pages
BookSlides 5B Similarity Based Learning
No ratings yet
BookSlides 5B Similarity Based Learning
69 pages
BookSlides 5B Similarity Based Learning
No ratings yet
BookSlides 5B Similarity Based Learning
69 pages
BookSlides 6A Probability-Based Learning PDF
No ratings yet
BookSlides 6A Probability-Based Learning PDF
68 pages
BookSlides 4A Information-Based Learning PDF
No ratings yet
BookSlides 4A Information-Based Learning PDF
65 pages
BookSlides 3B Data Exploration
No ratings yet
BookSlides 3B Data Exploration
60 pages
Indigenous Media May Be Defined As Forms of Media Expression
No ratings yet
Indigenous Media May Be Defined As Forms of Media Expression
3 pages
BookSlides 7A Error-Based Learning
No ratings yet
BookSlides 7A Error-Based Learning
49 pages
Judge me not
No ratings yet
Judge me not
3 pages
BookSlides 3A Data Exploration
No ratings yet
BookSlides 3A Data Exploration
43 pages
BookSlides 5A Similarity Based Learning
No ratings yet
BookSlides 5A Similarity Based Learning
40 pages
Thesis - Mental Health
No ratings yet
Thesis - Mental Health
271 pages
English 7 Dll-2nd Quarter - Wk10
No ratings yet
English 7 Dll-2nd Quarter - Wk10
6 pages
Interlanguage and Its Implications To Second Langu
No ratings yet
Interlanguage and Its Implications To Second Langu
7 pages
Community of Inquiry Theoretical Framework (Garrison & Akyol, 2013)
No ratings yet
Community of Inquiry Theoretical Framework (Garrison & Akyol, 2013)
19 pages
Drug Education, Consumer Health and Nutrition
No ratings yet
Drug Education, Consumer Health and Nutrition
8 pages
Model Ensemble Trpo
No ratings yet
Model Ensemble Trpo
15 pages
q2 CSTP Lesson Plan Template 2
No ratings yet
q2 CSTP Lesson Plan Template 2
4 pages
BookSlides 11 The Art of Machine Learning For Predictive Data Analytics
No ratings yet
BookSlides 11 The Art of Machine Learning For Predictive Data Analytics
27 pages
Fundamentals of Machine Learning For Predictive Data Analytics
No ratings yet
Fundamentals of Machine Learning For Predictive Data Analytics
49 pages
NPNC 1
No ratings yet
NPNC 1
17 pages
Management Job Family
No ratings yet
Management Job Family
11 pages
Ceremonial-Urban Dynamic Cairo
No ratings yet
Ceremonial-Urban Dynamic Cairo
137 pages
L2F2 LP: (Localized Learner's Feedback Form: A Tool in Monitoring Learners' Progress in Araling Panlipunan)
No ratings yet
L2F2 LP: (Localized Learner's Feedback Form: A Tool in Monitoring Learners' Progress in Araling Panlipunan)
4 pages
Verbal Communication
100% (1)
Verbal Communication
18 pages
Ch15-RF Site Survey Fundamentals - 190615
No ratings yet
Ch15-RF Site Survey Fundamentals - 190615
12 pages
Q2 Week 4 Day 2
No ratings yet
Q2 Week 4 Day 2
4 pages
Differenciate Between Ralph Tyler
No ratings yet
Differenciate Between Ralph Tyler
3 pages
A Graphic Organizer
No ratings yet
A Graphic Organizer
2 pages
Communication Matrix Profile PDF
No ratings yet
Communication Matrix Profile PDF
1 page
Superfit - Missed Classes Assignment
No ratings yet
Superfit - Missed Classes Assignment
2 pages
Idriss Guindo Discourse Community 3 Draft
No ratings yet
Idriss Guindo Discourse Community 3 Draft
4 pages
Functional Dependency To Minimal Cover To 3NF Decomposition
No ratings yet
Functional Dependency To Minimal Cover To 3NF Decomposition
3 pages
Report of Testing and Measurement in Psychology Submitted To Ma, Am Rabia Sajjad Submitted by Urva Razzaq Roll No 16.261 MS, C 2 Semester Department of Applied Psychologhy
No ratings yet
Report of Testing and Measurement in Psychology Submitted To Ma, Am Rabia Sajjad Submitted by Urva Razzaq Roll No 16.261 MS, C 2 Semester Department of Applied Psychologhy
8 pages
Bahan Silabus
No ratings yet
Bahan Silabus
5 pages
TEACHING Speaking (Full Materi)
No ratings yet
TEACHING Speaking (Full Materi)
4 pages
Developing Listening Skills
No ratings yet
Developing Listening Skills
4 pages
The Plural of Nouns
No ratings yet
The Plural of Nouns
8 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)