0% found this document useful (0 votes)

16 views

CH 3

Uploaded by

Revathi

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

CH 3

Uploaded by

Revathi

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 38

1

CS 43105 Data Mining Techniques

Chapter 3 Output: Knowledge
Xiang Lian
Department of Computer Science
Kent State University
Email: [email protected]
Homepage: http://www.cs.kent.edu/~xlian/
2

Outline
• Knowledge Representation
• Tables
• Linear Models
• Trees
• Rules
• Instance-based Representation
• Clusters
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
3

Output: Knowledge Representation

●
Tables
●
Linear models
●
Trees
●
Rules
●
Classification rules
●
Association rules
●
Rules with exceptions
●
More expressive rules
●
Instance-based representation
●
Clusters
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
4

Output: Representing Structural Patterns

•
Many different ways of representing patterns
•
Decision trees, rules, instance-based, …
•
Also called “knowledge” representation
•
Representation determines inference method
•
Understanding the output is the key to
understanding the underlying learning methods
•
Different types of output for different learning
problems (e.g. classification, regression, …)
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
5

Tables
●
Simplest way of representing output:
●
Use the same format as input!
●
Decision table for the weather problem:
Outlook Humidity Play
Sunny High No
Sunny Normal Yes
Overcast High Yes
Overcast Normal Yes
Rainy High No
Rainy Normal No

Main problem: selecting the right attributes

●
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
6

Linear Models
• Another simple representation
• Regression model
• Inputs (attribute values) and output are all numeric
• Output is the sum of weighted attribute values
• The trick is to find good values for the weights
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
7

A Linear Regression Function for the CPU

Performance Data

PRP = 37.06 + 2.47CACH

Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
8

Linear Models for Classification

• Binary classification
• Line separates the two classes
• Decision boundary - defines where the decision
changes from one class value to the other
• Prediction is made by plugging in observed
values of the attributes into the expression
• Predict one class if output  0, and the other class if
output < 0
• Boundary becomes a high-dimensional plane
(hyperplane) when there are multiple attributes
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
9

Separating Setosas From Versicolors

2.0 – 0.5PETAL-LENGTH – 0.8PETAL-WIDTH = 0

Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
10

Trees
●
“Divide-and-conquer” approach produces tree
●
Nodes involve testing a particular attribute
●
Usually, attribute value is compared to constant
●
Other possibilities:
●
Comparing values of two attributes
●
Using a function of one or more attributes
●
Leaves assign classification, set of classifications, or
probability distribution to instances
●
Unknown instance is routed down the tree
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
11

Nominal and Numeric Attributes

●
Nominal:
number of children usually equal to number values
 attribute won’t get tested more than once
●
Other possibility: division into two subsets
●
Numeric:
test whether value is greater or less than constant
 attribute may get tested several times
●
Other possibility: three-way split (or multi-way split)
●
Integer: less than, equal to, greater than
●
Real: below, within, above
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
12

Missing Values
●
Does absence of value have some significance?
Yes  “missing” is a separate value
●

No  “missing” must be treated in a special way

●

Solution A: assign instance to most popular branch

Solution B: split instance into pieces

 Pieces receive weight according to fraction of training

instances that go down each branch
 Classifications from leave nodes are combined using the

weights that have percolated to them

Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
13

Trees for Numeric Prediction

• Regression: the process of computing an expression
that predicts a numeric quantity
• Regression tree: “decision tree” where each leaf
predicts a numeric quantity
• Predicted value is average value of training instances that
reach the leaf
• Model tree: “regression tree” with linear regression
models at the leaf nodes
• Linear patches approximate continuous function
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
14

Linear Regression for the CPU Data

PRP =
- 56.1
+ 0.049 MYCT
+ 0.015 MMIN
+ 0.006 MMAX
+ 0.630 CACH
- 0.270 CHMIN
+ 1.46 CHMAX
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
15

Regression Tree for the CPU Data

Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
16

Model Tree for the CPU Data

Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
17

Classification Rules
●
Popular alternative to decision trees
●
Antecedent (pre-condition): a series of tests (just
like the tests at the nodes of a decision tree)
●
Tests are usually logically ANDed together (but
may also be general logical expressions)
●
Consequent (conclusion): classes, set of classes,
or probability distribution assigned by rule
●
Individual rules are often logically ORed together
●
Conflicts arise if different conclusions apply
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
18

From Trees to Rules

Easy: converting a tree into a set of rules
●

●
One rule for each leaf:
 Antecedent contains a condition for every node on the path from
the root to the leaf
 Consequent is class assigned by the leaf
Produces rules that are unambiguous
●

●
Doesn’t matter in which order they are executed
But: resulting rules are unnecessarily complex
●

●
Pruning to remove redundant tests/rules
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
19

From Rules to Trees

More difficult: transforming a rule set into a tree
●

●
Tree cannot easily express disjunction between rules
Example: rules which test different attributes
●

If a and b then x
If c and d then x

●
Symmetry needs to be broken
●
Corresponding tree contains identical subtrees
( “replicated subtree problem”)
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
20

A Tree for a Simple Disjunction

Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
21

The Exclusive-OR Problem

If x = 1 and y = 0
then class = a
If x = 0 and y = 1
then class = a
If x = 0 and y = 0
then class = b
If x = 1 and y = 1
then class = b
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
22

A Tree with a Replicated Subtree

If x = 1 and y = 1
then class = a
If z = 1 and w = 1
then class = a
Otherwise class = b
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
23

“Nuggets” of Knowledge
●
Are rules independent pieces of knowledge? (It
seems easy to add a rule to an existing rule base.)
●
Problem: ignores how rules are executed
●
Two ways of executing a rule set:
●
Ordered set of rules (“decision list”)
●
Order is important for interpretation
●
Unordered set of rules
●
Rules may overlap and lead to different conclusions for the same
instance
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
24

Interpreting Rules

What if two or more rules conflict?

●

●
Give no conclusion at all?
●
Go with rule that is most popular on training data?
●
…
What if no rule applies to a test instance?
●

●
Give no conclusion at all?
●
Go with class that is most frequent in training data?
●
…
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
25

Special Case: Boolean Class

●
Assumption: if instance does not belong to class
“yes”, it belongs to class “no”
●
Trick: only learn rules for class “yes” and use
default rule for “no”
If x = 1 and y = 1 then class = a
If z = 1 and w = 1 then class = a
Otherwise class = b

Order of rules is not important. No conflicts!

●

Rule can be written in disjunctive normal form

●
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
26

Association Rules
• Association rules…
• … can predict any attribute and combinations of attributes
• … are not intended to be used together as a set
• Problem: immense number of possible associations
• Output needs to be restricted to show only the most
predictive associations  only those with high support and
high confidence
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
27

Support and Confidence of a Rule

• Support: number of instances predicted correctly
• Confidence: number of correct predictions, as
proportion of all instances that rule applies to
• Example: 4 cool days with normal humidity

If temperature = cool then humidity = normal

Support = 4, confidence = 100%

• Normally: minimum support and confidence pre-
specified (e.g. 58 rules with support  2 and
confidence  95% for weather data)
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
28

Rules with Exceptions

• Idea: allow rules to have exceptions
• Example: rule for iris data
If petal-length  2.45 and petal-length < 4.45 then Iris-versicolor

• New instance:
Sepal Sepal Petal Petal Type
length width length width
5.1 3.5 2.6 0.2 Iris-setosa

• Modified rule:

If petal-length  2.45 and petal-length < 4.45 then Iris-versicolor

EXCEPT if petal-width < 1.0 then Iris-setosa
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
29

A More Complex Example

• Exceptions to exceptions to exceptions …

default: Iris-setosa
except if petal-length  2.45 and petal-length < 5.355
and petal-width < 1.75
then Iris-versicolor
except if petal-length  4.95 and petal-width < 1.55
then Iris-virginica
else if sepal-length < 4.95 and sepal-width  2.45
then Iris-virginica
else if petal-length  3.35
then Iris-virginica
except if petal-length < 4.85 and sepal-length < 5.95
then Iris-versicolor
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
30

Advantages of Using Exceptions

• Rules can be updated incrementally
• Easy to incorporate new data
• Easy to incorporate domain knowledge
• People often think in terms of exceptions
• Each conclusion can be considered just in the
context of rules and exceptions that lead to it
• Locality property is important for understanding large rule
sets
• “Normal” rule sets don’t offer this advantage
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
31

More on Exceptions
• Default...except if...then...
is logically equivalent to
if...then...else
(where the else specifies what the default did)
• But: exceptions offer a psychological advantage
• Assumption: defaults and tests early on apply
more widely than exceptions further down
• Exceptions reflect special cases
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
32

Rules Involving Relations

●
So far: all rules involved comparing an attribute-value
to a constant (e.g. temperature < 45)
●
These rules are called “propositional” because they
have the same expressive power as propositional logic
●
What if problem involves relationships between
examples (e.g. family tree problem from above)?
●
Can’t be expressed with propositional rules
●
More expressive representation required
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
33

Instance-based Representation
• Simplest form of learning: rote learning
• Training instances are searched for instance that most
closely resembles new instance
• The instances themselves represent the knowledge
• Also called instance-based learning
• Similarity function defines what’s “learned”
• Instance-based learning is lazy learning
• Methods: nearest-neighbor, k-nearest-neighbor, …
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
34

The Distance Function

• Simplest case: one numeric attribute
• Distance is the difference between the two attribute values
involved (or a function thereof)
• Several numeric attributes (attributes may be
normalized):
• Manhattan Distance (L1-norm)
• Euclidean distance (L2-norm)
• L-norm
•…
• Nominal attributes: distance is set to 1 if values are
different, 0 if they are equal
• Are all attributes equally important?
•
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
35

Learning Prototypes

• Only those instances involved in a decision need to

be considered
• Noisy instances should be filtered out
• Idea: only use prototypical examples
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
36

Rectangular Generalizations

• Nearest-neighbor rule is used outside rectangles

• Rectangles are rules! (But they can be more
conservative than “normal” rules.)
• Nested rectangles are rules with exceptions
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
37

Representing Clusters I

Simple 2-D representation Venn diagram

Overlapping clusters
Data Mining: Practical Machine Learning Tools and
Techniques (Chapter 3)
38

Representing Clusters II

Probabilistic assignment Dendrogram

1 2 3

a 0.4 0.1 0.5

b 0.1 0.8 0.1
c 0.3 0.3 0.4
d 0.1 0.1 0.8
e 0.4 0.2 0.4
f 0.1 0.4 0.5
g 0.7 0.2 0.1
h 0.5 0.4 0.1
…
NB: dendron is the Greek word for tree

Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
43 pages
Data Mining Slides
No ratings yet
Data Mining Slides
43 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
39 pages
Lecture 3 Ver2
No ratings yet
Lecture 3 Ver2
42 pages
Fintech ML Using Azure
No ratings yet
Fintech ML Using Azure
51 pages
Day 2
No ratings yet
Day 2
58 pages
Lecture Notes on Machine Learning Concepts.docx
No ratings yet
Lecture Notes on Machine Learning Concepts.docx
5 pages
ML 01
No ratings yet
ML 01
44 pages
Classification
No ratings yet
Classification
40 pages
49 Machine Learning
No ratings yet
49 Machine Learning
300 pages
Week 01
No ratings yet
Week 01
37 pages
ITA6016 - Machine Learning Introduction
No ratings yet
ITA6016 - Machine Learning Introduction
13 pages
SL_DT
No ratings yet
SL_DT
25 pages
What Are The Basic Concepts in Machine Learning
No ratings yet
What Are The Basic Concepts in Machine Learning
3 pages
Semi Supervised Learning
No ratings yet
Semi Supervised Learning
86 pages
ML Exam Preparation Tips
No ratings yet
ML Exam Preparation Tips
41 pages
CS3491 - Aiml - Unit Iii Supervised Learning
No ratings yet
CS3491 - Aiml - Unit Iii Supervised Learning
162 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
ML QA
No ratings yet
ML QA
10 pages
11 W11NSE6220 - Fall 2023 - Zeng
No ratings yet
11 W11NSE6220 - Fall 2023 - Zeng
43 pages
AI Chapter 4
No ratings yet
AI Chapter 4
63 pages
21AI502 Syllbus
No ratings yet
21AI502 Syllbus
5 pages
Chapter 5 - Machine Learning Basics
No ratings yet
Chapter 5 - Machine Learning Basics
58 pages
DL-ppt
No ratings yet
DL-ppt
100 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
59 pages
Pengantar Datamining: Anto Satriyo Nugroho, DR - Eng
No ratings yet
Pengantar Datamining: Anto Satriyo Nugroho, DR - Eng
33 pages
1
No ratings yet
1
6 pages
ClassNote One
No ratings yet
ClassNote One
2 pages
21CS743 Model Question Paper Solution
No ratings yet
21CS743 Model Question Paper Solution
32 pages
DL UNIT 1
No ratings yet
DL UNIT 1
21 pages
05_kaggle_competition
No ratings yet
05_kaggle_competition
37 pages
MLANS
No ratings yet
MLANS
26 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
3 pages
Ann Unit V
No ratings yet
Ann Unit V
30 pages
4 AI ML - 2
No ratings yet
4 AI ML - 2
31 pages
I2ml Chap1 v1 1
No ratings yet
I2ml Chap1 v1 1
14 pages
Session 2 ANN 2024
No ratings yet
Session 2 ANN 2024
29 pages
Week 6 - Lecture 11-1
No ratings yet
Week 6 - Lecture 11-1
28 pages
1635838720082
No ratings yet
1635838720082
35 pages
Lecture 1 Ai
No ratings yet
Lecture 1 Ai
38 pages
algorithmeknn-121213175830-phpapp02
No ratings yet
algorithmeknn-121213175830-phpapp02
52 pages
ML1-Introduction To Machine Learning
No ratings yet
ML1-Introduction To Machine Learning
46 pages
Aiml Questions1
No ratings yet
Aiml Questions1
27 pages
Unit 1 ML_Ver 2
No ratings yet
Unit 1 ML_Ver 2
56 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
No ratings yet
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
93 pages
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
No ratings yet
Kenny-230718-The Ultimate Machine Learning Cheat Sheet
20 pages
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
No ratings yet
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
95 pages
Module 3 Intro 1ef1ea17a8ab2a794dc68a0a1e2efe59
No ratings yet
Module 3 Intro 1ef1ea17a8ab2a794dc68a0a1e2efe59
46 pages
Machine Learning 1707965934
No ratings yet
Machine Learning 1707965934
15 pages
01 Phan Tich Dau Tu Nang Cao - CRISP Trong KHDL
No ratings yet
01 Phan Tich Dau Tu Nang Cao - CRISP Trong KHDL
37 pages
Doan 2015
No ratings yet
Doan 2015
8 pages
Classification Through Machine Learning Technique: C4.5 Algorithm Based On Various Entropies
No ratings yet
Classification Through Machine Learning Technique: C4.5 Algorithm Based On Various Entropies
8 pages
CE6146_Lecture_1
No ratings yet
CE6146_Lecture_1
63 pages
Supervised and Unsupervised Learning: Ciro Donalek Ay/Bi 199 - April 2011
No ratings yet
Supervised and Unsupervised Learning: Ciro Donalek Ay/Bi 199 - April 2011
69 pages
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-10-33
No ratings yet
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-10-33
24 pages
ML Lecture UIII 1 Dim Red
No ratings yet
ML Lecture UIII 1 Dim Red
25 pages
Dr. Mujiono - MachineLearningApplicationsCyberSecurity-Final-MS
No ratings yet
Dr. Mujiono - MachineLearningApplicationsCyberSecurity-Final-MS
28 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
ITU-BR Space Applications v9.1 Beta Release Notes
No ratings yet
ITU-BR Space Applications v9.1 Beta Release Notes
14 pages
Activity 1
No ratings yet
Activity 1
5 pages
Recruitment System
No ratings yet
Recruitment System
7 pages
CAD CAM 3 - 2 OBJ MID I PAPER May 2023
No ratings yet
CAD CAM 3 - 2 OBJ MID I PAPER May 2023
2 pages
Ifcitt
No ratings yet
Ifcitt
29 pages
CRN 7019717230
No ratings yet
CRN 7019717230
3 pages
Hoppecke FNC Fiber Nickel Cadmium Batteries
No ratings yet
Hoppecke FNC Fiber Nickel Cadmium Batteries
4 pages
Dhanunjay Java
No ratings yet
Dhanunjay Java
8 pages
Unisab II R
No ratings yet
Unisab II R
114 pages
2RSCN02 NETWORK MANAGEMENT-syllabus-2019-20 13062019 AVK
No ratings yet
2RSCN02 NETWORK MANAGEMENT-syllabus-2019-20 13062019 AVK
2 pages
Class Notes for doing CEH
No ratings yet
Class Notes for doing CEH
4 pages
ELK334E - TFP - Fall 2023
No ratings yet
ELK334E - TFP - Fall 2023
2 pages
18030/Shm LTT Express Sleeper Class (SL)
No ratings yet
18030/Shm LTT Express Sleeper Class (SL)
2 pages
03 Supply Chain Strategy
No ratings yet
03 Supply Chain Strategy
21 pages
TECHPAPER ATD MakeupWater 100119
No ratings yet
TECHPAPER ATD MakeupWater 100119
3 pages
Searchq Noot+noot+penguin&rlz 1C9BKJA enUS1007US1007&hl en-US&prmd Ivsn&sxsrf ALiCzsYeqgFRTj7z05tDsaAKDf
No ratings yet
Searchq Noot+noot+penguin&rlz 1C9BKJA enUS1007US1007&hl en-US&prmd Ivsn&sxsrf ALiCzsYeqgFRTj7z05tDsaAKDf
1 page
Classification of Computers
No ratings yet
Classification of Computers
23 pages
2008 Nissan Teana J32 Service Manual-ADP
No ratings yet
2008 Nissan Teana J32 Service Manual-ADP
138 pages
Package Ape': October 12, 2022
No ratings yet
Package Ape': October 12, 2022
296 pages
20-107 Fuel Supply System Biturbo 2.8 PDF
No ratings yet
20-107 Fuel Supply System Biturbo 2.8 PDF
25 pages
Ozobot Color Code Guide
No ratings yet
Ozobot Color Code Guide
11 pages
Barsana Division
No ratings yet
Barsana Division
2 pages
2-7yeung Negative Capacitance
No ratings yet
2-7yeung Negative Capacitance
31 pages
18 Advanced CSS Tricks and Tips (2024) - LambdaTest
No ratings yet
18 Advanced CSS Tricks and Tips (2024) - LambdaTest
34 pages
Cyborg Soldier 2050 CBC-TR-1599
No ratings yet
Cyborg Soldier 2050 CBC-TR-1599
42 pages
DA3
No ratings yet
DA3
12 pages
This Study Resource Was: Award: 1.00 Point
No ratings yet
This Study Resource Was: Award: 1.00 Point
2 pages
BSNL Report
No ratings yet
BSNL Report
26 pages
A Comprehensive Guide To Oil Sight Glasses
No ratings yet
A Comprehensive Guide To Oil Sight Glasses
10 pages
Knowledge Management at Accenture
67% (3)
Knowledge Management at Accenture
23 pages

CH 3

Uploaded by

CH 3

Uploaded by

1

CS 43105 Data Mining Techniques

Output: Knowledge Representation

Output: Representing Structural Patterns

Main problem: selecting the right attributes

A Linear Regression Function for the CPU

PRP = 37.06 + 2.47CACH

Linear Models for Classification

Separating Setosas From Versicolors

2.0 – 0.5PETAL-LENGTH – 0.8PETAL-WIDTH = 0

Nominal and Numeric Attributes

No  “missing” must be treated in a special way

Solution A: assign instance to most popular branch

Solution B: split instance into pieces

 Pieces receive weight according to fraction of training

weights that have percolated to them

Trees for Numeric Prediction

Linear Regression for the CPU Data

Regression Tree for the CPU Data

Model Tree for the CPU Data

From Trees to Rules

From Rules to Trees

A Tree for a Simple Disjunction

The Exclusive-OR Problem

A Tree with a Replicated Subtree

What if two or more rules conflict?

Special Case: Boolean Class

Order of rules is not important. No conflicts!

Rule can be written in disjunctive normal form

Support and Confidence of a Rule

If temperature = cool then humidity = normal

Support = 4, confidence = 100%

Rules with Exceptions

If petal-length  2.45 and petal-length < 4.45 then Iris-versicolor

A More Complex Example

Advantages of Using Exceptions

Rules Involving Relations

The Distance Function

• Only those instances involved in a decision need to

• Nearest-neighbor rule is used outside rectangles

Simple 2-D representation Venn diagram

Probabilistic assignment Dendrogram

a 0.4 0.1 0.5

You might also like