Chapter 1
Chapter 1
Data vs information
Structural descriptions
Rules: classification and association
Decision trees
Datasets
Weather, contact lens, CPU performance, labor negotiation data,
soybean classification
Fielded applications
Ranking web pages, loan applications, screening images, load
forecasting, machine fault diagnosis, market basket analysis
Generalization as search
Extracting
implicit,
previously unknown,
potentially useful
information from data
Operational definition:
Things learn when they change their
behavior in a way that makes them perform
Does a slipper learn?
better in the future.
Classification rule:
predicts value of a given attribute (the classification of an example)
Association rule:
predicts value of arbitrary attribute (or combination)
Attribute Type 1 2 3 40
Duration (Number of years) 1 2 3 2
Wage increase first year Percentage 2% 4% 4.3% 4.5
Wage increase second year Percentage ? 5% 4.4% 4.0
Wage increase third year Percentage ? ? ? ?
Cost of living adjustment {none,tcf,tc} none tcf ? none
Working hours per week (Number of hours) 28 35 38 40
Pension {none,ret-allw, empl-cntr} none ? ? ?
Standby pay Percentage ? 13% ? ?
Shift-work supplement Percentage ? 5% 4% 4
Education allowance {yes,no} yes ? ? ?
Statutory holidays (Number of days) 11 15 12 12
Vacation {below-avg,avg,gen} avg gen gen avg
Long-term disability assistance {yes,no} no ? ? yes
Dental plan contribution {none,half,full} none ? full full
Bereavement assistance {yes,no} no ? ? yes
Health plan contribution {none,half,full} none ? full half
Acceptability of contract {good,bad} bad good good good
20 attributes:
age
years with current employer
years at current address
years with the bank
other credit cards possessed,
Attributes:
size of region
shape, area
intensity
sharpness and jaggedness of boundaries
proximity of other regions
info about background
Constraints:
Few training examplesoil slicks are rare!
Unbalanced data: most dark regions arent slicks
Regions from same image form a batch
Requirement: adjustable false-alarm rate
Data Mining: Practical Machine Learning Tools and Techniques (Chapter 1) 26
Load forecasting
Attributes:
temperature
humidity
wind speed
cloud cover readings
plus difference between actual load and predicted load
Applications:
Customer loyalty:
identifying customers that are likely to defect by detecting
changes in their behavior
(e.g. banks/phone companies)
Special offers:
identifying profitable customers
(e.g. reliable owners of credit cards that need extra money
during the holiday season)
Leo Breiman
Developed decision trees
1984 Classification and Regression
Trees. Wadsworth.
Simple solution:
enumerate the concept space
eliminate descriptions that do not fit examples
surviving descriptions contain target concept
Important question:
is language universal
or does it restrict what can be learned?
Search heuristic
Greedy search: performing the best single step
Beam search: keeping several alternatives
Direction of search
General-to-specific
E.g. specializing a rule by adding conditions
Specific-to-general
E.g. generalizing an individual instance into a rule
Important questions:
Who is permitted access to the data?
For what purpose was the data collected?
What kind of conclusions can be legitimately drawn from it?