0% found this document useful (0 votes)
191 views10 pages

231

This document provides an overview and introduction to data mining. Chapter 1 defines data mining, discusses its origins and growth, and outlines the structure and contents of the book. Chapter 2 provides an overview of the data mining process, including preliminary steps like data organization, preprocessing, and partitioning. It also discusses supervised and unsupervised learning approaches. The book is divided into parts on data exploration, performance evaluation, and prediction/classification methods.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
191 views10 pages

231

This document provides an overview and introduction to data mining. Chapter 1 defines data mining, discusses its origins and growth, and outlines the structure and contents of the book. Chapter 2 provides an overview of the data mining process, including preliminary steps like data organization, preprocessing, and partitioning. It also discusses supervised and unsupervised learning approaches. The book is divided into parts on data exploration, performance evaluation, and prediction/classification methods.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

AL

Contents
Foreword

Part I PRELIMINARIES

1 Introduction

RI

GH

TE

What Is Data Mining? . . . . . . . . . . . .


Where Is Data Mining Used? . . . . . . . . .
Origins of Data Mining . . . . . . . . . . . .
Rapid Growth of Data Mining . . . . . . . .
Why Are There So Many Different Methods?
Terminology and Notation . . . . . . . . . .
Road Maps to This Book . . . . . . . . . . .
Order of Topics . . . . . . . . . . . . . . . .

PY

Chapter

xxiii

MA

Acknowledgments

1.1
1.2
1.3
1.4
1.5
1.6
1.7

xxi

TE

Preface to the first edition

Chapter

xix

RI

Preface to the second edition

xvii

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

2 Overview of the Data Mining Process

CO

2.1 Introduction . . . . . . . . . . . . . . .
2.2 Core Ideas in Data Mining . . . . . . . .
Classification . . . . . . . . . . . . . . .
Prediction . . . . . . . . . . . . . . . .
Association Rules . . . . . . . . . . . .
Predictive Analytics . . . . . . . . . . .
Data Reduction . . . . . . . . . . . . .
Data Exploration . . . . . . . . . . . . .
Data Visualization . . . . . . . . . . . .
2.3 Supervised and Unsupervised Learning
2.4 Steps in Data Mining . . . . . . . . . .
2.5 Preliminary Steps . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

3
4
4
5
6
7
9
10
12

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

12
13
13
13
13
14
14
14
14
15
15
17

vii

viii

CONTENTS

Organization of Datasets . . . . . . . . . . . . .
Sampling from a Database . . . . . . . . . . . .
Oversampling Rare Events . . . . . . . . . . . . .
Preprocessing and Cleaning the Data . . . . . . .
Use and Creation of Partitions . . . . . . . . . . .
2.6 Building a Model: Example with Linear Regression
Boston Housing Data . . . . . . . . . . . . . . .
Modeling Process . . . . . . . . . . . . . . . . .
2.7 Using Excel for Data Mining . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

17
17
17
18
24
27
27
28
34
38

Part II DATA EXPLORATION AND


DIMENSION REDUCTION
Chapter

3 Data Visualization

3.1 Uses of Data Visualization . . . . . . . . . . . . . . . . . .


3.2 Data Examples . . . . . . . . . . . . . . . . . . . . . . . .
Example 1: Boston Housing Data . . . . . . . . . . . . . .
Example 2: Ridership on Amtrak Trains . . . . . . . . . . .
3.3 Basic Charts: Bar Charts, Line Graphs, and Scatterplots . .
Distribution Plots: Boxplots and Histograms . . . . . . . .
Heatmaps: Visualizing Correlations and Missing Values . .
3.4 Multidimensional Visualization . . . . . . . . . . . . . . .
Adding Variables: Color, Size, Shape, Multiple Panels, and
Animation . . . . . . . . . . . . . . . . . . . . . . . . .
Manipulations: Rescaling, Aggregation and Hierarchies,
Zooming, and Panning, and Filtering . . . . . . . . . . .
Reference: Trend Lines and Labels . . . . . . . . . . . . .
Scaling up: Large Datasets . . . . . . . . . . . . . . . . .
Multivariate Plot: Parallel Coordinates Plot . . . . . . . . .
Interactive Visualization . . . . . . . . . . . . . . . . . . .
3.5 Specialized Visualizations . . . . . . . . . . . . . . . . . .
Visualizing Networked Data . . . . . . . . . . . . . . . . .
Visualizing Hierarchical Data: Treemaps . . . . . . . . . .
Visualizing Geographical Data: Map Charts . . . . . . . . .
3.6 Summary of Major Visualizations and Operations, According
to Data Mining Goal . . . . . . . . . . . . . . . . . . . . .
Prediction . . . . . . . . . . . . . . . . . . . . . . . . . .
Classification . . . . . . . . . . . . . . . . . . . . . . . . .
Time Series Forecasting . . . . . . . . . . . . . . . . . . .
Unsupervised Learning . . . . . . . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43
43
45
45
45
45
47
50
52
52
54
57
58
59
60
63
63
65
66
67
67
67
68
68
69

CONTENTS

Chapter

4 Dimension Reduction

71

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Practical Considerations . . . . . . . . . . . . . . . . . . .
Example 1: House Prices in Boston . . . . . . . . . . . . .
4.3 Data Summaries . . . . . . . . . . . . . . . . . . . . . . .
Summary Statistics . . . . . . . . . . . . . . . . . . . . .
Pivot Tables . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Correlation Analysis . . . . . . . . . . . . . . . . . . . . .
4.5 Reducing the Number of Categories in Categorical Variables
4.6 Converting a Categorical Variable to a Numerical Variable .
4.7 Principal Components Analysis . . . . . . . . . . . . . . .
Example 2: Breakfast Cereals . . . . . . . . . . . . . . . .
Principal Components . . . . . . . . . . . . . . . . . . . .
Normalizing the Data . . . . . . . . . . . . . . . . . . . .
Using Principal Components for Classification and
Prediction . . . . . . . . . . . . . . . . . . . . . . . . .
4.8 Dimension Reduction Using Regression Models . . . . . .
4.9 Dimension Reduction Using Classification and Regression
Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71
72
72
73
73
75
76
76
78
78
78
83
83
87
87
88
89

Part III PERFORMANCE EVALUATION


Chapter 5 Evaluating Classification and

Predictive Performance

5.1 Introduction . . . . . . . . . . . . . . . . . . .
5.2 Judging Classification Performance . . . . . . .
Benchmark: The Naive Rule . . . . . . . . . . .
Class Separation . . . . . . . . . . . . . . . . .
Classification Matrix . . . . . . . . . . . . . . .
Using the Validation Data . . . . . . . . . . . .
Accuracy Measures . . . . . . . . . . . . . . .
Cutoff for Classification . . . . . . . . . . . . .
Performance in Unequal Importance of Classes
Asymmetric Misclassification Costs . . . . . . .
Oversampling and Asymmetric Costs . . . . . .
Classification Using a Triage Strategy . . . . . .
5.3 Evaluating Predictive Performance . . . . . . .
Benchmark: The Average . . . . . . . . . . . .
Prediction Accuracy Measures . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . .

93
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

93
94
94
94
96
96
97
97
100
105
109
114
115
115
115
118

ix

CONTENTS

Part IV PREDICTION AND CLASSIFICATION METHODS


Chapter 6 Multiple Linear Regression

121

6.1 Introduction . . . . . . . . . . . . . . . . . . . . .
6.2 Explanatory versus Predictive Modeling . . . . . .
6.3 Estimating the Regression Equation and Prediction
Example: Predicting the Price of Used Toyota Corolla
Automobiles . . . . . . . . . . . . . . . . . . .
6.4 Variable Selection in Linear Regression . . . . . . .
Reducing the Number of Predictors . . . . . . . . .
How to Reduce the Number of Predictors . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter

121
122
123

.
.
.
.
.

124
127
127
128
133

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

7 k-Nearest Neighbors (k-NN)

137

7.1 k-NN Classifier (Categorical Outcome) . . . . . . .


Determining Neighbors . . . . . . . . . . . . . .
Classification Rule . . . . . . . . . . . . . . . . .
Example: Riding Mowers . . . . . . . . . . . . . .
Choosing k . . . . . . . . . . . . . . . . . . . . .
Setting the Cutoff Value . . . . . . . . . . . . . .
k-NN with More Than Two Classes . . . . . . . . .
7.2 k-NN for a Numerical Response . . . . . . . . . .
7.3 Advantages and Shortcomings of k-NN Algorithms
Problems . . . . . . . . . . . . . . . . . . . . . . . .
Chapter

. . . .
. . . .
. . . .

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

8 Naive Bayes

148

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . .
Example 1: Predicting Fraudulent Financial Reporting . . .
8.2 Applying the Full (Exact) Bayesian Classifier . . . . . . . .
Practical Difficulty with the Complete (Exact) Bayes
Procedure . . . . . . . . . . . . . . . . . . . . . . . . .
Solution: Naive Bayes . . . . . . . . . . . . . . . . . . . .
Example 2: Predicting Fraudulent Financial Reports,
Two Predictors . . . . . . . . . . . . . . . . . . . . . .
Example 3: Predicting Delayed Flights . . . . . . . . . . .
8.3 Advantages and Shortcomings of the Naive Bayes Classifier
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter

9 Classification and Regression Trees

9.1 Introduction . . . . . . .
9.2 Classification Trees . . .
Recursive Partitioning . .
Example 1: Riding Mowers

137
138
138
139
140
141
142
142
144
146

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

148
149
150
151
152
153
155
159
162
164

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

164
166
166
166

CONTENTS

9.3 Measures of Impurity . . . . . . . . . . . . . . . .


Tree Structure . . . . . . . . . . . . . . . . . . . .
Classifying a New Observation . . . . . . . . . . .
9.4 Evaluating the Performance of a Classification Tree
Example 2: Acceptance of Personal Loan . . . . . .
9.5 Avoiding Overfitting . . . . . . . . . . . . . . . . .
Stopping Tree Growth: CHAID . . . . . . . . . . . .
Pruning the Tree . . . . . . . . . . . . . . . . . . .
9.6 Classification Rules from Trees . . . . . . . . . . .
9.7 Classification Trees for More Than Two Classes . . .
9.8 Regression Trees . . . . . . . . . . . . . . . . . . .
Prediction . . . . . . . . . . . . . . . . . . . . . .
Measuring Impurity . . . . . . . . . . . . . . . . .
Evaluating Performance . . . . . . . . . . . . . . .
9.9 Advantages, Weaknesses, and Extensions . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

10 Logistic Regression

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . .
10.2 Logistic Regression Model . . . . . . . . . . . . . . .
Example: Acceptance of Personal Loan . . . . . . . .
Model with a Single Predictor . . . . . . . . . . . . .
Estimating the Logistic Model from Data: Computing
Parameter Estimates . . . . . . . . . . . . . . . .
Interpreting Results in Terms of Odds . . . . . . . . .
10.3 Evaluating Classification Performance . . . . . . . .
Variable Selection . . . . . . . . . . . . . . . . . . .
Impact of Single Predictors . . . . . . . . . . . . . .
10.4 Example of Complete Analysis: Predicting Delayed
Flights . . . . . . . . . . . . . . . . . . . . . . . . .
Data Preprocessing . . . . . . . . . . . . . . . . . .
Model Fitting and Estimation . . . . . . . . . . . . .
Model Interpretation . . . . . . . . . . . . . . . . . .
Model Performance . . . . . . . . . . . . . . . . . .
Variable Selection . . . . . . . . . . . . . . . . . . .
10.5 Appendix: Logistic Regression for Profiling . . . . . .
Appendix A: Why Linear Regression Is Inappropriate
for a Categorical Response . . . . . . . . . . . . .
Appendix B: Evaluating Goodness of Fit . . . . . . .
Appendix C: Logistic Regression for More Than
Two Classes . . . . . . . . . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . . .

169
172
173
173
174
179
179
180
183
185
185
186
187
187
187
189
192

.
.
.
.

.
.
.
.

.
.
.
.

192
194
196
197

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

199
201
202
203
205

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

206
208
208
208
209
211
211

. . .
. . .

212
214

. . .
. . .

215
219

xi

xii

CONTENTS

Chapter

11 Neural Nets

11.1 Introduction . . . . . . . . . . . . . . . . . . . .
11.2 Concept and Structure of a Neural Network . . . .
11.3 Fitting a Network to Data . . . . . . . . . . . . .
Example 1: Tiny Dataset . . . . . . . . . . . . . .
Computing Output of Nodes . . . . . . . . . . . .
Preprocessing the Data . . . . . . . . . . . . . .
Training the Model . . . . . . . . . . . . . . . . .
Example 2: Classifying Accident Severity . . . . .
Avoiding Overfitting . . . . . . . . . . . . . . . .
Using the Output for Prediction and Classification
11.4 Required User Input . . . . . . . . . . . . . . . .
11.5 Exploring the Relationship Between Predictors
and Response . . . . . . . . . . . . . . . . . . .
11.6 Advantages and Weaknesses of Neural Networks
Problems . . . . . . . . . . . . . . . . . . . . . . . .
Chapter

222
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

222
223
223
224
225
228
228
232
236
237
237

. . . . .
. . . . .
. . . . .

239
239
241

12 Discriminant Analysis

12.1 Introduction . . . . . . . . . . . . . . . . . . . . .
Example 1: Riding Mowers . . . . . . . . . . . . . .
Example 2: Personal Loan Acceptance . . . . . . .
12.2 Distance of an Observation from a Class . . . . . .
12.3 Fishers Linear Classification Functions . . . . . . .
12.4 Classification Performance of Discriminant Analysis
12.5 Prior Probabilities . . . . . . . . . . . . . . . . . .
12.6 Unequal Misclassification Costs . . . . . . . . . . .
12.7 Classifying More Than Two Classes . . . . . . . . .
Example 3: Medical Dispatch to Accident Scenes . .
12.8 Advantages and Weaknesses . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . .

243
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

Part V MINING RELATIONSHIPS


AMONG RECORDS
Chapter 13 Association Rules
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . .
13.2 Discovering Association Rules in Transaction Databases . .
Example 1: Synthetic Data on Purchases of Phone Faceplates
13.3 Generating Candidate Rules . . . . . . . . . . . . . . . . .
The Apriori Algorithm . . . . . . . . . . . . . . . . . . . .
13.4 Selecting Strong Rules . . . . . . . . . . . . . . . . . . . .
Support and Confidence . . . . . . . . . . . . . . . . . . .

243
244
244
246
247
251
252
252
253
253
254
258

263
263
263
265
265
266
267
267

CONTENTS

Lift Ratio . . . . . . . . . . . . . . . . . . .
Data Format . . . . . . . . . . . . . . . . .
Process of Rule Selection . . . . . . . . . .
Interpreting the Results . . . . . . . . . . .
Statistical Significance of Rules . . . . . . .
Example 2: Rules for Similar Book Purchases
13.5 Summary . . . . . . . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . .
Chapter

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

14 Cluster Analysis

279

14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . .
Example: Public Utilities . . . . . . . . . . . . . . . . . .
14.2 Measuring Distance Between Two Records . . . . . . . .
Euclidean Distance . . . . . . . . . . . . . . . . . . . .
Normalizing Numerical Measurements . . . . . . . . . .
Other Distance Measures for Numerical Data . . . . . . .
Distance Measures for Categorical Data . . . . . . . . . .
Distance Measures for Mixed Data . . . . . . . . . . . .
14.3 Measuring Distance Between Two Clusters . . . . . . . .
14.4 Hierarchical (Agglomerative) Clustering . . . . . . . . . .
Minimum Distance (Single Linkage) . . . . . . . . . . . .
Maximum Distance (Complete Linkage) . . . . . . . . . .
Average Distance (Average Linkage) . . . . . . . . . . .
Centroid Distance (Average Group Linkage) . . . . . . . .
Wards Method . . . . . . . . . . . . . . . . . . . . . . .
Dendrograms: Displaying Clustering Process and Results
Validating Clusters . . . . . . . . . . . . . . . . . . . . .
Limitations of Hierarchical Clustering . . . . . . . . . . .
14.5 Nonhierarchical Clustering: The k-Means Algorithm . . .
Initial Partition into k Clusters . . . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

Part VI FORECASTING TIME SERIES


Chapter 15 Handling Time Series
15.1 Introduction . . . . . . . . . . . . . . . .
15.2 Explanatory versus Predictive Modeling .
15.3 Popular Forecasting Methods in Business
Combining Methods . . . . . . . . . . . .
15.4 Time Series Components . . . . . . . . .
Example: Ridership on Amtrak Trains . . .
15.5 Data Partitioning . . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . .

268
269
270
271
272
274
275
277

279
281
283
283
283
284
286
287
287
290
290
291
291
291
291
292
293
295
295
297
300

305
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

305
306
307
307
308
308
312
314

xiii

xiv

CONTENTS

Chapter

16 Regression-Based Forecasting

317

16.1 Model with Trend . . . . . . . . . . . . . . . . . .


Linear Trend . . . . . . . . . . . . . . . . . . . . .
Exponential Trend . . . . . . . . . . . . . . . . . .
Polynomial Trend . . . . . . . . . . . . . . . . . .
16.2 Model with Seasonality . . . . . . . . . . . . . . .
16.3 Model with Trend and Seasonality . . . . . . . . .
16.4 Autocorrelation and ARIMA Models . . . . . . . . .
Computing Autocorrelation . . . . . . . . . . . . .
Improving Forecasts by Integrating Autocorrelation
Information . . . . . . . . . . . . . . . . . . . .
Evaluating Predictability . . . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

317
317
319
321
322
324
324
325

. . . .
. . . .
. . . .

328
331
334

17 Smoothing Methods

344

17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . .
17.2 Moving Average . . . . . . . . . . . . . . . . . . . . . . .
Centered Moving Average for Visualization . . . . . . . . .
Trailing Moving Average for Forecasting . . . . . . . . . . .
Choosing Window Width (w) . . . . . . . . . . . . . . . . .
17.3 Simple Exponential Smoothing . . . . . . . . . . . . . . .
Choosing Smoothing Parameter . . . . . . . . . . . . . .
Relation between Moving Average and Simple Exponential
Smoothing . . . . . . . . . . . . . . . . . . . . . . . .
17.4 Advanced Exponential Smoothing . . . . . . . . . . . . .
Series with a Trend . . . . . . . . . . . . . . . . . . . . . .
Series with a Trend and Seasonality . . . . . . . . . . . .
Series with Seasonality (No Trend) . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Part VII CASES


Chapter 18 Cases
18.1 Charles Book Club . . . . . . .
The Book Industry . . . . . . .
Database Marketing at Charles
Data Mining Techniques . . . .
Assignment . . . . . . . . . .
18.2 German Credit . . . . . . . . .
Assignment . . . . . . . . . .
18.3 Tayko Software Cataloger . . .

344
345
345
346
350
350
351
352
353
353
354
354
356

367
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

367
367
368
370
373
375
379
379

CONTENTS

Background . . . . . . . . . . . . . . . . . . . . .
The Mailing Experiment . . . . . . . . . . . . . . .
Data . . . . . . . . . . . . . . . . . . . . . . . . .
Assignment . . . . . . . . . . . . . . . . . . . . .
18.4 Segmenting Consumers of Bath Soap . . . . . . . .
Business Situation . . . . . . . . . . . . . . . . . .
Key Problems . . . . . . . . . . . . . . . . . . . . .
Data . . . . . . . . . . . . . . . . . . . . . . . . .
Measuring Brand Loyalty . . . . . . . . . . . . . .
Assignment . . . . . . . . . . . . . . . . . . . . .
Appendix . . . . . . . . . . . . . . . . . . . . . . .
18.5 Direct-Mail Fundraising . . . . . . . . . . . . . . .
Background . . . . . . . . . . . . . . . . . . . . .
Data . . . . . . . . . . . . . . . . . . . . . . . . .
Assignment . . . . . . . . . . . . . . . . . . . . .
18.6 Catalog Cross Selling . . . . . . . . . . . . . . . .
Background . . . . . . . . . . . . . . . . . . . . .
Assignment . . . . . . . . . . . . . . . . . . . . .
18.7 Predicting Bankruptcy . . . . . . . . . . . . . . . .
Predicting Corporate Bankruptcy . . . . . . . . . .
Assignment . . . . . . . . . . . . . . . . . . . . .
18.8 Time Series Case: Forecasting Public Transportation
Demand . . . . . . . . . . . . . . . . . . . . . . .
Background . . . . . . . . . . . . . . . . . . . . .
Problem Description . . . . . . . . . . . . . . . . .
Available Data . . . . . . . . . . . . . . . . . . . .
Assignment Goal . . . . . . . . . . . . . . . . . . .
Assignment . . . . . . . . . . . . . . . . . . . . .
Tips and Suggested Steps . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

379
380
380
380
383
383
384
384
386
386
386
387
387
387
387
389
389
390
390
391
393

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

393
393
393
394
394
394
394

References

397

Index

399

xv

You might also like