Data Mining by Worapoj Kreesuradej
Data Mining by Worapoj Kreesuradej
Custom Data
Application Warehouse Query & Reporting tool
ERP
Packaged
Application
OLAP
Custom Intelligence Enterprise
Application
Business Impact
Increasing
business Impact
Data Mining
Information Discovery
Data Exploration
OLAP
Statistical Analysis, Querying and Reporting
Data Sources
Paper, Files, Information Providers, Database Systems, OLTP
Potential Applications of
Data Mining
z Market analysis and management
¾ purchasing pattern over time
¾ cross-selling
¾ customer profiling
¾ direct mail campaign
¾ market segmentation
Potential Applications of
Data Mining
z Risk analysis and management
¾ forecasting
¾ credit scoring for loan application
processing
¾ profile of attrition (churn management)
Potential Applications of Data
Mining
z Fraud detection and management
¾ money laundering detect suspicious
money transactions
¾ detecting Inappropriate Medical
Treatments
Potential Applications of Data
Mining
z Web mining
¾ Web Usage Mining
¾ Web Content Mining
Automatic Classification of Web
Document
¾ Web Structure Mining
Potential Applications of
Data Mining
z Text mining
¾ Dividing documents into groups
¾ Document feature extraction
Structured Data
Uns tructured
Data
Data mining process
Pattern Evaluation
Data Mining
Data Preparation
Preprocessed
Data
Selection
Business
Target
Objective Data
Databases
Data mining process
z Business Objectives Determination
¾ Identify
the business problems or
opportunity
z Data Selection
¾ Identify
all internal or external sources
of information and select which
subset of the data is needed for the
data mining application.
Data mining process
z Data Preprocessing
¾ The goal of data preprocessing is to
ensure the quality of the selected data.
¾ current data set, sampling data, unit
conversion, representation formats,
detecting missing value
Data mining process
z Data Transformation
¾ thegoal is to transform data to suit the
intended analysis and the data formats
required by the data mining algorithms,
many of which have particular
requirements.
Data mining process
z Data Mining
¾ Select modeling technique
¾ Data Mining Operations
Predictive Modeling
Database Segmentation
Link Analysis
Visualization
Predictive Modeling
z Findingmodels that describe and
distinguish classes or concepts for
future prediction
¾ Model: decision-tree, neural network
Database Segmentation
(clustering)
z partitioninga database into
segments of similar records, that is
records that share a number of
properties.
K-means, Kohonen neural
¾ Model:
networks
Database Segmentation
Annual
Income
Age
Link Analysis
z Findingfrequent patterns,
associations, correlations, or causal
structures among sets of items or
objects in transaction databases,
relational databases, and other
information repositories
¾ Model: Apiori Algorithm,
Visualization of Link Analysis
software
Visualization
Visualization
Visualization of a decision
tree in MineSet 3.0
Data mining process
z Analysis of results
¾ Interpret and evaluate the output form
data mining.
¾ Have we found something that is
interesting, valid, and actionable?
Data mining process
z Assimilation of knowledge
¾ The objective is to put into action,
according to the new, valid and
actionable information from the
previous process steps.
Effort Required for Each
Data Mining Process Step
Methodology for data mining
zCRISP-DM
¾ CrossIndustry Standard
Process for Data Mining
(CRISP-DM)
zConsortium of data
miners from various
industries –
manufacturing,
marketing, and
government
Examples of Data Mining
Systems
z IBM Intelligent Miner
¾A wide range of data mining algorithms
¾ Scalable mining algorithms
¾ Toolkits: neural network algorithms,
statistical methods, data preparation,
and data visualization tools
¾ Tight integration with IBM's DB2
relational database system
Examples of Data Mining
Systems
z SAS Enterprise Miner
¾A variety of statistical analysis tools
¾ Data warehouse tools and multiple data
mining algorithms
z Clementine (from SPSS)
¾ Multiple
data mining algorithms and
advanced statistics
Examples of Data Mining
Systems
z SQL Server 2005
¾ Multipledata mining modules:
discovery-driven OLAP analysis,
association, classification, and
clustering
¾ Tight integration with SQL Server
relational database system
Examples of Data Mining
Systems
z Oracle Data miner
¾ Multipledata mining modules:
discovery-driven OLAP analysis,
association, classification, and
clustering
Examples of Data Mining
Systems
z DBMiner (DBMiner Technology
Inc.)
¾ Multiple data mining modules:
discovery-driven OLAP analysis,
association, classification, and
clustering
¾ Efficient, association and sequential-
pattern mining functions, and visual
classification tool
¾ Mining both relational databases and
data warehouses
Trends in Data Mining
z Application exploration
¾ development of application-specific
data mining system
¾ Invisible data mining (mining as built-in
function)
Trends in Data Mining
z Scalable data mining methods
¾ Constraint-based mining: use of
constraints to guide data mining
systems in their search for
interesting patterns
z Integration
of data mining with
database systems, data
warehouse systems, and Web
database systems
Trends in Data Mining
z Standardization of data mining
language
¾A standard will facilitate
systematic development, improve
interoperability, and promote the
education and use of data mining
systems in industry and society
z Web mining
Data Mining: Confluence of
Multiple Disciplines
Database
Statistics
Technology
Machine
Learning
Data Mining Visualization
Information Other
Science Disciplines
Thank you !!!