0% found this document useful (0 votes)
1K views

Unit-1 Data Mining Metrics

Uploaded by

Suja Mary
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views

Unit-1 Data Mining Metrics

Uploaded by

Suja Mary
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Unit-1

DATA MINING METRICS


 Different metrics could be used for different techniques and also based on the interest
level.
 An overall business or usefulness perspective, a measure such as return on investment
(ROI) could be used.
 ROI examines the difference between what the data mining technique costs and what the
savings or benefits from its use are.
 It could be measured as increased sales, reduced advertising expenditure, or both.
 To use a more computer science/database perspective to measure various data mining
approaches.
 The business management has determined that a particular data mining application be
made. They subsequently will determine the overall effectiveness of the approach using
some ROI (or related) strategy.
 The metrics used include the traditional metrics of space and time based on complexity
analysis.
 In some cases, such as accuracy m classification, more specific metrics targeted to a data
mining task may be used.

SOCIAL IMPLICATIONS OF DATA MINING


 Data mining applications can derive much demographic information concerning
customers that was previously not known or hidden in the data.
 The unauthorized use of such data could result in the disclosure of mformat10n that is
deemed to be confidential.
 Data mining techniques targeted the applications as fraud detection, identifying
criminal suspects, and prediction of potential terrorists.
 These can be viewed as types of classification problems.
 Many classification techniques work by identifying the attribute values that
commonly occur for the target class. Subsequent records will be the then classified
based on these attribute values.
 Classifications are imperfect then the Mistakes can be made.
 An individual takes a series of credit card purchases that are similar to those often
made when a card is stolen does not mean that the card is stolen or that the individual
is a criminal.
 Users of data mining techniques must be sensitive to these issues and must not violate
any privacy directives or guidelines.

DATA MINI NG FROM A DATABASE PERSPECTIVE

Data mining can be studied from many different perspective.


IR researcher -concentrate on the use of data mining techniques to access text data
statistician -might look primarily at the historical techniques, including time series analysis,
Hypothesis testing, and applications of Bayes theorem.
machme learning specialist might be interested primarily in data mining algorithms that learn
Algorithms researcher - interested in studying and comparing algorithms based on type and
complexity.
 Scalability:- Algorithms that do not scale up to perform well with massive realworld
datasets are of limited application. Related to this is the fact that techniques should work
regardless of the amount of available main memory.
 Real-world data: Real-world data are noisy and have many missing attribute values.
Algorithms should be able to work even in the presence of these problems.
 Update: Many data mining algorithms work with static datasets. This is not a realistic
assumption.
 Ease of use: Although some algorithms may work well, they may not be well received by
users if they are difficult to use or understand.
These issues are crucial if applications are to be accepted and used in the workplace

 Data mining today is in a similar state as that of databases in the early 1960s.
 The rise of DBMS occurred in the early 1970s. Their success has been due partly to
the abstraction of data definition and access primitives to a small core of needed
requirements
 One crucial part of the database abstraction is query processing support. One reason
relational databases are so popular today is the development of SQL.
 It is easy to use and has become a standard query language implemented by all major
DBMS vendors.

You might also like