0% found this document useful (0 votes)

58 views7 pages

Data Description For Data Mining

This document discusses different types of data that are commonly collected and stored in databases for data mining purposes. It describes scientific data, personal and medical data, data about games and athletes, CAD and software engineering data, business transactions, surveillance videos and pictures, data from satellites, and repositories on the world wide web. It then discusses different types of data that can be mined, including flat files, relational databases, data warehouses, transaction databases, spatial databases, multimedia databases, and time-series databases.

Uploaded by

Kimondo King

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views7 pages

Data Description For Data Mining

Uploaded by

Kimondo King

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

DATA DESCRIPTION FOR DATA MINING

Types of Information Collected

We collect on daily basis a myriad of data which ranges from simple numerical
measurements and text documents to more complex information such as hypertext
documents, scientific data, spatial data and multimedia channels. Here is a different kind of
information often collected in digital form in databases and flat files, although not
exclusive.

1. Scientific Data

Our society is seriously gathering great amount of scientific data that needs to be analyzed.
Examples are in the Swiss nuclear accelerator laboratory counting particles, South Pole iceberg
gathering data about oceanic activity, American university investigating human psychology
and Canadian forest studying readings from a grizzly bear radio collar. The unfortunate part
of it is we can easily capture and store more new data faster than we can analyze the old data
that have been accumulated.

2. Personal and Medical Data

From personal data to medical and government, very large amounts of information are
continuously collected. Governments, individuals and organizations such as hospitals and
schools are on daily basic stockpiling large quantity of very important personal data to help
them manage human resources, better understanding of market, or simply assist client. No
matter the private issues this type of data reveals, the information is collected used and even
shared. And when compared with other data this information can shed more light on customer
behavior and likes.

3. Games

The rate at which our society gathers data and statistics about games, players and athletes is
tremendous. These ranges from car-racing, swimming, hockey scores, footballs, basketball
passes, chess positions and boxers‘ pushes, all these data are stored. Trainers and athletes
make use of this data to improve their performances and have a better understanding
of their opponents, but the journalists and commentators use this information to report.

4. CAD and Software Engineering Data

There are different types of Computer Assisted Design (CAD) systems used by architects
and engineers to design buildings and picture system components or circuits. These systems
generate a great amount of data. Also, software engineering is a source of data generation with
code, function libraries and objects, these needs powerful tools for management and
maintenance

5. Business Transaction

Every transaction in business is often noted for the sake of continuity. These transactions are
usually related and can be inter-business deals such as banking, purchasing, exchanges and
stocks or intra-business operations such as management of in-house wares and assets. Large
departmental stores for example stores millions of transactions on daily basis with the use
barcodes.
The storage space does not pose any problem, as the price of hard disks are dropping, but
the effective use of the data within a reasonable time frame for competitive decision-making
is certainly the most important problem to be solved for businesses that struggle in
competitive world.

6. Surveillance Video and Pictures

With the incredible fall in price of video camera prices, video cameras are becoming very
common. The video tapes from surveillance cameras are usually recycled, thereby losing its
content. But today there is tendency to store the tapes and even digitize them for future use
and analysis.

7. Satellite Sensing

There are countless numbers of satellites around the globe, some are geo-stationary above
a region while some are orbiting round the Earth, but all are sending a non-stop of data to
the surface of the earth. NASA which is a body controlling large number of satellites
receives more data per second than all NASA engineers and researchers can cope with.
Many of the pictures and data captured by the satellite are made public as soon they are
received hoping that other researchers can analyze them.

8. World Wide Web (WWW) Repositories

Since the advent of World Wide Web in 1993, documents of different formats, contents and
description have been collected and inter- connected with hyperlinks making it the largest
repository of data ever built, The World Wide Web is the most important data collection
regularly used for reference because of the wide variety of topics covered and the infinite
contributions of resources and authors. Many even believe that the World Wide Web is a
compilation of human knowledge.

Types of Data Mined

Data mining can be applied to any kind of information in the repository, though algorithms
and approaches may differ when applied to different types of data. And the challenges
posed by different types of data vary extensively. Data Mining is used and studied for
databases including relational databases, object-relational databases and object-oriented
data, bases data warehouses, transactional databases, unstructured and semi structured
repositories such as the World Wide Web, and advanced databases such as spatial
databases, multimedia database, time-series databases and flat files. Some of these are
discussed in more details as follows.

a) Flat Files

These are the commonest data source for data mining algorithms especially at the
research level. Flat files are simply data files in text or binary format with a structured
known by the data mining algorithms to be applied. The data in these files can be in
form of transactions, time-sales data, scientific measurements etc.
b) Relational Databases

This is the most popular type of database system in use today by computers. It stores data
in a series of two-dimensional tables called relation (i.e. tabular form). A relational database
consists of a set of tables containing either values of entity attributes, or value of attribute
from entity relationships. Tables generally have columns and rows, where columns
represent attribute and rows represent tuples. A tuple in a relational table corresponds to
either an object or a relationship between objects and is identified by a set of attribute
values representing a unique key. In the table below we present some relations student
name, registration number, department and grade in computer representing a fictitious
student grade in a class. These relations are just a subset of what could be a database for
student score.

c) Relational Database

Student Registration DepartmentGrade in

Name Data Mining
Ken BUS/05/MLD/101 Business A
John MKT/05/MLD/105 Marketing B
Nancy BFN/05/MLD/203 Banking & Finance A
Mary ACC/05/MLD/102 Accountancy C
Victor BUS/05/MLD/200 Business B

The most commonly used query language for relational database is Structured Query
Language (SQL), it allows for retrieval and manipulation of data stored in the tables as well
as the calculation of aggregate function such as sum, min, max and count. The data mining
algorithm that uses relational databases can be more versatile than data mining algorithm
that is specifically designed for flat files because they can always take advantage of the
structure inherent in relational databases, while data mining can benefit from Structured
Query Language (SQL) for data selection, transformation and consolidation. Also, it goes
beyond what SQL can provide like predicting, comparing and detecting deviations.

d) Data Warehouses

A data warehouse (a storehouse) is a repository of data gathered from multiple data

sources (often heterogeneous) and is designed to be used as a whole under the same unified
schema. A data warehouse provides an option of analyzing data from different sources
under the same roof. The most efficient data warehousing architecture will be able to
incorporate or at least reference all management systems using designated technology
suitable for corporate database management e.g. Sybase, Ms SQL Server

e) Transaction Databases

This is a set of records that represent transactions, each with a time stamp, an identifier and
set of items. Also, associated with the transaction files is the descriptive data for the items.
Rentals
Transaction (1) Data Time Customer Item List
ID
TI 14/09/04 14.40 12 10,11,30, 110..
II. III. IV. V. VI.
VII. VIII. IX. X. XI.

Fragment of a Transaction Database for Rentals in a Store

Figure above represents a transaction database, each record shows a rental contact with a
customer identifier, a date and list of items rented. But relational database do not allow
nested tables that is a set as attribute value, transactions are usually stored in flat files or
stored in two normalized transaction tables, one for the transactions and the other one for
the transaction items. A typical data analysis on such data is the so-called market basket
analysis or association rules in which associations occurring together or in sequence are
studied.

f) Spatial Databases

These are databases that in addition to the usual data stores geographical information such
as maps, global or regional positioning, and this type of database also present new
challenges to data mining algorithms.

g) Multimedia Databases

Multimedia databases include audio, video, images and text media. These can be stored on
extended object-relational or object-oriented databases, or simply on a file system.
Multimedia database is characterized by its high dimension; this makes data mining more
challenging. Data mining that comes from multimedia repositories may require vision,
computer graphics, images interpretation and natural language processing methodologies

h) Time-Series Databases

This type of database contains time related data such as stock market data or logged
activities. Time-series database usually contain a continuous flow of new data coming in
that sometimes causes the need for a challenging real time analysis. Data mining in these
types of databases often include the study of trends and correlations between evolutions of
different variables, prediction of trends and movements of the variables in time.

i) World Wide Web

World Wide Web is the most heterogeneous and dynamic repository available. Large
number of authors and publishers are continuously contributing to its growth and
metamorphosis, and a massive number of users are assessing its resources daily. The data
in the World Wide Web are organized in inter-connected documents, which can be text,
audio, video, raw data and even applications. The World Wide Web comprises of three
major components: the content of the web, which encompasses document available, the
structure of the web, which covers the hyperlinks and the relationships between documents
the usage of the web, this describe how and when the resources are accessed.
A fourth dimension can be added relating the dynamic nature or evolution of the
documents. Data mining in the World Wide Web, or web mining, addresses all these issues
and is often divided into web content mining and web usage mining.

DATA MINING FUNCTIONALITIES

Data mining functionalities are used to specify the kind of patterns to be found in data
mining task. It is a very common phenomenon that many users do not have clear idea of the
kind of patterns they can discover or need to discover from the data at hand. It is therefore
crucial to have a versatile and inclusive data mining system that allows the discovery of
different kinds of knowledge and at different levels of abstraction. This also makes
interactivity an important issue in data mining system.

The data mining functionalities and the variety of knowledge they discover are described in
this section as follows:

1. Classification

This is also referred to as supervised classification and is a learning function that maps (i.e.
classifies) item into several given classes. The classification uses given class labels to order
the objects in the data collection. Classification approaches normally make use of a training
set where all objects are already associated with known class labels. The classification
algorithm learns from the training set and builds a model which is used to classify new
objects. Examples of classification method used in data mining application include the
classifying of trends in financial markets and the automated identification of objects of
interest in large image database. Figure 2.2 shows a simple partitioning of the loan data
into two class regions; this may be done imperfectly using a linear decision boundary. The
bank may use the classification regions to automatically decide whether future loan
applicants will be given loan or not.
2. Characterization

Data characterization is also called summarization and involves methods for finding a
compact description (general features) for a subject of data or target class, and produces
what is called characteristics rules. The data that is relevant to a user-specified class
are normally retrieved by a database query and run through a summarization module
to extract the essence of the data at different levels of abstractions. A simple example
would be tabulating the mean and standard deviations for all fields. More
sophisticated methods involve the deviation of summary rules (Usama et al. 1996;
Agrawal et al. 1996), multivariate visualization techniques and the discovery of
functional relationships between variables. Summarization techniques are often
applied to interactive exploratory data analysis and automated report generation
(Usama et al., 1996)

3. Clustering

Clustering is similar to classification and is the organization of data in classes. But

unlike classification, in clustering class tables are not predefined (unknown) and
is up to clustering algorithm to discover acceptable classes. Clustering can also be
referred to as unsupervised classification because the classification is not dictated by
given class tables. We have so many clustering approaches which are all based on the
principle of maximizing the similarity between objects in the same class (that is
intra-class similarity) and minimizing the similarity between objects of different classes
that is inter-class similarity.

4. Prediction (Regression)

This involves learning a function that maps a data item to a real–valued

prediction variable. This method has attracted considerable attention given the potential
implication of successful forecasting in a business context. Predictions can be classified
into two major types namely: one can either try to predict some unavailable data value
or pending trends, or predict a class label for some data (this is tied to
classification). The moment a classification model is built based on training set, the
class label of an object can be foreseen based on the attribute values of the object and
the attribute values of the classes. Prediction often refers to forecast of missing
numerical value, or increase/decrease trends in time related data. Summarily, the main
idea of prediction is to use a large number of past values to consider probable future
values.
5. Discrimination

Data discrimination generates what we call discriminant rules and is basically the
comparison of the general features of objects between two classes referred to as the
target class and the contrasting class. For instance, we may want to characterize the
rental customers that regularly rent more than 50 movies last year with those
whose rental account is lower than 10. The techniques used for data discrimination
are similar to that used for data characterization with the exception that data
discrimination results include comparative measures.

6. Association Analysis
Association analysis is the discovery of what we commonly refer to as association rules.
It studies the frequency of items occurring together in transactional databases, and
based on a threshold called support, identifies the frequent item sets. Another
threshold, confidence that is the conditional probability that an item appears in a
transaction when another item appears is used to pinpoint association rules.
Association analysis is commonly used for market basket analysis because it searches
for relationship between variable. For example, a supermarket might gather data of
what each customer buys. With the use of association rule learning, the supermarket
can work out what products are frequently bought together, which is useful for
marketing purposes. This is sometimes called market basket analysis.

7. Outlier Analysis

This is also referred to as exceptions or surprises. Outliers are data elements that cannot
be grouped in a given class or clusters, and often important to identify, though, outliers
can be considered noisy and discarded in some applications. They can reveal important
knowledge in other domains; this makes them very significant and their analysis
valuable.

8. Evolution and Deviation Analysis

Evolution and deviation analysis deals with the study of time related data that changes
in time. In actual sense evolution analysis models evolutionary trends in data that
consent with characterizing, comparing, classifying or clustering of time related data.
While deviation analysis is concerned with the differences between measured values
and expected values, and attempts to find the cause of the deviations from the expected
values.

DWDM Module II
No ratings yet
DWDM Module II
103 pages
DA(Unit 1)
No ratings yet
DA(Unit 1)
91 pages
DM - UNIT I
No ratings yet
DM - UNIT I
58 pages
Dsdm Notes
No ratings yet
Dsdm Notes
114 pages
DWM Unit 4 Introduction To Data Mining
100% (2)
DWM Unit 4 Introduction To Data Mining
17 pages
DMW-M1-Ktunotes.in
No ratings yet
DMW-M1-Ktunotes.in
75 pages
Data Mining First draft
No ratings yet
Data Mining First draft
84 pages
DBMS UNIT-I
No ratings yet
DBMS UNIT-I
47 pages
Data Mining MCA 3 Sem
No ratings yet
Data Mining MCA 3 Sem
51 pages
UNIT-1 PPT DMA
No ratings yet
UNIT-1 PPT DMA
83 pages
Unit 1
No ratings yet
Unit 1
82 pages
Ism Second Module
No ratings yet
Ism Second Module
73 pages
Tutorial Answre Unit 1
No ratings yet
Tutorial Answre Unit 1
35 pages
DWDM Notes
No ratings yet
DWDM Notes
59 pages
Data Curation and Managment Chap1-5 1-5
No ratings yet
Data Curation and Managment Chap1-5 1-5
31 pages
Datamining With Big Data - Siva
No ratings yet
Datamining With Big Data - Siva
69 pages
Unit-3 DWDM
No ratings yet
Unit-3 DWDM
39 pages
Unit 1 DM
No ratings yet
Unit 1 DM
37 pages
DWDM - (UNIT-1) : SVIT College of Engineering, ATP
No ratings yet
DWDM - (UNIT-1) : SVIT College of Engineering, ATP
40 pages
Database Management Systems
No ratings yet
Database Management Systems
32 pages
Unit 2
No ratings yet
Unit 2
37 pages
CH - 1 Relational Database Design Updated
No ratings yet
CH - 1 Relational Database Design Updated
80 pages
DWDM
No ratings yet
DWDM
48 pages
MIS Mod2
No ratings yet
MIS Mod2
36 pages
Data Mining Unit I notes
No ratings yet
Data Mining Unit I notes
29 pages
IRS Unit Wise Important Questions
No ratings yet
IRS Unit Wise Important Questions
3 pages
DMT UNIT 5
No ratings yet
DMT UNIT 5
25 pages
Why We Need Data Mining?
No ratings yet
Why We Need Data Mining?
39 pages
Data Mining 445545
No ratings yet
Data Mining 445545
11 pages
BI_UNIT 3
No ratings yet
BI_UNIT 3
18 pages
This PPT Is Dedicated To My Inner Controller Founders.: Amma Bhagavan
No ratings yet
This PPT Is Dedicated To My Inner Controller Founders.: Amma Bhagavan
84 pages
UNIT-V-1
No ratings yet
UNIT-V-1
23 pages
Data Mining Ch1
No ratings yet
Data Mining Ch1
38 pages
1 ST Review Document
No ratings yet
1 ST Review Document
37 pages
Imp Answers
No ratings yet
Imp Answers
29 pages
DM Unit2(Part1)
No ratings yet
DM Unit2(Part1)
19 pages
Bca DM Unit I
No ratings yet
Bca DM Unit I
20 pages
Introduction 1
No ratings yet
Introduction 1
70 pages
4a - Database Systems
No ratings yet
4a - Database Systems
35 pages
Administration
No ratings yet
Administration
117 pages
Major components of data mining system
No ratings yet
Major components of data mining system
9 pages
Data Mining L-3,4
No ratings yet
Data Mining L-3,4
25 pages
Unit 2 MODMIT1 Stair Chapter 3
No ratings yet
Unit 2 MODMIT1 Stair Chapter 3
11 pages
CH 4
No ratings yet
CH 4
35 pages
Presentation, Analysis & Interpretation of Date
33% (3)
Presentation, Analysis & Interpretation of Date
21 pages
Unit-1 DWDM
No ratings yet
Unit-1 DWDM
20 pages
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
No ratings yet
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
52 pages
Kinds of data
No ratings yet
Kinds of data
8 pages
Data Warehouse & Mining
No ratings yet
Data Warehouse & Mining
28 pages
Distributed Multimedia Database Technologies
No ratings yet
Distributed Multimedia Database Technologies
275 pages
Unit I DATA MINING AAGAC
No ratings yet
Unit I DATA MINING AAGAC
27 pages
What Motivated Data Mining?: Huge Amount of Raw DATA Is Available - The Motivation For The Data Mining Is To
No ratings yet
What Motivated Data Mining?: Huge Amount of Raw DATA Is Available - The Motivation For The Data Mining Is To
83 pages
Survival - Notes (Lecture 3)
100% (1)
Survival - Notes (Lecture 3)
23 pages
Oracle PL SQL by Example Benjamin Rosenzweig
No ratings yet
Oracle PL SQL by Example Benjamin Rosenzweig
65 pages
Ramy mahmoud 52117
No ratings yet
Ramy mahmoud 52117
3 pages
18mca52c U1
No ratings yet
18mca52c U1
17 pages
Classification
No ratings yet
Classification
58 pages
SQL Queries
No ratings yet
SQL Queries
11 pages
Data Mining Moodle Notes U1
No ratings yet
Data Mining Moodle Notes U1
11 pages
Introduction To Databases
No ratings yet
Introduction To Databases
21 pages
Forecasting Infant Mortality.docx.Fc9704e84daf3b4fcd7b430d5128861e.20230129113759802
No ratings yet
Forecasting Infant Mortality.docx.Fc9704e84daf3b4fcd7b430d5128861e.20230129113759802
26 pages
What Motivated Data Mining? Why Is It Important?
No ratings yet
What Motivated Data Mining? Why Is It Important?
12 pages
Changes in Energy Sector Strategies A Literature Review 3r0s5lzc
No ratings yet
Changes in Energy Sector Strategies A Literature Review 3r0s5lzc
26 pages
Chapter 5 Database Systems
No ratings yet
Chapter 5 Database Systems
7 pages
Unit 1: Introduction To Big Data: Types of Data and Their Characteristics
No ratings yet
Unit 1: Introduction To Big Data: Types of Data and Their Characteristics
7 pages
Survival - Notes (Lecture 4)
No ratings yet
Survival - Notes (Lecture 4)
29 pages
Sapbw Technical Specification Template
No ratings yet
Sapbw Technical Specification Template
30 pages
Survival - Notes (Lecture 6)
No ratings yet
Survival - Notes (Lecture 6)
27 pages
DBMS Presentation
No ratings yet
DBMS Presentation
17 pages
Data Management: Data, Databases, and Warehousing: Maestría en Tecnologías de Información
No ratings yet
Data Management: Data, Databases, and Warehousing: Maestría en Tecnologías de Información
54 pages
COM 428 - Jupyter Notebook2_101223
No ratings yet
COM 428 - Jupyter Notebook2_101223
16 pages
Bik 1173-Assignment 2 (Research Report)
No ratings yet
Bik 1173-Assignment 2 (Research Report)
16 pages
PRAC1 Student Copy--Assignment 2
No ratings yet
PRAC1 Student Copy--Assignment 2
9 pages
Acs 411 Sample Paper
No ratings yet
Acs 411 Sample Paper
5 pages
Fionaw Linear Algebra Math 232
No ratings yet
Fionaw Linear Algebra Math 232
4 pages
Acs 411 Lecture 9 Notes
No ratings yet
Acs 411 Lecture 9 Notes
15 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
3 pages
2.5 - DB2 Backup and Recovery
No ratings yet
2.5 - DB2 Backup and Recovery
42 pages
Describe The Data Processing Chain: Business Understanding
No ratings yet
Describe The Data Processing Chain: Business Understanding
4 pages
Microsoft Official Course: Introduction To Active Directory Domain Services
No ratings yet
Microsoft Official Course: Introduction To Active Directory Domain Services
13 pages
What Kind of Data Can Be Mined
No ratings yet
What Kind of Data Can Be Mined
6 pages
Acs 411 Lecture 10notes
No ratings yet
Acs 411 Lecture 10notes
14 pages
Inside Dynamics AX 2012 Performance - A1214
No ratings yet
Inside Dynamics AX 2012 Performance - A1214
22 pages
Difference Between Hashgraph and Blockchain _ Simplilearn
No ratings yet
Difference Between Hashgraph and Blockchain _ Simplilearn
10 pages
PROG
No ratings yet
PROG
11 pages
Data Mining and Data Warehouse: Qis College of Engineering & Technology Ongole
No ratings yet
Data Mining and Data Warehouse: Qis College of Engineering & Technology Ongole
10 pages
ICT 34 Data Structures and Analysis of Algorithm
No ratings yet
ICT 34 Data Structures and Analysis of Algorithm
9 pages
Informatica Unix Training Course Outline
No ratings yet
Informatica Unix Training Course Outline
14 pages
Chapter 7: Sampling and Sampling Distributions Cheat Sheet: by Via
No ratings yet
Chapter 7: Sampling and Sampling Distributions Cheat Sheet: by Via
1 page
MongoDB Backup Preparation - Backup Ninja
No ratings yet
MongoDB Backup Preparation - Backup Ninja
6 pages
Shared-Disk vs. Shared-Nothing: Comparing Architectures For Clustered Databases
No ratings yet
Shared-Disk vs. Shared-Nothing: Comparing Architectures For Clustered Databases
18 pages
Commonly Used SAP T-Codes
No ratings yet
Commonly Used SAP T-Codes
3 pages
Chapter 4 - File
No ratings yet
Chapter 4 - File
5 pages
IFoA Foundation Grant Application Form.
No ratings yet
IFoA Foundation Grant Application Form.
4 pages
Karatina University: University Examinations 2018/2019 ACADEMIC YEAR
No ratings yet
Karatina University: University Examinations 2018/2019 ACADEMIC YEAR
4 pages
Qualitative Research Methods-CSA Sociology
No ratings yet
Qualitative Research Methods-CSA Sociology
4 pages
Database Languages in DBMS
100% (1)
Database Languages in DBMS
3 pages
Association Analysis
No ratings yet
Association Analysis
3 pages
Untitled
No ratings yet
Untitled
3 pages
Research Paper on Database Management System PDF
No ratings yet
Research Paper on Database Management System PDF
4 pages
Datainbrief Template
No ratings yet
Datainbrief Template
3 pages
On What Kind of Data Mining Task Can Be Performed? or Explain Different Data Repository On Which Data Mining Task Can Be Performed
No ratings yet
On What Kind of Data Mining Task Can Be Performed? or Explain Different Data Repository On Which Data Mining Task Can Be Performed
5 pages
Nimisha Chandra
No ratings yet
Nimisha Chandra
2 pages
Summary Chapter 6 Foundations of Business Intelligence: Databases and Information Management
50% (2)
Summary Chapter 6 Foundations of Business Intelligence: Databases and Information Management
2 pages
SQL For RMA Receipt Data Fix
No ratings yet
SQL For RMA Receipt Data Fix
3 pages
Monday ACS 412 DF2 COM 411 DG1
No ratings yet
Monday ACS 412 DF2 COM 411 DG1
1 page
R Programming Cheat Sheet: by Via
No ratings yet
R Programming Cheat Sheet: by Via
2 pages
Linear Algebra Using Sympy Cheat Sheet: by Via
No ratings yet
Linear Algebra Using Sympy Cheat Sheet: by Via
2 pages
Finance Exam 2 Cheat Sheet: by Via
No ratings yet
Finance Exam 2 Cheat Sheet: by Via
2 pages
MATH140 Final Cheat Sheet: by Via
No ratings yet
MATH140 Final Cheat Sheet: by Via
1 page
Attachment
No ratings yet
Attachment
1 page
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet

Data Description For Data Mining

Uploaded by

Data Description For Data Mining

Uploaded by

DATA DESCRIPTION FOR DATA MINING

Types of Information Collected

2. Personal and Medical Data

4. CAD and Software Engineering Data

6. Surveillance Video and Pictures

8. World Wide Web (WWW) Repositories

Types of Data Mined

Student Registration DepartmentGrade in

A data warehouse (a storehouse) is a repository of data gathered from multiple data

Fragment of a Transaction Database for Rentals in a Store

i) World Wide Web

DATA MINING FUNCTIONALITIES

Clustering is similar to classification and is the organization of data in classes. But

This involves learning a function that maps a data item to a real–valued

8. Evolution and Deviation Analysis

You might also like