0% found this document useful (0 votes)

2 views

Mining Kind of data

Data mining can be applied to various forms of data, including relational databases, data warehouses, and multimedia data. Relational databases are structured collections of tables that can be accessed using SQL, while data warehouses integrate information from multiple sources to support decision-making. Key components of a data warehouse include a central database, ETL tools, metadata, and access tools, and they provide a multidimensional view of data through structures like data cubes.

Uploaded by

virat18kohli360

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Mining Kind of data

Uploaded by

virat18kohli360

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 24

WHAT KINDS OF DATA CAN

BE MINED?
 As a general technology, data mining can be applied to
any kind of data as long as the data are meaningful for
a target application.
 The most basic forms of data for mining applications

are relational databases, object-relational

databases and object-oriented databases, data
warehouse data, and transactional data.
 Data mining can also be applied to other forms of data

e.g., data streams, ordered/sequence data, graph or

networked data, spatial data, text data, multimedia
data, and the WWW.
Relational Database
 A Relational database is defined as the collection of
data organized in tables with rows and columns.
 Briefly, a relational database is a collection of tables,

each of which is assigned a unique name.

 Each tables consists columns and rows, where

columns represent attributes and rows or records

represent tuples.
 Each tuple in a relational table represents an object

identified by a unique key and described by a set of

attribute values.
Relational Database
Database Schema
 A database schema is the skeleton structure that represents
the logical view of the entire database.
 It defines how the data is organized and how the relations
among them are associated.
 It formulates all the constraints that are to be applied on the
data.
 A database schema defines its entities and the relationship
among them.
 It contains a descriptive detail of the database.
Example of Relational schema for a
relational database

 customer (cust ID, name, address, age, occupation,

annual income, credit information, category, . . .)
 item (item ID, brand, category, type, price, place made,
supplier, cost, . . .)
 employee (empl ID, name, category, group, salary,
commission, . . .)
Categories of Database schema

 A database schema can be divided broadly into

two categories such as,
 Physical schema in Relational databases is a
schema which defines the structure of tables.
 Logical schema in Relational databases is a
schema which defines the relationship among
tables.
E-R diagram

A semantic data model, such as an entity-

relationship (ER) data model, is often
constructed for relational databases.
 An ER data model represents the database
as a set of entities and their relationships.
SQL
 Relational data can be accessed by database queries
written in a relational query language called SQL.
 Standard API of relational database is SQL.
 The most commonly used query language for relational
database is SQL, which allows retrieval and manipulation
of the data stored in the tables, as well as the calculation
of aggregate functions such as average, sum, min, max
and count.
 For instance, an SQL query to select the videos grouped
by category would be:
SELECT count(*) FROM Items WHERE type=video
GROUP BY category.
 Data mining algorithms using relational databases can
be more versatile, since they can take advantage of
the structure inherent to relational databases.
 While data mining can benefit from SQL for data
selection, transformation and consolidation, it goes
beyond what SQL could provide, such as predicting,
comparing, detecting deviations, etc.
 Application: Data Mining, ROLAP model, etc.
 Relational databases are one of the most commonly
available and richest information repositories, and thus
they are a major data form in the study of data mining.
Data warehouse

 A data warehouse is a repository of information

collected from multiple sources, stored under a unified
schema, and usually residing at a single site.
 Data warehouse is an integrated subject-oriented
and time-variant repository of information in support

of management’s decision-making process.

 Data warehouses are constructed via a process of data
cleaning, data integration, data transformation, data
loading, and periodic data refreshing.
 In other words, data from the different stores would be
loaded, cleaned, transformed and integrated together.
 To facilitate decision-making and multi-dimensional
views, data warehouses are usually modelled by a multi-
dimensional data structure.
Characteristics of Data warehouse

 Subject-Oriented: A data warehouse is subject-oriented

since it provides topic-wise information rather than the
overall processes of a business.
 Integrated: A data warehouse is developed by
integrating data from varied sources into a consistent
format.
 Time-variant: The different data present in the data
warehouse provides information for a specific period.
 Non-volatile: Data once entered into a data warehouse
must remain unchanged. All data is read-only. Previous
data is not erased when current data is entered.
Structure of data warehouse system
Three types of Data warehouse

 Enterprise data warehouse (EDW) is a system

for structuring and storing all company's
business data for analytics querying and
reporting.
 Data Mart is a smaller form of data warehouse,

which serves some specific needs on data

analysis.
 A virtual warehouse is a set of views over an

operational database for efficient query

processing.
DW Architecture
Key components of a data warehouse
 A typical data warehouse has four main components are
 Central database
 ETL (extract, transform, load) tools
 Metadata
 Access tools
 It involves extracting information from source system by using
an ETL process and then storing the information in a staging
database. The daily changes also come to the staging area.
 Another ETL process is used to transform information from
the staging area to populate the ODS.
 Then ODS is used for supplying information via another ETL
process to the data warehouse which in turn feeds a number
of data marts that generate the reports required by
management.
 The data in a data warehouse are organized
around major subjects to facilitate decision
making.
 The data are stored to provide information from
a historical perspective and are typically
summarized.
Data Cube

 A data warehouse is usually modelled by a

multidimensional data structure, called a data
cube, in which each dimension corresponds to
an attribute or a set of attributes in the schema,
and each cell stores the value of some
aggregate measure such as count or sum.
 A data cube provides a multidimensional view of

data and allows the precomputation and fast

access of summarized data.
Example: A cube represented by,

Country x Degree x Semester

Data Cube Operations
 Roll-up: zooming out on the data cube
 Drill-down: zooming in on the data and is therefore the
reverse of roll-up.
 Slice and dice: Slice and dice are operations for browsing
the data in the cube.
 A slice is a subset of the cube corresponding to a single
value.
 A dice is obtained by performing a selection on two or more
dimensions.
 Pivot: The pivot operation is used when the user wishes to
re-orient the view of the data cube.
 It may involve swapping the rows and columns
Benefits of data warehouse

 Provides a single version of truth about

enterprise information.
 Speed up ad hoc reports and queries

 Improved data consistency

 Better business decisions

 Easier access to enterprise data for end-users

 Better documentation of data

 Reduced computer costs and higher productivity

Data Warehouse & Data Mining
No ratings yet
Data Warehouse & Data Mining
41 pages
Knowledge Discovery Analysis
No ratings yet
Knowledge Discovery Analysis
7 pages
Data Modeling: Agnivesh Kumar
100% (1)
Data Modeling: Agnivesh Kumar
21 pages
Data Modeling Principles
100% (1)
Data Modeling Principles
21 pages
Data Warehouse
No ratings yet
Data Warehouse
16 pages
First Data WarehouseAima First Final Updated 9 Sep 2016
No ratings yet
First Data WarehouseAima First Final Updated 9 Sep 2016
188 pages
Datamining 1
No ratings yet
Datamining 1
21 pages
???? ?????????
No ratings yet
???? ?????????
22 pages
Data Warehouse and OLAP
No ratings yet
Data Warehouse and OLAP
55 pages
Data Mining UNIT 2 LECTURE NOTES
No ratings yet
Data Mining UNIT 2 LECTURE NOTES
32 pages
Chapter Four
No ratings yet
Chapter Four
43 pages
Data Mining L-3,4
No ratings yet
Data Mining L-3,4
25 pages
MIS - 7 (Compatibility Mode)
No ratings yet
MIS - 7 (Compatibility Mode)
48 pages
Designing The Data Warehouse Aima Second Lecture
No ratings yet
Designing The Data Warehouse Aima Second Lecture
34 pages
ITBO WEEK 3 PPT - ch03
No ratings yet
ITBO WEEK 3 PPT - ch03
53 pages
Major components of data mining system
No ratings yet
Major components of data mining system
9 pages
Data Mining
No ratings yet
Data Mining
98 pages
Unit-1.1 Data Warehouse
No ratings yet
Unit-1.1 Data Warehouse
29 pages
Lecture 13
No ratings yet
Lecture 13
17 pages
Data Mining and Data Warehouse
No ratings yet
Data Mining and Data Warehouse
11 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
29 pages
CDM - Class 5,6,7
No ratings yet
CDM - Class 5,6,7
8 pages
Data Warehouse Modeling
No ratings yet
Data Warehouse Modeling
17 pages
Data Mining and Data Warehouse: Qis College of Engineering & Technology Ongole
No ratings yet
Data Mining and Data Warehouse: Qis College of Engineering & Technology Ongole
10 pages
UNIT 3 DBMS
No ratings yet
UNIT 3 DBMS
114 pages
DBMS, Data Warehousing and Data Mining
No ratings yet
DBMS, Data Warehousing and Data Mining
31 pages
Lect 14 DM
No ratings yet
Lect 14 DM
33 pages
Data Mining and Data Warehouse BY
100% (1)
Data Mining and Data Warehouse BY
12 pages
Data Mining and Warehosuing Lecture 01
No ratings yet
Data Mining and Warehosuing Lecture 01
36 pages
UNIT-1 (RIT-062) : Data Warehousing
No ratings yet
UNIT-1 (RIT-062) : Data Warehousing
34 pages
DWDM B Tech Unit 1 Part-A
No ratings yet
DWDM B Tech Unit 1 Part-A
15 pages
Database Management Systems
No ratings yet
Database Management Systems
19 pages
IT UNIT 3 new
No ratings yet
IT UNIT 3 new
43 pages
An Introduction To Data Warehousing and Data Mining
No ratings yet
An Introduction To Data Warehousing and Data Mining
34 pages
Database Technology (DB), Data Ware House (DWH) and Data Mining (DM)
No ratings yet
Database Technology (DB), Data Ware House (DWH) and Data Mining (DM)
26 pages
Module 6
No ratings yet
Module 6
7 pages
3.1 What Is Data Warehouse?: Unit Iii
No ratings yet
3.1 What Is Data Warehouse?: Unit Iii
33 pages
Data Warehousing, Business Analytics and Online Analytical -1 (1)
No ratings yet
Data Warehousing, Business Analytics and Online Analytical -1 (1)
35 pages
4a - Database Systems
No ratings yet
4a - Database Systems
35 pages
DWDM Lecture Notes
No ratings yet
DWDM Lecture Notes
139 pages
INFORMATION MANAGEMENT Unit 3 NEW
100% (1)
INFORMATION MANAGEMENT Unit 3 NEW
61 pages
Data Warehouse
No ratings yet
Data Warehouse
71 pages
Data Warehouse
No ratings yet
Data Warehouse
4 pages
Data Mining and Data Warehouse: Raju - Qis@yahoo - Co.in Praneeth - Grp@yahoo - Co.in
No ratings yet
Data Mining and Data Warehouse: Raju - Qis@yahoo - Co.in Praneeth - Grp@yahoo - Co.in
8 pages
DW Presentation Logic
No ratings yet
DW Presentation Logic
94 pages
Lecture 3: Business Intelligence: OLAP, Data Warehouse, and Column Store
No ratings yet
Lecture 3: Business Intelligence: OLAP, Data Warehouse, and Column Store
119 pages
Data Warehousing
No ratings yet
Data Warehousing
21 pages
Unit 6 NOSQL Databases and Data Warehousing
No ratings yet
Unit 6 NOSQL Databases and Data Warehousing
29 pages
Warehouse
No ratings yet
Warehouse
60 pages
data mining 4
No ratings yet
data mining 4
59 pages
CH 1
No ratings yet
CH 1
53 pages
3
No ratings yet
3
77 pages
Iare DWDM PPT Cse
No ratings yet
Iare DWDM PPT Cse
249 pages
Lesson 4
No ratings yet
Lesson 4
41 pages
DM & DW
No ratings yet
DM & DW
5 pages
Chapter-2 DATA WAREHOUSE PDF
100% (1)
Chapter-2 DATA WAREHOUSE PDF
28 pages
Wk3-4 Data Warehouse
No ratings yet
Wk3-4 Data Warehouse
60 pages
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
3 Gales-Table
No ratings yet
3 Gales-Table
2 pages
APPENDIX E - INTRODUCTION TO R TO PERFORM LINEAR REGRESSION ANALYSIS - Introduction To Linear Regression Analysis, 5th Edition
No ratings yet
APPENDIX E - INTRODUCTION TO R TO PERFORM LINEAR REGRESSION ANALYSIS - Introduction To Linear Regression Analysis, 5th Edition
5 pages
10578200
No ratings yet
10578200
90 pages
SL Loney Trigonometry PDF
100% (3)
SL Loney Trigonometry PDF
402 pages
Mathematical Chemistry - Wikipedia
No ratings yet
Mathematical Chemistry - Wikipedia
10 pages
Canad: Comparison of The Strip Theory and The Panel Method in Computing Ship Motion With Forward Speed
No ratings yet
Canad: Comparison of The Strip Theory and The Panel Method in Computing Ship Motion With Forward Speed
8 pages
User Satisfaction of Information Display On Mobile Devices and Desktop Computer: A Comparative Study
No ratings yet
User Satisfaction of Information Display On Mobile Devices and Desktop Computer: A Comparative Study
5 pages
Getting Started With PIC18F4550 and MPLABX IDE - PIC Controllers
100% (1)
Getting Started With PIC18F4550 and MPLABX IDE - PIC Controllers
11 pages
Guard Room
No ratings yet
Guard Room
10 pages
Price Universal Plenum Catalog
No ratings yet
Price Universal Plenum Catalog
5 pages
Class 7 Maths Percentage 1
No ratings yet
Class 7 Maths Percentage 1
5 pages
Arbuthnot's Argument For Divine Providence
No ratings yet
Arbuthnot's Argument For Divine Providence
14 pages
PC-03 (A) RM (P-3) Solution
No ratings yet
PC-03 (A) RM (P-3) Solution
12 pages
HV Battery Cooling Fan Maintenance For Severe Usage Vehicles
No ratings yet
HV Battery Cooling Fan Maintenance For Severe Usage Vehicles
2 pages
Com 213 Note1
No ratings yet
Com 213 Note1
66 pages
Gallons Pounds Conversion
No ratings yet
Gallons Pounds Conversion
17 pages
Micro Scale CFD Modeling of Reactive Mass Transfer in Falling Liquid Films Within Structured Packing Materials
No ratings yet
Micro Scale CFD Modeling of Reactive Mass Transfer in Falling Liquid Films Within Structured Packing Materials
11 pages
Fallacy
No ratings yet
Fallacy
3 pages
CAA A5 Reference Guide MOD 1 V1.0
No ratings yet
CAA A5 Reference Guide MOD 1 V1.0
27 pages
Area Measurement
No ratings yet
Area Measurement
25 pages
Bakke 2010
No ratings yet
Bakke 2010
10 pages
CA15 Quantitatitive Analysis
No ratings yet
CA15 Quantitatitive Analysis
4 pages
Types of Spirals
No ratings yet
Types of Spirals
1 page
1 PDFsam P2N2222A-D
No ratings yet
1 PDFsam P2N2222A-D
1 page
Unix and Linux: Computer Operating Systems
No ratings yet
Unix and Linux: Computer Operating Systems
11 pages
Lesson 1: Thales' Theorem: Abc A B C AB A B C AB Abc
No ratings yet
Lesson 1: Thales' Theorem: Abc A B C AB A B C AB Abc
14 pages
Consolidation of Soils: ABSTRACT: Primary Compression and Secondary Compression of Saturated Soils Are Consistent
No ratings yet
Consolidation of Soils: ABSTRACT: Primary Compression and Secondary Compression of Saturated Soils Are Consistent
16 pages
Assignment On:: C Programming Language Course Code: CSEC 122
No ratings yet
Assignment On:: C Programming Language Course Code: CSEC 122
38 pages
Module Science 9 3rd Quarter
No ratings yet
Module Science 9 3rd Quarter
17 pages
Specialized - STEM
No ratings yet
Specialized - STEM
32 pages