DM 5th unit ppt

Uploaded by

Sandhya Rani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

DM 5th unit ppt

Uploaded by

Sandhya Rani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 54

Mining Time series Data:

• A Time Series database consists of sequences of values or events

obtained over repeated measurements of time.
Ex: for every 2 mins
• The Values are typically measured at equal time intervals .
Ex: hourly, daily, weekly
• A Time series database is also know as “sequence database”.
• Time series databases are popularly in many app locations such as
- Stock market Analysis
- Observation of natural phenomena
• Such as atmosphere , temperature, wind, earth quake, scientific and
engineering experiments, medical treatments.
Mining Time series Data:

• Every organization generates a high-volume of data every single day

- Sales figure
- revenue
- Traffic or operating cost
•Time series data mining can generate valuable information for long-
term business decisions.
•Time series are very frequently plotted via line- charts.
•Time series “fore casting “ is the use of a model to predict future
values based on previously observed values.
Strengths:
• A lot of well established algorithm
• Fore casting time-series can be a very hard task due to the inherent
uncertainty nature of these systems.
• Fast computation is possible.
Weakness:
Some cases correct information but wrong results in weather report.
• Some times, the past of the time-series is not enough to predict the
future.
• How to efficiently deal with outliers
• How to efficiently deal with multiple- periodicities.
Applications of Time series Analysis:
• Economic forecasting
• Sales forecasting
• Budgetary forecasting
• Stock market Analysis
• Yield Projections
• Process and Quality Control
• Inventory Studies
• Work Load Projections
• Utility Studies
• Census Analysis
Examples of Time Series:

• Sales and profits of a product of a company in different years.

• National Income measured for recent 10 years.
• Monthly Bank Deposits and Bank clearings.
• Daily Sales of milk and milk products in month of a milk dairy.
• Shares in stock exchange in all the days of a week.
Spatial Data Mining:
It is the process of discovering potentially useful patterns from large
spatial datasets.
Spatial Database:
It stores a large amount of space related data, such as maps,
preprocessed remote sensing or medical imaging data and VLSI chip
layout.
Eg: GIS, ISRO, NASA, RADAR Data etc….
Properties of spatial Data Mining
• Exploring New models
• New objective Functions
• New pattern
Applications of Spatial Data Mining:
• Use in Geographical Model
• Used in Analysis
• To provide Business intelligence
• Used in Research Purpose.
Spatial Data Mining Tasks:
Basic Tasks are Spatial Data Mining are:
• Classification
• Association Rules
• Characteristics Rules
• Discriminate Rules
• Clustering
• Trend Detection
1. Classification:
Finds a set of rules which determine the class of the classified
object according to it’s attributes.
2. Association Rules:
Find rules from the database. Association rules describe patterns
which are often in the database.
3. Characteristics Rules:
Describe some part of database
eg: “ Bridge is an object in the place where a road crosses a river.
4. Discriminate Rules:
Describe differences between two parts of database
eg: find differences b/w cities with high & low unemployment rate.
5. Clustering:
Groups the object in one cluster are similar and objects from
different clusters are dissimilar.
6. Trend Detection:
• finds trends in database.
• A trend is a temporal pattern in some time series data.
• A spatial trend is defined as a pattern of change of a non-spatial
attribute in the neighbor hood of spatial object.
Spatial Data Mining Techniques:
• There is no Unique way of classifying SDM techniques.
• Various kinds of patterns can be discovered from databases can be presented
in different forms.
SDM techniques as follows:
1. Clustering & outlier detection
a) Partitioning Method
b) Hierarchal Method
c) Density Based Method
d) Grid-Based Method
2. Association & co-location
3. Classification
4. Trend Dictions
1. Clustering and Outlier Detection:
• Spatial Clustering is a process of grouping a set of spatial objects into
groups called clusters.
• Objects with in a cluster show a high degree of similarity, where as
the clusters are much disimilar as possible.
• Clustering is a very well known technique in satistics and clustering
algorithm to deal with the large geographical datasets.
Clustering algorithms can be separated into four general categories:
a) Partitioning Method
b) Hierarchal Method
c) Density Based Method
d) Grid-Based Method
a) Partitioning Method:
• Partitioning algorithm organizes the objects into clusters such that the
total deviation of each object from it’s cluster center is minimized.
• At beginning each object is classified as a single clusters.
• K-means is commonly used fundamental partitioning algorithm.
b) Hierarchal Method :
• Hierarchical method decomposes the dataset of splitting or merging
all clusters until a stopping criterion is met.
• Some of the recently used hierarchical clustering algorithms are
“ Balanced Iterative Reducing and Clustering using hierarchies and
clustering using representatives” .
c) Density Based Method :
• The method regards clusters as dense regions of objects that are separated by
regions of low density.
• It contrast to partitioning methods clusters of arbitrary shapes can be
discovered.
• Density -based methods can be used to filter out noise and outliers.
d) Grid-Based Method :
• Grid Based Clustering algorithms first quartile clustering space into a finite
number of cells and then perform the required operations on the grid
structure.
• Cells that contain more than a certain number of points are treated as dense.
• The main advantage of the approach it’s fast processing time , since the time is
independent on the number of data objects, dependent on the number of
cells.
2. Association & co-location :
• When performing , clustering methods on the data, we can find only
characteristics rules, describing spatial objects according to their non-
spatial attributes.
3. Classification :
• Every data object stored in database is characterized by it’s attributes.
• Classification is technique, Which aim is to find rules that describe the
partition of the database into an explicitly given set of classes.
4. Trend Dictions :
• A spatial trend is a regular change of one or more non-spatial attributes
when spatially moving away from a start object.
• Spatial Trend detection is a technique for finding patterns of the attribute
changes with respect to the neighborhood of some spatial object.
Challenges of Web Mining:
a) Complexity of Web pages:
• The site pages don’t have a unifying structure.
• They are extremely complicated as compared to traditional text
document.
b) The Web is dynamic data source:
The data on the intrenet is quickly updated for ex : News,
climate, shopping, financial news, sports------etc
c) Diversity of client networks:
• The client network on the web is quickly expanding.
• These clients have different intrests, backgrounds and usage purpose.
d) Relavancy of data:
• It is considered that a specific person is generally concerned about a
small partion of the web, while the rest of the segment of the web
contains the data that is not user and may lead to unwanted results.
e) The web is too broad:
• The size of the web is tremendous and rapidly increasing
• It appears that the web is too huge for data ware housing and data
mining.
Web Mining

Web Content Mining Web Usage Mining Web Structure Mining

Clustering Classification Association

Text Mining :
It is the process of “ Extracting required data” and consists of
large collections of documents from various sources.
Ex: news, articles, research papers, books, digital, libraries, electronic
publications, e-mail messages, electronic documents and web pages
etc…
Goal : Finding the patterns trends across multiple documents.
• Text mining is the part of data mining which involves processing of tet
from documents.
• The text is used to “ gather high quality information “ .
• Computational logistics principles are used to evaluate text.
Text Mining:
• In Text Mining data is stored in Unstructured Format
• It is used to in fields like bio-science and consumers profile analysis.
• Text-Mining is basically an AI technology that involves processing the
data from various Text-documents.
Text Mining Process:
a) Text Transformation:
• A text transformation is a technique that is used to control the
capabilization of the text.
• The two way of document representation is given
- bag of words
- vector space
b) Text – preprocessing:
For extracting useful information and knowledge from unstructured text
data.
Feature selection:
• The process of reducing the input of processing or finding the essential
information sources.
• The feature selection is also called “variable selection”.
Evaluate:
Computational logistics principles are used to evaluate text.
Applications:
- Online library catalogue system.
- Online library document management system.
- Web search engines.
Basic Measures:
• Precision
• Recall
• F-Score
1. Precision:
|{Relevant}^{Retrieved}|
Precision=
|{ Retrieved}|
2. Recall :

|{Relevant}^ {Retrieved}|
Recall =
|{ Relevant}|
3. F- Score:
Recall * Precision
F-Score =
( recall + Precision)\2
Text Mining:
Fig: Relationship Between the set of relevant documents and set of
retrieved document Relevant + Retrieved

Relevant Documents Retrieved Documents

All Documents
Text Retrieval Methods:
1. Document Selection Method:
Boolean Retrieval model ( and/ OR /Not)
2. Document Ranking Method: The goal is to approximate the
degree of relevance of a document with a score computed based on
information such as the frequency of words in the document and
the whole collection.
Tokenization:
Stop list: Regularly used terms a, the, for, with
Word stem : long, longer, longest------
Multi Media Mining
• Multi media data mining is used for Extracting intresting
information for Multi media data set.
• Multi media mining is a sub field of data mining which is used to find
interesting information of implicit knowledge from multitime data
bases.
• Audio data
• Video data
• Image data
• Graphical data
• Speech data
• Text Data
Categories of Multi Media data Mining:
Multi Media Data Mining

Video
Text Dynamic Media
Static Media Mining
Ming

Audio
Image
Mining
Mining
• The Multi Media Data Mining is classified into Two categories are
Static and Dynamic media.
• Static media contains text ( digital library, creating sms & mms) and
images ( photos & media images)
• Dynamic media contains Audio ( Music & MP3 sounds) & (video like
movies).
Applications of Multimedia Mining:
• Digital Library
• Traffic video sequences
• Media Analysis
• Customer Perception
• Media Making and Broad Casting
• Mobiles
• Digital cameras
• Internet------etc
Multimedia Data Mining Processing:
• Data Collection is the initial stage of the learning s/m pre-processing is
to extract significant features from raw data, it includes data cleaning,
transformation , normalization, features extaction etc----
• Learning can be direct, if informative types can be recognized at pre-
processing data\ stage.
• Complete process depends extremely on the nature of raw data and
difficulty field
• The product of preprocessing is the training set.
Multimedia Data Mining Processing:

Data Collection Feature

Extraction
Raw Data

Data Preprocessing
-Data Cleaning
-Feature Selection
Training set

Machine Learning
Model
Architecture of Multimedia Data Mining:
• The Architecture has several components:
1. Input
2. Multimedia content
3. Spation temporal segmentation
4. Feature Extraction
5. Finding the similar pattern.
Architecture of Multimedia Data Mining:
Vid
text Im eo
Input Multimedia Contents Aud
age io

Spatiotemporal segmentation

Text Image Audio Video Feature Extraction

Finding the similar Patterns Evolution of results

Multimedia Data Mining:
a) Similarity search in multi media data
b) Multi dimensional Analysis of Multimedia data
c) Classification & Prediction Analysis of multi media data.
d) Mining Associations in Multi media data
e) Image Analysis , pattern recognization, digital image content mining.
f) Mining associations in multimedia data is “ Associations b/w image
contents and non image contents”.
g) Association among image contents related to spatial relationships.
Multimedia Data Mining:
Tool: Multimedia Miner an extraction for Data base miner.
Procedure : Feature Extraction is Descriptor only descriptions of
image.
• Layout descriptor image grid is 8*8 & 4*4 -----& 64 cells stored.
Feature: Image Excavator is extraction uses image context information
HTML tag’s.
• Hieratical of keywords searched in directions
• It is a combination of Text, graphic, sound, animation & video that is
delivered imitatively to the user by electronic or digitally manipulated
means.
Spatial Data Mining:
• Process of Discovering interesting and Preprocessing Unknown but
potentially useful patterns from large spatial database.
• Stores a large amount of space related data such as maps
preprocessed remote sensing on medical image data.
• Spatial database is a database systems that is optimized to store and
query basic spatial objects.
• Point- a house, a city, a moving car.

Liver Disorders A Point of Care Clinical Guide 1st Ed. 2017 Edition
100% (3)
Liver Disorders A Point of Care Clinical Guide 1st Ed. 2017 Edition
627 pages
Subject: Memory Input To REKES Models: Musso and Korando With The REKES Condition
No ratings yet
Subject: Memory Input To REKES Models: Musso and Korando With The REKES Condition
6 pages
Portrait of Mona Lisa
No ratings yet
Portrait of Mona Lisa
2 pages
Data Mining 1 2 and 3
No ratings yet
Data Mining 1 2 and 3
20 pages
CSC 425 Data Mining and Warehousing 2024
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
Unit-1 Notes (1)
No ratings yet
Unit-1 Notes (1)
24 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
1.1 - Data Mining
No ratings yet
1.1 - Data Mining
18 pages
Unit 2
No ratings yet
Unit 2
37 pages
DM-unit 1
No ratings yet
DM-unit 1
22 pages
DM Module1 notes
No ratings yet
DM Module1 notes
25 pages
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
No ratings yet
Data Mining Is Defined As The Procedure of Extracting Information From Huge Sets of Data
6 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
DM Chapter 1
No ratings yet
DM Chapter 1
10 pages
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
No ratings yet
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
52 pages
Mca II Sem Data Ware Hoise and Mining
No ratings yet
Mca II Sem Data Ware Hoise and Mining
53 pages
Cluster Analysis
No ratings yet
Cluster Analysis
36 pages
18mca52c U1
No ratings yet
18mca52c U1
17 pages
Activity 1 PDF
No ratings yet
Activity 1 PDF
3 pages
Data Mining Real
No ratings yet
Data Mining Real
19 pages
Assignment 5
No ratings yet
Assignment 5
16 pages
Clustering Unit4
No ratings yet
Clustering Unit4
9 pages
DATA MINING UNIT-1
No ratings yet
DATA MINING UNIT-1
59 pages
Unit-4 DWM
No ratings yet
Unit-4 DWM
73 pages
data mining unit I notes
No ratings yet
data mining unit I notes
24 pages
CS822-DataMining-Week1 (1)
No ratings yet
CS822-DataMining-Week1 (1)
97 pages
Data Mining
No ratings yet
Data Mining
14 pages
TPW Data Mining
No ratings yet
TPW Data Mining
4 pages
Data Mining Unit 1-1
No ratings yet
Data Mining Unit 1-1
11 pages
Whats App
No ratings yet
Whats App
23 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
24 pages
Paper - Xvii Data Mining and Warehousing
No ratings yet
Paper - Xvii Data Mining and Warehousing
140 pages
1.data Mining Functionalities
No ratings yet
1.data Mining Functionalities
14 pages
Sample Doc Final
No ratings yet
Sample Doc Final
21 pages
Data Mining Issues and Tasks
No ratings yet
Data Mining Issues and Tasks
5 pages
unit 1 mining
No ratings yet
unit 1 mining
15 pages
DMWH M1
No ratings yet
DMWH M1
25 pages
DATA MINING-Knowledge Discovery in Databases
No ratings yet
DATA MINING-Knowledge Discovery in Databases
6 pages
Data Mining Implementation
No ratings yet
Data Mining Implementation
9 pages
Complete Doc - Lavanya
No ratings yet
Complete Doc - Lavanya
95 pages
Data Mining
No ratings yet
Data Mining
5 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
4 pages
BI_Unit 5
No ratings yet
BI_Unit 5
9 pages
DMW-M1-Ktunotes.in
No ratings yet
DMW-M1-Ktunotes.in
75 pages
R21 Unit 2
No ratings yet
R21 Unit 2
101 pages
Data Mining: An Overview From A Database Perspective
No ratings yet
Data Mining: An Overview From A Database Perspective
30 pages
DMW Notes by Me
No ratings yet
DMW Notes by Me
45 pages
An Introduction To Data Mining Technique: August 2014
No ratings yet
An Introduction To Data Mining Technique: August 2014
6 pages
BI-Unit-3-Part-1-PPT.ppt
No ratings yet
BI-Unit-3-Part-1-PPT.ppt
51 pages
module 1
No ratings yet
module 1
41 pages
1. Introduction
No ratings yet
1. Introduction
26 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
47 pages
Bca DM Unit I
No ratings yet
Bca DM Unit I
20 pages
Data Mining
No ratings yet
Data Mining
9 pages
Cs1004 Data Warehousing & Mining Unit 5
No ratings yet
Cs1004 Data Warehousing & Mining Unit 5
10 pages
DM Unit1 Intro
No ratings yet
DM Unit1 Intro
12 pages
DM-Unit-I Introduction To Association-1
No ratings yet
DM-Unit-I Introduction To Association-1
97 pages
Data Mining-CH5
No ratings yet
Data Mining-CH5
49 pages
Data Mining Questions
100% (1)
Data Mining Questions
7 pages
DATA MINING MODULE 2
No ratings yet
DATA MINING MODULE 2
23 pages
Data Warehouse and Mining Notes
No ratings yet
Data Warehouse and Mining Notes
12 pages
Databases: System Concepts, Designs, Management, and Implementation
From Everand
Databases: System Concepts, Designs, Management, and Implementation
Jonathan Rigdon
No ratings yet
Basic Concepts in Data Structures
From Everand
Basic Concepts in Data Structures
K.Meenendranath Reddy
No ratings yet
AIRmatic With ADS Control Unit, Component Description
No ratings yet
AIRmatic With ADS Control Unit, Component Description
2 pages
Aetd Unit 4
No ratings yet
Aetd Unit 4
17 pages
0id Safety
No ratings yet
0id Safety
264 pages
PT - English 6
No ratings yet
PT - English 6
10 pages
Facing Your Feelings - 03 - Improving Distress PDF
No ratings yet
Facing Your Feelings - 03 - Improving Distress PDF
15 pages
Themes in Romeo and Juliet
No ratings yet
Themes in Romeo and Juliet
4 pages
Comprehensive Care Plan
No ratings yet
Comprehensive Care Plan
10 pages
Self Rating Scale
No ratings yet
Self Rating Scale
7 pages
Art Wih Adele 10 Exercises To Inspire Your Painting
No ratings yet
Art Wih Adele 10 Exercises To Inspire Your Painting
3 pages
Special Instruction: Pandaros Digital Governor
No ratings yet
Special Instruction: Pandaros Digital Governor
48 pages
TABLAS - TAYLOR DEVICES Web-Damper PDF
No ratings yet
TABLAS - TAYLOR DEVICES Web-Damper PDF
2 pages
CM Class 5
No ratings yet
CM Class 5
14 pages
AVA5-50 Product Specification
No ratings yet
AVA5-50 Product Specification
5 pages
The Determinants of Tax Revenue in Sub Saharan Africa - Tony & Jorgen
No ratings yet
The Determinants of Tax Revenue in Sub Saharan Africa - Tony & Jorgen
19 pages
Ethereum Blockchain Wallets
No ratings yet
Ethereum Blockchain Wallets
9 pages
Calculus I Course Outline
No ratings yet
Calculus I Course Outline
2 pages
Oscar
No ratings yet
Oscar
4 pages
History Quiz
100% (1)
History Quiz
33 pages
Writing The Social Venture Business Plan
100% (1)
Writing The Social Venture Business Plan
19 pages
The Ultimate Body Transformation Guide! PDF
100% (3)
The Ultimate Body Transformation Guide! PDF
27 pages
PMW-350 PMW-320 PMW-EX350 PMW-EX330: Solid-State Memory Camcorder
No ratings yet
PMW-350 PMW-320 PMW-EX350 PMW-EX330: Solid-State Memory Camcorder
13 pages
THE_MEASUREMENT_OF_FERRITE_NUMBER_FN_IN_REAL_WELDMENTS_1710215254
No ratings yet
THE_MEASUREMENT_OF_FERRITE_NUMBER_FN_IN_REAL_WELDMENTS_1710215254
9 pages
Daf Ditty Eruvin 81: Cut Me A Slice
No ratings yet
Daf Ditty Eruvin 81: Cut Me A Slice
34 pages
Unit-6 New Born Ignou
No ratings yet
Unit-6 New Born Ignou
26 pages
?oracle Index Maintenance Overview (Database Box)
No ratings yet
?oracle Index Maintenance Overview (Database Box)
17 pages
Good Offices: Unit 2 and Topic 4
No ratings yet
Good Offices: Unit 2 and Topic 4
6 pages
Final SOM Manual
No ratings yet
Final SOM Manual
52 pages

DM 5th unit ppt

Uploaded by

DM 5th unit ppt

Uploaded by

Mining Time series Data:

• A Time Series database consists of sequences of values or events

• Every organization generates a high-volume of data every single day

• Sales and profits of a product of a company in different years.

Web Content Mining Web Usage Mining Web Structure Mining

Clustering Classification Association

Relevant Documents Retrieved Documents

Data Collection Feature

Text Image Audio Video Feature Extraction

Finding the similar Patterns Evolution of results

You might also like