0% found this document useful (0 votes)

12 views

DM Ch3 Data Preprocessing

Data mining

Uploaded by

sh1637

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

DM Ch3 Data Preprocessing

Data mining

Uploaded by

sh1637

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

llygomelascam r a o 4 a NillluviluDo

INF 489
I". • • • 4 4 I t
Of*, r a l * c a ,

0, i t 4 4?) g o
4t 4

In structor:
t r a l k " A l % 7 A %
Dr. Mohamed H. Farrag DATA MINING
Concepts a n d Techniques

M . 2 1 1 Jiawei Hon I Micheline Komber I Jion Pei

Instructor: Dr. Mohamed H. Farrag 1 C o u r s e : Data Mining Ch3: Data Pmprocessing

TwdbocoA
Main textbook,
oO81a Nfikkcy Ceflcopb eflicl TschIgnosw (3rd ed.)
Jiawei Han, Micheline Kamber, and „Ilan Pei p- 4 - 4 tile.
V. I
University of Illinois at Urbana-Champaign & •—•;„.,;"f
a
DATA MININ
Simon Fraser University

O k aTOcollos© r a g a M a g e [ V E 2nd Edition

Tan, Steinbach,
Karpatne, Kumar

Modified for Introduction to Data Mining by Dr. Mohamed H. Farrag

Instructor: Dr. Mohamed H. Farrag 2 C o u r s e : Data Mining Ch3: Data Preprocessing

Chapb7 3
tr r2 . . 4 # ...4o i o
410,6 - i t $ 6 i t . , ' •
v P O P rd. a . t r i t t Ai l

01 I
lit_t'Pilib,4 Aim
q rI

I I 4 1
4 ! fig d i
OData Preprocesdng I P I I APIIt l i A v \. i.t odrv _ 0 4
t V
• ,,,,i,... A ollikow d i a f f 4

' d e r & 0 1 *ZA,1

DATA MINING
Concepts a n d Techniques

MI • • Jrcrwer Hon I Micheline Komber I Jion Pei

Instructor: Dr. Mohamed H. Farrag 3 C o u r s e : Data Mining Ch3: Data Preprocessing

Data Preprocessing

Instructor: Dr. Mohamed H. Farrag 4 C o u r s e : Data Mining Ch3: Data Preprocessing

Data Preprocessing
• Aggregation

• Sampling

• Dimensionality Reduction

• Feature subset selection

• Feature creation

• Discretization and Binarization

• Attribute Transformation

Instructor: Dr. Mohamed H. Farrag 5 C o u r s e : Data Mining Ch3: Data Preprocessing

Aggregation
• Combining two or more attributes (or objects) into a
single attribute (or object)
• Purpose
- Data reduction
• Reduce the number of attributes or objects
- Change of scale
• Cities aggregated into regions, states, countries, etc.
• Days aggregated into weeks, months, or years
- More "stable" data
• Aggregated data tends to have less variability

Instructor: Dr. Mohamed H. Farrag 6 C o u r s e : Data Mining Ch3: Data Preprocessing

Example: Precipitation in Australia
Variation of Precipitation in Australia

180

160

g 140

0 120
-CI 1 0 0

_J 8 0

in'g:17) 60

z 40

0 5 10 1 5 la 1 2 1 4 1 0 1 0 2 0

Standard Deviation of Average Standard Deviation of

Monthly Precipitation Average Yearly Precipitation

Instructor: Dr. Mohamed H. Farrag 7 C o u r s e : Data Mining Ch3: Data Preprocessing

Sampling
• Sampling i s t h e main technique employed f o r data
reduction.
- It is often used for both the preliminary investigation of the
data and the final data analysis.

• Statisticians often sample because obtaining the entire

set of data of interest is too expensive or time consuming.

• Sampling i s typically used i n data mining because

processing t h e entire set o f data o f interest i s t o o
expensive or time consuming.

Instructor: Dr. Mohamed H. Farrag 8 C o u r s e : Data Mining Ch3: Data Preprocessing

Sampling ...
• T h e key principle for effective sampling is the following:

- Using a sample will work almost as well as using the

entire d a t a s e t , i f t h e s a m p l e i s representative

- A sample is representative if it has approximately the

same properties (of interest) as the original set of data

Instructor: Dr. Mohamed H. Farrag 9 C o u r s e : Data Mining Ch3: Data Preprocessing

Types of Sampling
• Simple random sampling
- There is an equal probability of selecting any particular
item
• Sampling without replacement
- Once a n object i s selected, i t i s removed f r o m t h e
population
• Sampling with replacement
- A selected object is not removed from the population
• Stratified sampling:
- Partition the data set, and draw samples from each
partition (proportionally, i . e . , approximately t h e s a m e
percentage of the data)

Instructor: Dr. Mohamed H. Farrag 1 0 C o u r s e : Data Mining Ch3: Data Preprocessing

Cis'
Sample Size

•
• .• • •• ••
•, ...VI, • • • • •!' 4 • • • • •
• . 4 • • : * : • •
• • ! .••tts'1••?••••• • • • • • %.• • : '
•
; • • ••Te•••?4,3••• •
I t , . . 4 11 - • • •••• • • • i • • t t r I . N : . : * •• • %tot' •s " ••1•••• •••••%. •
rg • • •
4'••1
• ••• ,• -••• -• .• •
• givi••• ••••••: tws •
•
••••• .•!%„n : ▪ A m • •••:•••• 544 .•••
• • •• *0°
•4••
•
••• • •••

• ••• * e •O:•-1-1. • . . • • -tt.:1!..4(•••• ••••• • *

••*,,• • • • •
• 5-1 •-•.'
•
00
• %
,
t.:••b• •'•. • • • :;e•
:'e' ;•••:i .•-% '<i'...
; ;At••!,%: SE.1.••••
.„:". *0'; % . . t " ; A t ,
. ft.,. z .•
N.1 0•;„•• •• % • • • •• * • • •
•
• t• Pi * h . : I r y % . . . . * t e l . . 0 . • •• •
• t k**
: tV, . .f•. 1• 1 • s•. •. •b• 2•••
T. 2 • 1 4 1 : 7 . : f : : • vvii141*
• •-s, •
•• ;",/ • , h:,-: . . 0 . - •si•, .:• •• .• •-•tt:•••0
.4. ...: •• • •• • •• ' • ••• • • • •••••
-• • • • • ••
.••:.•• . .•4 at': 41•'%.1
•?....;.••• ::.t•l'?,
74 -..I3,;:-1.4.1.-..4
i•A••z-•ss.r.•
•kZ% •••4 •• 441-';';:l .•%•••• •
•,.,• le: :At-es:*
• • • • • •.:
•••:-.4.'
w•••,•••.c.,. • •••• i v 4• •
•
• • •• •
•••• • • . •

8000 points 2000 Points 500 Points

Instructor: Dr. Mohamed H. Farrag 1 1 C o u r s e : Data Mining Ch3: Data Prappacessing

Sampling: With or without Replacement

Instructor: Dr. Mohamed H. Farrag 1 2 C o u r s e : Data Mining Ch3: Data Pmprocessing

,
oC
Sampling: Cluster or Stratified Sampling

Raw Data Cluster/Stratified Sample

Instructor: Dr. Mohamed H. Farrag 1 3 C o u r s e : Data Mining Ch3: Data Preprocessing

Ai'
Curse of Dimensionality
• When d i m e n s i o n a l i t y
increases, data becomes
increasingly sparse in the
space that it occupies

• Definitions of density and

distance between points,
which a r e c r i t i c a l f o r
clustering a n d o u t l i e r
detection, b e c o m e l e s s 5 1 0 15 20 25 30 35 40 45 50
Number of dimensions
meaningful
• Randomly generate 500 points
'Compute difference between max and
min distance between a n y p a i r o f
points
Instructor: Dr. Mohamed H. Farrag 14 C o u r s e : Data Mining Ch3: Data Preprocessing
CC
Dimensionality Reduction
• Purpose:
—Avoid curse of dimensionality
—Reduce amount of time and memory required by data
mining algorithms
—Allow data to be more easily visualized
—May help to eliminate irrelevant features or reduce noise

• Techniques
—Principal Components Analysis (PCA)
—Singular Value Decomposition
—Others: supervised and non-linear techniques

Instructor: Dr. Mohamed H. Farrag 1 5 C o u r s e : Data Mining Ch3: Data Preprocessing

Feature Subset Selection
• Another way to reduce dimensionality of data I
• Redundant features
- Duplicate much or all of the information contained in
one or more other attributes
- Example: purchase price of a product and the amount of
sales tax paid
• Irrelevant features
- Contain no information that is useful for the data
mining task at hand
- Example: students' ID is often irrelevant to the task of
predicting students' GPA
• Many techniques developed, especially for classification

Instructor: Dr. Mohamed H. Farrag 1 6 C o u r s e : Data Mining Ch3: Data Preprocessing

Feature Creation
• Create new attributes that can capture the important
information in a data set much more efficiently than the
original attributes
• Three general methodologies:
- Feature extraction (creation of a new set of features from the
original data)
• b u t you can apply in specific domains only (domain —specific)
- Feature construction (one or more new features constructed
out of the original features can be more useful than the original)
• Example: dividing mass by volume to get density
- Mapping data to new space (totally different view of the data
can reveal important and interesting features)
• Example: Fourier and wavelet analysis

Instructor: Dr. Mohamed H. Farrag 1 7 C o u r s e : Data Mining Ch3: Data Preprocessing

it)
Discretization
• Discretization i s t h e p r o c e s s o f converting a
continuous attribute into an categorical attribute
- A potentially infinite number of values are mapped into a
small number of categories

—Discretization is commonly used in classification

Instructor: Dr. Mohamed H. Farrag 1 8 C o u r s e : Data Mining Ch3: Data Pmprocessing

ob
-
Binarization
• Binarization m a p s a continuous o r categorical
attribute into one or more binary variables

• Typically used for association analysis

• Often convert a continuous attribute to a categorical

attribute and then convert a categorical attribute to a set
of binary attributes

Instructor: Dr. Mohamed H. Farrag 1 9 C o u r s e : Data Mining Ch3: Data Preprocessing

Attribute Transformation
• An attribute transform is a function that maps the entire
set o f values o f a given attribute t o a n e w s e t o f
replacement values such that each old value can b e
identified with one of the new values

Instructor: Dr. Mohamed H. Farrag 2 0 C o u r s e : Data Mining Ch3: Data Pmprocessing

I, t
Similarity and Dissimilarity Measures
• Similarity measure
- Numerical measure of how alike two data objects are.
- Is higher when objects are more alike.
- Often falls in the range [0,1]
• Dissimilarity measure
- Numerical measure of how different two data objects
are
- Lower when objects are more alike
- Minimum dissimilarity is often 0
- Upper limit varies
- Proximity refers to a similarity or dissimilarity

Dr. Mohamed H. Farrag 2 1 C o u r s e : Data Mining Ch2: Getdng to Know Your Data
Cri)_
Similarity/Dissimilarity for Simple Attributes

The following table shows t h e similarity a n d dissimilarity

between two objects, x and .1., with respect to a single, simple
attribute.

Attribute Dissimilarity Similarity

Type
d= I 0 if x = y 1ifx=y
Yominal s ,
11 if x 0 y 10 i f x 0 y
' d = Ix - yli(n — 1)
Ordinal (values mapped to integers 0 to n—1. s=1 - d
where 71 is the number of values)
Interval or Ratio d = lx - yl — .45--d1
8—
- —
— S — e
1-1-d' — 1
s = 1 _ cl—min_ct
max_cl—min_d

Instructor: Dr. Mohamed H. Farrag 22 C o u r s e : Data Mining Ch2: Getting to Know Your Data C o t , _
Data Mining: Exploring Data

Instructor: Dr. Mohamed H. Farrag 2 3 C o u r s e : Data Mining Ch3: Data Preprocessing

What is data exploration?
A preliminary exploration o f t h e d a t a t o better
understand its characteristics.

• K e y motivations of data exploration include

—Helping t o select t h e right tool f o r preprocessing o r
analysis

—Making use of humans' abilities to recognize patterns

• People can recognize patterns n o t captured b y data

analysis tools

Instructor: Dr. Mohamed H. Farrag 2 4 C o u r s e : Data Mining Ch3: Data Preprocessing

Techniques Used In Data Exploration
I

• I n our discussion of data exploration, we focus on

—Summary statistics
—Visualization
—Online Analytical Processing (OLAP)

Instructor: Dr. Mohamed H. Farrag 2 5 C o u r s e : Data Mining Ch3: Data Preprocessing

Summary Statistics
• Summary statistics a r e numbers that summarize
properties of the data
—Summarized properties include frequency, location and
spread
• Examples: location - mean
spread - standard deviation

—Most summary statistics can be calculated in a single

pass through the data

Instructor: Dr. Mohamed H. Farrag 2 6 C o u r s e : Data Mining Ch3: Data Preprocessing

Frequency and Mode
• T h e frequency of an attribute value is the percentage of time the value
occurs in the data set

—For example, given t h e attribute gender' and a representative

population of people, the gender female occurs about 50% of the
time.
• T h e mode of an attribute is the most frequent attribute value

• R a n g e is the difference between the max and min

• T h e variance or standard deviation s1 is the most common measure of
the spread of a set of points

Instructor: Dr. Mohamed H. Farrag 2 7 C o u r s e : Data Mining Ch3: Data Preprocessing

Visualization
Visualization is the conversion of data into a visual or
tabular format so that the characteristics of the data and
the relationships among data items or attributes can be
analyzed or reported.

• Visualization of data is one of the most powerful and

appealing techniques for data exploration.
—Humans have a well developed ability to analyze large
amounts of information that is presented visually
—Can detect general patterns and trends
—Can detect outliers and unusual patterns

Instructor: Dr. Mohamed H. Farrag 2 8 C o u r s e : Data Mining Ch3: Data Preprocessing

ob
Visualization Techniques: Histograms
• Histogram
—Usually shows the distribution of values of a single variable
—Divide the values into bins and show a bar plot of the number of
objects in each bin.
—The height of each bar indicates the number of objects
—Shape of histogram depends on the number of bins
• E x a m p l e : Petal Width (10 and 20 bins, respectively)

Nut Width

Instructor: Dr. Mohamed H. Farrag 2 9 C o u r s e : Data Mining Ch3: Data Preprocessing

Histogram from Weka

Program Applications Tools Visuekation Windows

Explorer
Preprocess Classify I Cluster I Associate I Select attributes I Visualize

Open U... O p e n 0... G e n e r a t . . . I Edit... Save...

rliter
Choose ( N o n e

rCurrent relation Selected attribute

Relation: weather Name: outlook Type: Nominal
Instances: 14 Attributes: 5 teltssing: 0 (0%) Distinct: 3 Unique: 0 (0%)
Attributes Label Count

Al I None I Invert I Pattern overcast 4

rainy
futirlY
No. Name

2 r temperature 'Class: play (Nom) V i s u a l i z e All I

3 hurridity
4 windi
5
PlaY
4

Remove

Instructor: Dr. Mohamed H. Farrag 3 0 C o u r s e : Data Mining Ch3: Data Pmprocessing

Two-Dimensional Histograms

4
3
petal width petal length

Instructor: Dr. Mohamed H. Farrag 3 1 C o u r s e : Data Mining Ch3: Data Pmprocessing

0-1
Visualization Techniques: Box Plots

-4— outlier
• B o x Plots
—Invented by J.
4— 90th percentile
Tukey
—Another way of
displaying the
distribution of 4— 75th percentile
data
4— 50th percentile
—Following figure
4— 25th percentile
shows the basic
part of a box plot

4— 10th percentile
E

[ Instructor: Dr. Mohamed H. Farrag 32 C o u r s e : Data Mining Ch3: Data Preprocessing

Example of Box Plots
• Box plots can be used to compare attributes

0
sepal length s e p a l width p e t a l length p e t a l width

Instructor: Dr. Mohamed H. Farrag 3 3 C o u r s e : Data Mining Ch3: Data Preprocessing

Visualization Techniques: Scatter Plots
• Scatter plots I
—Attributes values determine the position
—Two-dimensional scatter plots most common, but can
have three-dimensional scatter plots
—Often additional attributes can be displayed by using
the size, shape, and color of the markers that represent
the objects
—It is useful to have arrays of scatter plots can compactly
summarize the relationships of several pairs of
attributes
• S e e example on the next slide

Instructor: Dr. Mohamed H. Farrag 3 4 C o u r s e : Data Mining Ch3: Data Preprocessing

Scatter Plot Array of Iris Attributes
Setosa
4Ve r s i c o l o u r
O Vi r g i n i c a

00
0

0 at9
4 4 / Ii9 ' t 0 +

xx,,,§4k0<ack
4

7 8 2 4 2 4 6 0
sepal length sepal width p e t a l length p e t a l width

Instructor: Dr. Mohamed H. Farrag 3 5 C o u r s e : Data Mining Ch3: Data Pmprocessing

Parallel Coordinates Plots for Iris Data

a Setosa
Versicolor
Virginica - - Setosa
Versicolor
Virginica

i (
sepal? length sepal width petal length petal width s e p 3 width sepal length petal length petal width

Instructor: Dr. Mohamed H. Farrag 3 6 C o u r s e : Data Mining Ch3: Data Preprocessing

Star Plots for Iris Data

A I A L Dsa

2 3 4 5

51 5
<I>
2 5 3
' W
54 55
sicolour

102 1 0 3
<le 1 0 4 1 0 5
lmica

Instructor: Dr. Mohamed H. Farrag 3 7 C o u r s e : Data Mining Ch3: Data Pmprocessing

fil
OLAP
• O n -Line Analytical Processing (OLAP) was proposed by
E. F. Codd, the father of the relational database.
• Relational databases put data into tables, while OLAP
uses a multidimensional array representation.
—Such representations o f data previously existed i n
statistics and other fields
• There a r e a number o f d a t a analysis a n d d a t a
exploration operations that are easier with such a data
representation.

Instructor: Dr. Mohamed H. Farrag 3 8 C o u r s e : Data Mining Ch3: Data Preprocessing

Example
Petal
Width

Virginica
Versicolour
Setosa

high

medium
•ec.'
low c-)9

Petal 0
Width

Instructor: Dr. Mohamed H. Farrag 3 9 C o u r s e : Data Mining Ch3: Data Preprocessing

()LAP Operations:
Data Cube:
• A data cube is a multidimensional representation of data, together with all possible
aggregates.
• B y all possible aggregates, we mean the aggregates that result by selecting a
proper subset of the dimensions and summing over all remaining dimensions.

Date

Product ID

Instructor: Dr. Mohamed H. Farrag 4 0 C o u r s e : Data Mining Ch3: Data Preprocessing

()LAP Operations:
Slicing and Dicing
• S l i c i n g is selecting a group of cells from the entire multidimensional array by
specifying a specific value for one or more dimensions.
. D i c i n g involves selecting a subset of cells by specifying a range of attribute
values.

Instructor: Dr. Mohamed H. Farrag 4 1 C o u r s e : Data Mining Ch3: Data Preprocessing

()LAP Operations:
Roll-up and Drill-down
• Attribute values often have a hierarchical structure.
—Each date is associated with a year, month, and week.
—A location is associated with a continent, country, state
(province, etc.), and city.
—Products can be divided into various categories, such as
clothing, electronics, and furniture.
• Note that these categories often nest and form a tree or
lattice
—A year contains months which contains day
—A country contains a state which contains a city

Instructor: Dr. Mohamed H. Farrag 4 2 C o u r s e : Data Mining Ch3: Data Preprocessing

Summary
• D a t a attribute types: nominal, binary, ordinal, interval-scaled, ratio-scaled
• M a n y types of data sets, e.g., numerical, text, graph, Web, image.
• G a i n insight into the data by:
—Basic statistical data description: central tendency, dispersion, graphical
displays
—Data visualization: map data onto graphical primitives
—Measure data similarity
• A b o v e steps are the beginning of data preprocessing.
• M a n y methods have been developed but still an active area of research.

Instructor: Dr. Mohamed H. Farrag 4 3 C o u r s e : Data Mining Ch3: Data Preprocessing

References
• W . Cleveland, Visualizing Data, Hobart Press, 1993
• T . Dasu and T. Johnson. Exploratory Data Mining and Data Cleaning. John Wiley, 2003
• U . Fayyad, G. Grinstein, and A. Wierse. Information Visualization in Data Miningand
Knowledge Discovery, Morgan Kaufmann, 2001
• L . Kaufman and P. J. Rousseeuw. Finding Groups in Data: an Introduction to Cluster
Analysis. John Wiley & Sons, 1990.
• H . V. Jagadish, et al., Special Issue on Data Reduction Techniques. Bulletin of the Tech.
Committee on Data Eng., 20(4), Dec. 1997
• D . A. Keim. Information visualization and visual data mining, IEEE trans. on Visualization
and Computer Graphics, 8(1), 2002
• D . Pyle. Data Preparation for Data Mining. Morgan Kaufmann, 1999
• S . Santini and R. Jain," Similarity measures", IEEE Trans. on Pattern Analysis and Machine
Intelligence, 21(9), 1999
• E . R. Tufte. The Visual Display of Quantitative Information, 2nd ed., Graphics Press, 2001
• C . Yu, et al., Visual data mining of multimedia data for social and behavioral studies,
Information Visualization, 8(1), 2009

[ Instructor: Dr. Mohamed H. Farrag 4 4 C o u r s e : Data Mining Ch3: Data Preprocessing

References
• D . P. Ballou and G. K. Tayi. Enhancing data quality in data warehouse environments. Comm. of ACM,
42:73-78, 1999
• A . Bruce, D. Donoho, and H.-Y. Gao. Wavelet analysis. IEEE Spectrum, Oct 1996
• T . Dasu and T. Johnson. Exploratory Data Mining and Data Cleaning. John Wiley, 2003
• J . Devore and R. Peck. Statistics: The Exploration and Analysis of Data. Duxbury Press, 1997.
• H . Galhardas, D. Florescu, D. Shasha, E. Simon, and C.-A. Salta. Declarative data cleaning: Language,
model, and algorithms. VLDB'01
• M . Hua and J. Pei. Cleaning disguised missing data: A heuristic approach. KDD'07
• H . V. Jagadish, et al., Special Issue on Data Reduction Techniques. Bulletin of the Technical Committee
on Data Engineering, 20(4), Dec. 1997
• H . Liu and H. Motoda (eds.). Feature Extraction, Construction, and Selection: A Data Mining
Perspective. Kluwer Academic, 1998
• J . E. Olson. Data Quality: The Accuracy Dimension. Morgan Kaufmann, 2003
• D . Pyle. Data Preparation for Data Mining. Morgan Kaufmann, 1999
• V . Raman and J. Hellerstein. Potters Wheel: An Interactive Framework for Data Cleaning and
Transformation, VLDB'2001
• T . Redman. Data Quality: The Field Guide. Digital Press (Elsevier), 2001
• R . Wang, V. Storey, and C. Firth. A framework for analysis of data quality research. IEEE Trans.
Knowledge and Data Engineering, 7:623-640, 1995

[
-•
Instructor: Dr. Mohamed H. Farrag 4 5 C o u r s e : Data Mining Ch3: Data Pleprocessing -

Research Methodology
From Everand
Research Methodology
Dr.Archana Dadhe
4/5 (25)
VALVES J70530 Catalogue HiRes
No ratings yet
VALVES J70530 Catalogue HiRes
76 pages
AASHTOWare Pavement ME Design Build 1 3 28 Release Notes
No ratings yet
AASHTOWare Pavement ME Design Build 1 3 28 Release Notes
26 pages
DM 2 Part 2
No ratings yet
DM 2 Part 2
35 pages
CAS CS 565, Data Mining
No ratings yet
CAS CS 565, Data Mining
30 pages
DM Ch1 Introduction
No ratings yet
DM Ch1 Introduction
50 pages
AI351 Lecture 1
No ratings yet
AI351 Lecture 1
32 pages
Recommender System - Module 2 - Data Mining Techniques in Recommender System
No ratings yet
Recommender System - Module 2 - Data Mining Techniques in Recommender System
58 pages
Ragb Alllnkg Kyoulltherrdz: in Structor
No ratings yet
Ragb Alllnkg Kyoulltherrdz: in Structor
31 pages
10-2 Data analysis and pre-processing part 4 PDF
No ratings yet
10-2 Data analysis and pre-processing part 4 PDF
23 pages
UNIT - II - Data Mining Essentials
No ratings yet
UNIT - II - Data Mining Essentials
20 pages
BDA Class1
No ratings yet
BDA Class1
33 pages
Data Prep
No ratings yet
Data Prep
33 pages
Unit 2: Big Data Analytics
No ratings yet
Unit 2: Big Data Analytics
45 pages
PPT 2
No ratings yet
PPT 2
51 pages
Chapter 3: Data Preprocessing
No ratings yet
Chapter 3: Data Preprocessing
15 pages
Data Mining Notes: 7 Semester. CS 1435: Syllabus
No ratings yet
Data Mining Notes: 7 Semester. CS 1435: Syllabus
4 pages
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 3
No ratings yet
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 3
53 pages
Data Mining: Concepts and Techniques: - Chapter 3
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 3
52 pages
CH1-data Preprocessing
No ratings yet
CH1-data Preprocessing
49 pages
Unit-4 Introduction To Data Mining
No ratings yet
Unit-4 Introduction To Data Mining
26 pages
Ques 1.give Some Examples of Data Preprocessing Techniques?: Assignment - DWDM Submitted By-Tanya Sikka 1719210284
No ratings yet
Ques 1.give Some Examples of Data Preprocessing Techniques?: Assignment - DWDM Submitted By-Tanya Sikka 1719210284
7 pages
Data Mining and Business Intelligence
No ratings yet
Data Mining and Business Intelligence
52 pages
Data Mining1
No ratings yet
Data Mining1
13 pages
3.data Pre-Processing Concepts
No ratings yet
3.data Pre-Processing Concepts
8 pages
Datawarehouse&Data mining_ALL
No ratings yet
Datawarehouse&Data mining_ALL
46 pages
Data Mining Presentation
No ratings yet
Data Mining Presentation
206 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
Data Mining & Data Warehousing
No ratings yet
Data Mining & Data Warehousing
62 pages
3prep
No ratings yet
3prep
53 pages
Lecture123
No ratings yet
Lecture123
20 pages
LECTURE 3-BDM 411 Data Analytics and BIG Data
No ratings yet
LECTURE 3-BDM 411 Data Analytics and BIG Data
49 pages
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 2 &3
No ratings yet
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 2 &3
36 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
50 pages
03 Data Preparation
No ratings yet
03 Data Preparation
28 pages
Data Mining I: Summer Semester 2017
No ratings yet
Data Mining I: Summer Semester 2017
68 pages
Week2-2
No ratings yet
Week2-2
25 pages
Data Mining Chapter3 0
No ratings yet
Data Mining Chapter3 0
32 pages
Que Es Datamin
No ratings yet
Que Es Datamin
52 pages
CS F415 DATA MINING L1
No ratings yet
CS F415 DATA MINING L1
4 pages
CS-DM MODULE -1
No ratings yet
CS-DM MODULE -1
27 pages
Data Mining CSE-443: Ayesha Aziz Prova Lecturer, Dept. of CSE CWU
No ratings yet
Data Mining CSE-443: Ayesha Aziz Prova Lecturer, Dept. of CSE CWU
21 pages
Major Issues in Data Mining
No ratings yet
Major Issues in Data Mining
9 pages
Unit 3 DW
No ratings yet
Unit 3 DW
19 pages
Unit 3 Data Warehousing and Data Mining
No ratings yet
Unit 3 Data Warehousing and Data Mining
7 pages
Ella Datamining PDF
No ratings yet
Ella Datamining PDF
88 pages
R21 Unit 2
No ratings yet
R21 Unit 2
101 pages
DMiningKuliah 2B DPreparation Lanjutan New2 - 3
No ratings yet
DMiningKuliah 2B DPreparation Lanjutan New2 - 3
40 pages
DM Day1 Intro MS F24 (1)
No ratings yet
DM Day1 Intro MS F24 (1)
111 pages
DSR Unit III
No ratings yet
DSR Unit III
11 pages
DMDW Full PDF
No ratings yet
DMDW Full PDF
784 pages
02 DataPreparation
No ratings yet
02 DataPreparation
43 pages
Data Mining Presentation
No ratings yet
Data Mining Presentation
154 pages
CS583 Data Prep
No ratings yet
CS583 Data Prep
33 pages
Chapter 1 Introduction To Data Mining
No ratings yet
Chapter 1 Introduction To Data Mining
46 pages
UNIT-2
No ratings yet
UNIT-2
37 pages
Data Cleaning and Datamining
No ratings yet
Data Cleaning and Datamining
54 pages
Data Mining Methods: Data Pre-Processing: Prof. Dr. Christina Andersson
No ratings yet
Data Mining Methods: Data Pre-Processing: Prof. Dr. Christina Andersson
33 pages
What Is Data Mining?
No ratings yet
What Is Data Mining?
17 pages
Data Mining New Notes Unit 3 PDF
No ratings yet
Data Mining New Notes Unit 3 PDF
12 pages
Research & the Analysis of Research Hypotheses: Volume 2
From Everand
Research & the Analysis of Research Hypotheses: Volume 2
Kathleen Thomas Allan, PhD
No ratings yet
Research & the Analysis of Research Hypotheses
From Everand
Research & the Analysis of Research Hypotheses
Kathleen Thomas Allan
No ratings yet
Dissertation Sur Le Mariage Et Le Divorce
100% (2)
Dissertation Sur Le Mariage Et Le Divorce
4 pages
Santa Rosa Coca Cola Plant Employee Union Vs Coca Cola Bottlers Phil
No ratings yet
Santa Rosa Coca Cola Plant Employee Union Vs Coca Cola Bottlers Phil
1 page
Synopsis On SRS Document
No ratings yet
Synopsis On SRS Document
10 pages
Language Assessment Final Examination: My Holiday (By Tareek Abdul Quinn)
No ratings yet
Language Assessment Final Examination: My Holiday (By Tareek Abdul Quinn)
3 pages
Intro To Safety
No ratings yet
Intro To Safety
12 pages
Murphy - Controlador EMS 447 - Sales Bulletin
No ratings yet
Murphy - Controlador EMS 447 - Sales Bulletin
2 pages
Chapter 4 - Quiz - Attempt Review - Welcome To Learning at Nibm
No ratings yet
Chapter 4 - Quiz - Attempt Review - Welcome To Learning at Nibm
3 pages
Introduce Yourself: PAWS (Tell Me About Yourself?)
No ratings yet
Introduce Yourself: PAWS (Tell Me About Yourself?)
11 pages
BADAC Executive Order-EO 2022
No ratings yet
BADAC Executive Order-EO 2022
3 pages
Universal Serial Bus (USB) - History
No ratings yet
Universal Serial Bus (USB) - History
10 pages
Loctite 596™: Surfaces
No ratings yet
Loctite 596™: Surfaces
2 pages
JainFarms Brochure Referance Low Resolution
No ratings yet
JainFarms Brochure Referance Low Resolution
17 pages
Brij Trivedi - CX Presales
No ratings yet
Brij Trivedi - CX Presales
3 pages
Section Description:: Manual Calculation Sheet For Reinforced Concrete Beam B 25 75
No ratings yet
Section Description:: Manual Calculation Sheet For Reinforced Concrete Beam B 25 75
8 pages
Marking Scheme Computer Networks
No ratings yet
Marking Scheme Computer Networks
3 pages
Cuenca Marañon
No ratings yet
Cuenca Marañon
23 pages
Karl E. Case, Ray C. Fair, Sharon E Oster, Principles of Macroeconomics, Pearson, 12th Edition, Global Edition 2017
No ratings yet
Karl E. Case, Ray C. Fair, Sharon E Oster, Principles of Macroeconomics, Pearson, 12th Edition, Global Edition 2017
2 pages
Criticare Poet Plus 8100
No ratings yet
Criticare Poet Plus 8100
194 pages
Family Law Lecture Family Law II - DR Poonam Pradhan Saxena - 2019 - Lexis Nexis - Anna's Archive
No ratings yet
Family Law Lecture Family Law II - DR Poonam Pradhan Saxena - 2019 - Lexis Nexis - Anna's Archive
696 pages
9179Quantum communications and cryptography 1st Edition Alexander V. Sergienko pdf download
100% (2)
9179Quantum communications and cryptography 1st Edition Alexander V. Sergienko pdf download
61 pages
Arihant NEET Objective Physics Volume 2 by DC Pandey 2022 Edition (1) - Unlocked-Compressed-Part - 2
100% (1)
Arihant NEET Objective Physics Volume 2 by DC Pandey 2022 Edition (1) - Unlocked-Compressed-Part - 2
200 pages
Payments: 40 Queens Gate Street Miss Andreea Manea
No ratings yet
Payments: 40 Queens Gate Street Miss Andreea Manea
1 page
Emergency Eyewash/Shower Requirements: 2.1 Installation
No ratings yet
Emergency Eyewash/Shower Requirements: 2.1 Installation
3 pages
National Employability Skill Test (N.e.s.t)
No ratings yet
National Employability Skill Test (N.e.s.t)
1 page
Historian - For - Linux - User - API - v2.2.0
No ratings yet
Historian - For - Linux - User - API - v2.2.0
25 pages
Solar PV Balance of System
50% (2)
Solar PV Balance of System
24 pages
1 Minute Scalper PDF
No ratings yet
1 Minute Scalper PDF
11 pages
PW1100G-JM Engine: Greenest Engine On The Airbus A320neo
No ratings yet
PW1100G-JM Engine: Greenest Engine On The Airbus A320neo
2 pages