Data Mining
Data Mining
mcs2100440
MSC IT 4th
Assignment # 1
Subject
Data Mining
Submitted
To
Sir, Ghulam Jillani
Q.No:1
How can we mine multilevel and multidimensional associations,
explain both with suitable examples.
Association rules created from mining information at different degrees of reflection are called
various level or staggered association rules.
Multilevel association rules can be mined effectively utilizing idea progressions under a help
certainty system.
Rules at a high idea level may add to good judgment while rules at a low idea level may not be
valuable consistently.
At the point when a uniform least help edge is utilized, the pursuit system is
rearranged.
The technique is likewise straightforward, in that clients are needed to indicate just a
single least help edge.
A similar least help edge is utilized when mining at each degree of deliberation. (for
example for mining from “PC” down to “PC”). Both “PC” and “PC” discovered to be
incessant, while “PC” isn’t.
Sometimes at the low data level, data does not show any significant pattern but there
is useful information hiding behind it.
The aim is to find the hidden information in or between levels of abstraction.
Multidimensional Association Rules:
In Multidimensional association rule Qualities can be absolute or quantitative.
Quantitative characteristics are numeric and consolidates order.
Numeric traits should be discretized.
Multidimensional affiliation rule comprises of more than one measurement.
Example –buys (X, “IBM Laptop computer”) buys (X, “HP Inkjet Printer”)
Example –
If in an information block the 3D cuboid (age, pay, purchases) is continuous suggests
(age, pay), (age, purchases), (pay, purchases) are likewise regular.
Note –
Information blocks are appropriate for mining since they make mining quicker. The
cells of an n-dimensional information cuboid relate to the predicate cells.
Example –:
Age (X, "20..25") Λ income (X, "30K..41K")buys ( X, "Laptop Computer")
3. Grid FOR TUPLES:
Quantitative Associations:
Quantitative association rules refer to a special type of association rules in the form of X → Y,
with X and Y consisting of a set of numerical and/or categorical attributes. Different from general
association rules where both the left-hand and the right-hand sides of the rule should be categorical
(nominal or discrete) attributes, at least one attribute of the quantitative association rule (left or
right) must involve a numerical attribute. Examples of this type of association rule can be
categorized into the following two classes, depending on whether the rules are measured by the
frequency of the supporting data records or by some distributional features of some numerical
attributes.
Negative Correlations:
Two people or situations (known as variables) with a negative correlation have an inverse
relationship, which means one increases as the other decreases. Think of school absences, for
example: The higher the number of absences, the lower a student's grades will be. Although
negative correlation is a common part of psychological and statistical analysis, you can also find
examples of negative correlation all around you every day.
Problem Statement
In this section, we first introduce a new distance measure on closed frequent patterns, and
then discuss the clustering criterion.
Distance Measure:
Let P1 and P2 be two closed patterns. The distance of P1 and P2 is defined as: D(P1, P2) =
1 − |T(P1) ∩ T(P2)| |T(P1) ∪ T(P2)|
Example:
Let P1 and P2 be two patterns: T(P1) = {t1, t2, t3, t4, t5} and T(P2) = {t1, t2, t3, t4, t6},
where ti is a transaction in the database. The distance between P1 and P2 is D(P1, P2) = 1 −
4 6 = 1 3 . Theorem 1 The distance measure D is a valid distance metric, such that: 1. D(P1,
P2) > 0, ∀P1 6= P2 2. D(P1, P2) = 0, ∀P1 = P2 3. D(P1, P2) = D(P2, P1) 4. D(P1, P2) +
D(P2, P3) ≥ D(P1, P3), ∀P1, P2, P3