Overviewof Data Mining Techniques and Image Segmentation
Overviewof Data Mining Techniques and Image Segmentation
1285
Number Items
Support
1
XYZ
Total Support = 5
Support {XY} = 2/5 =40%
2
XYW
Support {YZ} = 3/5 = 60%
3
YZ
Support {XYZ} = 1/5 = 20%
4
XZ
5
YZW
Confidence: It measures how often items in Y
appear in transactions that contain X.
Confidence = Number of X Occurrence / Number of
Y Occurrence.
1286
Number
1
2
3
4
K-means Algorithm
Items
XYZ
XYW
YZ
XZ
Confidence
Confidence {X => Y} = 2/3 =
66%
Confidence {Y => Z } =3 / 4 =
75%
Confidence {XY => Z} = 1 / 2
5
YZW
= 50%
This powerful exploratory techniquehas a wide
range of applications in many areas of business
practice and also research - from the analysis of
consumer
preferences
or
human
resource
management, to the history of language. These
techniques enable analysts and researchers to uncover
hidden patterns in large data sets, such as "customers
who order product A often also order product B orC".
B. Clustering
Cluster analysis group objects (observations,
events) based on the information found in the data
describing the objects or their relationships. The goal
is that the objects in a group will be similarly (or
related) to one another and different from (or
unrelated to) the objects in other groups. The greater
the similarity (or homogeneity) within a group, and
the greater the difference between groups, the better
or more distinct the clustering.
Clustering methods are classified into 5 approaches
partitioning algorithms, hierarchical algorithm,
density based method, grid-based method, modelbased method.
Partitioning Based Algorithm
This algorithm minimizes the data clustering
criterion by iterative relocating data points between
clusters until a (locally) optimized partition is
obtained. In a Data set D, with n objects and k
number of clustersto be designed, the partitioning
based algorithmcoordinates the objects into k
partitions (k n), where in each partition represents a
cluster C1,. Ck. K-means and k-medoids are
commonly used partitioning based algorithm. In kmeans each cluster is characterized by the center of
the cluster and in k-medoids each cluster is
characterized by one of the objects in the cluster.
In K-means technique the centroid of a cluster is
treated as the center point. The centroid Ci can be
found by mean or medoids of the objects and referred
to the cluster. The difference between an object p
(belongs to Ci) and Ci, is calculated by Dist (p, Ci),
the distance between the objects are forecast by
Euclidean distance. The main aim of k-means
clustering is to reduce the total intra cluster variance
or squared error function.
=
( ,
2
)
( ,
1287
1288
REFERENCES
[1] Kun-Che Lu, Don-Lin Yang.,Image Processing and Image
Mining using Decision Trees, International Conference on
Extended Database Technology, 2006, pp. 35-40.
[2] http://www.bioss.ac.uk/people/chris/ch4.pdf.
[3] Kun-Che Lu, Don-Lin Yang and Ming-Chuan Hung., Decision
Tree Based Image Data Mining and Its Application on Image
Segmentation, Journal Information Science and Engineering
25, 989 1003 (2009).
[4] http://www.cs.toronto.edu/~jepson/csc2503/segmentation.pdf
[5] P. Rajendran, M. Madheswaran., Hybrid Medical Image
Classification Using Association Rule Mining with Decision
Tree Algorithm, Journal of Computing, Volume 2, Issue 1,
January 2010, ISSN 2151 9617.
[6] ChunkyChandhok, SoniChaturvedi, A. A. Khurshid., An
Approach to Image Segmentation using K-means Clustering
Algorithm, International Journal for Information Technology
(IJIT), Volume I, Issue I August 2012, ISSN 2279 008X.
[7] http://www.academia.edu/648890/Support_vsConfidence_in_A
ssociation_Rule_Algorithms
[8] ]http://www.cs.put.poznan.pl/jstefanowski/sed/DM7clusteringnew.pdf
[9] Petra Perner,Mining Knowledge in Medical Image Databases
In Data Mining and Knowledge Discovery: Theory, Tools, and
Technology, Belur V. Dasarathy (Eds), Proceeding of SPIE
Vol. 4057 (2000), 359 369.
[10]https://courses.cs.washington.edu/courses/cse576/book/ch10.pd
f
[11]S. Hameetha Begum, Data Mining Tools and Trends An
Overview, International Journal of Emerging Research in
Management & Technology, February 2013, ISSN: 2278
9359.
[12]Xiao Feng Wang, De Shuang Huang, HuanXu., An
Efficient Local Chan-Vese Model for Image Segmentation,
Elsevier, Volume 43, Issue 3, March 2010, Pages 603 618.
[13]George Karypis, Vipin Kumar., A Fast and High Quality
Multilevel Scheme for Partitioning Irregular Graphs SIAM
Journal of Scientific Computing, Volume 20, Issue 1, Pages 359
392.
[14]Anil K.Jain, Data Clustering: 50 Years beyond K Means,
19th International Conference in Pattern Recognition (ICPR),
Volume 31, Issue 8, 1 June 2010, Pages 651 666.
1289