SlideShare a Scribd company logo
http://www.iaeme.com/IJARET/index.asp 78 editor@iaeme.com
International Journal of Advanced Research in Engineering and Technology (IJARET)
Volume 8, Issue 1, January- February 2017, pp. 78–85, Article ID: IJARET_08_01_008
Available online at http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=8&IType=1
ISSN Print: 0976-6480 and ISSN Online: 0976-6499
© IAEME Publication
COMPARATIVE STUDY OF DISTRIBUTED
FREQUENT PATTERN MINING ALGORITHMS FOR
BIG SALES DATA
Dinesh J. Prajapati
Research Scholar, Department of Computer Science & Engineering,
Institute of Technology, Nirma University, Ahmedabad, India
ABSTRACT
Association rule mining plays an important role in decision support system. Nowadays in the
era of internet, various online marketing sites and social networking sites are generating enormous
amount of structural/semi structural data in the form of sales data, tweets, emails, web pages and
so on. This online generated data is too large that it becomes very complex to process and analyze
it using traditional systems which consumes more time. This paper overcomes the main memory
bottleneck in single computing system. There are two major goals of this paper. In this paper, big
sales dataset of AMUL dairy is preprocessed using Hadoop Map Reduce that convert it into the
transactional dataset. Then, after removing the null transactions; distributed frequent pattern
mining algorithm MR-DARM (Map Reduce based Distributed Association Rule Mining) is used to
find most frequent item set. Finally, strong association rules are generated from frequent item sets.
The paper also compares the time efficiency of MR-DARM algorithm with existing Count
Distributed Algorithm (CDA) and Fast Distributed Mining (FDM) distributed frequent pattern
mining algorithms. The compared algorithms are presented together with experimental results that
lead to the final conclusions.
Key words: Association rule, distributed frequent pattern mining, hadoop, map reduces.
Cite this Article: Dinesh J. Prajapati, Comparative Study of Distributed Frequent Pattern Mining
Algorithms for Big Sales Data. International Journal of Advanced Research in Engineering and
Technology, 8(1), 2017, pp 78–85.
http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=8&IType=1
1. INTRODUCTION
The process of data mining is to extract the useful information and patterns for the knowledge discovery
process. One of the techniques used in data mining is called association rule mining. Association rule
mining is the data mining task of uncovering relationships in the data. It is a popular model in the retail
sales industry where a company is interested in identifying items that are frequently purchased together.
An association rule is expressed in the form X Y, where X and Y are the itemsets. This rule exposes the
relationship between the itemset X with the itemset Y. The interestingness of the rule X Y is measured
by the support and confidence [1, 2]. The rule X Y has minimum support value min_sup if min_sup
percent of transactions support XUY, the rule X Y holds with minimum confidence value min_conf if
min_conf percent of transactions which support X also support Y [3, 4]. Association rule mining process
basically consists of two steps: (i) Finding all the frequent itemsets that satisfies minimum support
Dinesh J. Prajapati
http://www.iaeme.com/IJARET/index.asp 79 editor@iaeme.com
thresholds and, (ii) Generating strong association rules from derived frequent itemsets. Big data is termed
for a collection of large data sets which are complex and difficult to process using traditional data
processing tools [5].
In brief, the contribution of this paper is summarized in three steps: i) First of all, the distributed
frequent itemset mining algorithms CDA, FDM and MR-DARM are used to generate the complete set of
frequent itemsets and results are compared, (iv) Proposed framework mines not only frequent itemsets, but
also mines distributor’s sales association rules in transactional datasets to analyze total sales based on the
distributor. (v) Finally, based on user defined thresholds, the complete set of distributor’s sales strong
association rules are generated with the interesting patterns. The CDA, FDM and MR-DARM distributed
frequent mining algorithms are tested on sales dataset of AMUL Dairy.
The remaining of the paper is organized as follows. Related work is given in section 2. Section 3 shows
the proposed methodology. In Section 4, the performance of CDA, FDM and MR-DARM algorithms are
evaluated on sales dataset of AMUL dairy. Finally, the conclusion and future scope is drawn in section 5.
2. RELATED WORK
Authors in [6] proposed performance analysis factors like heterogeneous and autonomous. The authors
also proposed a complex theorem which characterizes the features of both the big data revolution and big
data processing model. Authors analyze the challenging issues in the data mining model and also in the big
data analysis. Authors in [7] proposed imminent about big data mining infrastructures and analysis of
Twitter. In this paper two major topics are discussed. First, schemas are insufficient to provide the
knowledge of understanding the petabytes or terabytes of data. Second, a major challenge for analyzing the
data is the heterogeneity of the various components. The objective of this paper is to share experiences of
authors to analyze the data from Twitter in the area of production environment. Authors in [8] proposed an
optimized distributed association rule mining approach to reduce the communication cost for
geographically distributed data. The communication as well as computation time is considered to achieve
an improved response time. The performance analysis is done based on scalability of processors in
distributed environment. Authors in [9] proposed distributed trie based algorithm (DTFIM) to find frequent
item sets. In this paper, authors proposed Bodon’s algorithm based on no shared memory in distributed
computing environment. The proposed algorithm is revised with some frequent data mining algorithm.
Authors in [10] proposed a distributed system for mining the transactional datasets using an improved Map
Reduce framework. In this paper, authors implemented “Associated-Correlated-Independent” algorithm to
find the complete set of customer’s purchase patterns along with the correlated, associated, associated-
correlated, and independent purchase patterns.
The PARMA algorithm proposed in [11] provides great improvements to the runtime of finding
association rules. PARMA achieves this by utilizing probabilistic results, it only approximates the answers.
Another statistical approach was presented in [12]. This solution uses clustering to create groups of
transactions and chooses candidate sets from the representative item sets in the clusters. Authors in [13]
present improved version of the frequent item set mining algorithm as well as its generalized version. The
authors introduced optimized formulas for generating valid candidates by reducing number of invalid
candidates. By using the computations of previous steps by other processed nodes, it avoids generating
redundant candidates. Authors also suggested to run the same algorithm in parallel or distributed system.
The Count Distribution Algorithm (CDA) [14] provides fundamental distributed association rule
algorithm. In this paper, each node contains huge number of frequent item sets and counts candidate item
set locally. These count values are stored in the local database and maintains incoming count values. All
the computing nodes execute the Apriori algorithm locally and after reading count values from the local
database they broadcast respective count values to the remaining nodes. Each of the nodes can generate
new candidate itemset based on the global counter. The FDM (Fast Distributed Mining) algorithm [15]
provides candidate set generation algorithm similar to Apriori. The interesting property of local as well as
global frequent itemset is used to generate a reduced set of candidates for the each iteration. Thus the
Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data
http://www.iaeme.com/IJARET/index.asp 80 editor@iaeme.com
number of messages interchanged between each node reduces. Once the candidate sets are generated, then
local reduction and global reduction techniques are applied to eliminate few candidate sets from each site.
In big data analysis, mining long patterns is more important for the transactional database having
unique item set. However, none of the above mentioned work deals with the problem of data
transformation and elimination of null transactions using Map Reduce. Therefore, data transformation and
finding null transactions and then eliminating it for the future consideration; is the initial part of this
proposed methodology. After removing null transactions, distributed frequent mining algorithm is applied
to generate useful patterns. Existing CDA and FDM algorithm generates large candidate set, uses more
number of message passing system and execution time is also higher while mining big data. The MR-
DARM algorithm improves the drawback of CDA and FDM algorithms and generates useful patterns. The
objective of this work is to remove the drawbacks of relational database and facilitate the existing Map
Reduce framework; to generate the complete set of frequent itemsets with smaller candidate set
generations, less message passing and improvement in the execution time of the system.
3. PROPOSED METHODOLOGY
The CDA and FDM algorithms are data parallelism algorithm [15]. In CDA algorithm, the dataset is
divided into n number of partitions, each partition is given to separate node. Each node counts the
candidates and then broadcasts its counts to remaining nodes. Each node then determines the global counts.
The global counts are used to determine the large item sets and to generate the candidates for the next
iteration. In FDM algorithm, candidate set is generated similar to Apriori algorithm. To reduce the size of
candidates at each iteration, local and global frequent item sets are used which result reduction in the
number of messages interchanged between nodes. Once the candidate sets are generated, local reduction
and global reduction techniques are applied on each site to eliminate redundant candidate sets. The main
drawback of CDA and FDM algorithm is that both generates large candidate set, uses more number of
message passing system and execution time is higher while mining big data. These drawbacks can be
improved by Map Reduce so the new approach is developed.
The MR-DARM algorithm is used to find frequent item sets from the actual transactional dataset. Once
the actual transactional dataset is stored in HDFS, the entire dataset is split into the smaller segments and
then each segment is transformed to data nodes. The map function is executed on each data segments and it
produces <key, value> pairs for each record of database. The Map Reduce framework groups all <key,
value> pairs, which have the same items and call the reducer function by passing value list for generating
candidate item sets. In each database scan, map function generates local candidate item sets, then the
reduce function generates global counts by adding local count values. For the overall computation,
multiple iterations of Map Reduce functions are necessary. Each of the Map Reduce iteration produces a
frequent item set. The iteration continues until no further frequent item sets are found. The reduce function
adds up all the values produce by Mapper and generates a count for the candidate item. The main
advantage of this approach is that it doesn’t exchange data between each node, but it only exchanges the
count values. The MR-DARM algorithm uses notation Ck as a set of candidate k-item set and Lk as a set of
frequent k-itemset which is shown in Fig. 1.
http://www.iaeme.com/IJARET/index.
The transactional data is given as an input to the Mapper line by line. Each line is split into items and
the output <key, value> pair consists of the item and the value 1. This is the local frequency of the item.
The reduce task starts with the itemsets of length 1 and generates candidates with length 2. During step k
of the algorithm it will start with length n itemsets and genera
reduce task cannot generate bigger candidate itemsets it will stop the whole computation. Frequent
itemsets are calculated based on different values of minimum support threshold. Support decision system
will check for the appropriate support count value for generating strong association rules.
3.1. Association Rule Generation
The output of distributed frequent mining algorithm is frequent itemsets which will be given as input to the
association rule generator module to generate strong association rules which satisfies minimum confidence
threshold. Association rules can be generated as follows [
• For each frequent itemset,
Input: Transactional Database in HDFS (
Minimum Support Threshold (
Output: Frequent Itemsets (
Method:
L1 = find frequent 1
For each frequent
Ck = Lk-1
Ct = Map(). // Generates itemset occurrence
Lk = Reduce (). // Gets the subset of frequent itemsets
L = L Uk Lk.
Map Function:
Input: Set of Transaction (
Output: < Candidate Itemset
Method:
For each transaction
For each itemset
If ( Ii ∈ Ti ) then
Generate the output <
as < Key,
Reduce Function:
Input: < candidate itemset, list
Output: < frequent itemset, support_count
Method:
count = 0.
For each number in
count + = number
If ( count > =
Generate the output <
as < key, value
Dinesh J. Prajapati
IJARET/index.asp 81
Figure 1 The MR-DARM Algorithm
The transactional data is given as an input to the Mapper line by line. Each line is split into items and
pair consists of the item and the value 1. This is the local frequency of the item.
The reduce task starts with the itemsets of length 1 and generates candidates with length 2. During step k
of the algorithm it will start with length n itemsets and generate length k + 1 candidate itemsets. If the
reduce task cannot generate bigger candidate itemsets it will stop the whole computation. Frequent
itemsets are calculated based on different values of minimum support threshold. Support decision system
for the appropriate support count value for generating strong association rules.
Association Rule Generation
The output of distributed frequent mining algorithm is frequent itemsets which will be given as input to the
le to generate strong association rules which satisfies minimum confidence
threshold. Association rules can be generated as follows [16].
For each frequent itemset, l, generate all non-empty subsets of l.
Transactional Database in HDFS (D),
Minimum Support Threshold (min_sup)
Output: Frequent Itemsets (L)
= find frequent 1-itemsets from D.
For each frequent k-itemset do
Lk-1. // Generates candidate itemset
= Map(). // Generates itemset occurrence
= Reduce (). // Gets the subset of frequent itemsets
Input: Set of Transaction ( Ti )
Candidate Itemset, Value>
For each transaction Ti ∈ D do
For each itemset Ii in Candidate Itemset Ck do
) then
Generate the output < Ii, 1>
, Value> pair.
andidate itemset, list >
frequent itemset, support_count >
For each number in list do
number.
> = Min_sup ) then
Generate the output < frequent itemset, count >
key, value > pair.
editor@iaeme.com
The transactional data is given as an input to the Mapper line by line. Each line is split into items and
pair consists of the item and the value 1. This is the local frequency of the item.
The reduce task starts with the itemsets of length 1 and generates candidates with length 2. During step k
te length k + 1 candidate itemsets. If the
reduce task cannot generate bigger candidate itemsets it will stop the whole computation. Frequent
itemsets are calculated based on different values of minimum support threshold. Support decision system
for the appropriate support count value for generating strong association rules.
The output of distributed frequent mining algorithm is frequent itemsets which will be given as input to the
le to generate strong association rules which satisfies minimum confidence
Generates candidate itemset
Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data
http://www.iaeme.com/IJARET/index.
• For every non-empty subset s of l, output the rule
where min_conf is the minimum confidence threshold.
Since, the rules are generated from frequent itemsets; each rule automatically satisfies minimum
support.
4. EXPERIMENTAL SETUP
For the experimental purpose cluster of four desktop machines consists of i5 processor with 4 GB DDR
RAM are used. Ubuntu 12.04 LTS operating system is installed in all the four computers. Usually JVM is
not a part of Ubuntu 12.04, so, JVM is also instal
configured in three computers and single
Hadoop packages.
For this experiment, the sales database of AMUL dairy with more than 1500 differen
having total size of 5GB is used. In dairy dataset, sales of the dairy product are done based on concept
hierarchy. First of all product is send to the distributor which in turn distribute the product to the retailer
and finally the retailer will sell the dairy product to the customer.
4.1. Comparative Study of CDA, FDM and
After transforming transactional dataset into actual transactional dataset, actual transaction file is given as
input to the frequent pattern mining al
MR-DARM algorithms on AMUL datasets for the varying database size 256MB, 512MB, 1GB, 2GB and
5GB is applied using single node, two node and three node clusters with minimum support threshol
which are shown in Fig. 2, 3 and
depends on the number of nodes and the size of dataset. For a data set of size 5GB that was distributed on
single node, the execution time for t
seconds and 373 seconds respectively, while the same data set distributed on three node cluster produced
an execution time of 3490 seconds, 2280 seconds and 269 seconds respectively. So, in order to
comparatively small execution times, the number of nodes must be increase with increase in the database
size. It is noticeable that the performance of the algorithms increases with increase in number of nodes, and
the proposed algorithm gives much
dataset is large.
Figure 2 Dataset Size Vs Execution Time for Single Node Cluster with 1% Minimum Support
Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data
IJARET/index.asp 82
empty subset s of l, output the rule “s (l-s)” if (Support (
where min_conf is the minimum confidence threshold.
Since, the rules are generated from frequent itemsets; each rule automatically satisfies minimum
. EXPERIMENTAL SETUP & RESULTS
For the experimental purpose cluster of four desktop machines consists of i5 processor with 4 GB DDR
RAM are used. Ubuntu 12.04 LTS operating system is installed in all the four computers. Usually JVM is
not a part of Ubuntu 12.04, so, JVM is also installed it in all the four computers. Multi
configured in three computers and single-node cluster is configured in single computer using Apache
For this experiment, the sales database of AMUL dairy with more than 1500 differen
having total size of 5GB is used. In dairy dataset, sales of the dairy product are done based on concept
hierarchy. First of all product is send to the distributor which in turn distribute the product to the retailer
r will sell the dairy product to the customer.
Comparative Study of CDA, FDM and MR-DARM Algorithm
After transforming transactional dataset into actual transactional dataset, actual transaction file is given as
input to the frequent pattern mining algorithm to find the frequent itemsets. The results of CDA, FDM and
algorithms on AMUL datasets for the varying database size 256MB, 512MB, 1GB, 2GB and
5GB is applied using single node, two node and three node clusters with minimum support threshol
and 4 respectively. The result shows that the performance of the algorithm
depends on the number of nodes and the size of dataset. For a data set of size 5GB that was distributed on
single node, the execution time for the CDA, FDM and MR-DARM algorithms are 5670 seconds, 3680
seconds and 373 seconds respectively, while the same data set distributed on three node cluster produced
an execution time of 3490 seconds, 2280 seconds and 269 seconds respectively. So, in order to
comparatively small execution times, the number of nodes must be increase with increase in the database
size. It is noticeable that the performance of the algorithms increases with increase in number of nodes, and
the proposed algorithm gives much better performance than CDA as well as FDM when the size of the
Dataset Size Vs Execution Time for Single Node Cluster with 1% Minimum Support
Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data
editor@iaeme.com
” if (Support (l) / Support (s)) >= min_conf,
Since, the rules are generated from frequent itemsets; each rule automatically satisfies minimum
For the experimental purpose cluster of four desktop machines consists of i5 processor with 4 GB DDR-3
RAM are used. Ubuntu 12.04 LTS operating system is installed in all the four computers. Usually JVM is
led it in all the four computers. Multi-node cluster is
node cluster is configured in single computer using Apache
For this experiment, the sales database of AMUL dairy with more than 1500 different dairy product
having total size of 5GB is used. In dairy dataset, sales of the dairy product are done based on concept
hierarchy. First of all product is send to the distributor which in turn distribute the product to the retailer
After transforming transactional dataset into actual transactional dataset, actual transaction file is given as
The results of CDA, FDM and
algorithms on AMUL datasets for the varying database size 256MB, 512MB, 1GB, 2GB and
5GB is applied using single node, two node and three node clusters with minimum support threshold of 1%
respectively. The result shows that the performance of the algorithm
depends on the number of nodes and the size of dataset. For a data set of size 5GB that was distributed on
algorithms are 5670 seconds, 3680
seconds and 373 seconds respectively, while the same data set distributed on three node cluster produced
an execution time of 3490 seconds, 2280 seconds and 269 seconds respectively. So, in order to obtain
comparatively small execution times, the number of nodes must be increase with increase in the database
size. It is noticeable that the performance of the algorithms increases with increase in number of nodes, and
better performance than CDA as well as FDM when the size of the
Dataset Size Vs Execution Time for Single Node Cluster with 1% Minimum Support
http://www.iaeme.com/IJARET/index.
Figure 3 Dataset Size Vs Execution Time for Two Node Cluster with 1% Minimum Sup
Figure 4 Dataset Size Vs Execution Time for Three Node Cluster with 1% Minimum Support
5. CONCLUSION AND FUTURE SCOPE
HDFS and MapReduce play really an important role
However, most of the algorithms have limitation of processing speed. In this paper, hadoop based
distributed approach is presented which process the transactional dataset into partitions and transfers the
task to all participating nodes. The purpose is to reduce inter node message passing in the cluster. In
preprocessing using Hadoop MapReduce, it has been observed that as the number of reducer increases, the
execution time is significantly decreases. The experimen
scales linearly with the number of nodes and the size of the dataset. In this paper, The
algorithm is implemented to find distributed frequent itemsets. As the number of node is increased, the
performance is really improved by considering lower minimum support factor and large database size. The
proposed algorithm generates a smaller candidate set and uses a less message passing than CDA and FDM
algorithm, thus the execution time of the proposed alg
algorithm is more flexible, scalable and efficient distributed frequent pattern mining algorithm for mining
large data.
Dinesh J. Prajapati
IJARET/index.asp 83
Dataset Size Vs Execution Time for Two Node Cluster with 1% Minimum Sup
Dataset Size Vs Execution Time for Three Node Cluster with 1% Minimum Support
AND FUTURE SCOPE
HDFS and MapReduce play really an important role for handling and analyzing of large datasets.
However, most of the algorithms have limitation of processing speed. In this paper, hadoop based
distributed approach is presented which process the transactional dataset into partitions and transfers the
all participating nodes. The purpose is to reduce inter node message passing in the cluster. In
preprocessing using Hadoop MapReduce, it has been observed that as the number of reducer increases, the
execution time is significantly decreases. The experimental results show that the parallel processing task
scales linearly with the number of nodes and the size of the dataset. In this paper, The
algorithm is implemented to find distributed frequent itemsets. As the number of node is increased, the
ormance is really improved by considering lower minimum support factor and large database size. The
proposed algorithm generates a smaller candidate set and uses a less message passing than CDA and FDM
algorithm, thus the execution time of the proposed algorithm is less as compare to others. The proposed
algorithm is more flexible, scalable and efficient distributed frequent pattern mining algorithm for mining
editor@iaeme.com
Dataset Size Vs Execution Time for Two Node Cluster with 1% Minimum Support
Dataset Size Vs Execution Time for Three Node Cluster with 1% Minimum Support
handling and analyzing of large datasets.
However, most of the algorithms have limitation of processing speed. In this paper, hadoop based
distributed approach is presented which process the transactional dataset into partitions and transfers the
all participating nodes. The purpose is to reduce inter node message passing in the cluster. In
preprocessing using Hadoop MapReduce, it has been observed that as the number of reducer increases, the
tal results show that the parallel processing task
scales linearly with the number of nodes and the size of the dataset. In this paper, The MR-DARM
algorithm is implemented to find distributed frequent itemsets. As the number of node is increased, the
ormance is really improved by considering lower minimum support factor and large database size. The
proposed algorithm generates a smaller candidate set and uses a less message passing than CDA and FDM
orithm is less as compare to others. The proposed
algorithm is more flexible, scalable and efficient distributed frequent pattern mining algorithm for mining
Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data
http://www.iaeme.com/IJARET/index.asp 84 editor@iaeme.com
The time efficiency of the algorithm may be improved by using FP-tree based data structures for the
candidate itemset generation.
6. ACKNOWLEDGEMENTS
The authors take this opportunity to thank all the researchers from the domain of big data analysis for their
immense knowledge and kind support throughout the work. Also would like to thank our institute for their
resources and constant inspiration. Special thanks to the authority of AMUL dairy located at Anand district
for providing sales dataset. At last heartiest thanks to our family and friends for encouraging us to make
this a success.
REFERENCES
[1] Srikumar, K. and Bhasker, B. 2005. Metamorphosis: Mining Maximal Frequent Sets in Dense Domains.
Int. Journal of Artificial Intelligence Tools, Vol. 14, Issue 3, 491-506.
[2] Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining association rules between sets of items in large
databases. Proc. Int. Conf. of ACM-SIGMOD on Management of Data, 207-216.
[3] Olsan, D. L. and Delen, D. 2008. Advanced Data Mining Techniques. Springer.
[4] Han, J. and Kamber, M. 2004. Data Mining Concepts & Techniques. San Francisco: Morgan Kaufmann
Publishers.
[5] Agrawal, D., Das, S. and Abbadi, A. 2011. Big data and cloud computing: current state and future
opportunities. Proc. 14th Int. Conf. Extending Database Technology, ACM, 530-533.
[6] Wu, X., Zhu, X., Wu, G. and Ding W. 2013. Data Mining with Big Data. IEEE Transactions on
Knowledge and Data Engineering, Vol. 26, Issue 1, 97-107.
[7] Lin, J., & Ryaboy, D. 2013. Scaling big data mining infrastructure: the twitter experience. ACM
SIGKDD Explorations Newsletter, 14, 6-19.
[8] Mottalib, M. A., Arefin, K. S., Islam, M. M., Rahman, M. A. and Abeer, S. A. 2011. Performance
Analysis of Distributed Association Rule Mining with Apriori Algorithm. Int. Journal of Computer
Theory and Engineering, Vol. 3, No. 4, 484-488.
[9] Ansari, E., Dastghaibifard, G. H., Keshtkaran, M. and Kaabi, H. 2008. Distributed Frequent Itemset
Mining using Trie Data Structure. Int. Journal of Computer Science (IJCS).
[10] Karim, M. R., Ahmed, C. F., Jeong, B. and Choi, H. 2013. An efficient Distributed Programming Model
for Mining Useful Patterns in Big Datasets. IETE Technical Review, Vol. 30, Issue 1, 53-63.
[11] Riondato, M., DeBrabant, J. A., Fonseca, R. and Upfal, E. 2012. Parma: A parallel randomized
algorithm for approximate association rules mining in MapReduce. Proc. 21th Int. Conf. Information
and Knowledge Management (CIKM ’12), ACM, USA, 85–94.
[12] Malek, M. and Kadima, H. 2013. Searching frequent itemsets by clustering data: Towards a parallel
approach using MapReduce. Web Information Systems Engineering WISE 2011 and 2012, Springer,
Berlin Heidelberg, 7652, 251–258.
[13] Butincu, C. N. and Craus, M. 2015. An improved version of the frequent itemset mining algorithm.
Proc. 14th IEEE Int. Conf. Networking in Education and Research, Craiova, 184-189.
[14] Agrawal, R. and Shafer, J. C. 1996. Parallel mining of association rules. IEEE Trans. on Knowledge and
Data Engineering, 8, 962-969.
[15] Cheung, D. W., Han. J., Vincent. T. N. and Ada W. Fu 1996. A fast distributed algorithm for mining
association rules. Proc. 4th
IEEE Int. Conf. Parallel and Distributed Information Systems, 31-42.
Dinesh J. Prajapati
http://www.iaeme.com/IJARET/index.asp 85 editor@iaeme.com
[16] Ban, T., Eto, M., Guo, S., Inoue, D., Nakao, K. and Huang, R. 2015. A study on association rule mining
of darknet big data. Proc. IEEE Int. Joint Conf. on Neural Network (IJCNN), 1-7.
[17] Mudra Doshi And Bidisha Roy, Efficient Processing of Ajax Data Using Mining Algorithms.
International Journal of Computer Engineering and Technology (IJCET), 5(8), 2014, pp. 48–54
[18] Ms. Aruna J. Chamatkar and Dr. P.K. Butey, Performance Analysis of Data Mining Algorithms with
Neural Network. International Journal of Computer Engineering and Technology, 6(1), 2015, pp. 01–11

More Related Content

What's hot (19)

Review on: Techniques for Predicting Frequent Items
Review on: Techniques for Predicting Frequent ItemsReview on: Techniques for Predicting Frequent Items
Review on: Techniques for Predicting Frequent Items
vivatechijri
 
Parallel Key Value Pattern Matching Model
Parallel Key Value Pattern Matching ModelParallel Key Value Pattern Matching Model
Parallel Key Value Pattern Matching Model
ijsrd.com
 
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGPATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
IJDKP
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES) International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
irjes
 
5 data preparation and processing2
5 data preparation and processing25 data preparation and processing2
5 data preparation and processing2
Mahmoud Alfarra
 
International journal of computer science and innovation vol 2015-n1-paper4
International journal of computer science and innovation  vol 2015-n1-paper4International journal of computer science and innovation  vol 2015-n1-paper4
International journal of computer science and innovation vol 2015-n1-paper4
sophiabelthome
 
Ijtra130516
Ijtra130516Ijtra130516
Ijtra130516
International Journal of Technical Research & Application
 
New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...
IJDKP
 
A literature review of modern association rule mining techniques
A literature review of modern association rule mining techniquesA literature review of modern association rule mining techniques
A literature review of modern association rule mining techniques
ijctet
 
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET Journal
 
Infos2014
Infos2014Infos2014
Infos2014
Arab Open University and Cairo University
 
IRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET- Improving the Performance of Smart Heterogeneous Big DataIRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET Journal
 
A unified approach for spatial data query
A unified approach for spatial data queryA unified approach for spatial data query
A unified approach for spatial data query
IJDKP
 
Linking Behavioral Patterns to Personal Attributes through Data Re-Mining
Linking Behavioral Patterns to Personal Attributes through Data Re-MiningLinking Behavioral Patterns to Personal Attributes through Data Re-Mining
Linking Behavioral Patterns to Personal Attributes through Data Re-Mining
ertekg
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
VijayasankariS
 
GCUBE INDEXING
GCUBE INDEXINGGCUBE INDEXING
GCUBE INDEXING
IJDKP
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
VijayasankariS
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
VijayasankariS
 
Unit i
Unit iUnit i
Unit i
AishwaryaLakshmiA
 
Review on: Techniques for Predicting Frequent Items
Review on: Techniques for Predicting Frequent ItemsReview on: Techniques for Predicting Frequent Items
Review on: Techniques for Predicting Frequent Items
vivatechijri
 
Parallel Key Value Pattern Matching Model
Parallel Key Value Pattern Matching ModelParallel Key Value Pattern Matching Model
Parallel Key Value Pattern Matching Model
ijsrd.com
 
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MININGPATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
PATTERN GENERATION FOR COMPLEX DATA USING HYBRID MINING
IJDKP
 
International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES) International Refereed Journal of Engineering and Science (IRJES)
International Refereed Journal of Engineering and Science (IRJES)
irjes
 
5 data preparation and processing2
5 data preparation and processing25 data preparation and processing2
5 data preparation and processing2
Mahmoud Alfarra
 
International journal of computer science and innovation vol 2015-n1-paper4
International journal of computer science and innovation  vol 2015-n1-paper4International journal of computer science and innovation  vol 2015-n1-paper4
International journal of computer science and innovation vol 2015-n1-paper4
sophiabelthome
 
New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...New proximity estimate for incremental update of non uniformly distributed cl...
New proximity estimate for incremental update of non uniformly distributed cl...
IJDKP
 
A literature review of modern association rule mining techniques
A literature review of modern association rule mining techniquesA literature review of modern association rule mining techniques
A literature review of modern association rule mining techniques
ijctet
 
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET-Efficient Data Linkage Technique using one Class Clustering Tree for Da...
IRJET Journal
 
IRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET- Improving the Performance of Smart Heterogeneous Big DataIRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET Journal
 
A unified approach for spatial data query
A unified approach for spatial data queryA unified approach for spatial data query
A unified approach for spatial data query
IJDKP
 
Linking Behavioral Patterns to Personal Attributes through Data Re-Mining
Linking Behavioral Patterns to Personal Attributes through Data Re-MiningLinking Behavioral Patterns to Personal Attributes through Data Re-Mining
Linking Behavioral Patterns to Personal Attributes through Data Re-Mining
ertekg
 
GCUBE INDEXING
GCUBE INDEXINGGCUBE INDEXING
GCUBE INDEXING
IJDKP
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
VijayasankariS
 

Viewers also liked (20)

AN INSIGHT INTO ATOMIC DIMENSION OF PERSONALITY AND STRESS MANAGEMENT
AN INSIGHT INTO ATOMIC DIMENSION OF PERSONALITY AND STRESS MANAGEMENTAN INSIGHT INTO ATOMIC DIMENSION OF PERSONALITY AND STRESS MANAGEMENT
AN INSIGHT INTO ATOMIC DIMENSION OF PERSONALITY AND STRESS MANAGEMENT
IAEME Publication
 
E-INNOVATION
   E-INNOVATION   E-INNOVATION
E-INNOVATION
IAEME Publication
 
VESTIGATION OF DYNAMIC INVOLVED IN DETERMINATION OF CAPITAL STRUCTURE OF KARU...
VESTIGATION OF DYNAMIC INVOLVED IN DETERMINATION OF CAPITAL STRUCTURE OF KARU...VESTIGATION OF DYNAMIC INVOLVED IN DETERMINATION OF CAPITAL STRUCTURE OF KARU...
VESTIGATION OF DYNAMIC INVOLVED IN DETERMINATION OF CAPITAL STRUCTURE OF KARU...
IAEME Publication
 
RTL DESIGN OF EFFICIENT MODIFIED RUN-LENGTH ENCODING ARCHITECTURES USING VERI...
RTL DESIGN OF EFFICIENT MODIFIED RUN-LENGTH ENCODING ARCHITECTURES USING VERI...RTL DESIGN OF EFFICIENT MODIFIED RUN-LENGTH ENCODING ARCHITECTURES USING VERI...
RTL DESIGN OF EFFICIENT MODIFIED RUN-LENGTH ENCODING ARCHITECTURES USING VERI...
IAEME Publication
 
INNER SELF-IMPROVEMENT PROGRAMS: PATHWAY TO GAIN INDEPENDENCE FROM THE DEPEND...
INNER SELF-IMPROVEMENT PROGRAMS: PATHWAY TO GAIN INDEPENDENCE FROM THE DEPEND...INNER SELF-IMPROVEMENT PROGRAMS: PATHWAY TO GAIN INDEPENDENCE FROM THE DEPEND...
INNER SELF-IMPROVEMENT PROGRAMS: PATHWAY TO GAIN INDEPENDENCE FROM THE DEPEND...
IAEME Publication
 
EFFECTIVE RESOURCE SHARING WITH UNIVERSAL BASE-BAND PROCESSING TECHNOLOGY SUP...
EFFECTIVE RESOURCE SHARING WITH UNIVERSAL BASE-BAND PROCESSING TECHNOLOGY SUP...EFFECTIVE RESOURCE SHARING WITH UNIVERSAL BASE-BAND PROCESSING TECHNOLOGY SUP...
EFFECTIVE RESOURCE SHARING WITH UNIVERSAL BASE-BAND PROCESSING TECHNOLOGY SUP...
IAEME Publication
 
A STUDY ON CHALLENGES OF MULTICULTURAL TEAM MEMBERS OF IT SECTOR
A STUDY ON CHALLENGES OF MULTICULTURAL TEAM MEMBERS OF IT SECTORA STUDY ON CHALLENGES OF MULTICULTURAL TEAM MEMBERS OF IT SECTOR
A STUDY ON CHALLENGES OF MULTICULTURAL TEAM MEMBERS OF IT SECTOR
IAEME Publication
 
ARE THE SERVICES DELIVERED EMPLOYABLE? A SCENARIO OF TECHNICAL EDUCATION IN ...
ARE THE SERVICES DELIVERED EMPLOYABLE?  A SCENARIO OF TECHNICAL EDUCATION IN ...ARE THE SERVICES DELIVERED EMPLOYABLE?  A SCENARIO OF TECHNICAL EDUCATION IN ...
ARE THE SERVICES DELIVERED EMPLOYABLE? A SCENARIO OF TECHNICAL EDUCATION IN ...
IAEME Publication
 
A STUDY ON IMPACT OF BARCODE AND RADIO FREQUENCY IDENTIFICATION TECHNOLOGY ON...
A STUDY ON IMPACT OF BARCODE AND RADIO FREQUENCY IDENTIFICATION TECHNOLOGY ON...A STUDY ON IMPACT OF BARCODE AND RADIO FREQUENCY IDENTIFICATION TECHNOLOGY ON...
A STUDY ON IMPACT OF BARCODE AND RADIO FREQUENCY IDENTIFICATION TECHNOLOGY ON...
IAEME Publication
 
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
IAEME Publication
 
TAX REFORM FOR DEVELOPING VIABLE AND SUSTAINABLE TAX SYSTEMS IN INDIA WITH SP...
TAX REFORM FOR DEVELOPING VIABLE AND SUSTAINABLE TAX SYSTEMS IN INDIA WITH SP...TAX REFORM FOR DEVELOPING VIABLE AND SUSTAINABLE TAX SYSTEMS IN INDIA WITH SP...
TAX REFORM FOR DEVELOPING VIABLE AND SUSTAINABLE TAX SYSTEMS IN INDIA WITH SP...
IAEME Publication
 
SOCIAL INNOVATION AND SOCIAL ENTREPRENEURSHIP - AN ELUCIDATION FOR THE PROBLE...
SOCIAL INNOVATION AND SOCIAL ENTREPRENEURSHIP - AN ELUCIDATION FOR THE PROBLE...SOCIAL INNOVATION AND SOCIAL ENTREPRENEURSHIP - AN ELUCIDATION FOR THE PROBLE...
SOCIAL INNOVATION AND SOCIAL ENTREPRENEURSHIP - AN ELUCIDATION FOR THE PROBLE...
IAEME Publication
 
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
IAEME Publication
 
AN ANALYSIS OF EMPLOYEE ATTRITION IN AMARAJA BATTERIES LIMITED, TIRUPATI, AP
AN ANALYSIS OF EMPLOYEE ATTRITION IN AMARAJA BATTERIES LIMITED, TIRUPATI, APAN ANALYSIS OF EMPLOYEE ATTRITION IN AMARAJA BATTERIES LIMITED, TIRUPATI, AP
AN ANALYSIS OF EMPLOYEE ATTRITION IN AMARAJA BATTERIES LIMITED, TIRUPATI, AP
IAEME Publication
 
ANALYSIS OF NON PERFORMING ASSETS IN PUBLIC SECTOR BANKS OF INDIA
ANALYSIS OF NON PERFORMING ASSETS IN PUBLIC SECTOR BANKS OF INDIA ANALYSIS OF NON PERFORMING ASSETS IN PUBLIC SECTOR BANKS OF INDIA
ANALYSIS OF NON PERFORMING ASSETS IN PUBLIC SECTOR BANKS OF INDIA
IAEME Publication
 
PRODUCTION OF ALTERNATIVE FUEL USING GASIFICATION BY SYNTHESIS OF FISCHER-TRO...
PRODUCTION OF ALTERNATIVE FUEL USING GASIFICATION BY SYNTHESIS OF FISCHER-TRO...PRODUCTION OF ALTERNATIVE FUEL USING GASIFICATION BY SYNTHESIS OF FISCHER-TRO...
PRODUCTION OF ALTERNATIVE FUEL USING GASIFICATION BY SYNTHESIS OF FISCHER-TRO...
IAEME Publication
 
INVESTIGATE THE RELATIONSHIP BETWEEN ORGANIZATIONAL LEARNING CAPABILITIES AND...
INVESTIGATE THE RELATIONSHIP BETWEEN ORGANIZATIONAL LEARNING CAPABILITIES AND...INVESTIGATE THE RELATIONSHIP BETWEEN ORGANIZATIONAL LEARNING CAPABILITIES AND...
INVESTIGATE THE RELATIONSHIP BETWEEN ORGANIZATIONAL LEARNING CAPABILITIES AND...
IAEME Publication
 
MIXED CONVECTIVE HEAT AND MASS TRANSFER MHD FLOW PAST AN UNSTEADY STRETCHING ...
MIXED CONVECTIVE HEAT AND MASS TRANSFER MHD FLOW PAST AN UNSTEADY STRETCHING ...MIXED CONVECTIVE HEAT AND MASS TRANSFER MHD FLOW PAST AN UNSTEADY STRETCHING ...
MIXED CONVECTIVE HEAT AND MASS TRANSFER MHD FLOW PAST AN UNSTEADY STRETCHING ...
IAEME Publication
 
COMPARISON OF METHODS FOR THE RECOGNITION OF DERIVATIVE FINANCIAL PRODUCTS WI...
COMPARISON OF METHODS FOR THE RECOGNITION OF DERIVATIVE FINANCIAL PRODUCTS WI...COMPARISON OF METHODS FOR THE RECOGNITION OF DERIVATIVE FINANCIAL PRODUCTS WI...
COMPARISON OF METHODS FOR THE RECOGNITION OF DERIVATIVE FINANCIAL PRODUCTS WI...
IAEME Publication
 
A FEASIBILITY STUDY FOR ONLINE MARKETING OF AGRICULTURAL GREENHOUSE PRODUCTS ...
A FEASIBILITY STUDY FOR ONLINE MARKETING OF AGRICULTURAL GREENHOUSE PRODUCTS ...A FEASIBILITY STUDY FOR ONLINE MARKETING OF AGRICULTURAL GREENHOUSE PRODUCTS ...
A FEASIBILITY STUDY FOR ONLINE MARKETING OF AGRICULTURAL GREENHOUSE PRODUCTS ...
IAEME Publication
 
AN INSIGHT INTO ATOMIC DIMENSION OF PERSONALITY AND STRESS MANAGEMENT
AN INSIGHT INTO ATOMIC DIMENSION OF PERSONALITY AND STRESS MANAGEMENTAN INSIGHT INTO ATOMIC DIMENSION OF PERSONALITY AND STRESS MANAGEMENT
AN INSIGHT INTO ATOMIC DIMENSION OF PERSONALITY AND STRESS MANAGEMENT
IAEME Publication
 
VESTIGATION OF DYNAMIC INVOLVED IN DETERMINATION OF CAPITAL STRUCTURE OF KARU...
VESTIGATION OF DYNAMIC INVOLVED IN DETERMINATION OF CAPITAL STRUCTURE OF KARU...VESTIGATION OF DYNAMIC INVOLVED IN DETERMINATION OF CAPITAL STRUCTURE OF KARU...
VESTIGATION OF DYNAMIC INVOLVED IN DETERMINATION OF CAPITAL STRUCTURE OF KARU...
IAEME Publication
 
RTL DESIGN OF EFFICIENT MODIFIED RUN-LENGTH ENCODING ARCHITECTURES USING VERI...
RTL DESIGN OF EFFICIENT MODIFIED RUN-LENGTH ENCODING ARCHITECTURES USING VERI...RTL DESIGN OF EFFICIENT MODIFIED RUN-LENGTH ENCODING ARCHITECTURES USING VERI...
RTL DESIGN OF EFFICIENT MODIFIED RUN-LENGTH ENCODING ARCHITECTURES USING VERI...
IAEME Publication
 
INNER SELF-IMPROVEMENT PROGRAMS: PATHWAY TO GAIN INDEPENDENCE FROM THE DEPEND...
INNER SELF-IMPROVEMENT PROGRAMS: PATHWAY TO GAIN INDEPENDENCE FROM THE DEPEND...INNER SELF-IMPROVEMENT PROGRAMS: PATHWAY TO GAIN INDEPENDENCE FROM THE DEPEND...
INNER SELF-IMPROVEMENT PROGRAMS: PATHWAY TO GAIN INDEPENDENCE FROM THE DEPEND...
IAEME Publication
 
EFFECTIVE RESOURCE SHARING WITH UNIVERSAL BASE-BAND PROCESSING TECHNOLOGY SUP...
EFFECTIVE RESOURCE SHARING WITH UNIVERSAL BASE-BAND PROCESSING TECHNOLOGY SUP...EFFECTIVE RESOURCE SHARING WITH UNIVERSAL BASE-BAND PROCESSING TECHNOLOGY SUP...
EFFECTIVE RESOURCE SHARING WITH UNIVERSAL BASE-BAND PROCESSING TECHNOLOGY SUP...
IAEME Publication
 
A STUDY ON CHALLENGES OF MULTICULTURAL TEAM MEMBERS OF IT SECTOR
A STUDY ON CHALLENGES OF MULTICULTURAL TEAM MEMBERS OF IT SECTORA STUDY ON CHALLENGES OF MULTICULTURAL TEAM MEMBERS OF IT SECTOR
A STUDY ON CHALLENGES OF MULTICULTURAL TEAM MEMBERS OF IT SECTOR
IAEME Publication
 
ARE THE SERVICES DELIVERED EMPLOYABLE? A SCENARIO OF TECHNICAL EDUCATION IN ...
ARE THE SERVICES DELIVERED EMPLOYABLE?  A SCENARIO OF TECHNICAL EDUCATION IN ...ARE THE SERVICES DELIVERED EMPLOYABLE?  A SCENARIO OF TECHNICAL EDUCATION IN ...
ARE THE SERVICES DELIVERED EMPLOYABLE? A SCENARIO OF TECHNICAL EDUCATION IN ...
IAEME Publication
 
A STUDY ON IMPACT OF BARCODE AND RADIO FREQUENCY IDENTIFICATION TECHNOLOGY ON...
A STUDY ON IMPACT OF BARCODE AND RADIO FREQUENCY IDENTIFICATION TECHNOLOGY ON...A STUDY ON IMPACT OF BARCODE AND RADIO FREQUENCY IDENTIFICATION TECHNOLOGY ON...
A STUDY ON IMPACT OF BARCODE AND RADIO FREQUENCY IDENTIFICATION TECHNOLOGY ON...
IAEME Publication
 
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
IAEME Publication
 
TAX REFORM FOR DEVELOPING VIABLE AND SUSTAINABLE TAX SYSTEMS IN INDIA WITH SP...
TAX REFORM FOR DEVELOPING VIABLE AND SUSTAINABLE TAX SYSTEMS IN INDIA WITH SP...TAX REFORM FOR DEVELOPING VIABLE AND SUSTAINABLE TAX SYSTEMS IN INDIA WITH SP...
TAX REFORM FOR DEVELOPING VIABLE AND SUSTAINABLE TAX SYSTEMS IN INDIA WITH SP...
IAEME Publication
 
SOCIAL INNOVATION AND SOCIAL ENTREPRENEURSHIP - AN ELUCIDATION FOR THE PROBLE...
SOCIAL INNOVATION AND SOCIAL ENTREPRENEURSHIP - AN ELUCIDATION FOR THE PROBLE...SOCIAL INNOVATION AND SOCIAL ENTREPRENEURSHIP - AN ELUCIDATION FOR THE PROBLE...
SOCIAL INNOVATION AND SOCIAL ENTREPRENEURSHIP - AN ELUCIDATION FOR THE PROBLE...
IAEME Publication
 
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
HYPERSPECTRAL IMAGERY CLASSIFICATION USING TECHNOLOGIES OF COMPUTATIONAL INTE...
IAEME Publication
 
AN ANALYSIS OF EMPLOYEE ATTRITION IN AMARAJA BATTERIES LIMITED, TIRUPATI, AP
AN ANALYSIS OF EMPLOYEE ATTRITION IN AMARAJA BATTERIES LIMITED, TIRUPATI, APAN ANALYSIS OF EMPLOYEE ATTRITION IN AMARAJA BATTERIES LIMITED, TIRUPATI, AP
AN ANALYSIS OF EMPLOYEE ATTRITION IN AMARAJA BATTERIES LIMITED, TIRUPATI, AP
IAEME Publication
 
ANALYSIS OF NON PERFORMING ASSETS IN PUBLIC SECTOR BANKS OF INDIA
ANALYSIS OF NON PERFORMING ASSETS IN PUBLIC SECTOR BANKS OF INDIA ANALYSIS OF NON PERFORMING ASSETS IN PUBLIC SECTOR BANKS OF INDIA
ANALYSIS OF NON PERFORMING ASSETS IN PUBLIC SECTOR BANKS OF INDIA
IAEME Publication
 
PRODUCTION OF ALTERNATIVE FUEL USING GASIFICATION BY SYNTHESIS OF FISCHER-TRO...
PRODUCTION OF ALTERNATIVE FUEL USING GASIFICATION BY SYNTHESIS OF FISCHER-TRO...PRODUCTION OF ALTERNATIVE FUEL USING GASIFICATION BY SYNTHESIS OF FISCHER-TRO...
PRODUCTION OF ALTERNATIVE FUEL USING GASIFICATION BY SYNTHESIS OF FISCHER-TRO...
IAEME Publication
 
INVESTIGATE THE RELATIONSHIP BETWEEN ORGANIZATIONAL LEARNING CAPABILITIES AND...
INVESTIGATE THE RELATIONSHIP BETWEEN ORGANIZATIONAL LEARNING CAPABILITIES AND...INVESTIGATE THE RELATIONSHIP BETWEEN ORGANIZATIONAL LEARNING CAPABILITIES AND...
INVESTIGATE THE RELATIONSHIP BETWEEN ORGANIZATIONAL LEARNING CAPABILITIES AND...
IAEME Publication
 
MIXED CONVECTIVE HEAT AND MASS TRANSFER MHD FLOW PAST AN UNSTEADY STRETCHING ...
MIXED CONVECTIVE HEAT AND MASS TRANSFER MHD FLOW PAST AN UNSTEADY STRETCHING ...MIXED CONVECTIVE HEAT AND MASS TRANSFER MHD FLOW PAST AN UNSTEADY STRETCHING ...
MIXED CONVECTIVE HEAT AND MASS TRANSFER MHD FLOW PAST AN UNSTEADY STRETCHING ...
IAEME Publication
 
COMPARISON OF METHODS FOR THE RECOGNITION OF DERIVATIVE FINANCIAL PRODUCTS WI...
COMPARISON OF METHODS FOR THE RECOGNITION OF DERIVATIVE FINANCIAL PRODUCTS WI...COMPARISON OF METHODS FOR THE RECOGNITION OF DERIVATIVE FINANCIAL PRODUCTS WI...
COMPARISON OF METHODS FOR THE RECOGNITION OF DERIVATIVE FINANCIAL PRODUCTS WI...
IAEME Publication
 
A FEASIBILITY STUDY FOR ONLINE MARKETING OF AGRICULTURAL GREENHOUSE PRODUCTS ...
A FEASIBILITY STUDY FOR ONLINE MARKETING OF AGRICULTURAL GREENHOUSE PRODUCTS ...A FEASIBILITY STUDY FOR ONLINE MARKETING OF AGRICULTURAL GREENHOUSE PRODUCTS ...
A FEASIBILITY STUDY FOR ONLINE MARKETING OF AGRICULTURAL GREENHOUSE PRODUCTS ...
IAEME Publication
 

Similar to COMPARATIVE STUDY OF DISTRIBUTED FREQUENT PATTERN MINING ALGORITHMS FOR BIG SALES DATA (20)

A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...
A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...
A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...
IAEME Publication
 
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
acijjournal
 
Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...
Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...
Study on Positive and Negative Rule Based Mining Techniques for E-Commerce Ap...
Association of Scientists, Developers and Faculties
 
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET Journal
 
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
IRJET Journal
 
REVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining TechniquesREVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining Techniques
Editor IJMTER
 
Adaptive and Fast Predictions by Minimal Itemsets Creation
Adaptive and Fast Predictions by Minimal Itemsets CreationAdaptive and Fast Predictions by Minimal Itemsets Creation
Adaptive and Fast Predictions by Minimal Itemsets Creation
IJERA Editor
 
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANKPATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
IJDKP
 
Mining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce FrameworkMining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce Framework
IRJET Journal
 
IRJET- Customer Online Buying Prediction using Frequent Item Set Mining
IRJET- Customer Online Buying Prediction using Frequent Item Set MiningIRJET- Customer Online Buying Prediction using Frequent Item Set Mining
IRJET- Customer Online Buying Prediction using Frequent Item Set Mining
IRJET Journal
 
Ijetcas14 316
Ijetcas14 316Ijetcas14 316
Ijetcas14 316
Iasir Journals
 
H017124652
H017124652H017124652
H017124652
IOSR Journals
 
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set MiningAn Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
ijsrd.com
 
Intelligent Supermarket using Apriori
Intelligent Supermarket using AprioriIntelligent Supermarket using Apriori
Intelligent Supermarket using Apriori
IRJET Journal
 
Usage and Research Challenges in the Area of Frequent Pattern in Data Mining
Usage and Research Challenges in the Area of Frequent Pattern in Data MiningUsage and Research Challenges in the Area of Frequent Pattern in Data Mining
Usage and Research Challenges in the Area of Frequent Pattern in Data Mining
IOSR Journals
 
PAS: A Sampling Based Similarity Identification Algorithm for compression of ...
PAS: A Sampling Based Similarity Identification Algorithm for compression of ...PAS: A Sampling Based Similarity Identification Algorithm for compression of ...
PAS: A Sampling Based Similarity Identification Algorithm for compression of ...
rahulmonikasharma
 
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association RulesA New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
Venu Madhav
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduce
IJDKP
 
A Quantified Approach for large Dataset Compression in Association Mining
A Quantified Approach for large Dataset Compression in Association MiningA Quantified Approach for large Dataset Compression in Association Mining
A Quantified Approach for large Dataset Compression in Association Mining
IOSR Journals
 
Multi-threaded approach in generating frequent itemset of Apriori algorithm b...
Multi-threaded approach in generating frequent itemset of Apriori algorithm b...Multi-threaded approach in generating frequent itemset of Apriori algorithm b...
Multi-threaded approach in generating frequent itemset of Apriori algorithm b...
TELKOMNIKA JOURNAL
 
A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...
A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...
A NOVEL APPROACH TO MINE FREQUENT PATTERNS FROM LARGE VOLUME OF DATASET USING...
IAEME Publication
 
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
MAP/REDUCE DESIGN AND IMPLEMENTATION OF APRIORIALGORITHM FOR HANDLING VOLUMIN...
acijjournal
 
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET- Classification of Pattern Storage System and Analysis of Online Shoppi...
IRJET Journal
 
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
Irjet v4 iA Survey on FP (Growth) Tree using Association Rule Mining7351
IRJET Journal
 
REVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining TechniquesREVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining Techniques
Editor IJMTER
 
Adaptive and Fast Predictions by Minimal Itemsets Creation
Adaptive and Fast Predictions by Minimal Itemsets CreationAdaptive and Fast Predictions by Minimal Itemsets Creation
Adaptive and Fast Predictions by Minimal Itemsets Creation
IJERA Editor
 
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANKPATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
IJDKP
 
Mining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce FrameworkMining High Utility Patterns in Large Databases using Mapreduce Framework
Mining High Utility Patterns in Large Databases using Mapreduce Framework
IRJET Journal
 
IRJET- Customer Online Buying Prediction using Frequent Item Set Mining
IRJET- Customer Online Buying Prediction using Frequent Item Set MiningIRJET- Customer Online Buying Prediction using Frequent Item Set Mining
IRJET- Customer Online Buying Prediction using Frequent Item Set Mining
IRJET Journal
 
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set MiningAn Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
An Efficient Compressed Data Structure Based Method for Frequent Item Set Mining
ijsrd.com
 
Intelligent Supermarket using Apriori
Intelligent Supermarket using AprioriIntelligent Supermarket using Apriori
Intelligent Supermarket using Apriori
IRJET Journal
 
Usage and Research Challenges in the Area of Frequent Pattern in Data Mining
Usage and Research Challenges in the Area of Frequent Pattern in Data MiningUsage and Research Challenges in the Area of Frequent Pattern in Data Mining
Usage and Research Challenges in the Area of Frequent Pattern in Data Mining
IOSR Journals
 
PAS: A Sampling Based Similarity Identification Algorithm for compression of ...
PAS: A Sampling Based Similarity Identification Algorithm for compression of ...PAS: A Sampling Based Similarity Identification Algorithm for compression of ...
PAS: A Sampling Based Similarity Identification Algorithm for compression of ...
rahulmonikasharma
 
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association RulesA New Data Stream Mining Algorithm for Interestingness-rich Association Rules
A New Data Stream Mining Algorithm for Interestingness-rich Association Rules
Venu Madhav
 
Recommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduceRecommendation system using bloom filter in mapreduce
Recommendation system using bloom filter in mapreduce
IJDKP
 
A Quantified Approach for large Dataset Compression in Association Mining
A Quantified Approach for large Dataset Compression in Association MiningA Quantified Approach for large Dataset Compression in Association Mining
A Quantified Approach for large Dataset Compression in Association Mining
IOSR Journals
 
Multi-threaded approach in generating frequent itemset of Apriori algorithm b...
Multi-threaded approach in generating frequent itemset of Apriori algorithm b...Multi-threaded approach in generating frequent itemset of Apriori algorithm b...
Multi-threaded approach in generating frequent itemset of Apriori algorithm b...
TELKOMNIKA JOURNAL
 

More from IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
IAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
IAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
IAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
IAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
IAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
IAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
IAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
IAEME Publication
 
IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME Publication
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
IAEME Publication
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
IAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
IAEME Publication
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
IAEME Publication
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
IAEME Publication
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
IAEME Publication
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IAEME Publication
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
IAEME Publication
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
IAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
IAEME Publication
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
IAEME Publication
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
IAEME Publication
 

Recently uploaded (20)

DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G..."Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
Infopitaara
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Structural Response of Reinforced Self-Compacting Concrete Deep Beam Using Fi...
Journal of Soft Computing in Civil Engineering
 
The Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLabThe Gaussian Process Modeling Module in UQLab
The Gaussian Process Modeling Module in UQLab
Journal of Soft Computing in Civil Engineering
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...
IJCSES Journal
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G..."Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...
Infopitaara
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design ThinkingDT REPORT by Tech titan GROUP to introduce the subject design Thinking
DT REPORT by Tech titan GROUP to introduce the subject design Thinking
DhruvChotaliya2
 
QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)QA/QC Manager (Quality management Expert)
QA/QC Manager (Quality management Expert)
rccbatchplant
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...15th International Conference on Computer Science, Engineering and Applicatio...
15th International Conference on Computer Science, Engineering and Applicatio...
IJCSES Journal
 
Degree_of_Automation.pdf for Instrumentation and industrial specialist
Degree_of_Automation.pdf for  Instrumentation  and industrial specialistDegree_of_Automation.pdf for  Instrumentation  and industrial specialist
Degree_of_Automation.pdf for Instrumentation and industrial specialist
shreyabhosale19
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Smart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptxSmart_Storage_Systems_Production_Engineering.pptx
Smart_Storage_Systems_Production_Engineering.pptx
rushikeshnavghare94
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
fluke dealers in bangalore..............
fluke dealers in bangalore..............fluke dealers in bangalore..............
fluke dealers in bangalore..............
Haresh Vaswani
 
railway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forgingrailway wheels, descaling after reheating and before forging
railway wheels, descaling after reheating and before forging
Javad Kadkhodapour
 
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E..."Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
"Boiler Feed Pump (BFP): Working, Applications, Advantages, and Limitations E...
Infopitaara
 

COMPARATIVE STUDY OF DISTRIBUTED FREQUENT PATTERN MINING ALGORITHMS FOR BIG SALES DATA

  • 1. http://www.iaeme.com/IJARET/index.asp 78 [email protected] International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 8, Issue 1, January- February 2017, pp. 78–85, Article ID: IJARET_08_01_008 Available online at http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=8&IType=1 ISSN Print: 0976-6480 and ISSN Online: 0976-6499 © IAEME Publication COMPARATIVE STUDY OF DISTRIBUTED FREQUENT PATTERN MINING ALGORITHMS FOR BIG SALES DATA Dinesh J. Prajapati Research Scholar, Department of Computer Science & Engineering, Institute of Technology, Nirma University, Ahmedabad, India ABSTRACT Association rule mining plays an important role in decision support system. Nowadays in the era of internet, various online marketing sites and social networking sites are generating enormous amount of structural/semi structural data in the form of sales data, tweets, emails, web pages and so on. This online generated data is too large that it becomes very complex to process and analyze it using traditional systems which consumes more time. This paper overcomes the main memory bottleneck in single computing system. There are two major goals of this paper. In this paper, big sales dataset of AMUL dairy is preprocessed using Hadoop Map Reduce that convert it into the transactional dataset. Then, after removing the null transactions; distributed frequent pattern mining algorithm MR-DARM (Map Reduce based Distributed Association Rule Mining) is used to find most frequent item set. Finally, strong association rules are generated from frequent item sets. The paper also compares the time efficiency of MR-DARM algorithm with existing Count Distributed Algorithm (CDA) and Fast Distributed Mining (FDM) distributed frequent pattern mining algorithms. The compared algorithms are presented together with experimental results that lead to the final conclusions. Key words: Association rule, distributed frequent pattern mining, hadoop, map reduces. Cite this Article: Dinesh J. Prajapati, Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data. International Journal of Advanced Research in Engineering and Technology, 8(1), 2017, pp 78–85. http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=8&IType=1 1. INTRODUCTION The process of data mining is to extract the useful information and patterns for the knowledge discovery process. One of the techniques used in data mining is called association rule mining. Association rule mining is the data mining task of uncovering relationships in the data. It is a popular model in the retail sales industry where a company is interested in identifying items that are frequently purchased together. An association rule is expressed in the form X Y, where X and Y are the itemsets. This rule exposes the relationship between the itemset X with the itemset Y. The interestingness of the rule X Y is measured by the support and confidence [1, 2]. The rule X Y has minimum support value min_sup if min_sup percent of transactions support XUY, the rule X Y holds with minimum confidence value min_conf if min_conf percent of transactions which support X also support Y [3, 4]. Association rule mining process basically consists of two steps: (i) Finding all the frequent itemsets that satisfies minimum support
  • 2. Dinesh J. Prajapati http://www.iaeme.com/IJARET/index.asp 79 [email protected] thresholds and, (ii) Generating strong association rules from derived frequent itemsets. Big data is termed for a collection of large data sets which are complex and difficult to process using traditional data processing tools [5]. In brief, the contribution of this paper is summarized in three steps: i) First of all, the distributed frequent itemset mining algorithms CDA, FDM and MR-DARM are used to generate the complete set of frequent itemsets and results are compared, (iv) Proposed framework mines not only frequent itemsets, but also mines distributor’s sales association rules in transactional datasets to analyze total sales based on the distributor. (v) Finally, based on user defined thresholds, the complete set of distributor’s sales strong association rules are generated with the interesting patterns. The CDA, FDM and MR-DARM distributed frequent mining algorithms are tested on sales dataset of AMUL Dairy. The remaining of the paper is organized as follows. Related work is given in section 2. Section 3 shows the proposed methodology. In Section 4, the performance of CDA, FDM and MR-DARM algorithms are evaluated on sales dataset of AMUL dairy. Finally, the conclusion and future scope is drawn in section 5. 2. RELATED WORK Authors in [6] proposed performance analysis factors like heterogeneous and autonomous. The authors also proposed a complex theorem which characterizes the features of both the big data revolution and big data processing model. Authors analyze the challenging issues in the data mining model and also in the big data analysis. Authors in [7] proposed imminent about big data mining infrastructures and analysis of Twitter. In this paper two major topics are discussed. First, schemas are insufficient to provide the knowledge of understanding the petabytes or terabytes of data. Second, a major challenge for analyzing the data is the heterogeneity of the various components. The objective of this paper is to share experiences of authors to analyze the data from Twitter in the area of production environment. Authors in [8] proposed an optimized distributed association rule mining approach to reduce the communication cost for geographically distributed data. The communication as well as computation time is considered to achieve an improved response time. The performance analysis is done based on scalability of processors in distributed environment. Authors in [9] proposed distributed trie based algorithm (DTFIM) to find frequent item sets. In this paper, authors proposed Bodon’s algorithm based on no shared memory in distributed computing environment. The proposed algorithm is revised with some frequent data mining algorithm. Authors in [10] proposed a distributed system for mining the transactional datasets using an improved Map Reduce framework. In this paper, authors implemented “Associated-Correlated-Independent” algorithm to find the complete set of customer’s purchase patterns along with the correlated, associated, associated- correlated, and independent purchase patterns. The PARMA algorithm proposed in [11] provides great improvements to the runtime of finding association rules. PARMA achieves this by utilizing probabilistic results, it only approximates the answers. Another statistical approach was presented in [12]. This solution uses clustering to create groups of transactions and chooses candidate sets from the representative item sets in the clusters. Authors in [13] present improved version of the frequent item set mining algorithm as well as its generalized version. The authors introduced optimized formulas for generating valid candidates by reducing number of invalid candidates. By using the computations of previous steps by other processed nodes, it avoids generating redundant candidates. Authors also suggested to run the same algorithm in parallel or distributed system. The Count Distribution Algorithm (CDA) [14] provides fundamental distributed association rule algorithm. In this paper, each node contains huge number of frequent item sets and counts candidate item set locally. These count values are stored in the local database and maintains incoming count values. All the computing nodes execute the Apriori algorithm locally and after reading count values from the local database they broadcast respective count values to the remaining nodes. Each of the nodes can generate new candidate itemset based on the global counter. The FDM (Fast Distributed Mining) algorithm [15] provides candidate set generation algorithm similar to Apriori. The interesting property of local as well as global frequent itemset is used to generate a reduced set of candidates for the each iteration. Thus the
  • 3. Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data http://www.iaeme.com/IJARET/index.asp 80 [email protected] number of messages interchanged between each node reduces. Once the candidate sets are generated, then local reduction and global reduction techniques are applied to eliminate few candidate sets from each site. In big data analysis, mining long patterns is more important for the transactional database having unique item set. However, none of the above mentioned work deals with the problem of data transformation and elimination of null transactions using Map Reduce. Therefore, data transformation and finding null transactions and then eliminating it for the future consideration; is the initial part of this proposed methodology. After removing null transactions, distributed frequent mining algorithm is applied to generate useful patterns. Existing CDA and FDM algorithm generates large candidate set, uses more number of message passing system and execution time is also higher while mining big data. The MR- DARM algorithm improves the drawback of CDA and FDM algorithms and generates useful patterns. The objective of this work is to remove the drawbacks of relational database and facilitate the existing Map Reduce framework; to generate the complete set of frequent itemsets with smaller candidate set generations, less message passing and improvement in the execution time of the system. 3. PROPOSED METHODOLOGY The CDA and FDM algorithms are data parallelism algorithm [15]. In CDA algorithm, the dataset is divided into n number of partitions, each partition is given to separate node. Each node counts the candidates and then broadcasts its counts to remaining nodes. Each node then determines the global counts. The global counts are used to determine the large item sets and to generate the candidates for the next iteration. In FDM algorithm, candidate set is generated similar to Apriori algorithm. To reduce the size of candidates at each iteration, local and global frequent item sets are used which result reduction in the number of messages interchanged between nodes. Once the candidate sets are generated, local reduction and global reduction techniques are applied on each site to eliminate redundant candidate sets. The main drawback of CDA and FDM algorithm is that both generates large candidate set, uses more number of message passing system and execution time is higher while mining big data. These drawbacks can be improved by Map Reduce so the new approach is developed. The MR-DARM algorithm is used to find frequent item sets from the actual transactional dataset. Once the actual transactional dataset is stored in HDFS, the entire dataset is split into the smaller segments and then each segment is transformed to data nodes. The map function is executed on each data segments and it produces <key, value> pairs for each record of database. The Map Reduce framework groups all <key, value> pairs, which have the same items and call the reducer function by passing value list for generating candidate item sets. In each database scan, map function generates local candidate item sets, then the reduce function generates global counts by adding local count values. For the overall computation, multiple iterations of Map Reduce functions are necessary. Each of the Map Reduce iteration produces a frequent item set. The iteration continues until no further frequent item sets are found. The reduce function adds up all the values produce by Mapper and generates a count for the candidate item. The main advantage of this approach is that it doesn’t exchange data between each node, but it only exchanges the count values. The MR-DARM algorithm uses notation Ck as a set of candidate k-item set and Lk as a set of frequent k-itemset which is shown in Fig. 1.
  • 4. http://www.iaeme.com/IJARET/index. The transactional data is given as an input to the Mapper line by line. Each line is split into items and the output <key, value> pair consists of the item and the value 1. This is the local frequency of the item. The reduce task starts with the itemsets of length 1 and generates candidates with length 2. During step k of the algorithm it will start with length n itemsets and genera reduce task cannot generate bigger candidate itemsets it will stop the whole computation. Frequent itemsets are calculated based on different values of minimum support threshold. Support decision system will check for the appropriate support count value for generating strong association rules. 3.1. Association Rule Generation The output of distributed frequent mining algorithm is frequent itemsets which will be given as input to the association rule generator module to generate strong association rules which satisfies minimum confidence threshold. Association rules can be generated as follows [ • For each frequent itemset, Input: Transactional Database in HDFS ( Minimum Support Threshold ( Output: Frequent Itemsets ( Method: L1 = find frequent 1 For each frequent Ck = Lk-1 Ct = Map(). // Generates itemset occurrence Lk = Reduce (). // Gets the subset of frequent itemsets L = L Uk Lk. Map Function: Input: Set of Transaction ( Output: < Candidate Itemset Method: For each transaction For each itemset If ( Ii ∈ Ti ) then Generate the output < as < Key, Reduce Function: Input: < candidate itemset, list Output: < frequent itemset, support_count Method: count = 0. For each number in count + = number If ( count > = Generate the output < as < key, value Dinesh J. Prajapati IJARET/index.asp 81 Figure 1 The MR-DARM Algorithm The transactional data is given as an input to the Mapper line by line. Each line is split into items and pair consists of the item and the value 1. This is the local frequency of the item. The reduce task starts with the itemsets of length 1 and generates candidates with length 2. During step k of the algorithm it will start with length n itemsets and generate length k + 1 candidate itemsets. If the reduce task cannot generate bigger candidate itemsets it will stop the whole computation. Frequent itemsets are calculated based on different values of minimum support threshold. Support decision system for the appropriate support count value for generating strong association rules. Association Rule Generation The output of distributed frequent mining algorithm is frequent itemsets which will be given as input to the le to generate strong association rules which satisfies minimum confidence threshold. Association rules can be generated as follows [16]. For each frequent itemset, l, generate all non-empty subsets of l. Transactional Database in HDFS (D), Minimum Support Threshold (min_sup) Output: Frequent Itemsets (L) = find frequent 1-itemsets from D. For each frequent k-itemset do Lk-1. // Generates candidate itemset = Map(). // Generates itemset occurrence = Reduce (). // Gets the subset of frequent itemsets Input: Set of Transaction ( Ti ) Candidate Itemset, Value> For each transaction Ti ∈ D do For each itemset Ii in Candidate Itemset Ck do ) then Generate the output < Ii, 1> , Value> pair. andidate itemset, list > frequent itemset, support_count > For each number in list do number. > = Min_sup ) then Generate the output < frequent itemset, count > key, value > pair. [email protected] The transactional data is given as an input to the Mapper line by line. Each line is split into items and pair consists of the item and the value 1. This is the local frequency of the item. The reduce task starts with the itemsets of length 1 and generates candidates with length 2. During step k te length k + 1 candidate itemsets. If the reduce task cannot generate bigger candidate itemsets it will stop the whole computation. Frequent itemsets are calculated based on different values of minimum support threshold. Support decision system for the appropriate support count value for generating strong association rules. The output of distributed frequent mining algorithm is frequent itemsets which will be given as input to the le to generate strong association rules which satisfies minimum confidence Generates candidate itemset
  • 5. Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data http://www.iaeme.com/IJARET/index. • For every non-empty subset s of l, output the rule where min_conf is the minimum confidence threshold. Since, the rules are generated from frequent itemsets; each rule automatically satisfies minimum support. 4. EXPERIMENTAL SETUP For the experimental purpose cluster of four desktop machines consists of i5 processor with 4 GB DDR RAM are used. Ubuntu 12.04 LTS operating system is installed in all the four computers. Usually JVM is not a part of Ubuntu 12.04, so, JVM is also instal configured in three computers and single Hadoop packages. For this experiment, the sales database of AMUL dairy with more than 1500 differen having total size of 5GB is used. In dairy dataset, sales of the dairy product are done based on concept hierarchy. First of all product is send to the distributor which in turn distribute the product to the retailer and finally the retailer will sell the dairy product to the customer. 4.1. Comparative Study of CDA, FDM and After transforming transactional dataset into actual transactional dataset, actual transaction file is given as input to the frequent pattern mining al MR-DARM algorithms on AMUL datasets for the varying database size 256MB, 512MB, 1GB, 2GB and 5GB is applied using single node, two node and three node clusters with minimum support threshol which are shown in Fig. 2, 3 and depends on the number of nodes and the size of dataset. For a data set of size 5GB that was distributed on single node, the execution time for t seconds and 373 seconds respectively, while the same data set distributed on three node cluster produced an execution time of 3490 seconds, 2280 seconds and 269 seconds respectively. So, in order to comparatively small execution times, the number of nodes must be increase with increase in the database size. It is noticeable that the performance of the algorithms increases with increase in number of nodes, and the proposed algorithm gives much dataset is large. Figure 2 Dataset Size Vs Execution Time for Single Node Cluster with 1% Minimum Support Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data IJARET/index.asp 82 empty subset s of l, output the rule “s (l-s)” if (Support ( where min_conf is the minimum confidence threshold. Since, the rules are generated from frequent itemsets; each rule automatically satisfies minimum . EXPERIMENTAL SETUP & RESULTS For the experimental purpose cluster of four desktop machines consists of i5 processor with 4 GB DDR RAM are used. Ubuntu 12.04 LTS operating system is installed in all the four computers. Usually JVM is not a part of Ubuntu 12.04, so, JVM is also installed it in all the four computers. Multi configured in three computers and single-node cluster is configured in single computer using Apache For this experiment, the sales database of AMUL dairy with more than 1500 differen having total size of 5GB is used. In dairy dataset, sales of the dairy product are done based on concept hierarchy. First of all product is send to the distributor which in turn distribute the product to the retailer r will sell the dairy product to the customer. Comparative Study of CDA, FDM and MR-DARM Algorithm After transforming transactional dataset into actual transactional dataset, actual transaction file is given as input to the frequent pattern mining algorithm to find the frequent itemsets. The results of CDA, FDM and algorithms on AMUL datasets for the varying database size 256MB, 512MB, 1GB, 2GB and 5GB is applied using single node, two node and three node clusters with minimum support threshol and 4 respectively. The result shows that the performance of the algorithm depends on the number of nodes and the size of dataset. For a data set of size 5GB that was distributed on single node, the execution time for the CDA, FDM and MR-DARM algorithms are 5670 seconds, 3680 seconds and 373 seconds respectively, while the same data set distributed on three node cluster produced an execution time of 3490 seconds, 2280 seconds and 269 seconds respectively. So, in order to comparatively small execution times, the number of nodes must be increase with increase in the database size. It is noticeable that the performance of the algorithms increases with increase in number of nodes, and the proposed algorithm gives much better performance than CDA as well as FDM when the size of the Dataset Size Vs Execution Time for Single Node Cluster with 1% Minimum Support Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data [email protected] ” if (Support (l) / Support (s)) >= min_conf, Since, the rules are generated from frequent itemsets; each rule automatically satisfies minimum For the experimental purpose cluster of four desktop machines consists of i5 processor with 4 GB DDR-3 RAM are used. Ubuntu 12.04 LTS operating system is installed in all the four computers. Usually JVM is led it in all the four computers. Multi-node cluster is node cluster is configured in single computer using Apache For this experiment, the sales database of AMUL dairy with more than 1500 different dairy product having total size of 5GB is used. In dairy dataset, sales of the dairy product are done based on concept hierarchy. First of all product is send to the distributor which in turn distribute the product to the retailer After transforming transactional dataset into actual transactional dataset, actual transaction file is given as The results of CDA, FDM and algorithms on AMUL datasets for the varying database size 256MB, 512MB, 1GB, 2GB and 5GB is applied using single node, two node and three node clusters with minimum support threshold of 1% respectively. The result shows that the performance of the algorithm depends on the number of nodes and the size of dataset. For a data set of size 5GB that was distributed on algorithms are 5670 seconds, 3680 seconds and 373 seconds respectively, while the same data set distributed on three node cluster produced an execution time of 3490 seconds, 2280 seconds and 269 seconds respectively. So, in order to obtain comparatively small execution times, the number of nodes must be increase with increase in the database size. It is noticeable that the performance of the algorithms increases with increase in number of nodes, and better performance than CDA as well as FDM when the size of the Dataset Size Vs Execution Time for Single Node Cluster with 1% Minimum Support
  • 6. http://www.iaeme.com/IJARET/index. Figure 3 Dataset Size Vs Execution Time for Two Node Cluster with 1% Minimum Sup Figure 4 Dataset Size Vs Execution Time for Three Node Cluster with 1% Minimum Support 5. CONCLUSION AND FUTURE SCOPE HDFS and MapReduce play really an important role However, most of the algorithms have limitation of processing speed. In this paper, hadoop based distributed approach is presented which process the transactional dataset into partitions and transfers the task to all participating nodes. The purpose is to reduce inter node message passing in the cluster. In preprocessing using Hadoop MapReduce, it has been observed that as the number of reducer increases, the execution time is significantly decreases. The experimen scales linearly with the number of nodes and the size of the dataset. In this paper, The algorithm is implemented to find distributed frequent itemsets. As the number of node is increased, the performance is really improved by considering lower minimum support factor and large database size. The proposed algorithm generates a smaller candidate set and uses a less message passing than CDA and FDM algorithm, thus the execution time of the proposed alg algorithm is more flexible, scalable and efficient distributed frequent pattern mining algorithm for mining large data. Dinesh J. Prajapati IJARET/index.asp 83 Dataset Size Vs Execution Time for Two Node Cluster with 1% Minimum Sup Dataset Size Vs Execution Time for Three Node Cluster with 1% Minimum Support AND FUTURE SCOPE HDFS and MapReduce play really an important role for handling and analyzing of large datasets. However, most of the algorithms have limitation of processing speed. In this paper, hadoop based distributed approach is presented which process the transactional dataset into partitions and transfers the all participating nodes. The purpose is to reduce inter node message passing in the cluster. In preprocessing using Hadoop MapReduce, it has been observed that as the number of reducer increases, the execution time is significantly decreases. The experimental results show that the parallel processing task scales linearly with the number of nodes and the size of the dataset. In this paper, The algorithm is implemented to find distributed frequent itemsets. As the number of node is increased, the ormance is really improved by considering lower minimum support factor and large database size. The proposed algorithm generates a smaller candidate set and uses a less message passing than CDA and FDM algorithm, thus the execution time of the proposed algorithm is less as compare to others. The proposed algorithm is more flexible, scalable and efficient distributed frequent pattern mining algorithm for mining [email protected] Dataset Size Vs Execution Time for Two Node Cluster with 1% Minimum Support Dataset Size Vs Execution Time for Three Node Cluster with 1% Minimum Support handling and analyzing of large datasets. However, most of the algorithms have limitation of processing speed. In this paper, hadoop based distributed approach is presented which process the transactional dataset into partitions and transfers the all participating nodes. The purpose is to reduce inter node message passing in the cluster. In preprocessing using Hadoop MapReduce, it has been observed that as the number of reducer increases, the tal results show that the parallel processing task scales linearly with the number of nodes and the size of the dataset. In this paper, The MR-DARM algorithm is implemented to find distributed frequent itemsets. As the number of node is increased, the ormance is really improved by considering lower minimum support factor and large database size. The proposed algorithm generates a smaller candidate set and uses a less message passing than CDA and FDM orithm is less as compare to others. The proposed algorithm is more flexible, scalable and efficient distributed frequent pattern mining algorithm for mining
  • 7. Comparative Study of Distributed Frequent Pattern Mining Algorithms for Big Sales Data http://www.iaeme.com/IJARET/index.asp 84 [email protected] The time efficiency of the algorithm may be improved by using FP-tree based data structures for the candidate itemset generation. 6. ACKNOWLEDGEMENTS The authors take this opportunity to thank all the researchers from the domain of big data analysis for their immense knowledge and kind support throughout the work. Also would like to thank our institute for their resources and constant inspiration. Special thanks to the authority of AMUL dairy located at Anand district for providing sales dataset. At last heartiest thanks to our family and friends for encouraging us to make this a success. REFERENCES [1] Srikumar, K. and Bhasker, B. 2005. Metamorphosis: Mining Maximal Frequent Sets in Dense Domains. Int. Journal of Artificial Intelligence Tools, Vol. 14, Issue 3, 491-506. [2] Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining association rules between sets of items in large databases. Proc. Int. Conf. of ACM-SIGMOD on Management of Data, 207-216. [3] Olsan, D. L. and Delen, D. 2008. Advanced Data Mining Techniques. Springer. [4] Han, J. and Kamber, M. 2004. Data Mining Concepts & Techniques. San Francisco: Morgan Kaufmann Publishers. [5] Agrawal, D., Das, S. and Abbadi, A. 2011. Big data and cloud computing: current state and future opportunities. Proc. 14th Int. Conf. Extending Database Technology, ACM, 530-533. [6] Wu, X., Zhu, X., Wu, G. and Ding W. 2013. Data Mining with Big Data. IEEE Transactions on Knowledge and Data Engineering, Vol. 26, Issue 1, 97-107. [7] Lin, J., & Ryaboy, D. 2013. Scaling big data mining infrastructure: the twitter experience. ACM SIGKDD Explorations Newsletter, 14, 6-19. [8] Mottalib, M. A., Arefin, K. S., Islam, M. M., Rahman, M. A. and Abeer, S. A. 2011. Performance Analysis of Distributed Association Rule Mining with Apriori Algorithm. Int. Journal of Computer Theory and Engineering, Vol. 3, No. 4, 484-488. [9] Ansari, E., Dastghaibifard, G. H., Keshtkaran, M. and Kaabi, H. 2008. Distributed Frequent Itemset Mining using Trie Data Structure. Int. Journal of Computer Science (IJCS). [10] Karim, M. R., Ahmed, C. F., Jeong, B. and Choi, H. 2013. An efficient Distributed Programming Model for Mining Useful Patterns in Big Datasets. IETE Technical Review, Vol. 30, Issue 1, 53-63. [11] Riondato, M., DeBrabant, J. A., Fonseca, R. and Upfal, E. 2012. Parma: A parallel randomized algorithm for approximate association rules mining in MapReduce. Proc. 21th Int. Conf. Information and Knowledge Management (CIKM ’12), ACM, USA, 85–94. [12] Malek, M. and Kadima, H. 2013. Searching frequent itemsets by clustering data: Towards a parallel approach using MapReduce. Web Information Systems Engineering WISE 2011 and 2012, Springer, Berlin Heidelberg, 7652, 251–258. [13] Butincu, C. N. and Craus, M. 2015. An improved version of the frequent itemset mining algorithm. Proc. 14th IEEE Int. Conf. Networking in Education and Research, Craiova, 184-189. [14] Agrawal, R. and Shafer, J. C. 1996. Parallel mining of association rules. IEEE Trans. on Knowledge and Data Engineering, 8, 962-969. [15] Cheung, D. W., Han. J., Vincent. T. N. and Ada W. Fu 1996. A fast distributed algorithm for mining association rules. Proc. 4th IEEE Int. Conf. Parallel and Distributed Information Systems, 31-42.
  • 8. Dinesh J. Prajapati http://www.iaeme.com/IJARET/index.asp 85 [email protected] [16] Ban, T., Eto, M., Guo, S., Inoue, D., Nakao, K. and Huang, R. 2015. A study on association rule mining of darknet big data. Proc. IEEE Int. Joint Conf. on Neural Network (IJCNN), 1-7. [17] Mudra Doshi And Bidisha Roy, Efficient Processing of Ajax Data Using Mining Algorithms. International Journal of Computer Engineering and Technology (IJCET), 5(8), 2014, pp. 48–54 [18] Ms. Aruna J. Chamatkar and Dr. P.K. Butey, Performance Analysis of Data Mining Algorithms with Neural Network. International Journal of Computer Engineering and Technology, 6(1), 2015, pp. 01–11