An Efficient Technique For Protecting Sensitive Information: Abstract
An Efficient Technique For Protecting Sensitive Information: Abstract
ISSN:2320-0790
Abstract: Data mining services require accurate input data for their results to be meaningful, but privacy concerns may
influence users to provide spurious information. To preserve client privacy in the data mining process, a variety of techniques
based on random perturbation of data records have been proposed recently. One known fact which is very important in data
mining is discovering the association rules from database of transactions where each transaction consists of set of items. Two
important terms support and confidence are associated with each of the association rule. Actually any rule is called as sensitive
if its disclosure risk is above a certain privacy threshold. Sometimes we do not want to disclose sensitive rules to the public
because of confidentiality purposes. There are many approaches to hide certain association rules which take the support and
confidence as a base for algorithms and many more). The proposed work has the basis of reduction of support and confidence of
sensitive rules but this work is not editing or disturbing the given database of transactions directly .The proposed algorithm uses
some modified definition of support and confidence so that it would hide any desired sensitive association rule without any side
effect. Actually the enhanced technique is using the same method (as previously used method) of getting association rules but
modified definitions of support and confidence are used.
Keywords: Data mining, Data hiding, Support, Confidence, and Association rules etc.
I. INTRODUCTION
Many government agencies, businesses and non-profit
organizations in order to support their short and long term
planning activities, they are searching for a way to collect,
store, analyze and report data about individuals,
households or businesses. Information systems, therefore,
contain confidential information such as social security
numbers, income, credit ratings, type of disease, customer
purchases, etc., that must be properly protected. Let us
suppose that we are negotiating a deal with Dedtrees Paper
Company, as purchasing directors of Big Mart, a large
supermarket chain. They offer their products in reduced
price, if we agree to give them access to our database of
customer purchases. We accept the deal and Dedtrees starts
mining our data. By using an association rule mining tool,
they find that people who purchase skim milk also
purchase Green paper.
Dedtrees now runs a coupon marketing campaign saying
that you can get 50 cents off skim milk with every
purchase of a Dedtrees product. This campaign cuts
heavily into the sales of Green paper, which increases the
79
COMPUSOFT, An international journal of advanced computer technology, 2 (3), March-2013 (Volume-II, Issue-III)
Fig.1
80
COMPUSOFT, An international journal of advanced computer technology, 2 (3), March-2013 (Volume-II, Issue-III)
Items
ABD
B
ACD
AB
ABD
AB
BA
AD
DA
81
(60%, 75%)
(60%, 75%)
(60%, 75%)
(60%, 100%)
COMPUSOFT, An international journal of advanced computer technology, 2 (3), March-2013 (Volume-II, Issue-III)
[8]
Now there is a need to hide A.
To hide A. After 1 pass the status of database is as follows:
A B
B A
A D
D A
Support, confidence
20%
25%
50%
60%
20%
25%
43%
60%
[2]
[3]
[4]
[5]
[6]
[7]
82