0% found this document useful (0 votes)
7 views

Lecture 09 DifferentialPrivacy

The document discusses differential privacy and the challenges of ensuring privacy in data release, highlighting methods such as anonymization and k-anonymity. It emphasizes the risks of privacy violations through carefully crafted queries that can infer sensitive information about individuals. The conclusion suggests that while complete privacy cannot be guaranteed, measures can be taken to minimize the impact of individual data on released results.

Uploaded by

3houd.hk
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Lecture 09 DifferentialPrivacy

The document discusses differential privacy and the challenges of ensuring privacy in data release, highlighting methods such as anonymization and k-anonymity. It emphasizes the risks of privacy violations through carefully crafted queries that can infer sensitive information about individuals. The conclusion suggests that while complete privacy cannot be guaranteed, measures can be taken to minimize the impact of individual data on released results.

Uploaded by

3houd.hk
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Differential Privacy

ISEC411: Privacy & Anonymity

Course Instructor: Dr. Hanane LAMAAZI


Approach for private release

•Consent or anonymize
•Anonymization

2
Approach for private release

•Anonymization
• removing identifiers (safe harbor)
• k-anonymity and l-diversity

•Both have problems.

3
Solution?

•Hide the database (no user can see the


database)
•Allow users to query the database
•Users will only see the output of their queries

•What do you think?

4
Basic Setting

x1 query 1
x2 answer 1 Users
x3 interface
DB=  (government,
 query T researchers,
xn-1 marketers, …)
xn answer T

What are the privacy risks in such mechanism?


Can an attacker use this mechanism to breach the privacy of
participants? slide 5
Risk to Privacy
An attacker knows the age 20 and zipcode 15000 of Alice. In order to
infer something about Alice’s income, s/he issues 2 queries:

Table 1:
Risk to Privacy
An attacker knows the age 20 and zipcode 15000 of Alice. In order to
infer something about Alice’s income, s/he issues 2 queries:

• q0: SELECT COUNT(*) FROM T WHERE Age ∈ [20, 20] AND Zipcode ∈ [15k, 15k]
AND Income ∈ [80k, +∞)

Table 1:
Risk to Privacy
An attacker knows the age 20 and zipcode 15000 of Alice. In order to
infer something about Alice’s income, s/he issues 2 queries:

• q0: SELECT COUNT(*) FROM T WHERE Age ∈ [20, 20] AND Zipcode ∈ [15k, 15k]
AND Income ∈ [80k, +∞)
• 1

• q’0: SELECT COUNT(*) FROM T WHERE Age ∈ [20, 20] AND Zipcode ∈ [15k, 15k]
AND Income ∈ (-∞, 80k)
• 0

Table 1:
Conclusion?
• Carefully drafted queries can lead to privacy violations

Topic 21: Data Privacy 9


A running example: Justin Bieber
To understand the guarantee and what it protects against.
Suppose you are handed a survey:

If your music taste is sensitive information, what will make


you feel safe? Anonymous?
10
DB of surveys is hidden

x1 query 1
x2 answer 1 Users
x3 interface
DB=  (government,
 query T researchers,
xn-1 marketers, …)
xn answer T

A user poses T queries


Results R={Answers: 1,2,3,…,T}
What will make you participate in the survey? slide 11
Problems?
What do we want?

I would feel safe submitting a survey if……

I knew that my answer had no impact on the released results.

➢ Q(Di-me)=Q(Di)

I knew that any attacker looking at the published results R


couldn’t learn (with any high probability) any new
information about me.

➢ Prob(secret(me)| R)=Prob(secret(me))

12
Why can’t we have it?

❑ If individual answers had no


impact on the released results
….. Then the results would have
no utility

13
Problems?
What do we want?
I would feel safe submitting a
survey if……

I knew that my answer had no


impact on the released results. • Example: say I am a 20 year old
female. if R says that 90% of
➢ Q(Di-me)=Q(Di) females 20-25 like Justin Beiber
then:
• Does this conclusion seem a private
I knew that any attacker looking at one?
the published results R couldn’t • Is it allowed according to our
suggested requirement?
learn (with any high probability)
any new information about me.

➢ Prob(secret(me)| R)=Prob(secret(me))
14
More examples

• if the attacker knows a function about me that is dependent on


general facts about the population:
• I smoke shisha since 10 years
• I am male

• Then by just knowing these facts, attackers can infer specific


information about me (example if a study publishes that males who
smoke shisha for over 10 years have 50% increased chance of lung
cancer). Even if I do not participate in any studies

Topic 21: Data Privacy 15


Disappointing fact

• We can’t promise my data won’t affect the results

• We can’t promise that an attacker won’t be able to


learn new information about me. Giving proper
background information.

• What can we do?

Topic 21: Data Privacy 16


One more try
• I’d feel safe submitting a survey….

• If I knew the chance that the privatized released result R


would be nearly the same, whether or not I submitted my
information.

• Example:
• The research findings show the connection between
(gender, age) and how one listens to Bieber (for example
90% of boys age 21 like Justin Beiber):
• Will be almost the same regardless of my participation
• If I am a 21-year-old boy, then the result will implicate me even if I
do not participate.

Topic 21: Data Privacy 17


Recall…

• K-anonymity was introduced as a reaction to the famous Sweeney


attack

18

You might also like