0% found this document useful (0 votes)
7 views

5. Privacy Models Differential Privacy I

The document provides an overview of differential privacy, a privacy model that ensures individual data points do not significantly affect the output of data analysis. It discusses the advantages and disadvantages of different privacy mechanisms, including cryptography, anonymization, and perturbation, and highlights the importance of privacy budget and sensitivity in achieving differential privacy. Additionally, it explores local differential privacy and its applications, while emphasizing the balance between privacy guarantees and utility in real-world scenarios.

Uploaded by

shrydhpd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

5. Privacy Models Differential Privacy I

The document provides an overview of differential privacy, a privacy model that ensures individual data points do not significantly affect the output of data analysis. It discusses the advantages and disadvantages of different privacy mechanisms, including cryptography, anonymization, and perturbation, and highlights the importance of privacy budget and sensitivity in achieving differential privacy. Additionally, it explores local differential privacy and its applications, while emphasizing the balance between privacy guarantees and utility in real-world scenarios.

Uploaded by

shrydhpd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Privacy models: differential

privacy
Roadmap
• Differential privacy background
• Differentially Private Applications
• New applications: more than privacy
• Conclusion

2
1. Privacy preserving
background

3
Privacy Model
• A set of rules/assumptions to describe/measure the privacy of a dataset.
• Privacy: (informal definition) adversary cannot learn anything new about
particular individual after accessing to the dataset

Alice
Public Users or Entities
Bob Original PM
Dataset D
Curator
Cathy

Data Collecting Data Release/Sharing

4
Typical Privacy Model

Method Advantage Disadvantage

Cryptography •Not decreasing accuracy; •Computation complexity;


•suitable for multiple parties •people may not want to participant
computation

Anonymization •Easy to understand •No guarantee on the


•Easy to implement for low dimensional quality of the dataset;
dataset •NP-hard for high dimensional dataset
•Weak privacy guarantee

Perturbation Provide high privacy level •Noise size is subjective

5
Weakness of Traditional Privacy
Model
• Privacy level is difficult to be measured and compared
• The privacy guarantee is hardly to be proved theoretically
• Susceptible to background attack

6
Background Attack

D Attacker
Query result
X1 Query on (x1, « , xn)
differentiates with
X2 background
... information to get xn
Xn-1
Xn
X1
Background information
X2
on (x1,...xn-1)
...
Xn-1

7
Differential Privacy
• An individual is in or out of the database should make little difference
of the analytical output. Dwork, C. (2006). Differential Privacy

D
X1
X2 f(D) Why does it work?
… Output=S
Xn-1
Xn
Neighboring
datasets: differ in
a single record D’ If adversary cannot tell
X1 the difference between
X2 f(D’) the output of D and D’,
… Output=S
then Xn is safe.
Xn-1
Xn

8
2. Differential Privacy

9
Privacy Definition: DP
• Definition:
• a mechanism M is -differential privacy if for all pairs of neighboring datasets D
and D’, and for all possible output S, satisfy with:

is Privacy Budget
Pr[ M ( D)  S ] Pr[M()]
 D
e  e D’ Ratio Bounded
Pr[ M ( D' )  S ]
DP-Mechanism

10
How to achieve the definition?
• Curator will not release a dataset, instead, user provide statistical
queries to the curator, and curator replies with query answers.
• Add uncertainty to the output
• True answer is unique, but DP answer is a distribution.

Pr

True answer S

11
How to achieve the definition?
• Add uncertainty to the output
• perturbation: how much is enough?
• Which Privacy Level? Privacy Budget
• How much the difference (between query results on neighboring datasets) should be
hide? Sensitivity

Pr ¿ 𝑓 ( 𝐷 ) − (𝑓𝐷( )𝐷′
𝑀𝑎𝑥∨𝑓 ( 𝐷′ )∨¿
− 𝑓)∨¿

True answer True answer S


For D For D’
12
Privacy Budget
• controls the privacy guarantee level of mechanism.
• A smaller indicates a higher privacy level
• Normally, it is less than 1.
• Every DP step will consume a part of , until it used up.

13
Sensitivity
• Sensitivity is a parameter determining how much perturbation is
required in mechanisms.

Pr ¿ 𝑓 ( 𝐷 ) − (𝑓𝐷( )𝐷′
𝑀𝑎𝑥∨𝑓 ( 𝐷′ )∨¿
− 𝑓)∨¿

True answer True answer S


For D For D’

14
Principle Differential Privacy
Mechanisms
• Laplace Mechanism: add Laplace noise to the query result.
• How many people in this room have blue eyes?
Noise: Laplace(sensitivity/privacy budget)

-4 -3 -2 -1 0 1 2 3 4 5

• Exponential Mechanism: adjust probabilities of the possible output


• What is the most common eye color in this rooms?
• Prob proportion to exp(privacy budget/sensitivity)
• i.e. R={Brown, Blue, Black, Green},
10%, 5%, 100%,
0%, 0%, 80%, 5%
0%

Dwork, C., Mcsherry, F., Nissim, K., & Smith, A. (2006). Calibrating Noise to Sensitivity in Private Data Analysis, Theory of Cryptography , 265-284.
McSherry, F., & Talwar, K. (2007). Mechanism Design via Differential Privacy. (FOCS)
16
Laplace Mechanism
• Let f(D) be a numeric query on dataset D
• How many people in this room have blue eyes?
• The sensitivity of f:
M(D)
X1
X2 f(D) +noise
True answer Noisy output S
Xn-1
Xn

• A Laplace Mechanism M is -differential privacy:


)
True Answer Noise

17
Laplace Example
Job Sex Age Disease • Query: How many people have HIV?
Engineer Male 35 Hepatitis • DP answer = True answer + Noise
Engineer Male 38 Hepatitis M)
Lawyer Male 38 HIV • Sensitivity is 1, because the answer is
Writer Female 30 Flu changed most at 1 if one user is deleted.
Writer Female 30 HIV
• If we define privacy budget=1, the noise
is sampled from Lap(1)
Dancer Female 30 HIV

Dance Female 30 HIV


• DP answer A(D):
• 4+1=5 (higher probability)
• 4-1=3 (higher probability)
• 4-3=1 (lower probability)

18
Exponential Mechanism
• Exponential Mechanism is suitable for non-numeric output R
• What is the most common eye color in this rooms?
• i.e. R={Brown, Blue, Black, Green}
0%, 0%,
10%, 5%, 100%,
80%, 5%
0%

– Paired with a quality score q:


q(D, r)represents how good an output r is for dataset D

• An exponential Mechanism A is -differential privacy if:

Sensitivity of q:

19
Exponential Example
• What is the most common eye color in this rooms?
• i.e. R={Brown, Blue, Black, Green}

0%, 0.01%,
11.8%, 0%, 100%, 0%
88%, 0.0001%

Sensitivity
Impactofofq:changing a single record

Sampling Probability
Option Score
=0 =0.1 =1
Brown 23 0.25 0.34 0.12
Blue 9 0.25 0.16 10-4
Black 27 0.25 0.40 0.88
Green 0 0.25 0.10 10-6
20
Local Differential Privacy (LDP)

Alice
Public Users or Entities
Bob LDP Original
Dataset D
Curator
Cathy

Data Collecting Data Release/Sharing

21
From DP to LDP: Formal Definition
Idea of DP: Any output should be about as likely
regardless of whether or not I am in the dataset
A randomized algorithm satisfies -differential privacy, iff for any two
neighboring datasets and and for any output Sof ,

A randomized algorithm satisfies -local differential privacy, iff for any two
inputs and and for any output of ,

is also called privacy budget


Run by the server
Smaller  stronger
Run by privacy
each single person
Idea of LDP: Any output should be
about as likely regardless of my secret 22
Key difference between DP and LDP
• DP concerns two neighboring datasets
• LDP concerns any two values
• As a result, the amount of noise is different: In aggregated result for
counting queries
• Noise in DP is (sensitivity is constant)
• But in LDP, even noise for each user is constant, the aggregated result is [1]

[1] Optimal lower bound for differentially private multi-party aggregation by T.-H. H.
Chan, E. Shi, and D. Song
Practice at Scale (Part B) 23
Apple Differential Privacy

24
Apple Differential Privacy

25
Other adoptions

26
Advantage of Differential Privacy
• Differential privacy is a promising privacy model that can provide
provable privacy guarantee.
• It also has potential development on various research communities
such as data mining, machine learning, etc.

27
Disadvantage of Differential Privacy
• It is quite successful on the mathematic theory, but brings huge utility
loss in real applications.
• High noise
• Large sensitivity
• Sparse Dataset
• Limited privacy budget
• …

28

You might also like