0% found this document useful (0 votes)
12 views

Distributed DP in Mixnets

This paper proposes and analyzes a new "mixnet model" for differentially private distributed algorithms that lies between the standard local and central models. In the mixnet model, users randomly permute messages before sending them to a server, providing privacy without requiring users to fully trust the server. The paper presents differentially private mixnet protocols for computing boolean sums and sums of real numbers, and analyzes their privacy and accuracy. It also shows lower bounds, proving the mixnet model has strictly greater power than the local model but less than the central model for certain problems like selection queries and histograms.

Uploaded by

Siddhant Tiwari
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Distributed DP in Mixnets

This paper proposes and analyzes a new "mixnet model" for differentially private distributed algorithms that lies between the standard local and central models. In the mixnet model, users randomly permute messages before sending them to a server, providing privacy without requiring users to fully trust the server. The paper presents differentially private mixnet protocols for computing boolean sums and sums of real numbers, and analyzes their privacy and accuracy. It also shows lower bounds, proving the mixnet model has strictly greater power than the local model but less than the central model for certain problems like selection queries and histograms.

Uploaded by

Siddhant Tiwari
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Distributed Differential Privacy via Mixnets*

Albert Cheu† Adam Smith‡ Jonathan Ullman§


David Zeber¶ Maxim Zhilyaev||

April 13, 2019


arXiv:1808.01394v1 [cs.CR] 4 Aug 2018

Abstract
We consider the problem of designing scalable, robust protocols for computing statistics about
sensitive data. Specifically, we look at how best to design differentially private protocols in a distributed
setting, where each user holds a private datum. The literature has mostly considered two models: the
“central” model, in which a trusted server collects users’ data in the clear, which allows greater accuracy;
and the “local” model, in which users individually randomize their data, and need not trust the server,
but accuracy is limited. Attempts to achieve the accuracy of the central model without a trusted server
have so far focused on variants of cryptographic secure function evaluation, which limits scalability.
In this paper, we propose a mixnet model for distributed differentially private algorithms, which
lies between the local and central models. This simple-to-implement model augments the local model
with an anonymous channel that randomly permutes a set of user-supplied messages. For summation
queries, we show that this model provides the power of the central model while avoiding the need to
trust a central server and the complexity of cryptographic secure function evaluation. More generally,
we give evidence that the power of the mixnet model lies strictly between those of the central and local
models: for a natural restriction of the model, we show that mixnet protocols for a widely studied
selection problem require exponentially higher sample complexity than do central-model protocols.

* Some of these results are based on previous, unpublished work by two of the authors [32].
† College of Computer and Information Science, Northeastern University. Research supported by NSF award CCF-1718088.
[email protected]
‡ Computer Science Department, Boston University. Research supported by NSF awards IIS-1447700 and AF-1763786 and a
Sloan Foundation Research Award. [email protected].
§ College of Computer and Information Science, Northeastern University. Research supported by NSF awards CCF-1718088 and
CCF-1750640 and a Google Faculty Research Award. [email protected]
¶ Mozilla Foundation. [email protected]
|| [email protected]

1
Contents
1 Introduction 3
1.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Mixnets as a Primitive for Private Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Overview of Results 5
2.1 Algorithmic Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Negative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Model and Preliminaries 7


3.1 Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2 Differential Privacy in the Mixnet Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.1 Local Differential Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 A Protocol for Boolean Sums 9


4.1 The Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 Privacy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.3 Setting the Parameter λ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Accuracy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 A Protocol for Sums of Real Numbers 19


5.1 The Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2 Warmup: r = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3 Privacy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.4 Accuracy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

6 Lower Bounds for the Mixnet Model 24


6.1 Mixnet Randomizers Satisfy Local Differential Privacy . . . . . . . . . . . . . . . . . . . . 24
6.2 Applications of Theorem 6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.2.1 The Selection Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.2.2 Histograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

A From Approximate DP to Pure DP for Local Protocols 30


A.1 Privacy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
A.2 Accuracy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

B Concentration Inequalities 38

2
1 Introduction
The past few years has seen a wave of commercially deployed systems [18, 28] for analysis of users’
sensitive data in the local model of differential privacy (LDP). LDP systems have several features that make
them attractive in practice, and limit the barriers to adoption. Each user only sends private data to the
data collector, so users do not need to fully trust the collector, and the collector is not saddled with legal
or ethical obligations. Moreover, these protocols are relatively simple and scalable, typically requiring
each party to asynchronously send a single short message.
However, the local model imposes strong constraints on the utility of the algorithm—precluding the
most useful differentially private algorithms, which require a central model where the users’ data is sent in
the clear, and the data collector is trusted to perform only differentially private computations. Compared
to the central model, the local model requires enormous amounts of data, both in theory and in practice
(e.g. [21]). Unsurprisingly, the local model has so far only been used by large corporations like Apple and
Google with billions of users.
In principle, there is no dilemma between the central and local models, as any algorithm can be
implemented without a trusted data collector using cryptographic multiparty computation (MPC). However,
despite dramatic recent progress in the area of practical MPC, existing techniques still require large costs
in terms of computation, communication, and number of rounds of interaction between the users and
data collector, and are considerably more difficult for companies to extend and maintain.
In this work, we initiate a systematic study of an intermediate model for distributed differential
privacy called the mixnet model. This model augments the standard model of local differential privacy
with an anonymous channel (also called a shuffler) that collects messages from the users, randomly
permutes them, and then forwards them to the data collector for analysis. For certain applications, this
model overcomes the limitations on accuracy of local algorithms while preserving many of their desirable
features. However, under natural constraints, this model is dramatically weaker than the central model.
In more detail, we make two primary contributions:

• We give a simple, non-interactive algorithm in the mixnet model for estimating a single Boolean-
valued statistical query (that is, a counting query) that essentially matches the error achievable
by centralized algorithms. We also show how to extend this algorithm to estimate a bounded
real-valued statistical query, albeit at an additional cost in communication. These protocols are
sufficient to implement any algorithm in the statistical queries model [22], which includes methods
such as gradient descent.

• We consider the ubiquitous variable-selection problem—a simple but canonical optimization problem.
Given a set of counting queries, the variable-selection problem is to identify the query with nearly
largest value (an “approximate argmax” problem). We prove that the sample complexity of this
problem in a natural restriction of the mixnet model is exponentially larger than in the central
model. The restriction is that each user send only a single message into the shuffle, as opposed to
a set of messages—we call this the one-message mixnet model. Our positive results show that the
sample complexity in the mixnet model is polynomially smaller than in the local model. Taken
together, our results give evidence that the central, mixnet, and local models are strictly ordered
in the accuracy they can achieve for selection. Our lower bounds follow from a structural result
showing that any algorithm that is private in the one-message mixnet model is also private in the
local model with weak-but-non-trivial parameters.

1.1 Related Work

Models for Differentially Private Algorithms. Differential privacy [16] is a restriction on the algorithm
that processes a datasset to provide statistical summaries or other output. It ensures that, no matter what

3
an attacker learns by interacting with the algorithm, it would have learned nearly the same thing whether
or not the dataset contained any particular individual’s data [20]. Differential privacy is now widely
studied, and algorithms satisfying the criterion are increasingly deployed [2, 1, 18].
There are two well-studied models for implementing differentially-private algorithms. In the central
model, raw data are collected at a central server where they are processed by a differentially private
algorithm. In the local model [31, 19, 16], each individual applies a differentially private algorithm locally
to their data and shares only the output of the algorithm—called a report or response—with a server
that aggregates users’ reports. The local model allows individuals to retain control of their data since
privacy guarantees are enforced directly by their devices. It avoids the need for a single, widely-trusted
entity and the resulting single point of security failure. The local model has witnessed an explosion of
research in recent years, ranging from theoretical work to deployed implementations; a complete survey
is beyond the scope of this paper.
Unfortunately, for most tasks there is a large, unavoidable gap between the accuracy that is achievable
in the two models. Beimel et al. [5] and Chan et al. [9] show that estimating the sum of bits, one held

by each player, requires error Ω( n/ε) in the local model, while an error of just O(1/ε) is possible the
central model. [14] extended this lower bound to a wide range of natural problems, showing that the

error must blowup by at least Ω( n), and often by an additional factor growing with the data dimension.
More abstractly, Kasiviswanathan et al. [21] showed that the power of the local model is equivalent to
the statistical query model [22] from learning theory. They used this to show an exponential separation
between the accuracy and sample complexity of local and central algorithms. Subsequently, an even more
natural separation arose for the variable-selection problem [14, 29], which we also consider in this work.
Implementing Central-Model Algorithms in Distributed Models. In principle, one could also use the
powerful, general tools of modern cryptography, such as multiparty computation (MPC), or secure
function evaluation, to simulate central model algorithms in a setting without a trusted server [15], but
such algorithms currently impose bandwidth and liveness constraints that make them impractical for
large deployments. In contrast, Google [18] now collects certain usage statistics from hundreds of millions
of users’ devices subject to local differential privacy.
A number of specific solutions have been proposed to get around these efficiency limitations. Google’s
Prochlo architecture [6] allows a semi-trusted entity (such as Google itself) to collect and analyze data in
such a way that the entity does not have direct access to raw data, but only sufficiently coarse summaries.
The system enables various trust boundaries, but the main guarantee is provided by a combination of
user randomization (as in the local model) and trusted hardware (Intel’s SGX). The privacy guarantees
provided by Prochlo are not formalized in [6]. To the best of our understanding, they correspond to the
central model, with the SGX unit playing the role of the trusted server. Unfortunately, truly trustworthy
hardware is notoriously difficult to design, as demonstrated by recent attacks on SGX [12].
A number of specific, efficient MPC algorithms have also been proposed for differentially private
functionalities. They generally either (a) focus on simple summations and require a single “semi-
honest” (a.k.a. “honest-but-curious”) server that aggregates user answers, as in Shi et al. [25], Chan et al.
[10], Bonawitz et al. [7]; or (b) allow general computations but require a network of servers, a majority of
whom are assumed to behave honestly, as in Corrigan-Gibbs and Boneh [13]. As they currently stand,
these approaches have a number of drawbacks: they either require users to trust that a server maintained
by a service provided is behaving (semi-)honestly, or they require that a coalition of service providers
collaborate to run protocols that reveal to each other who their users are and what computations they
are performing on their users’ data. While it is possible, in principle, to avoid these issues by combining
anonymous communication layers and MPC protocols for universal circuits, such modifications destroy
the efficiency gains relative to generic MPC.
Mixnets. A mix network, or mixnet, is a protocol involving several computers that takes as input a sequence
of encrypted messages, and outputs a uniformly random permutation of those messages’ plaintexts.

4
π
Alice: x1 z1
Alice: x1 z1
Bob: x2 z2
Bob: x2 z2
Clarice: x3 z3 Aggregator Clarice: x3 z3 Aggregator
David: x4 z4 z4
David: x4
Egon: x5 z5 fˆ(z)≈f(x) Egon: x5 z5 fˆ(z)≈f(x)
Frida: x6 z6 z6
Frida: x6
Grete: x7 z7 z7
Grete: x7
Harris: x8 z8 z8
Harris: x8
Iannis: x9 z9 z9
Iannis: x9

Prototypical (one-message) protocols in the local model (left) and the mixnet model (right).

Introduced by [11], the basic idea now exists in many variations. In its simplest instantiation, the network
consists of a sequence of servers, whose identities and ordering are public information.1 Messages, each
one encrypted with all the servers’ keys, are submitted by users to the first server. Once enough messages
have been submitted, each server in turn performs a shuffle in which the server removes one layer of
encryption and sends a permutation of the messages to the next server. In a verifiable shuffle, the server
also produces a cryptographic proof that the shuffle preserved the multi-set of messages. The final server
sends the messages to their final recipients, which might be different for each message. A variety of
efficient implementations of mixnets with verifiable shuffles exist (see, e.g., [23] and citations therein).

1.2 Mixnets as a Primitive for Private Data Analysis


This paper studies how to use a mixnet as a cryptographic primitive to implement differentially-
private algorithms. Relative to general MPC, mixnets provide several advantages: First, there already
exist a number of highly efficient, implemented protocols. Second, their trust model is simple and highly
robust—as long as a single one of the servers performs its shuffle honestly, the entire process is a uniformly
random permutation, and our protocols’ privacy guarantees will hold. The approach also provides, for
free, secrecy of a company’s user base (since each company’s users could use that company’s server
as their first hop) and computations (since the computation is still done by the company). Finally, the
architecture and trust guarantees are easy to explain to nonexperts—while the guarantees of MPC require
considerable expertise to understand, the guarantees of a mixnet are easily explained with metaphors of
shuffled envelopes or a shell game.
Understanding the possibilities and limitations of mixnet-protocols for private data analysis is inter-
esting from both theoretical and practical perspectives. It provides an intermediate abstraction, and we
give evidence that it lies strictly between the central and local models. Thus, it sheds light on the mini-
mal cryptographic primitives needed to get the central model’s accuracy. It also provides an attractive
platform for near-term deployment, for the reasons listed above.
For the remainder of this paper, we treat the mixnet protocol as an abtract service that randomly
permutes a set of messages. We leave a discussion of the many engineering, social, and cryptographic
implementation considerations to future work.

2 Overview of Results
The Mixnet Model. In our model, there are n users, each with data xi ∈ X . Each user applies some
encoder R : X → Y m to their data and sends the messages (yi,1 , . . . , yi,m ) = R(xi ). In the one-message mixnet
1 Variations on this idea based on onion routing allow the user to specify a secret path through a network of mixes.

5
model, each user sends m = 1 message. The n · m messages yi,j are sent to a shuffler S : Y ∗ → Y ∗ that takes
these messages and outputs them in a uniformly random order. The shuffled set of messages is then
passed through some analyzer A : Y ∗ → Z to estimate some function f (x1 , . . . , xn ). Thus, the protocol P
consists of the tuple (R, S, A). We say that the algorithm is (ε, δ)-differentially private in the mixnet model
if the algorithm MR (x1 , . . . , xn ) = S(∪ni=1 R(xi )) satisfies (ε, δ)-differential privacy. For more detail, see the
discussion leading to Definition 3.4.
In contrast to the local model, differential privacy is now a property of all n users’ messages, and the
(ε, δ) may be functions of n. However, if an adversary were to inject additional messages, then it would
not degrade privacy, provided that those messages are independent of the honest users’ data. Thus, we
may replace n, in our results, as a lower bound on the number of honest users in the system. For example,
if we have a protocol that is private for n users, but instead we have np users of which we assume at least
a p fraction are honest, the protocol will continue to satisfy differential privacy.

2.1 Algorithmic Results


Our main result shows how to estimate any bounded, real-valued linear statistic (a statistical query) in
the mixnet model with error that nearly matches the best possible utility achievable in the central model.

Theorem 2.1. For every ε ∈ (0, 1), and every δ & εn2−εn and every function f : X → [0, 1], there is a protocol
P hin then mixnet model
i that is (ε, δ)-differentially private, and for every n and every X = (x1 , . . . , xn ) ∈ X n ,
Pn 1 n √
E P (X) − i=1 f (xi ) = O( ε log δ ). Each user sends m = Θ(ε n) one-bit messages.

For comparison, in the central model, the Laplace mechanism achieves (ε, 0)-differential privacy and

error O( 1ε ). In contrast, error Ω( 1ε n) is necessary in the local model. Thus, for answering statistical
queries, this protocol essentially has the best properties of the local and central models.
In the special case of estimating a sum of bits (or a Boolean-valued linear statistic), our protocol has a
slightly nicer guarantee and form.

Theorem 2.2. For every ε ∈ (0, 1), and every δ & 2−εn and every function f : X → {0, 1}, there is a protocol
P in the mixnet model that isq(ε, δ)-differentially private, and for every n and every X = (x1 , . . . , xn ) ∈ X n ,
h i
E P (X) − ni=1 f (xi ) = O( 1ε log 1δ ). Each user sends a single one-bit message.
P

The protocol corresponding to Theorem 2.2 is extremely simple:

1. For some appropriate choice of p ∈ (0, 1), each user i with input xi outputs yi = xi with probability
log(1/δ)
1 − p and a uniformly random bit yi with probability p. When ε is not too small, p ≈ ε2 n .
p
  P 
1 n
2. The analyzer collects the shuffled messages y1 , . . . , yn and outputs 1−p i=1 yi − 2 .


Communication Complexity. Our protocol for real-valued queries requires Θ(ε n) bits per user. In con-

trast, the local model requires just a single bit, but incurs error Ω( 1ε n). A generalization of Theorem 2.1

n
gives error O( r + 1ε log δr ) and sends r bits per user, but we do not know if this tradeoff is necessary.
Closing this gap is an interesting open question.

2.2 Negative Results


We also prove negative results for algorithms in the one-message mixnet model. These results hinge on
a structural characterization of private protocols in the one-message mixnet model.

Theorem 2.3. If a protocol P = (R, S, A) satisfies (ε, δ)-differential privacy in the one-message mixnet model, then
R satisfies (ε + ln n, δ)-differential privacy. Therefore, P is (ε + ln n, δ)-differentially private in the local model.

6
Using Theorem 6.1 (and a transformation of [8] from (ε, δ)-DP to (O(ε), 0)-DP in the local model),
we can leverage existing lower bounds for algorithms in the local model to obtain lower bounds on
algorithms in the mixnet model.
Variable Selection. In particular, consider the following variable selection problem: given a dataset x ∈
{0, 1}n×d , output b
J such that  
Xn  n
X  n
xi,b max x .
J ≥
 −
i,j
10
 j∈[d] 
i=1 i=1
n
(The 10 approximation term is somewhat arbitrary—any sufficiently small constant fraction of n will lead
to the same lower bounds and separations.)
Any local algorithm (with ε = 1) for selection requires n = Ω(d log d), whereas in the central model the
exponential mechanism [24] solves this problem for n = O(log d). The following lower bound shows that
for this ubiquitous problem, the one-message mixnet model cannot match the central model.

Theorem 2.4. If P is a (1, n110 )-differentially private protocol in the one-message mixnet model that solves the
selection problem (with high probability) then n = Ω(d 1/17 ). Moreover this lower bound holds even if x is drawn
iid from a product distribution over {0, 1}d .

In Section 6, we also prove lower bounds for the well studied histogram problem, showing that any
one-message mixnet-model protocol for this problem must have error growing (polylogarithmically)
with the size of the data domain. In contrast, in the central model it is possible to release histograms with
no dependence on the domain size, even for infinite domains.
We remark that our lower bound proofs do not apply if the algorithm sends multiple messages through
the mixnet. However, we do not know whether beating the bounds is actually possible. Applying √ our
bit-sum protocol d times (together with differential privacy’s composition property) shows that n = Õ( d)
samples suffice in the multi-round mixnet model. We also do not know if this bound can be improved.
We leave it as an interesting direction for future work to fully characterize the power of the mixnet model.

3 Model and Preliminaries


3.1 Differential Privacy
Let X ∈ X n be a dataset consisting of elements from some universe X . We say two datasets X, X 0 are
neighboring if they differ on at most one user’s data, and denote this X ∼ X 0 .

Definition 3.1 (Differential Privacy [16]). An algorithm M : X ∗ → Z is (ε, δ)-differentially private if for
every X ∼ X 0 ∈ X ∗ and every T ⊆ Z

P [M(X) ∈ T ] ≤ eε P M(X 0 ) ∈ T + δ.
 

where the probability is taken over the randomness of M.

Differential privacy satisfies two extremely useful properties:

Lemma 3.2 (Post-Processing [16]). If M is (ε, δ)-differentially private, then for every A, A ◦ M is (ε, δ)-
differentially private.

Lemma 3.3 (Composition [16, 17]). If M1 , . . . , MT are (ε, δ)-differentially private, then the composed algorithm

M(X)
e = (M1 (X), . . . , MT (X))
p
is (ε0 , δ0 + T δ)-differentially private for every δ0 > 0 and ε0 = ε(eε − 1)T + ε 2T log(1/δ0 ).

7
3.2 Differential Privacy in the Mixnet Model
In our model, there are n users, each of whom holds data xi ∈ X . We will use X = (x1 , . . . , xn ) ∈ X n to
denote the dataset of all n users’ data. We say two datasets X, X 0 are neighboring if they differ on at most
one user’s data, and denote this X ∼ X 0 .
The protocols we consider consist of three algorithms:
• R : X → Y m is a randomized encoder that takes as input a single users’ data xi and outputs a set of
m messages yi,1 , . . . , yi,m ∈ Y . If m = 1, then P is in the one-message mixnet model.

• S : Y ∗ → Y ∗ is a mixnet or shuffler that takes a set of messages and outputs these messages in a uni-
formly random order. Specifically, on input y1 , . . . , yN , S chooses a uniformly random permutation
π : [N ] → [N ] and outputs yπ(1) , . . . , yπ(N ) .

• A : Y ∗ → Z is some analysis function or analyzer that takes a set of messages y1 , . . . , yN and attempts
to estimate some function f (x1 , . . . , xn ) from these messages.
We denote the overall protocol P = (R, S, A). The mechanism by which we achieve privacy is

ΠR (x1 , . . . , xn ) = S(∪ni=1 R(xi )) = S(y1,1 , . . . , yn,m ),

where both R and S are randomized. We will use P (X) = A ◦ ΠR (X) to denote the output of the protocol.
However, by the post-processing property of differential privacy (Lemma 3.2), it will suffice to consider
the privacy of ΠR (X), which will imply the privacy of P (X). We are now ready to define differential
privacy for protocols in the mixnet model.
Definition 3.4 (Differential Privacy in the Mixnet Model). A protocol P = (R, S, A) is (ε, δ)-differentially
private if the algorithm ΠR (x1 , . . . , xn ) = S(R(x1 ), . . . , R(xn )) is (ε, δ)-differentially private (Definition 3.1).
In this model, privacy is a property of the entire set of users’ messages and of the shuffler, and thus
ε, δ may depend on the number of users n. When we wish to refer to P or Π with a specific number of
users n, we will denote this by Pn or Πn .
We remark that if an adversary were to inject additional messages, then it would not degrade privacy,
provided that those messages are independent of the honest users’ data. Thus, we may replace n, in our
results, with an assumed lower bound on the number of honest users in the system.
In some of our results it will be useful to have a generic notion of accuracy for a protocol P .
Definition 3.5 (Accuracy for Distributed Protocols). A protocol P = (R, S, A) is (α, β)-accurate for the
function f : X ∗ → Z if, for every X ∈ X ∗ , P [d(P (X), f (X)) ≤ α] ≥ 1 − β where d : Z × Z → R is some
application-dependent distance measure.
As with the privacy guarantees, the accuracy of the protocol may depend on the number of users n,
and we will use Pn when we want to refer to the protocol with a specific number of users.
Composition of Differential Privacy We will use the following useful composition property for proto-
cols in the mixnet model, which is an immediate consequence of Lemma 3.3 and the post-processing
Lemma 3.2. This lemma allows us to directly compose protocols in the mixnet model while only using
the shuffler once, rather than using the shuffler independently for each protocol being composed.
Lemma 3.6 (Composition of Protocols in the Mixnet Model). If Π1 = (R1 , S), . . . , ΠT = (RT , S) for Rt : X →
Y m are each (ε, δ)-differentially private in the mixnet model, and R
e : X → Y mT is defined as

e i ) = (R1 (xi ), . . . , RT (xi )),


R(x

δ0 > 0, the composed protocol Π


then, for every p e S) is (ε0 , δ0 + T δ)-differentially private in the mixnet model
e = (R,
0 2
for ε = ε + 2ε T log(1/δ ). 0

8
3.2.1 Local Differential Privacy
If the shuffler S were replaced with the identity function (i.e. if it did not randomly permute the
messages) then we would be left with exactly the local model of differential privacy. That is, a locally
differentially private protocol is a pair of algorithms P = (R, A), and the output of the protocol is P (X) =
A(R(x1 ), . . . , R(xn )). A protocol P is differentially private in the local model if and only if the algorithm R
is differentially private. In Section 6 we will see that if P = (R, S, A) is a differentially private protocol in
the one-message mixnet model, then R itself must satisfy local differential privacy for non-trivial (ε, δ),
and thus (R, A ◦ S) is a differentially private local protocol for the same problem.

4 A Protocol for Boolean Sums


In this section we describe and analyze a protocol for computing a sum of bits, establishing Theo-
rem 2.2 in the introduction.

4.1 The Protocol


In our model, the data domain is X = {0, 1} and the function being computed is f (x1 , . . . , xn ) = ni=1 xi .
P

Our protocol, Pλ , is specified by a parameter λ ∈ [0, n] that allows us to trade off the level of privacy and
accuracy. Note that λ may be a function of the number of users n. We will discuss in Section 4.3 how
to set this parameter to achieve a desired level of privacy. For intuition, one may wish to think of the
parameter λ ≈ ε12 when ε is not too small.
The basic outline of Pλ is as follows. Roughly, a random set of λ users will choose yi randomly, and
the remaining n − λ will choose yi to be their input bit xi . The output of each user is the single message yi .
The outputs are then shuffled and the output of the protocol is the sum ni=1 yi , shifted and scaled so that
P
Pn
it is an unbiased estimator of i=1 xi .
The protocol is described in Algorithm 1. The full name of this protocol is Pλ0/1 , where the superscript
R
serves to distinguish it with the real sum protocol Pλ,r (Section 5). Because of the clear context of this
section, we drop the superscript. Since the analysis of both the accuracy and utility of the algorithm will
depend on the number of users n, we will use Pn,λ , Rn,λ , An,λ to denote the protocol and its components in
the case where the number of users is n.

0/1
Algorithm 1: A mixnet protocol Pn,λ = (R0/1 0/1
n,λ , S, An,λ ) for computing the sum of bits

// Local Randomizer
R0/1
n,λ (x):
Input: x ∈ {0, 1}, parameters n ∈ N, λ ∈ (0, n).
Output: y ∈ {0, 1}
Let b ← Ber( λn )
If b = 0 : Return y ← x ;
 
1
ElseIf b = 1 : Return y ← Ber 2 ;

// Analyzer
A0/1
n,λ (y1 , . . . , yn ):
Input: (y1 , . . . , yn ) ∈ {0, 1}n , parameters n ∈ N, λ ∈ (0, n).
Output: z ∈ [0, n]
P 
Return z ← n−λ n
· ni=1 yi − λ2

9
4.2 Privacy Analysis
In this section we will prove that Pλ satisfies (ε, δ)-differential privacy. Note that if λ = n then the each
user’s output is independent of their input, so the protocol trivially satisfies (0, 0)-differential privacy,
and thus our goal is to prove an upper bound on the parameter λ that suffices to achieve a given (ε, δ).

Theorem 4.1 (Privacy of Pλ ). There are absolute constants κ1 , . . . , κ5 such that the following holds for Pλ . For
κ log(1/δ)
every n ∈ N, δ ∈ (0, 1) and 2 n ≤ ε ≤ 1, there exists a λ = λ(n, ε, δ) such that Pn,λ is (ε, δ) differentially
private and, q

κ4 log(1/δ) κ3 log(1/δ)
if ε ≥


ε2 n

λ≤

n − √κ5 εn3/2
 otherwise

log(1/δ)

In the remainder of this section we will prove Theorem 4.1.


P
The first step in the proof is the observation that the output of the shuffler depends only on i yi . It
will be more convenient to analyze the algorithm Cλ (Algorithm 2) that simulates S(Rλ (x1 ), . . . , Rλ (xn )).
P
Claim 4.2 shows that the output distribution of Cλ is indeed the same as that of the output i yi . Therefore,
privacy of Cλ carries over to Pλ .

Algorithm 2: Cλ (x1 . . . xn )
Input: (x1 . . . xn ) ∈ {0, 1}n , parameter λ ∈ (0, n).
Output: y ∈ {0, 1, 2, . . . , n}
 
Sample s ← Bin n, λn
Define Hs = {H ⊆ [n] : |H| = s} and choose H ← Hs uniformly at random
Return y ← i<H xi + Bin s, 12
P

hP i
n
Claim 4.2. For every n ∈ N, x ∈ {0, 1}n , and every r ∈ {0, 1, 2, . . . , n}, P [Cλ (X) = r] = P i=1 Rn,λ (xi ) = r

Proof. Fix any r ∈ {0, 1, 2, . . . , n}.


X
P [Cλ (X) = r] = P [Cλ (X) = r ∩ H = H]
H⊆[n]
 
1   λ |H|  λ n−|H|
X X   
= P  xi + Bin |H|, = r  · 1−
2 n n
H⊆[n] i<H
 
1   λ |H|  λ n−|H|
X X X   
= P  xi + Ber = r  · 1− (1)
2 n n
H⊆[n] i<H i∈H

Let G denote the (random) set of people for whom bi = 1 in Pλ . Notice that
 n   
X  X X 
P  Rn,λ (xi ) = r  = P  Rn,λ (xi ) = r ∩ G = G
i=1 G⊆[n] i
 
1   λ |G|  λ n−|G|
X X X   
= P 
 xi + Ber = r  · 1−
2 n n
G⊆[n] i<G i∈G

which is the same as (1). This concludes the proof.

Now we establish that in order to demonstrate privacy of Pn,λ , it suffices to analyze Cλ .

10
Claim 4.3. If Cλ is (ε, δ) differentially private, then Pn,λ is (ε, δ) differentially private.
Proof. Fix any number of users n. Consider the randomized algorithm T : {0, 1, 2, . . . , n} → {0, 1}n that
takes a number r and outputs a uniformly random string z that has r ones. If Cλ is differentially private,
then the output of T ◦ Cλ is (ε, δ) differentially private by the post-processing lemma.
To complete the proof, we show that for any X ∈ X n the output of (T ◦Cλ )(X) has the same distribution
as S(Rλ (x1 ), . . . Rλ (xn )). Fix some vector Z ∈ {0, 1}n with sum r
P [T (Cλ (X)) = Z] = P [T (r) = Z] · P [Cλ (X) = r]
T ,Cλ
n−1
= r · P [Cλ (X) = r]
n −1
  
= r · P f (Rn,λ (X)) = r (Claim 4.2)
X
n−1  
= r · P Rn,λ (X) = Y
Y ∈{0,1}n :|Y |=r
X  
= P Rn,λ (X) = Y · P [S(Y ) = Z]
Y ∈{0,1}n :|Y |=r
 
= P S(Rn,λ (X)) = Z
Rn,λ ,S

This completes the proof of Claim 4.3.


We will analyze the privacy of Cλ in three steps. First we show that for any sufficiently large H, the
final step will ensure differential privacy for some parameters. When then show that for any sufficiently
large value s and H chosen randomly with |H| = s, the privacy parameters actually improve significantly
in the regime where s is close to n. Finally, we show that when s is chosen randomly then s is sufficiently
large with high probability.

Algorithm 3: CH (x1 . . . xn )
Input: (x1 . . . xn ) ∈ {0, 1}n , parameter H ⊆ [n].
Output: yH ∈ {0, 1, 2, . . . , n}
 
Let B ← Bin |H|, 21
P
Return yH ← i<H xi + B

Claim 4.4. For any δ > 0 and any H ⊆ [n] such that |H| > 8 log 4δ , CH is (ε, 2δ )-differentially private for
 s  s
 32 log 4δ  32 log 4δ
ε = ln 1 +  <

 |H|  |H|

Proof. Fix neighboring datasets X ∼ X 0 ∈ {0, 1}n , any H ⊆ [n] such that |H| > 8 log 4δ , and any δ > 0. If
the point at which X, X 0 differ lies within H, the two distributions CH (X), CH (X 0 ) are identical. Hence,
without loss of generality we assume that xj = 0 and xj0 = 1 for some j < H.
q  
Define u := 12 |H| log 4δ and Iu := 12 |H| − u, 12 |H| + u so that by Hoeffding’s inequality(Theorem B.2),
P [B ∈ Iu ] < 12 δ. For any W ⊆ {0, 1, 2, . . . , n} we have,
P [CH (X) ∈ W ] = P [CH (X) ∈ W ∩ B ∈ Iu ] + P [CH (X) ∈ W ∩ B < Iu ]
1
≤ P [CH (X) ∈ W ∩ B ∈ Iu ] + δ
  2
X  X  1
= P B + xi = r  + δ
2
r∈W ∩Iu i<H

11
Thus to complete the proof, it suffices to show that for any H and r ∈ W ∩ Iu
s
32 log 4δ
P
P [B + i<H xi = r]
h i ≤ 1+ (2)
P B+
P
x0 = r |H|
i<H i

Because xj = 0, xj0 = 1 and j < H, we have 0


P P
i<H xi = i<H xi − 1. Thus,
h i h  P  i
0
P B = r − i<H xi0 + 1
P
P [B + i<H xi = r] P B + i<H xi − 1 = r
P
h i= h i = h  P i
P B + i<H xi0 = r P B + i<H xi0 = r P B = r − i<H xi0
P P

P 0
Now we define k = r − i<H xi so that
h  P  i
P B = r − i<H xi0 + 1 P [B = k + 1]
h  P i = .
P B = r − i<H xi 0 P [B = k]

Then we can calculate


P [B = k + 1] |H| − k
= (B is binomial)
P [B = k] k+1
|H| − ( 12 |H| − u)
≤ 1
(r ∈ Iu so k ≥ 12 |H| − u)
2 |H| − u + 1
1
2 |H| + u u 2 /(log 4δ ) + u q
1 4
< 1
= (u = 2 |H| log δ )
2 |H| − u u 2 /(log 4δ ) − u
u + log 4δ 2 log 4δ 2 log 4δ
= = 1 + = 1+
u − log 4δ u − log 4δ
q
1 4
2 |H| log δ − log 4δ
s
4 log 4δ 32 log 4δ
≤ 1+ q = 1+ (|H| > 8 log 4δ )
1 4 |H|
2 |H| log δ

which completes the proof.

Next, we consider the case where H is a random subset of [n] with a fixed size s. In this case we will
use an amplification via sampling argument [21, 26] to argue that the randomness of H improves the privacy
parameters by a factor of roughly (1 − ns ), which will be crucial when s ≈ n.

Algorithm 4: Cs (x1 , . . . , xn )
Input: (x1 , . . . , xn ) ∈ {0, 1}n , parameter s ∈ {0, 1, 2, . . . , n}.
Output: ys ∈ {0, 1, 2, . . . , n}
Define Hs = {H ⊆ [n] : |H| = s} and choose H ← Hs uniformly at random
Return ys ← CH (x)

Claim 4.5. For any δ > 0 and any s > 8 log 4δ , Cs is (ε, 12 δ) differentially private for
s
32 log 4δ  s

ε= · 1−
s n

12
Proof. As in the previous section, fix X ∼ X 0 ∈ {0, 1}n where xj = 0, xj0 = 1. Cs (X) selects H uniformly from
Hs and qruns CH (X); let H denote the realization of H. To enhance readability, we will use the shorthand
32 log 4δ
ε0 (s) := s . For any W ⊂ {0, 1, 2, . . . , n}, we aim to show that

P [CH (X) ∈ W ] − 12 δ
s
  
H,CH
≤ exp ε0 (s) · 1 −
P [CH (X 0 ) ∈ W ] n
H,CH

First, we have

P [CH (X) ∈ W ] − 12 δ
H,CH
P [CH (X 0 ) ∈ W ]
H,CH

P [CH (X) ∈ W | j ∈ H] · P [j ∈ H] + P [CH (X) ∈ W | j < H] · P [j < H] − 12 δ


H,CH H,CH
=
P [CH (X 0 ) ∈ W | j ∈ H] · P [j ∈ H] + P [CH (X 0 ) ∈ W | j < H] · P [j < H]
H,CH H,CH
1
(1 − p)γ(X) + pζ(X) − 2δ
= (3)
(1 − p)γ(X 0 ) + pζ(X 0 )

where p := P [j < H] = (1 − s/n),

γ(X) := P [CH (X) ∈ W | j ∈ H] and ζ(X) := P [CH (X) ∈ W | j < H].


CH CH

When user j outputs a uniformly random bit, their private value has no impact on the distribution. Hence,
γ(X) = γ(X 0 ), and
(1 − p)γ(X) + pζ(X) − 12 δ
(3) = (4)
(1 − p)γ(X) + pζ(X 0 )
Since s = |H| is sufficiently large, by Claim 4.4 we have ζ(X) ≤ (1 + ε0 (s)) · min{ζ(X 0 ), γ(X)} + 12 δ.

(1 − p)γ(X) + p · (1 + ε0 (s)) · min{ζ(X 0 ), γ(X)} + δ) − 21 δ


(4) ≤
(1 − p)γ(X) + pζ(X 0 )
(1 − p)γ(X) + p · (1 + ε0 (s)) · min{ζ(X 0 ), γ(X)}

(1 − p)γ(X) + pζ(X 0 )
(1 − p)γ(X) + p · min(ζ(X 0 ), γ(X)) + p · ε0 (s) · min{ζ(X 0 ), γ(X)}
=
(1 − p)γ(X) + pζ(X 0 )
(1 − p)γ(X) + pζ(X ) + p · ε0 (s) · min{ζ(X 0 ), γ(X)}
0

(1 − p)γ(X) + pζ(X 0 )
p · ε0 (s) · min{ζ(X 0 ), γ(X)}
= 1+ (5)
(1 − p)γ(X) + pζ(X 0 )

Observe that min{ζ(X 0 ), γ(X)} ≤ (1 − p)γ(X) + pζ(X 0 ), so


s s
    
(5) ≤ 1 + p · ε0 (s) = 1 + ε0 (s) · 1 − ≤ exp ε0 (s) · 1 −
n n
r 
 32 log(4/δ)  s

= exp  · 1 − 

s n

which completes the proof.

13
We now come to the actual algorithm Cλ , where s is not fixed but is random. The analysis of Cs yields
a bound on the privacy parameter that increases with s, so we will complete the analysis of Cλ by using
the fact that, with high probability, s is almost as large as λ.
Claim 4.6. For any δ > 0 and n ≥ λ ≥ 14 log 4δ , Cλ is (ε, δ) differentially private where
 q 
2λ log 2δ 
v
t 32 log 4 λ −
u
u 
δ 
ε= · 1 −
 
q
2   n 
λ − 2λ log δ 

Proof. Fix any X ∼ X 0 ∈ {0, 1}n and any W ⊆ [n].


 q   q 
P [Cλ (X) ∈ W ] = P Cλ (X) ∈ W ∩ s ≥ λ − 2λ log 2δ + P Cλ (X) ∈ W ∩ s < λ − 2λ log 2δ
1
 q 
≤ P Cλ (X) ∈ W ∩ s ≥ λ − 2λ log 2δ + δ (Chernoff bound)
2
X 1
= P [Cs (X) ∈ W ] · P [s = s] + δ (6)
q 2
s≥λ− 2λ log 2δ
q
Because λ is sufficiently large, λ − 2λ log 2δ > 8 log 4δ . Claim 4.5 thus applies to each term in the sum.
s 
 32 log 4δ 
 
s    1
P [Cs (X) ∈ W ] ≤ exp  · 1 −  · P Cs (X 0 ) ∈ W + δ
 s n  2
q
32 log 4δ  
For notational convenience, we will use the shorthand ε1 (s) := s · 1 − ns . Therefore,
 
 
 1  1
 X   
ε (s)  0
(6) ≤  e · P Cs (X ) ∈ W + δ · P [s = s] + δ
 1
 q 2  2
 2 
s≥λ− 2λ log δ
 
 
 X 
ε (s)  0 
e 1 · P Cs (X ) ∈ W · P [s = s] + δ
 
≤ 
 q 
 2 
s≥λ− 2λ log δ

eε1 (s) · P Cλ (X 0 ) ∈ W + δ
 
≤ max
q
2
s≥λ− 2λ log δ

Because ε1 (s) decreases with s, the above is maximized at the lower bound on s:
v  q 
t 32 log 4δ
u u  λ − 2λ log 2δ  
 · P Cλ (X 0 ) ∈ W + δ
 
P [Cλ (X) ∈ W ] ≤ exp  · 1 −

q
n
λ − 2λ log 2δ 
  


which completes the proof.

From Claim 4.3, Cλ and Pn,λ share the same privacy guarantees. Hence, Claim 4.6 implies the
following:
h i
Corollary 4.7. For any δ ∈ (0, 1), n ∈ N, and λ ∈ 14 log 4δ , n , Pn,λ is (ε, δ) differentially private, where
 q 
λ − 2λ log 2δ 
v
t 32 log 4
u
u 
δ 
ε= · 1 −
 
q
n

2   
λ − 2λ log δ

14
4.3 Setting the Parameter λ
Corollary 4.7 gives a bound on the privacy of Pn,λ in terms of the number of users n and the random-
ization parameter λ. While this may be enough on its own, in order to understand the tradeoff between ε
and the accuracy of the protocol, we want to identify a suitable choice of λ to achieve a desired privacy
guarantee (ε, δ). To complete the proof of Theorem 4.1, we prove such a bound.
h For thei remainder of this section, fix some δ ∈ (0, 1). Corollary 4.7 states that for any n and λ ∈
14 log 4δ , n , Pn,λ satisfies (ε∗ (λ), δ)-differential privacy, where
 q 
2
v
t 32 log 4 λ − 2λ log
u
u 
δ δ
ε∗ (λ) =
 
q · 1 −
n

λ − 2λ log 2δ 
 

Let λ∗ (ε) be the inverse of ε∗ , i.e. the minimum λ ∈ [0, n] such that ε∗ (λ) ≤ ε. Note that ε∗ (λ) is decreasing
as λ → n while λ∗ (ε) increases as ε → 0. By definition, Pn,λ satisfies (ε, δ) privacy if λ ≥ λ∗ (ε); the following
Lemma gives such an upper bound:
√ 
Lemma 4.8. For all δ ∈ (0, 1), n ≥ 14 log 4δ , ε ∈ 3456
n log 4
δ , 1 , Pn,λ is (ε, δ) differentially private if
 q
64 4 192
 ε2 log δ


 if ε ≥ n log 4δ
λ= εn3/2
(7)
n − √ otherwise


432 log(4/δ)

We’ll prove the lemma in two claims, each of which corresponds to one of the two cases of our bound
on λ∗ (ε). The first bound applies when ε is relatively large.
q 
Claim 4.9. For all δ ∈ (0, 1), n ≥ 14 log 4δ , ε ∈ 192
n log 4
δ , 1 , if λ = 64
ε2
log 4δ then Pn,λ is (ε, δ) private.

64
Proof. Let λ = ε2
log 4δ as in the statement. Corollary 4.7 states that Pn,λ satisfies (ε∗ (λ), δ) privacy for
 q 
2
v
t 32 log 4 λ − 2λ log
u
u 
δ δ
ε∗ (λ) =
 
q · 1 − 
2   n 
λ − 2λ log δ 
v
t 32 log 4
u
u
δ
≤ q (λ ≤ n)
λ − 2λ log 2δ
s
64 log 4δ
≤ (λ ≥ 8 log 2δ )
λ

This completes the proof of the claim.



The value of λ in the previous claim can be as large as n when ε approaches 1/ n. We now give a
meaningful bound for smaller values of ε.
√ q 
Claim 4.10. For all δ ∈ (0, 1), n ≥ 14 log 4δ , ε ∈ 3456 4
n log δ ,
192 4
n log δ , if

εn3/2
λ = n− p
432 log(4/δ)
then Pn,λ is (ε, δ) private.

15
p
Proof. Let λ = n−εn3/2 / 432 log(4/δ) as in the statement. Note that for this ε regime, we have n/3 < λ < n.
Corollary 4.7 states that Pn,λ satisfies (ε∗ (λ), δ) privacy for

 q  s  q 
2λ log 2δ  2λ log 2δ 
v
4 λ − 64 log 4δ λ −
u
32 log
u
t  
δ
ε∗ (λ) = (λ ≥ 8 log 2δ )
 
· 1 −  ≤ · 1 −
 
q
n λ n

λ − 2λ log 2δ
   
s  q  s s
√ √
 
2
4 
64 log δ   ε n 2λ log δ
 64 log 4δ 
ε n
2
2 log δ 
= + + (λ ≤ n)

·  p  ≤ ·  p 
λ  432 log(4/δ) n 
 λ  432 log(4/δ) n 
s s

 
192 log 4δ  ε n 2 log 2

δ
≤ ·  p +  (λ ≥ n/3)
n n 

 432 log(4/δ)
q
384 log 4δ log 2δ 2 √
2 384 4
= ε+ < ε+ log
3 n 3 n δ

2 1 3456
< ε+ ε=ε (ε > n log 4δ )
3 3
which completes the proof.

4.4 Accuracy Analysis


P
In this section, we will bound the error of Pλ (X) with respect to i xi . Recall that, to clean up notational
P
clutter, we will often write f (X) = i xi . As with the previous section, our statements will at first be in
terms of λ but the section will end with a statement in terms of ε, δ.

Theorem 4.11. For every n ∈ N, β > 0, n > λ ≥ 2 log β2 , and x ∈ {0, 1}n ,
 
n 
 X p  
P  Pn,λ (x) −
 xi > 2λ log(2/β) ·  ≤ β
n−λ 
i

1
Observe that, using the choice of λ specified in Theorem 4.1, we conclude that for every n . ε . 1 and
every δ the protocol Pλ satisfies
 p 
 X  log(1/δ) log(1/β) 
P  Pn,λ (x) −
 xi > O   ≤ β
ε 
i

To see how this follows from Theorem 4.11, consider two parameter regimes:

√ log(1/δ) p
1. When ε  1/ n then λ ≈ ε2
 n, so the bound in Theorem 4.11 is O( λ log(1/β)), which
yields the desired bound.
√ !
√ p n3/2 log(1/β)
2. When ε  1/ n then n − λ ≈ εn3/2 / log(1/δ)  n, so the bound in Theorem 4.11 is O n−λ ,
which yields the desired bound.

We formalize this analysis in Corollary 4.15to obtain Theorem ?? in the introduction.


We can begin by determining the mean and variance of the messages yi

16
Claim 4.12. For any n ∈ N, 0 < λ ≤ n, and x ∈ {0, 1},
 λ λ
 

E Rn,λ (x) = + 1− ·x
2n n
 λ λ
 

Var Rn,λ (x) = · 1−
2n 2n
Proof. For shorthand, y = Rn,λ (x). The calculation of the expectation is not long:
λ 1 λ
    
E [y] = · E Ber + 1− ·x
n 2 n
λ λ
 
= + 1− ·x
2n n
If x = 0, then y is a Bernoulli random variable with probability 12 · λn of being 1. The variance of Ber(p) is
 
λ λ
p(1 − p), which is here 2n 1 − 2n . A symmetric argument applies to the case where x = 1. This concludes
the proof.

Using the above, one can compute the mean and variance of the protocol’s estimate using linearity of
expectation and the fact that the output of the protocol is a (rescaled) sum of independent messages:
Claim 4.13. For any n ∈ N, 0 < λ ≤ n, and X ∈ {0, 1}n ,
n
  X
E Pn,λ (X) = xi
i=1
n 2 λ λ
   
 
Var Pn,λ (X) =· · 1−
n−λ 2 2n
We omit the proof for space. From the previous claim, and the fact that the protocol is output a sum
of independent bits, we can obtain a high-probability bound on the error.
16
Corollary 4.14. For any n ∈ N, 0 < β < 1, and log β2 < λ < n, the protocol Pn,λ is (α, β)-accurate for
9
r
n 2
α= 2λ log
n−λ β
 
λ
Proof. Fix any X ∈ {0, 1}n . Let di denote the random variable Rn,λ (xi ) − 2n − 1 − λn · xi . It has maximum
 
λ λ λ λ
1 − 2n < 1 and minimum −1 + 2n > −1. From Claim 4.12, E [di ] = 0 and Var [di ] = 2n 1 − 2n . Because λ is
4
sufficiently large, the variance is larger than 9n log β2 . By assumption, there are n honest users. These facts
imply that Bernstein’s inequality (Theorem B.3) is compatible:
 n s 
λ 2 
 X  
P  di > 2λ 1 − log  < β (8)
2n β
i=1
Define yi := Rλ (xi ) for shorthand. Observe that
X λ λ
X   
di = yi − − 1− · xi
2n n
i i
 
X  λ  λ

=  yi  − − 1 − · f (x)
2 n
i
  
n X n X  λ 
di =  yi  −  − f (x)
n−λ n − λ  2
i i
= (Aλ ◦ S)(y1 , . . . , yn ) − f (x) (S only permutes)
= Pn,λ (x) − f (x) (9)

17
Substitution of (9) in (8) yields
 s 
n λ 2
   
P  Pn,λ (x) − f (x) > 2λ 1 − log  < β
n−λ 2n β

The Claim follows from the fact that λ > 0. This concludes the proof.

When λ is set to the piecewise function in Lemma 4.8, the error of Pλ with respect to the bit-sum is of
the same order as the Gaussian mechanism:

Corollary 4.15. For any n ∈ N, 0 < δ < 1, 3456 4
n log δ < ε < 1, and δ < β < 1, there exists a λ ∈ [0, n] such that
Pn,λ is (ε, δ)-differentially private and for every X ∈ {0, 1}n , with probability at least 1 − β,
n r
X 30 2 4
Pn,λ (X) − xi ≤ log log
ε β δ
i=1

Proof. Fix any X ∈ {0, 1}n . Let


 q
64 4 192
 ε2 log δ


 if ε ≥ n log 4δ
λ= εn3/2
n − √ otherwise


432 log(4/δ)

By Lemma 4.8, Pλ is (ε, δ) private. By Theorem 4.11, Pn,λ is (α, β) accurate for
r
2 n
α = 2λ log ·
β n−λ
q
We have to consider both ranges of ε. If ε < 192 4
n log δ , then
r
2 n
α= 2λ · log · !
β εn3/2
n− n− √
432 log(4/δ)
r p
2 432 log(4/δ)
= 2λ · log · √
β ε n
s !
30 λ 2 4
< · log log
ε n β δ
s !
30 2 4
≤ log log (λ ≤ n)
ε β δ
q
192
If ε ≥ n log 4δ , then
r
64 4 2 n
 
α= 2 · 2 log · log ·
ε δ β n−λ
r
1 4 2 3
 
≤ 128 log · log · (λ < n/3)
ε δ β 2
r  
17 4 2
< log log
ε δ β
Combining the two bounds completes the proof.

18
5 A Protocol for Sums of Real Numbers
In this section, we show how to extend our protocol to compute sums of bounded real numbers. In
P
this case the data domain is X = [0, 1], but the function we wish to compute is still f (x) = i xi . The main
idea of the protocol is to randomly round each number xi to a Boolean value bi ∈ {0, 1} with expected
value xi . However, since the randomized rounding introduces additional error, we may need to round
multiple times and estimate several sums. As a consequence, this protocol is not one-message.

5.1 The Protocol


Our algorithm is described in two parts, an encoder Er that performs the randomized rounding
R
(Algorithm 5) and a mixnet protocol Pλ,r (Algorithm 6) that is the composition of many copies of our
protocol for the binary case, Pλ0/1 . The encoder takes a number x ∈ [0, 1] and a parameter r ∈ N and
h P i h P i
outputs a vector (b1 , . . . , br ) ∈ {0, 1}r such that E 1r j bj = xj and Var 1r j bj = O(1/r 2 ). To clarify, we
give two examples of the encoding procedure:
• If r = 1 then the encoder simply sets b = Ber(x). The mean and variance of b are x and x(1 − x) ≤ 14 ,
respectively.

• If x = .4 and r = 4 then the encoder sets b = (1, Ber(.6), 0, 0). The mean and variance of 14 (b1 + b2 +
b3 + b4 ) are .4 and .015, respectively.
After doing the rounding, we then run the bit-sum protocol Pλ0/1 on the bits b1,j , . . . , bn,j for each j ∈ [r]
and average the results to obtain an estimate of the quantity
X1X X
bi,j ≈ xi
r
i j i

To analyze privacy we use the fact that the protocol is a composition of bit-sum protocols, which are each
private, and thus we can analyze privacy via the composition properties of differential privacy.
R
Much like in the bit-sum protocol, we use Pn,λ,r , RR R
n,λ,r , An,λ,r to denote the real-sum protocol and its
components when n users participate.

Algorithm 5: An encoder Er (x)


Input: x ∈ [0, 1], a parameter r ∈ N.
Output: (b1 , . . . , br ) ∈ {0, 1}r
Let µ ← dx · re and p ← x · r − µ + 1
For j = 1, . . . , r



 1 j <µ

bj =  Ber(p) j = µ




0 j >µ
Return (b1 , . . . , br )

1/4 poly(log n)
Theorem 5.1. For every δ = δ(n) such that e−Ω(n ) < δ(n) < n1 and n < ε < 1 and every sufficiently large
R
n, there exists exists parameters λ ∈ [0, n], r ∈ N such that Pn,λ,r is both (ε, δ) differentially private and for every
β > 0, and every X = (x1 , . . . , xn ) ∈ [0, 1]n ,
n
 r !

R
X 1 1 1 
P  Pn,λ,r (X) − xi > O log log  ≤ β
ε δ β
i=1

19
R
Algorithm 6: The protocol Pλ,r = (RR R
λ,r , S, Aλ,r )

// Local randomizer
RR
n,λ,r (x):
Input: x ∈ [0, 1], parameters n, r ∈ N, λ ∈ (0, n).
Output: (y1 , . . . yr ) ∈ {0, 1}r
(b1 , . . . br ) ← Er (x)  
Return (y1 , . . . yr ) ← R0/1
n,λ (b1 ), . . . , R0/1
n,λ (br )

// Analyzer
AR
n,λ,r (y1,1 , . . . , yn,r ):
Input: (y1,1 , . . . , yn,r ) ∈ {0, 1}n·r , parameters n, r ∈ N, λ ∈ (0, n).
Output: z ∈ [0, n]
P P  
Return z ← 1r · n−λ n
j i yi,j − 2
λ·r

5.2 Warmup: r = 1

To simplify the discussion, we will handle the case where ε < 1/ n is quite small, in which it suffices
to consider r = 1. In this case the protocol is exactly the bit-sum protocol run on the bits r1 , . . . , rn . In this
case we have two sources of error, the rounding, and the bit-sum protocol itself, and we can simply
analyze the combination.
The error of the rounding is bounded by Hoeffding’s inequality.
P q 
n Pn 1
Claim 5.2. For every n ∈ N, x1 , . . . , xn ∈ [0, 1], and β > 0, P i=1 xi − i=1 E1 (xi ) > 2 n log(2/β) ≤ β.

Using this claim, combined with Corollary 4.15, we immediately obtain the following
√ q 
Theorem 5.3. For every n ∈ N, δ ∈ (0, 1), and ε ∈ 3456 4 1
n log δ , n , and every β > 0, there is a λ such that the
protocol Pn,λ,1 is (ε, δ)-differentially private and for every x1 , . . . , xn ∈ N,
n
 
 X 
P  Pn,λ,1 (x) −
 xi > α  ≤ β
i=1

for
√ q q   q 
α=O n + 1ε log 1δ log β1 = O 1ε log 1δ log β1

Summing up, when ε < 1/ n, the error term coming from rounding is smaller than the error already

in the bit-sum protocol. Thus, we have established Theorem 5.1 for the regime where ε < 1/ n. However,

when ε is larger, the bit-sum protocol has much less than n error, so we will need to perform the more
elaborate rounding with r > 1.

5.3 Privacy Analysis


Privacy will follow immediately from the composition properties of mixnet protocols (Lemma 3.6)
and the privacy of the bit-sum protocol Pn,λ . One technical nuisance is that the composition properties
are naturally stated in terms of ε, whereas the protocol is described in terms of the parameter λ, and the
relationship between ε, λ, and n is somewhat complex. Thus, we will state our guarantees in terms of
the level of privacy that each individual bit-sum protocol achieves with parameter λ. To this end, define

20
the function λ∗ (n, ε, δ) to be the minimum value of λ such that the bit-sum protocol with n users satisfies
(ε, δ)-differential privacy. We will state the privacy guarantee in terms of this function.
Theorem 5.4. For every ε, δ ∈ (0, 1), n, r ∈ N, define
ε δ
ε0 = p δ0 = λ∗ = λ∗ (n, ε0 , δ0 )
8r log(2/δ) 2r

For every λ ≥ λ∗ , Pn,λ,r


R
is (ε, δ)-differentially private.

5.4 Accuracy Analysis


R P P
In this section, we bound the error of Pλ,r (X) with respect to i xi . Recall that f (X) = i xi . Similar to
the analysis of Pλ0/1 , our statements will at first be in terms of λ, r but the section will end with a statement
in terms of ε, δ.
Observe that there are two sources of randomness: the encoding of the input X = (x1 , . . . xn ) as bits
and the execution of R0/1 n,λ on that encoding. We first show that the bit encoding lends itself to an unbiased
and concentrated estimator of f (X). Then we show that the output of Pn,λ,r is concentrated around any
value that estimator takes.
It will help to establish notation. Throughout this section, we fix some X = (x1 , . . . , xn ) ∈ {0, 1}n . For
any i ∈ [n], the vector of bits (bi,1 , . . . bi,r ) denotes randomized encoding from Er (xi ). The set of all such
bits b1,1 , . . . bn,r is denoted B ∈ {0, 1}n·r .
Claim 5.5. For every n, r ∈ N , X ∈ (0, 1)n , and 0 < β < 1,

n r √ r 
 1 XX 2 2 
P  f (X) − bi,j > n log  ≤ β (10)

B r r β
i=1 j=1

Proof. Fix some i ∈ [n]. Only one bit among bi,1 , . . . , bi,r is random, at index µ. The remainder have sum
bxi · rc. Hence,
r
1X 1
xi − bi,j ≤ (11)
r r
j=1
1
We show that the average of r (bi,1 + · · · + bi,r ) has expected value xi :
    
r
 1 X   1  X 
bi,j  = E  bi,µi + bi,j 
    
E 
bi,1 ,...bi,r  r  r  
j=1 j,µ
1 h i 
= E bi,µ + µi − 1
r
1
= (pi + µ − 1)
r
1
= (xi · r − µ + 1 + µ − 1)
r
1
= · xi · r
r
= xi (12)
Pr
From (11), (12), and Hoeffding’s inequality (Theorem B.2),the sum over all xi − 1r j=1 bi,j is concen-
trated:  
n n Xr r
 X 1 X 2 1 2 
xi − bi,j > n log  ≤ β
 
P 
B r r 2 β
i=1 i=1 j=1

21
which is equivalent to (10). This concludes the proof.

Condition on an encoding b1,1 , . . . , bn,r of the n real-valued inputs. When we treat the output of the
protocol as an estimator of 1r j i bi,j , we find that it is unbiased and concentrated:
P P

16
Claim 5.6. For every β > 0, n ≥ λ ≥ 9 log β2 , X ∈ (0, 1)n , r ∈ N and every fixed set of bits b1,1 , . . . , bn,r ∈ {0, 1}n·r ,
 s 

 R 1 XX n λ 2 
P  Pn,λ,r (X) − bi,j > 2 log B = b1,1 , . . . , bn,r  < β (13)

 r n−λ r β 
j i
 
λ
Proof. As in the statement, fix any b1,1 , . . . , bn,r ∈ {0, 1}n·r . Let di,j denote Rn,λ (bi,j ) − 2n − 1 − λn · bi,j . Its
h i h i  
λ λ
value is in [−1, 1]. Applying Claim 4.12 here, we have E di,j = 0 and Var di,j = 2n 1 − 2n . From
Bernstein’s inequality(Theorem B.3),
 s 
λ 2
 X X   
di,j > 2λr 1 − log  < β (14)
 
P 
 2n β
j i

Let yi,j denote the random variable output by Rλ (bi,j ). Observe that
 
XX X X  λ · r  λ XX

di,j =  yi,j  − − 1− bi,j
 
  2 n
j i j i j i
  
1 n XX 1 n X X  λ · r  1 X X
· di,j = · yi,j  − − bi,j
r n−λ r n − λ  2  r


j i j i j i
1 XX
= AR
n,λ,r (y 1,1 , . . . , y n,r ) − bi,j (Defn. of AR
n,λ,r )
r
j i
1 XX
= (AR
n,λ,r ◦ S)(y1,1 , . . . , yn,r ) − bi,j (S only permutes)
r
j i

0/1 1 XX
= (AR
n,λ,r ◦ S ◦ Rn,λ )(b1,1 , . . . , bn,r ) − bi,j (15)
r
j i

R
When executing Pn,λ,r , condition on B = b1,1 , . . . , bn,r . By substituting (15) into (14), we have that
conditioned on B = b1,1 , . . . , bn,r ,
 s 
1 n λ λ 2
 XX   
 R
P  Pn,λ,r (x) − bi,j > 2 1− log B = b1,1 , . . . , bn,r  < β

 r n−λ r 2n β 
j i

Now, (13) follows from the fact that λ > 0. This concludes the proof.

By a union bound over (10) and (13),


16
Corollary 5.7. For every β > 0, n ≥ λ ≥ 9 log β2 , r ∈ N, and X ∈ [0, 1]n ,

 √ q q 
2
R
P Pn,λ,r (X) − f (X) ≥ r n log β2 + n−λ
n
· 2 λr log β2

< 2β

22
The above bounds the error of Pn,λ,r in terms of λ, r. When λ, r are chosen such that Pn,λ,r satisfies
(ε, δ) differential privacy, we bound on the error of Pn,λ,r in terms of ε, δ. Theorem 5.1 follows from the
following statement:
 q 
1/4
 
Corollary 5.8. For any n > 104 , δ ∈ 8e−0.03n , 1/n , β ∈ (δ, 1), ε ∈ 122
n log 8
δ log 2
β , 1 , there exist parameters
λ ∈ [0, n], r ∈ N such that Pn,λ,r is (ε, δ) private and for any X ∈ (0, 1)n
" r #
R 122 8 2
P Pn,λ,r (X) − f (X) > log log < 2β
ε δ β

Proof. Let r = dε ne. Define
ε δ
ε0 = p δ0 =
8r log(2/δ) 2r
0/1
In an identical fashion to Lemma 4.8, assign λ such that Pn,λ satisfies (ε0 , δ0 ) privacy. From Corollary 5.4,
this in turn implies Pn,λ,r is (ε, δ) differentially private.
For these values of r, λ, Corollary 5.7 bounds the error as
 √ r s 

R 2 2 n λ 2 
P  Pn,λ,r (X) − f (X) ≥ n log + · · 2 log  ≤ 2β (16)
r β n−λ r β

It is immediate from substitution of r that


√ r √ r
2 2 2 2
n log = log (17)
r β ε β

Following the same steps as the proof of Corollary 4.15, it can be shown that
s s !
n λ 2 30 1 4 2
· 2 log ≤ log log
n−λ r β ε0 r δ0 β
q
8r log 2δ 1
s !
4 2
= 30 · log log (Defn. of ε0 )
ε r δ0 β
s !
30 2 4 2
 
= 8 log log log
ε δ δ0 β
s !
30 2 8r 2
 
= 8 log log log (Defn. of δ0 )
ε δ δ0 β
s √ !
30 2 8ε n 2
 
= 8 log log log (Defn. of r)
ε δ δ β
r
30 2
 
8
 
2 √
< 16 log log log (ε < 1, δ < 8/ n)
ε δ δ β
r
120 8 2
= log log (18)
ε δ β

23
Eq(18) √
Observe Eq(17) = 60 2 log 8δ < 85 log 8δ . Hence,

8
Eq(17) + Eq(18) < Eq(17) · (1 + 85 log )
δ
8
< Eq(17) · 86 log (δ < 1)
δ
√ r
86 2 8 2
= log log
ε δ β
r
122 8 2
< log log
ε δ β
The above can be substituted into (16) and we arrive at
" r #
122 8 2
P Pn,λ,r (X) − f (X) ≥ log log ≤ 2β
ε δ β

which is precisely the target claim.

6 Lower Bounds for the Mixnet Model


In this section, we prove separations between central model algorithms and mixnet model protocols
where each user’s local randomizer is identical and sends one indivisible message to the shuffler (the
one-message model).

Theorem 6.1 (Mixnet-to-Local Transformation). Let PS be a protocol in the one-message mixnet model that is

• (εS , δS )-differentially private in the mixnet model for some εS ≤ 1 and δS = δS (n) < n−8 , and

• (α, β)-accurate with respect to f for some β = Ω(1).

Then there exists a protocol PL in the local model that is

• (εL , 0)-differentially private in the local model for εL = 8(εS + ln n), and

• (α, 4β)-accurate with respect to f (when n is larger than some absolute constant)

This means that an impossibility result for approximating f in the local model implies a related
impossibility result for approximating f in the mixnet model. In Section 6.2 we combine this result with
existing lower bounds for local differential privacy to obtain several strong separations between the
central model and the one-message mixnet model.
The key to Theorem 6.1 is to show that if PS = (RS , S, AS ) is a protocol in the one-message mixnet
model satisfying (εS , δS )-differential privacy, then the algorithm RS itself satisfies (εL , δS )-differential
privacy without use of the shuffler S. Therefore, the local protocol PL = (RS , AS ◦ S) is (εL , δS )-private in
the local model and has the exact same output distribution, and thus the exact same accuracy, as PS . To
complete the proof, we use (a slight generalization of) a transformation of Bun, Nelson, and Stemmer [8]
to turn R into a related algorithm R0 satisfying (8(εS + ln n), 0)-differential privacy with only a slight loss
of accuracy. We prove the latter result in Appendix A.

6.1 Mixnet Randomizers Satisfy Local Differential Privacy


The following lemma is the key step in the proof of Theorem 6.1, and states that for any symmetric
mixnet protocol, the local randomizer R must satisfy local differential privacy with weak, but still
non-trivial, privacy parameters.

24
Theorem 6.2. Let P = (R, S, A) be a protocol in the one-message mixnet model. If n ∈ N is such that Pn satisfies
(εS , δS )-differential privacy, then the algorithm R satisfies (εL , δL )-differential privacy for εL = εS + ln n. Therefore,
the symmetric local protocol PL = (R, A ◦ S) satisfies (εL , δL )-differential privacy.

Proof. By assumption, Pn is (εS , δS )-private. Let ε be the supremum such that R : X → Y is not (ε, δS )-
private. We will attempt to find a bound on ε. If R is not (ε, δS )-differentially private, there exist Y ⊂ Y
and x, x0 ∈ X such that
P R(x0 ) ∈ Y > exp(ε) · P [R(x) ∈ Y ] + δS
 

For brevity, define p := P(R(x) ∈ Y ) and p0 := P(R(x0 ) ∈ Y ) so that we have

p0 > exp(ε)p + δS (19)

We will show that if ε is too large, then (19) will imply that Pn is not (εS , δS )-differentially private,
which contradicts our assumption. To this end, define the set W := {W ∈ Y n | ∃i wi ∈ Y }. Define two
datasets X ∼ X 0 as
X := (x, . . . , x) and X 0 := (x0 , x, . . . , x )
| {z } | {z }
n times n−1 times
Because Pn is (εS , δS )-differentially private

P Pn (X 0 ) ∈ W ≤ exp(εS ) · P [Pn (X) ∈ W ] + δS


 
(20)

Now we have

P [Pn (X) ∈ W ]
 
 
 
= P S(R(x), . . . , R(x)) ∈ W 
 
 | {z } 
 
n times
 
 
 
= P (R(x), . . . , R(x)) ∈ W  (W is symmetric)
 | {z } 
n times
= P [∃i R(x) ∈ Y ] ≤ n · P [R(x) ∈ Y ] (Union bound)
= np

where the second equality is because the set W is closed under permutation, so we can remove the
random permutation S without changing the probability. Similarly, we have
 
 
 
 0   0
P Pn (X ) ∈ W = P (R(x ), R(x) . . . , R(x)) ∈ W 

 | {z } 
n−1 times
≥ P R(x0 ) ∈ Y = p0
 

> exp(ε)p + δS (By (19))

Now, plugging the previous two inequalities into (20), we have

exp(ε)p + δS < P Pn (X 0 ) ∈ W
 

≤ exp(εS ) · P [Pn (X) ∈ W ]


≤ exp(εS )np + δS

25
By rearranging and canceling terms in the above we obtain the conclusion

ε ≤ εS + ln n

Therefore R must satisfy (εS + ln n, δS )-differential privacy.

Claim 6.3. If the mixnet protocol PS = (R, S, A) is (α, β)-accurate for some function f , then the local protocol
PL = (R, A ◦ S) is (α, β)-accurate for f , where

(A ◦ S)(y1 , . . . , yN ) = A(S(y1 , . . . , yN ))

We do not present a proof of Claim 6.3, as it is immediate that the distribution of PS (x) and PL (x) are
identical, since A ◦ S incorporates the shuffler.
We conclude this section with a slight extension of a result of Bun, Nelson, and Stemmer [8] showing
how to transform any local algorithm satisfying (ε, δ)-differential privacy into one satisfying (O(ε), 0)-
differential privacy with only a small decrease in accuracy. Our extension covers the case where ε > 2/3,
whereas their result as stated requires ε ≤ 1/4.

Theorem 6.4 (Extension of [8]). Suppose local protocol PL = (R, A) is (ε, δ) differentially private and (α, β)
accurate with respect to f . If ε > 2/3 and

β 1
δ< ·
8n ln(n/β) exp(6ε)

then there exists another local protocol PL0 = (R0 , A) that is (8ε, 0) differentially private and (α, 4β) accurate with
respect to f .

Theorem 6.1 now follows by combining Theorem 6.2 and Claim 6.3 with Theorem 6.4.

6.2 Applications of Theorem 6.1


In this section, we define two problems and present known lower bounds in the central and local
models. By applying Theorem 6.1, we derive lower bounds in the one-message mixnet model. These
bounds imply large separations between the central and one-message mixnet models.

6.2.1 The Selection Problem


We define the selection problem as follows. The data universe is X = {0, 1}d where d is the dimension
of the problem and the main parameter of interest. Given a dataset x = (x1 , . . . , xn ) ∈ X n , the goal is to
identify a coordinate j such that the sum of the users’ j-th bits is approximately as large as possible. That
is, a coordinate j ∈ [d] such that
n n
X X n
xi,j ≥ max xi,j 0 − (21)
0
j ∈[d] 10
i=1 i=1

We say that an algorithm solves the selection problem with probability 1 − β if for every dataset x, with
probability at least 1 − β, it outputs j satisfying (21).
We would like to understand the minimum n (as a function of d) such that there is a differentially
private algorithm that can solve the selection problem with constant probability of failure. We remark
that this is a very weak notion of accuracy, but since we are proving a negative result, using a weak
notion of accuracy only strengthens our results.

26
Function Differential Privacy Model
One-Message Mixnet General Mixnet
Central Local
(Parameters) (this paper) (this paper)

Mean, X = {0, 1}  log(1/δ) 
O
(Accuracy α)
1
Θ( αε ) αε Θ( α 21ε2 )

Mean, X = [0, 1] 
1 log(1/δ)   log(1/δ) 
O α2
+ αε O αε
(Accuracy α)
Selection 1
O( d log d log dδ )
p
Θ(log d) Ω(d 17 ) Θ(d log d)
(Dimension d)
Histograms  n o 1
Θ min log 1δ , log D
p
Ω(log 17 D) O( log D) Θ(log D)
(Domain Size D)

Table 1: Comparisons Between Models. When a parameter is unspecified, the reader may substitute
ε = 1, δ = 0, α = β = .01. All results are presented as the minimum dataset size n for which we can hope
to achieve the desired privacy and accuracy as a function of the relevant parameter for the problem.

The following lower bound for locally differentially private protocols for selection is from [29], and is
implicit in the work of [14].2
Theorem 6.5. If PL = (RL , AL ) is a local protocol that satisfies (ε, 0)-differential
  privacy and PL solves the selection
9 d log d
problem with probability 10 for datasets x ∈ ({0, 1}d )n , then n = Ω (eε −1)2
.

By applying Theorem 6.1 we immediately obtain the following corollary.


Corollary 6.6. If PS = (RS , S, AS ) is a (1, δ)-differentially private protocol in the one-message mixnet model, for
99
δ = δ(n) < n−8 , and PS solves the selection problem with probability 100 , then n = Ω((d log d)1/17 ).

Using a multi-message mixnet protocol3 , we can solve selection with Õ( 1ε d) samples. By contrast, in
the local model n = Θ( ε12 d log d) samples are necessary and sufficient. In the central model, this problem is
solved by the exponential mechanism [24] with a dataset of size just n = O( 1ε log d), and this is optimal [3, 27].
These results are summarized in Table 1.

6.2.2 Histograms
We define the histogram problem as follows. The data universe is X = [D] where D is the domain size of
the problem and the main parameter of interest. Given a dataset x = (x1 , . . . , xn ) ∈ X n , the goal is to build
a vector of size D such that for all j ∈ [D] the j-th element is as close to the frequency of j in x. That is, a
vector v ∈ [0, n]D such that
n
X n
max vj − 1(xi = j) ≤ (22)
j∈[D] 10
i=1
where 1(conditional) is defined to be 1 if conditional evaluates to true and 0 otherwise.
Similar to the selection problem, an algorithm solves the histogram problem with probability 1 − β if for
every dataset x, with probability at least 1 − β it outputs v satisfying (22). We would like to find the
2 These works assume that the dataset x consists of independent samples from some distribution D, and define accuracy for
selection with respect to mean of that distribution. By standard arguments, a lower bound for the distributional version implies a
lower bound for the version we have defined.
3 The idea is to simulate multiple rounds of our protocol for binary sums, one round per dimension.

27
minimum n such that a differentially private algorithm can solve the histogram problem; the following
lower bound for locally differentially private protocols for histograms is from [4].

Theorem 6.7. If PL = (RL , AL ) is a local protocol that satisfies


 (ε, 0)
 differential privacy and PL solves the histogram
9 log D
problem with probability 10 for any x ∈ [D]n then n = Ω (eε −1)2

By applying Theorem 6.1, we immediately obtain the following corollary.

Corollary 6.8. If PS = (RS , S, AS ) is a (1, δ)-differentially private protocol in the one-message


 mixnet
 model, for
δ = δ(n) < n−8 , and PS solves the histogram problem with probability 100 99
, then n = Ω log1/17 D

In the mixnet model, we can solve this problem using our protocol for bit-sums by having each user
encode their data as a “histogram” of just their value xi ∈ [D]
q and then running the bit-sum protocol D
times, once for each value j ∈ [D], which incurs error O( 1ε log 1δ log D).4 But in the central model, this
problem can be solved to error O(min{log 1δ , log D}), which is optimal (see, e.g. [30]). Thus, the central
and one-message mixnet models are qualitatively different with respect to computing histograms: D may
be infinite in the former whereas D must be bounded in the latter.

References
[1] Apple tries to peek at user habits without violating privacy. The Wall Street Journal, 2016.

[2] J. M. Abowd. The U.S. Census Bureau adopts differential privacy. In Proceedings of the 24th ACM
SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD ’18, pages 2867–2867,
New York, NY, USA, 2018. ACM.

[3] M. Bafna and J. Ullman. The price of selection in differential privacy. In Conference on Learning
Theory, pages 151–168, 2017.

[4] R. Bassily and A. Smith. Local, private, efficient protocols for succinct histograms. In Proceedings of
the Forty-Seventh Annual ACM on Symposium on Theory of Computing, pages 127–135. ACM, 2015.

[5] A. Beimel, K. Nissim, and E. Omri. Distributed private data analysis: Simultaneously solving how
and what. In Annual International Cryptology Conference, pages 451–468. Springer, 2008.

[6] A. Bittau, U. Erlingsson, P. Maniatis, I. Mironov, A. Raghunathan, D. Lie, M. Rudominer, U. Kode,


J. Tinnes, and B. Seefeld. PROCHLO: Strong privacy for analytics in the crowd. In Proceedings of the
Symposium on Operating Systems Principles (SOSP), 2017.

[7] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal,


and K. Seth. Practical secure aggregation for privacy preserving machine learning. IACR Cryptology
ePrint Archive, 2017.

[8] M. Bun, J. Nelson, and U. Stemmer. Heavy hitters and the structure of local privacy. In ACM
SIGMOD/PODS Conference International Conference on Management of Data (PODS 2018), 2018.

[9] T. H. Chan, E. Shi, and D. Song. Optimal lower bound for differentially private multi-party
aggregation. In Algorithms - ESA 2012 - 20th Annual European Symposium, Ljubljana, Slovenia, September
10-12, 2012. Proceedings, pages 277–288, 2012.
4 Note that changing one user’s data can only change two entries of their local histogram, so we only have to scale ε, δ by a factor
of 2 rather than a factor that grows with D.

28
[10] T.-H. H. Chan, E. Shi, and D. Song. Privacy-preserving stream aggregation with fault tolerance. In
Financial Cryptography, pages 200–214, 2012.

[11] D. L. Chaum. Untraceable electronic mail, return addresses, and digital pseudonyms. Commun.
ACM, 24(2):84–90, Feb. 1981.

[12] G. Chen, S. Chen, Y. Xiao, Y. Zhang, Z. Lin, and T. H. Lai. Sgxpectre attacks: Leaking enclave secrets
via speculative execution. CoRR, abs/1802.09085, 2018.

[13] H. Corrigan-Gibbs and D. Boneh. Prio: Private, robust, and scalable computation of aggregate
statistics. In Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation,
NSDI’17, pages 259–282, Berkeley, CA, USA, 2017. USENIX Association. ISBN 978-1-931971-37-9.

[14] J. C. Duchi, M. I. Jordan, and M. J. Wainwright. Local privacy and statistical minimax rates. In
Foundations of Computer Science (FOCS), 2013 IEEE 54th Annual Symposium on, pages 429–438. IEEE,
2013.

[15] C. Dwork, K. Kenthapadi, F. McSherry, I. Mironov, and M. Naor. Our data, ourselves: Privacy via
distributed noise generation. In EUROCRYPT, 2006.

[16] C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data
analysis. In Theory of Cryptography Conference (TCC), 2006.

[17] C. Dwork, G. N. Rothblum, and S. P. Vadhan. Boosting and differential privacy. In FOCS, pages
51–60. IEEE, 2010.

[18] Ú. Erlingsson, V. Pihur, and A. Korolova. RAPPOR: Randomized aggregatable privacy-preserving
ordinal response. In ACM Conference on Computer and Communications Security (CCS), 2014.

[19] A. Evfimievski, J. Gehrke, and R. Srikant. Limiting privacy breaches in privacy preserving data
mining. In PODS, pages 211–222. ACM, 2003.

[20] S. P. Kasiviswanathan and A. Smith. On the ‘semantics’ of differential privacy: A bayesian formula-
tion. CoRR, arXiv:0803.39461 [cs.CR], 2008.

[21] S. P. Kasiviswanathan, H. K. Lee, K. Nissim, S. Raskhodnikova, and A. Smith. What can we learn
privately? In Foundations of Computer Science (FOCS). IEEE, 2008.

[22] M. J. Kearns. Efficient noise-tolerant learning from statistical queries. In STOC, pages 392–401. ACM,
May 16-18 1993.

[23] A. Kwon, D. Lazar, S. Devadas, and B. Ford. Riffle: An efficient communication system with strong
anonymity. PoPETs, 2016(2):115–134, 2016.

[24] F. McSherry and K. Talwar. Mechanism design via differential privacy. In IEEE Foundations of
Computer Science (FOCS), 2007.

[25] E. Shi, T. H. Chan, E. G. Rieffel, R. Chow, and D. Song. Privacy-preserving aggregation of time-series
data. In Proceedings of the Network and Distributed System Security Symposium, (NDSS) 2011, 2011.

[26] A. Smith. Differential privacy and the secrecy of the sample, 2009.

[27] T. Steinke and J. Ullman. Tight lower bounds for differentially private selection. In Foundations of
Computer Science (FOCS), 2017 IEEE 58th Annual Symposium on, pages 552–563. IEEE, 2017.

[28] A. G. Thakurta, A. H. Vyrros, U. S. Vaishampayan, G. Kapoor, J. Freudiger, V. R. Sridhar, and


D. Davidson. Learning new words, May 9 2017. US Patent 9,645,998.

29
[29] J. Ullman. Tight lower bounds for locally differentially private selection. CoRR, abs/1802.02638,
2018.

[30] S. Vadhan. The complexity of differential privacy. http://privacytools.seas.harvard.edu/publications


/complexity-differential-privacy, 2016.

[31] S. L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal
of the American Statistical Association, 60(309):63–69, 1965.

[32] M. Zhilyaev and D. Zeber. Sufficient differential privacy. Unpublished manuscript. Avail-
able at https://github.com/mozilla/k-randomization/blob/dave/notes/sufficient-dp, De-
cember 2017.

A From Approximate DP to Pure DP for Local Protocols


In this section, we show that for any (ε, δ) private local protocol, there exists an (8ε, 0) private
counterpart with roughly the same accuracy guarantees. [8] proved this theorem when ε ≤ 1/4, but we
need the theorem when ε  1. Our proof follows their approach almost exactly, but we include it for
completeness to verify that their result can be modified to hold for larger ε.

Theorem A.1 (Extension of [8]). Let Pn = (Rn , An ) be a local protocol for n ≥ 3 users that is (ε, δ) differentially
private and (α, β) accurate with respect to f . If ε > 2/3,

β
δ<
8n ln(n/β)e6ε

then there exists another local protocol that is (8ε, 0) differentially private and (α, 4β) accurate with respect to f .

The conditions ε > 2/3 and n > 3 are not essential, but are used to simplify the statement. We will
prove Theorem A.1 by construction: given local randomizer R, Algorithm 7 transforms it into another
randomizer Rk,T . The parameters will be set later to achieve the desired privacy and accuracy.

Algorithm 7: A local randomizer Rk,T


Input: x ∈ X ; parameters k ∈ (0, 2e−2ε ) and T ∈ N; black-box access to R : X → Y
Output: yk,T ∈ Y
Let c be some (publicly
h known) fixed element
i of X .
1 1
Define GoodInt := 2 exp(−2ε), 2 exp(2ε)
For t ∈ [T ]
vt ← R(c)
P[R(x)=v ]
pt ← 12 P[R(c)=v t ]
t
1
If pt < GoodInt : pt ← 2
bt ← Ber(pt · k)
If ∃t bt = 1 : Sample j uniformly over {t ∈ [T ] | bt = 1}
Else Sample j uniformly over [T ]
Return yk,T ← vj

30
A.1 Privacy Analysis
First, we establish that the transformed local randomizer is indeed (8ε, 0)-differentially private.

Claim A.2. For any ε > 0, any (ε, δ)-differentially private algorithm R, any k ∈ (0, 2e−2ε ), and any T ∈ N, the
algorithm Rk,T is (8ε, 0) differentially private.
h i
Proof. Define L := 12 exp(−2ε) · k and U := 21 exp(2ε) · k. Note that [L, U ] ⊂ [0, 1] and P bi,j = 1 ∈ [L, U ].
Fix x ∼ x0 ∈ X , j ∈ [T ], and y ∈ Y . Let Vy = {V ∈ Y T | ∃j vj = y}.
  h i
P Rk,T (x) = y = P Rk,T (x) = y ∩ V ∈ Vy
X  
= P [V = V ] · P Rk,T (x) = y | V = V
V ∈Vy
 
X  X 
= P [V = V ] ·  P [j = j | V = V ]
 
(23)
 
V ∈Vy {j|vj =y}

Recall that bj takes a random binary value and the distribution of j is dependent on these bits. In the
analysis below, we omit the condition that V = V for brevity.
h i h i
P [j = j] = P j = j ∩ bj = 0 + P j = j ∩ bj = 1 (24)

We’ll upper bound each summand separately. If bj is set to zero, then the only way to for the user to
choose j is for all other bits bt to be zero as well. And in that case, the choice is uniform over [T ]:

h i 1
P j = j ∩ bj = 0 = · P [∀t bt = 0]
T
T
1 Y
= · P [bt = 0] (Independence)
T
t=1
T
1 Y
≤ · (1 − L) (P [bt = 1] ≥ L)
T
t=1
1
= · (1 − L)T (25)
T
If bj is set to one, j is uniform over the bits set to one, itself a random variable:
h i h i h i
P j = j ∩ bj = 1 = P bj = 1 · P j = j | bj = 1
 
T
h i X 1 X 
= P bj = 1 · · P  bt = s − 1
 
s  
s=1 t,j
 
T −1
h i X 1 X 
= P bj = 1 · · P  bt = s
 
s+1  
s=0 t,j
" #
h i 1
= P bj = 1 · E P (26)
1 + t,j bt
P
Observe that the term t,j bt is a sum of Bernoulli random variables, with different expectations but
all residing in [L, U ]. As a corollary of Claim 6.3 in [8], we have the following:

31
Claim A.3. If random variables b1 , . . . , bT are each drawn independently from Ber(p1 ), . . . , Ber(pT ) where L ≤
pt ≤ U for every t ∈ [T ], then
" # " # " #
1 1 1
E ≤E P ≤E .
1 + Bin(T , U ) 1 + bt 1 + Bin(T , L)

Hence,
" #
h i 1
(26) ≤ P bj = 1 · E
1 + Bin(T − 1, L)
h T −1
i X 1 T −1
!
= P bj = 1 · · · Ls · (1 − L)T −s−1
s+1 s
s=0
h i 1  
= P bj = 1 · 1 − (1 − L)T
TL
U  
≤ 1 − (1 − L)T (27)
TL
From (25), (27), and (24),
U   1
P [j = j] ≤ 1 − (1 − L)T + (1 − L)T
TL T
U  T
 U
≤ 1 − (1 − L) + (1 − L)T (U > L)
TL TL
U
= (28)
TL
Recall (23):
 
X  X 
 
P Rk,T (x) = y = P [V = V ] ·  P [j = j | V = V ]
 
 
V ∈Vy {j|vj =y}
U X
< · P [V = V ] · {#vj = y} (From (28))
TL
V ∈Vy
U h i
= · E #vj = y (29)
T L V←R(c) T

where we use #vj = y to indicate the number of elements of V that have value y.
We remark that the distribution of V is wholly independent of private value x. Hence, by a completely
symmetric series of steps,
L h i
P Rk,T (x0 ) = y >
 
· E #vj = y (30)
T U V←R(c)T

From (29) and (30),

U2
 
P Rk,T (x) = y
 0
< 2
P Rk,T (x ) = y L
(1/4) · exp(4ε) · k 2
=
(1/4) · exp(−4ε) · k 2
= exp(8ε)

which completes the proof.

32
A.2 Accuracy Analysis
Next we show that, for suitable parameters, the protocol Pn,k,T = (Rn,k,T , An ) remains essentially as
accurate as Pn = (Rn , An ).

Claim A.4. If Pn = (Rn , An ) is (α, β)-accurate for f , and Rn is (ε, δ)-differentially private for ε > 2/3 and

β 1
δ< ·
8n ln(n/β) exp(6ε)

then there exists T ∈ N, k ∈ (0, 2e−2ε ) such that the local protocol Pn,k,T = (Rn,k,T , An ) is (α, 4β) accurate for f .

Claim A.2 and Claim A.4 together imply Theorem 6.4.


For the purposes of this section, fix any X ∈ X n where X = (x1 , . . . , xn ). Let yk,T [i] denote5 the random
output of Rn,k,T (xi ) and let Yk,T denote the ordered set yk,T [1], . . . , yk,T [n]. Let y[i] denote the random
output of Rn (xi ) and let Y denote the ordered set y[1], . . . , y[n].
As described in Algorithm 7, Rk,T defines variables bt , vt for every t ∈ [T ]. In the context of Pn , there
are 2n · T such variables, 2T for each user i: we use bi,t , vi,t to disambiguate between users.
As a first step to proving Claim A.4, we show that the distribution of Yk,T is similar to that of Y
(Claim A.5). Then we show that running the same analysis function An on both Y and Yk,T yields similar
accuracy guarantees (Claim A.8). The notion of “similar” is statistical distance in terms of parameters
n, ε, δ, k, T : if the parameters are constrained, then the distance simplifies to 3β. Because β + 3β = 4β, we
have (α, 4β) accuracy (Claim A.9). We close the section by a particular setting of k, T to achieve such a
bound.
As we have stated, we start by relating Yk,T to Y:
1−exp(−ε)
Claim A.5. If ε > 0, 0 < δ < 4 exp(ε)n
,T ∈ N, k ∈ (0, 2e−2ε ), then for any W ⊆ Y n ,

T
1 2nδ exp(ε)

 
P Yk,T ∈ W < P [Y ∈ W ] + n · 1 − exp(−2ε) · k + (T + 2) (31)
2 1 − exp(−ε)

Proof. Consider an execution of the protocol Pn,k,T (X). Let E1 denote the event that for some user i, all
bits bi,j are set to 0:
 
P [E1 ] = P ∃i ∀t bi,t = 0
 
≤ n · max P ∀t bi,t = 0 (Union bound)
i∈[n]
T
1

≤ n · 1 − exp(−2ε) · k (32)
2

Recall that GoodInt = [ 12 e−2ε , 12 e2ε ]. For any x, x0 ∈ X , let Good(x, x0 ) ⊂ Y denote the set consisting of all
0
1 P(Rn (x )=τ)
τ satisfying 2 P(Rn (x)=τ) ∈ GoodInt. The following is a property of private algorithms:

Lemma A.6. Fix a value of c ∈ X . If R : X → Y is (ε, δ) differentially private, then for any x ∈ X ,

2δ exp(ε)
P [R(c) < Good(c, x)] ≤
1 − exp(−ε)
5 Brackets are used for indices in order to avoid collision with parameters k, T in the subscript

33
A proof can be found in [4, Claim 5.4]. Let E2 denote the event that for some i and some t, vi,t <
Good(c, xi ).
n X
X T
 
P [E2 ] = P vi,t < Good(c, xi ) (Independence)
i=1 t=1
≤ nT · max (P [R(c) < Good(c, xi )]) (vi,t ← R(c))
i∈[n]
2nT δ exp(ε)
≤ (33)
1 − exp(−ε)
The second inequality is an application of Lemma A.6 to (ε, δ) private Rn .
Fix any W ⊂ Y n .
       
P Yk,T ∈ W = P Yk,T ∈ W ∩ E1 + P Yk,T ∈ W ∩ E2 + P Yk,T ∈ W ∩ (¬E1 ∩ ¬E2 )
 
≤ P [E1 ] + P [E2 ] + P Yk,T ∈ W ∩ (¬E1 ∩ ¬E2 )
T
1 2nT δ exp(ε)

 
≤ n · 1 − exp(−2ε) · k + + P Yk,T ∈ W ∩ (¬E1 ∩ ¬E2 ) (34)
2 1 − exp(−ε)
The second inequality is simply substitution of (32) and (33)
Fix some i ∈ [n]. Notice that if ¬E2 occurs, then it must be the case that for every t ∈ [T ], vi,t ∈
Good(c, xi ). Because yk,T [i] is selected from vi,t , it must be the case that yk,T [i] has to lie in Good(c, xi ).
Let Good ⊂ Y n denote Good(c, x1 ) × · · · × Good(c, xn ). If ¬E2 occurs, then Yk,T ∈ Good. We use this to
analyze the third summand in (34):
  X  
P Yk,T ∈ W ∩ (¬E1 ∩ ¬E2 ) = P Yk,T = W ∩ (¬E1 ∩ ¬E2 )
W ∈W
X  
= P Yk,T = W ∩ (¬E1 ∩ ¬E2 )
W ∈W ∩Good
X n
Y  
= P yk,T [i] = w[i] ∩ (¬E1 ∩ ¬E2 ) (Independence)
W ∈W ∩Good i=1
X Yn
 
≤ P yk,T [i] = w[i] | (¬E1 ∩ ¬E2 ) (35)
W ∈W ∩Good i=1

We will later prove the following equivalence:


Claim A.7. For any ε > 0, k ∈ (0, 2 exp(−2ε)), T ∈ N and c, x ∈ X ,
 
∀g ∈ Good(c, x) P yk,T = g | (¬E1 ∩ ¬E2 ) = P [y = g | y ∈ Good(c, x)]
yk,T ←Rn,k,T (x) y←Rn (x)

By substitution,
X n
Y
(35) = P [y[i] = w[i] | y[i] ∈ Good(c, xi )]
W ∈W ∩Good i=1
X
= P [Y = W | Y ∈ Good] (Independence)
W ∈W ∩Good
= P [Y ∈ W | Y ∈ Good]
1
≤ · P [Y ∈ W ]
P [Y ∈ Good]
1
= · P [Y ∈ W ] (36)
1 − P [Y < Good]

34
1 P(Rn (xi )=y[i]) 2nδ exp(ε)
Notice that Y < Good when, for some i, 2 P(Rn (c)=y[i]) < GoodInt. We obtain P [Y < Good] ≤ 1−exp(−ε)
by
an argument similar6 to that of (33). Therefore,
1
(36) ≤ 2nδ exp(ε)
· P [Y ∈ W ]
1− 1−exp(−ε)
!
4nδ exp(ε) 2nδ exp(ε)
≤ 1+ · P [Y ∈ W ] ( 1−exp(−ε) < 12 )
1 − exp(−ε)
4nδ exp(ε)
≤ P [Y ∈ W ] +
1 − exp(−ε)
When we return to (34), we have
T
1 2nT δ exp(ε) 4nδ exp(ε)

 
P Yk,T ∈ W ≤ P [Y ∈ W ] + n · 1 − exp(−2ε) · k + +
2 1 − exp(−ε) 1 − exp(−ε)
which is equivalent to (31). This concludes the proof, modulo Claim A.7.

Here, we prove Claim A.7.

Proof of Claim A.7. Fix any ε > 0, k ∈ (0, 2 exp(−2ε)), T ∈ N, (c, x) ∈ X 2 and g ∈ Good(c, x). Sample y from
Rn (x) and yk,T from Rn,k,T (x).
By a corresponding argument advanced by [8],
  P [b1 = 1 ∩ v1 = g]
P yk,T = g | (¬E1 ∩ ¬E2 ) = (37)
P [b1 = 1 ∩ v1 ∈ Good(c, x)]
We first expand the numerator:

P [b1 = 1 ∩ v1 = g] = P [v1 = g] · P [b1 = 1 | v1 = g]


1 P [y = g]
= P [v1 = g] · ·k (Defn. of Rk,T )
2 P [v1 = g]
1
= P [y = g] · k (38)
2
We now analyze the denominator:
X
P [b1 = 1 ∩ v1 ∈ Good(c, x)] = P [v1 = τ] · P [b1 = 1 | v1 = τ]
τ∈Good(c,x)
X 1 P [y = τ]
= P [v1 = τ] · ·k (Defn. of Rk,T )
2 P [v1 = τ]
τ∈Good(c,x)
1 X
= · P [y = τ] · k
2
τ∈Good(c,x)
1
= P [y ∈ Good(c, x)] · k (39)
2
Therefore,
1
Eq(38) P [y = g] · k
Eq(37) = = 1 2
Eq(39) 2 P [y ∈ Good(c, x)] · k
= P [y = g | y ∈ Good(c, x)]

which completes the proof.


6 Notice the absence of T : to arrive at (33), we union bound over |V| = nT random variables but here |Y| = n

35
The preceding bound on the statistical distance between Rn (X), Rn,k,T (X) implies a bound on the error
of the transformed protocol Pn,k,T :
1−exp(−ε)
Claim A.8. Suppose Pn = (Rn , An ) is (ε, δ) differentially private and (α, β) accurate. If ε > 0, δ ∈ (0, 4 exp(ε)n
), T ∈
N, k ∈ (0, 2e−2ε ), then Pn,k,T = (Rn,k,T , An ) is (α, βk,T ) accurate where
T
1 2nδ exp(ε)

βk,T = β + n · 1 − exp(−2ε) · k + (T + 2) (40)
2 1 − exp(−ε)

Proof. If An is deterministic, the claim is immediate from Claim A.5. Otherwise, the randomness of u is
sourced from both An and Rn .
For any X = (x1 , . . . , xn ) ∈ X n , we again use Y to denote the random variable output by Rn (x1 ), . . . , Rn (xn ),
likewise Yk,T for the random variable output by Rn,k,T (x1 ), . . . , Rn,k,T (xn ). We will use u to denote the
random variable An (Y), which is the output of the original protocol, and uk,T to denote the random
variable An (Yk,T ), which is the output of the transformed protocol.
For any Y ∈ Y n , let ∆Y := P Yk,T = Y − P [Y = Y ]. Let I denote the subset of Y n containing exactly
 
 
those Y such that P Yk,T = Y > P [Y = Y ]; equivalently, those Y where ∆Y > 0. We will use I to analyze
the probability of exceeding α error: assuming we are interested in approximating a real-valued f (X),
     
P |uk,T − f (X)| > α = P |uk,T − f (X)| > α ∩ Yk,T ∈ I + P |uk,T − f (X)| > α ∩ Yk,T < I (41)

We bound each term in the sum separately.


  X  
P |uk,T − f (X)| > α ∩ Yk,T ∈ I = P [|An (Y ) − f (X)| > α] · P Yk,T = Y
Y ∈I
X
= P [|An (Y ) − f (X)| > α] · (∆Y + P [Y = Y ])
Y ∈I
X
< P [|An (Y ) − f (X)| > α] · P [Y = Y ] + ∆Y
Y ∈I
X
= P [|An (Y ) − f (X)| > α ∩ Y = Y ] + ∆Y
Y ∈I
 
X 
= P [|u − f (X)| > α ∩ Y ∈ I] +  ∆Y  (42)
Y ∈I

where the inequality comes from the fact that ∆Y > 0.

  X  
P |uk,T − f (X)| > α ∩ Yk,T < I = P [|An (Y ) − f (X)| > α] · P Yk,T = Y
Y <I
X
≤ P [|An (Y ) − f (X)| > α] · P [Y = Y ]
Y <I
= P [|u − f (X)| > α ∩ Y < I] (43)

The inequality comes from the definition of ¬I.

36
From (41), (42), and (43) we have
 
  X 
P |uk,T − f (X)| > α < P [|u − f (X)| > α] +  ∆Y 
Y ∈I
 
X 
≤ β +  ∆Y  (Pn is (α, β)-accurate)
Y ∈I
  
= β + P Yk,T ∈ I − P [Y ∈ I]
T
1 2nδ exp(ε)

< β + n · 1 − exp(−2ε) · k + (T + 2) (Claim A.5)
2 1 − exp(−ε)

This completes the proof.

Finally, we show that the preceding error probability simplifies to 4β provided that parameters n, ε, δ
obey some constraints.

Claim A.9. For all ε > 2/3, k ∈ (2 exp(−3ε), 2 exp(−2ε)), n ≥ 3 and


β 1
0<δ< · (44)
8n ln(n/β) exp(6ε)

then there exists T ∈ N such that


T
1 2nT δ exp(ε) 4nδ exp(ε)

n · 1 − exp(−2ε) · k + + < 3β (45)
2 1 − exp(−ε) 1 − exp(−ε)

Proof. (45) holds when each term in the sum is ≤ β.


 T
We begin with the term n · 1 − 12 exp(−2ε) · k . Because k > 2 exp(−3ε), it will suffice to have

β > n · (1 − exp(−5ε))T
!
n 1
ln < T ln
β 1 − exp(−5ε)
!
1
= T ln 1 + (46)
exp(5ε) − 1

The following is fairly trivial to prove: if 0 < τ < 1, then 1 + τ > exp(τ/2). Here, τ := (exp(5ε) − 1)−1 .
We are ensured that τ ∈ (0, 1) because ε > ln(2)/5. Therefore, the following bound on ln(n/β) is tighter
than (46):
!!
n 1 1
ln < T ln exp ·
β 2 exp(5ε) − 1
1 1
=T · ·
2 exp(5ε) − 1
n
T > ln · 2(exp(5ε) − 1) (47)
β

We also want the second term of (45) to be bounded by β.

2nT δ exp(ε)
β>
1 − exp(−ε)
1 − exp(−ε)
T <β (48)
2nδ exp(ε)

37
Both (47) and (48) need be true for the same value of T . Hence,
n 1 − exp(−ε)
ln · 2(exp(5ε) − 1) < β
β 2nδ exp(ε)
1 − exp(−ε) 1
δ<β ·
2n exp(ε) ln βn · 2(exp(5ε) − 1)
β exp(ε) − 1
= · (49)
4n ln(n/β) exp(2ε)(exp(5ε) − 1)
Because ε > 2/3, one can show that
exp(ε) − 1 1
> exp(−6ε)
exp(2ε)(exp(5ε) − 1) 2
which means that any δ satisfying (44) satisfies (49).
4nδ exp(ε)
The final term in (45) is 1−exp(−ε) ; for this to be bounded by β, it will suffice for

0.1
δ<β· (ε > 2/3)
n exp(ε)
This constraint on δ is not as tight as (44) whenever n > e.
Because we have shown all three terms in (45) are bounded by β, this concludes the proof.

Setting parameters k, T We now provide parameter values to ensure our transformation is (α, 4β)
accurate, thereby proving Claim A.4. Suppose the parameters k, T are assigned as follows
k ← 2 exp(−2.5ε)
l n m
T ← ln · 2(exp(5ε) − 1)
β
Because k ∈ (0, 2 exp(−2ε)) and T ∈ N , Claim A.8 implies that the protocol Pn,k,T = (Rn,k,T , An ) is
(α, βk,T ) accurate, where βk,T is defined in (40). Claim A.9 implies that there is an integer value of T where
βk,T ≤ 4β; from (47), T is assigned such a value. Hence, Pn,k,T is (α, 4β) accurate.

B Concentration Inequalities
In this appendix, we formally state the three concentration inequalities used in this paper:
Theorem B.1 (Chernoff bound). If x1 , . . . , xn are independent {0, 1}-valued random variables, each with mean µ,
then, for every β > 0,
 X q  X q 
P µn − xi < 2µn log β1 ≥ 1 − β and P xi − µn < 3µn log β1 ≥ 1 − β

Theorem B.2 (Hoeffding’s inequality). If x1 , . . . , xn are independent random variables, each with mean µ and
bounded in (a, b), then, for every β > 0,
X q 
P xi − µn < (b − a) 12 n log β2 > 1 − β

Theorem B.3 (Bernstein’s inequality). If x1 , . . . , xn are independent random variables, each with mean 0, variance
4
σ 2 > 9n log β2 , and bounded in [−1, 1], then, for every β > 0,
X q 
P xi < 2σ n log β2 > 1 − β

38

You might also like