Extending Partial Differential Private Mechanisms
Extending Partial Differential Private Mechanisms
Abstract—An (ε, δ)-DP mechanism is a mapping dened as In practice, neighboring data point are data sets which
follows. The domain of the mechanism is a nite set of objects, differ in precisely one eld. Therefore, not being able to
(also called the data points) such that a symmetric neighborhood distinguish two neighboring data point from the output means
relation over the data points is dened. The range of the
mechanism at each data point is a distribution over another that presence or absence of that eld can be made hidden.
set. Further more, neighboring data points must be mapped to The crucial observation is that since the mechanism is non-
two distributions that are not far away. The parametric notion deterministic, if we know that one of the two neighboring
of distance of two distribution in terms of the parameters (ε, δ) data points is the actual input of the mechanism, we can not
in the context of privacy theory, is rst introduced by Dwork guaranty that which one was the input only by observing the
and her collaborators.
In this paper, we study the following problem. Given a nite set
output. Obviously, the closer the output distribution at two
D of data points, the neighboring relation, the parameters ε, δ, neighboring points are, the harder distinguishing them will
and a partial mechanism that is dened over a subset D ′ ⊆ D, is be.
there an extension of the mechanism dened over the entire set D In section II, we formally dene the notion of (, )-DP.
that is identical to the partial mechanism on D ′ and also, is (ε, δ)- Roughly speaking, a mechanism is a random function which
differential private. We show that there exists an algorithm to
answer this question and it runs in time that is polynomial in the
associate a probability distribution to each vertex such that
input variables. Our result generalizes a result of Medard et al. the distributions associated to two neighboring data point are
about optimum mechanism extension with respect to preferential close. The measure of closeness of two distributions will be
query ordering. dened later. Also, this notion is parameterized with respect
Index Terms—Differential privacy, linear programming, pri- to two quantities , .
vacy, mechanism extension, rainbow differential privacy
In this paper, we consider the following problem. Let G =
(D, E) be a graph with the vertex set D and the edge set E.
I. I NTRODUCTION Let R be a nite set and U ⊆ D. A partial mechanism M is
a randomized function dened over the set U such that each
Dwork et al. introduced the concept of differential privacy point of U is mapped to a probability distribution over the set
in the seminal paper [DMNS06]. Differential privacy is a R.
mathematical framework to formally and rigorously dene the Problem 1: Given a graph G and a subset U of the vertices,
concept of privacy. Soon after, other researchers in the area of as above, and also given two parameters , and a partial
privacy and related disciplines start to investigate this notion mechanism U , is there an (, )-DP mechanism M̂ over the
further. entire set D such that M̂ is an extension of M (i.e. they
The rich theoretical aspects of differential privacy, together produce the same distribution at any point v ∈ D) and also,
with its signicant impact in practice made it a central concept M̂ is (, )-DP.
in the eld. Moreover, ever-increasing need for privacy, made
the differential privacy even more attractive. A. Prior Works
The main intuition behind the denition of differential Prior to this work, in [DMS21], [TES+ 22], [ZGD+ 22],
privacy is that if a server has access to a data set containing [TES+ 24], the problem of mechanism extension has been
private data, and if a client sends queries about the data to studied. However, in those works, the objective is to decide
the server, in order to preserve the privacy, the server must whether an extension exists and if one exists, nd the “best
respond problematically. That is, introducing randomness to one”. By the best mechanism, we mean a mechanism with the
the queries can help the server to keep the important data highest possible utility, where the utility of a mechanism is a
private. number associated with the mechanism.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on January 03,2025 at 09:33:06 UTC from IEEE Xplore. Restrictions apply.
2024 Iran Workshop on Communication and Information Theory (IWCIT)
For instance, in [TES+ 22], the authors proposed a polyno- A. Graph Theory
mial time algorithm such that given a binary partial mechanism A graph G is a pair G = (D, E) comprising a set D of
on an input graph G, two non-negative real numbers , , and vertices or nodes together with a set E of edges. Each edge
a specic type of utility function, called reasonable utility is a 2-element subset of D. In this article, the graphs are
function, it decides whether an (, )-DP extension exists. undirected.
Also, if one such extension exists, it outputs one with the For a broader introduction to graph theory, we refer the
highest possible utility. Further extending this discussion, interested reader to [Wes00].
[TES+ 24] delves deeper into binary partial mechanisms. The
authors introduce a rened framework that encapsulates lower B. Probability
and upper bounds implied by differential privacy constraints, For a nite subset of elements R, we represent the probabil-
providing a new solution to the extension problem for binary- ity distribution of possible outcomes from R with an element
valued mechanisms. in ∆(R). Here, ∆(R) is dened as the simplex over R:
This work is in fact a generalization of the result of
[DMS21] which works under the homogeneity assumption
over the partial mechanism. ∆(R) = x ∈ RR : xr = 1, ∀r ∈ R : xr ≥ 0 (1)
In the above, the term binary partial mechanism means r∈R
a mechanism for with the value at each point is a binary Given a randomized mechanism M from D to a set of
distribution over a set of size two. outputs R, we can represent M as a function from D to the
In [ZGD+ 22], the authors tried to relax the condition of simplex over R, ∆(R). This is because we can interpret the
binary range to any arbitrary nite number. In Section II, we result of M on an input x ∈ D as a probability distribution,
thoroughly describe the problem and the result of that work. M(x), over R. Therefore, for a subset S ⊆ R and c ∈ D, we
denote Pr(M(x) ∈ S) as r∈S M(x)r .
B. Our Contribution
Given a domain D and a space R of possible outputs, we
The contributions of this paper are as follows. dene U as a partial randomized mechanism if U is a function
• We show that there exists a polynomial time algorithm from a subset U ⊆ D to ∆(R). If we have another mechanism
which does the following. First, the algorithm decides if M : D → ∆(R), we say that M is an extension of U if, for
an (, )-DP extension of the input exists. Secondly, if an every x ∈ U , we have M(x) = U (x).
extension exists, it outputs at least one such extension.
C. Differential Privacy
• We modify the algorithm such that for any choice of pref-
erential ordering at the nodes, the output of the algorithm As pointed out earlier, differential privacy, is a robust
is optimal with respect to that ordering, provided that at privacy framework that provides a mathematical guarantee of
least one private extension exists. In another words, for privacy. It is a commitment made by a data curator to a data
the problem of rainbow differential privacy, we propose a subject, ensuring that the subject’s privacy is not compromised
linear programming based algorithm which runs in time by their participation in the dataset. We now dene this
polynomial in terms of input parameters. concept in a formal manner.
Our result generalizes that of [DMS21] in the sense that Denition 1 ((ϵ, )-DP mechanism): Let D be the set of
while their result hold under certain homogeneity conditions, possible datasets and let R be a nite set of results, represent-
we impose no constraint on the value of the partial mechanism. ing the range of output. Let ∼ be a symmetric relation, called
Furthermore, all results of [DMS21], [TES+ 22], [TES+ 24], neighboring relation, on the set D. For real numbers ϵ, > 0,
have extra assumption on the subset of the data point at which a function M : D → ∆(R) is called an (ϵ, )-differential
the partial mechanism is dened. In contrast, our result is private mechanism (ϵ, )-DP, for short) if for every v, u ∈ D
valid regardless of the shape of the domain of the input partial such that v ∼ u and for every S ⊆ R, the following holds:
mechanism.
Pr(M(v) ∈ S) ≤ eϵ Pr(M(u) ∈ S) + (2)
C. Organization of the Paper
In this paper, we consider a graph G = (D, E), where the
In the rest of this paper, we rst overview the basic termi- vertices of G, represented by D, correspond to our possible
nology of graph theory, linear programming and differential datasets. The neighboring relation is dened by adjacency in
privacy. Next, we formally state the main result of this work the graph. More formally, for any v, u ∈ D, we say that v and
and present a proof. Finally we conclude the paper. u are neighbors, denoted by v ∼ u, if and only if (v, u) ∈
E. This setup allows us to model the relationships between
II. P RELIMINARIES different datasets within the context of graph theory.
In this section, we overview the basics of graph theory, lin- A great introduction to the eld of differential privacy and
ear programming and differential privacy and introduce some in particular to its algorithmic aspects is the book [DR+ 14]
notation. We also review the problem of rainbow differential by Dwork et al. Another reference on this topic is [Dwo08]
privacy. by Dwork.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on January 03,2025 at 09:33:06 UTC from IEEE Xplore. Restrictions apply.
2024 Iran Workshop on Communication and Information Theory (IWCIT)
D. Rainbow Differential Privacy it is infeasible. The set of all the feasible points is called a
In the context of differential privacy, our objective is to polyhedron.
construct a differentially private mechanism that approximates Linear programming is a well-studied eld with numerous
a given correct function for each dataset. In the setting of proposed algorithms which given the input objective function
differential privacy with graph vertices as input, Zhou et al. and the constraints, output the optimum point at which the
[ZGD+ 22] dened a preference function over mechanisms. objective function is minimized or maximized.
Specically, for each dataset x ∈ D, they assumed that A notable contribution in this area is the paper [JSWZ20] by
we have a permutation over possible results R, denoted by Jiang et al., who established that the feasibility and optimality
π(x). The goal is for our mechanism to prefer to output of a linear program’s solution can be determined in polynomial
the results of R in a total ordering dened by π(x). In time. This is formally stated in the following proposition:
the following paragraphs, we discuss the formal denition of Proposition 1: Given a linear program l with n variables
rainbow differential privacy. that can be encoded using L bits, there exists an algorithm
Denition 2 (Rainbow): Let R be a nite set. A rainbow that can solve it in O(n2+1/18 L) time.
on R is a total ordering of R. If Sym(R) is the set of all This result highlights the efciency of linear programming
permutations of R, we can denote a rainbow as p ∈ Sym(R). solutions and their suitability for tackling large-scale problems.
We can then dene a preference function π : D → Sym(R) A comprehensive source for linear programming is [Sch98]
that assigns a rainbow to each dataset x ∈ D. The aim is to III. M AIN R ESULT
dene an (ϵ, )-DP mechanism that outputs an element of R
In this section, we aim to propose a solution to Problem
for each x ∈ D. We can dene a utility function using the
1. That is, we present a polynomial time algorithm to decide
ordering enforced by the preference function.
whether or not a partial mechanism on an input graph can
Denition 3 (Ordering forced by a permutation): For
be extended to an (, )-DP mechanism over the entire input
a nite set R = 1, 2, , R, x ∈ ∆(R), and p ∈
graph.
Sym(R), we dene the ordering enforced by p as xp =
Before we present the proof, we try to give a high level
(xp(1) , xp(2) , , xp(|R|) ) ∈ R|R| .
explanation of our method. We start by transforming the
Denition 4 (Lexicographic ordering): The lexicographical problem as a feasibility question about a polyhedron. In
ordering on two vectors x, y ∈ Rn is denoted by x ⪯ y if another words, we present a system of linear inequalities such
x = y or there exists a k ∈ 1, 2, , n such that for all i < k, that an extension exists if an only if the polyhedron dened
xi ≤ yi and xk < yk . by the linear constraints is non-empty.
Denition 5 (Optimal (ϵ, )-DP mechanism): For a graph At this point, any linear programming algorithm can solve
G = (D, E), the nite output set R, a preference function π : the problem. The difculty arises from the fact that for large
D → Sym(R), and ϵ, ≥ 0, we say an (ϵ, )-DP mechanism size R, the number of inequalities describing the polyhedron
M1 dominates another (ϵ, )-DP mechanism M2 if for all is exponential in the size of the range of the mechanism (i.e.
d ∈ D, we have: R).
M2
π(d)
(d) ⪯ M1
π(d)
(d) Then, we tweak the form of linear inequalities (hence, the
polyhedron) to derive a new system of linear inequalities such
We then say an (ϵ, )-DP mechanism M is optimal with that still we have the property that a partial mechanism is
respect to π if no other (ϵ, )-DP mechanism dominates it. extendable if and only if the polyhedron is non-empty. Also,
One of our goals in the subsequent sections is to nd an the new polyhedron has polynomial description; that is, it
optimal (ϵ, )-DP mechanism. can be described using at most polynomially many linear
inequalities.
E. Linear Programming Once we solved the mechanism extension problem, we are
Linear programming is a method to achieve the best out- able to solve the rainbow extension problem in the following
come of a linear function in a mathematical model whose steps.
requirements are represented by linear relationships. The linear • We rst decide if the input mechanism is not extendable.
function in the above is called the objective function and the If the answer is negative, we are done.
requirements are referred to as the constraints. • Otherwise, we nd an extension such that the sum of
In a linear programming problem, we have a set of variables the probabilities that the mechanism outputs the most
V , a set of constraints C, each constraint is a linear equality preferred value at each node is maximized. This task can
or inequality for variables V , and a linear objective function be done in polynomial time simply because that this is
for variables V which we want to minimize or maximize. indeed a problem of maximizing a linear function over
There is a vast literature in the eld of linear programming a polyhedron that has a polynomial description. (i.e. the
and linear programming modeling. polyhedron is the intersection of polynomially many half-
If there is an assignment for variables V for which all the spaces).
constraints are satised, we say that V is a feasible point and • Once we obtained an optimum point, we include a new
the linear program (LP) is feasible; otherwise, we say that constraint to the previous ones as follows. Among all the
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on January 03,2025 at 09:33:06 UTC from IEEE Xplore. Restrictions apply.
2024 Iran Workshop on Communication and Information Theory (IWCIT)
mechanism extensions that are optimum solutions of the LP 2 Extension Check (Efcient)
previous program, We nd an extension that maximizes variables:
the sum of the probabilities the mechanism outputs the
∀v ∈ D, r ∈ R xv,r (8)
second most preferred value of the range to the nodes.
• We repeatedly perform this until we reach to an extension
∀(v, u) ∈ E, r ∈ R dv,u,r (slack variables) (9)
which is provably the best extension with respect to the constraints:
preference ordering of the nodes.
In the sequel, we formally state and prove the results. ∀v ∈ D, r ∈ R xv,r ≥ 0 (10)
∀v ∈ D xv,r = 1 (11)
A. Mechanism Extension Problem as an LP Problem.
r∈R
In this subsection, we start by demonstrating that a mecha- ∀v ∈ U, r ∈ R xv,r = U (v)r (12)
nism extension problem can be formulated as an LP problem.
Then, we explain the main weakness of such a formulation, ∀(v, u) ∈ E, r ∈ R dv,u,r ≥ 0 (13)
ϵ
which is the complexity of describing the feasible region. ∀(v, u) ∈ E, r ∈ R xv,r ≤ e xu,r + dv,u,r (14)
(i.e. we require exponentially many constraints to describe the ∀(v, u) ∈ E dv,u,r ≤ (15)
feasible polyhedron). Finally, we x this issue by introducing r∈R
another LP which also solves the problem but does not suffer
from the same issue, namely exponential description.
Assume M : D → ∆(R) is a mechanism. To ensure that
M is an (ϵ, )-DP mechanism and extends a mechanism U : Proof. It is straightforward to see that constraints 10 and 11
U → ∆(R), we must check some constraints. All of these ensure that the result of M is a probability distribution (i.e., in
constraints are linear equalities or inequalities based on the ∆(R)). The constraint 12 ensures that the resulting mechanism
value of M(v)r for all v ∈ D, r ∈ R. For example, if we is an extension of U .
assume that the variable xv,r represents the value of M(v)r , Now, we need to check the (ϵ, )-DP constraints. Assume
the feasibility of the following linear program guarantees the that there exists an extension for U which is (ϵ, )-DP. Let
extensibility of the partial mechanism U . xv,r = Pr(M(v) = r) and let dv,u,r = max(0, xv,r − eϵ xu,r ).
We aim to show that x, d is a feasible solution of the theorem’s
LP 1 Extension Check (Non-efcient) LP. Constraints 10, 11, 12, 13, and 14 are trivially satised by
variables: the denition of the variables.
For each (v, u) ∈ E, if we assume S = r ∈ R : dv,u,r >
∀v ∈ D, r ∈ R xv,r (3)
0, by the denition of dv,u,r , we know for all r ∈ S, dv,u,r =
constraints: xv,r − eϵ xu,r . This implies Pr(M(v)
∈ S) ≤ eϵ Pr(M(u) ∈
S) + , which
further implies r∈S xv,r ≤ r∈S e xu,r + ,
ϵ
∀v ∈ D, r ∈ R xv,r ≥ 0 (4) and hence r∈S dv,u,r ≤ . Since for all r ∈ S, dv,u,r = 0,
∀v ∈ D xv,r = 1 (5) we have r∈R dv,u,r ≤ . Thus, the constraint 15 also holds
r∈R and x, d is a feasible solution for the LP.
∀v ∈ U, r ∈ R xv,r = U (v)r (6) On the other hand, assume x, d is a feasible solution for
LP 2. We must show that the constructed M is a (ϵ, )-DP
∀(v, u) ∈ E, S ⊆ R xv,r ≤ + eϵ xu,r (7)
mechanism which is an extension of U . The fact that M is
r∈S r∈S
a randomized mechanism and an extension is straightforward
as mentioned earlier in the proof. We only need to show that
As we mentioned above, the main issue of the LP 1 is that M has the (ϵ, )-DP properties.
the number of constraints corresponding to the subsets of R are We must show for every (v, u) ∈ E and for every S ⊆ R
not polynomial in the size of the input. At rst glance, it is not we have
straightforward how to efciently check the feasibility of this
LP. We x this issue by adding some slack variables to this LP Pr(M(v) ∈ S) − eϵ Pr(M(u) ∈ S) ≤
and use them to reduce the number of constraints signicantly.
This is equivalent to r∈S xv,r − e xu,r ≤ . We know
ϵ
To this end, we present the second LP formulation in 2 and
for every r ∈ R, xv,r
− e xu,r ≤ dv,u,r , so we have
ϵ
introduce a polynomial time algorithm to solve Problem 1.
. We also know by con-
ϵ
x v,r − e x u,r ≤ r∈S dv,u,r
Theorem 1: Given a graph G = (D, E), a subset U of the r∈S
straint 13, dv,u,r ≥ 0 so we have r∈S d v,u,r ≤ r∈R dv,u,r .
vertices, a nite set of possible outputs R, parameters ϵ, ≥ 0,
Thus,
by constraint 15, we know d v,u,r ≤ . Thus,
and a partial mechanism U , Problem 1 has a positive answer r∈R
x v,r − e ϵ
x u,r ≤ and the (ϵ, )-DP property is
if and only if the constraints of LP 2 have a feasible solution. r∈S
proven. □
Furthermore, the mechanism M : D → ∆(R), which satises
M(v)r = xv,r for all v ∈ D and r ∈ R, is an (ϵ, )-DP Corollary 1: Given a graph G = (D, E), a subset U of the
extension of U . vertices, a nite set of possible outputs R, parameters ϵ, ≥ 0,
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on January 03,2025 at 09:33:06 UTC from IEEE Xplore. Restrictions apply.
2024 Iran Workshop on Communication and Information Theory (IWCIT)
and a partial mechanism U , we can nd the answer of Problem Now, let x, d be the solution of the nal LP. For each
1 in polynomial time by D, E and R. k ∈ 1, 2, , R, let Ck be the constraints at the start of
the k- th step of the loop, and let its solution be x(k) , d(k) .
Proof. It’s straightforward because the number of variables
The constraints added at each step are only the equivalences
and constraints of LP 2 is polynomial by D, E and R.
of the solution of the step. Thus, if the rst LP is feasible, the
By proposition 1, it’s also known that we can solve this LP
LPs at all steps will be feasible.
in polynomial time. □ We prove the statement with a proof by contradiction.
Assume there exists an (ϵ, )-DP mechanism N : D → ∆(R)
B. Optimal Mechanism
which dominates M. Let k ∈ 1, 2, , R be the smallest
In this section, we propose an algorithm that iteratively value for which there exists a u ∈ D such that for all i < k,
solves a Linear Program (LP) and adds constraints to it to M(u)π(u)i = N (u)π(u)i and M(u)π(u)k < N (u)π(u)k . By
nd an optimal (ϵ, )-DP mechanism. Note that our algorithm the domination of M by N and the minimality of k, we have
nds one optimal mechanism; if there are multiple, it only for all i < k and all v ∈ D, M(v)π(x)i = N (v)π(v)i . Thus,
nds one of them. if we dene x′ , d′ as LP variables which represent N (this
Theorem 2: Given a graph G = (D, E), a subset U of the representation is shown in the proof of 1), x′ , d′ is a feasible
vertices, a nite set of possible outputs R = 1, 2, , R, solution of the LP with constraints Ck .
parameters ϵ, ≥ 0, a partial mechanism U , and a preference Now, by the optimality of x(k) , we know that
(k)
function π : D → Sym(R), Algorithm 1 nds a solution for v∈D xv,π(k) ≥ v∈D xv,π(k) , and by the added constraints
′
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on January 03,2025 at 09:33:06 UTC from IEEE Xplore. Restrictions apply.
2024 Iran Workshop on Communication and Information Theory (IWCIT)
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY CALICUT. Downloaded on January 03,2025 at 09:33:06 UTC from IEEE Xplore. Restrictions apply.