]>
Simple and Efficient Binomial Protocols for Differential Privacy in MPC
Meta
bmcase@meta.com
Mozilla
mt@lowentropy.net
Security
Privacy Preserving Measurement
oneplusoneplusoneplus
noise
counting coins
A method for computing a binomial noise in Multiparty Computation (MPC) is
described. The binomial mechanism for differential privacy (DP) is a simple
mechanism that is well suited to MPC, where computation of more complex
algorithms can be expensive. This document describes how to select the correct
parameters and apply binomial noise in MPC.
About This Document
The latest revision of this draft can be found at .
Status information for this document may be found at .
Discussion of this document takes place on the
Privacy Preserving Measurement Working Group mailing list (),
which is archived at .
Subscribe at .
Source for this draft and an issue tracker can be found at
.
Introduction
Using Multiparty Computation (MPC) to compute aggregate statistics has some very
promising privacy characteristics. MPC provides strong assurances about the
confidentiality of input values, relying only on the assumption that the parties
performing the computation do not collude. Depending on the MPC system in use,
the cryptographic assumptions involved can be conservative. For instance, MPC
is the basis of the Verifiable, Distributed Aggregation Functions (VDAFs)
used in DAP .
Depending on how the system is used, particularly for systems where the MPC
system offers some flexibility in how it can be queried, concrete privacy
guarantees are harder to provide. Multiple aggregations over similar input data
might be computed, leading to aggregates that can be compared to reveal
aggregates over a small set of inputs or even the value of specific inputs.
Differential privacy (DP) ) offers a framework
for both analyzing and protecting privacy that can be applied to this problem to
great effect. By adding some amount of noise to aggregates, strong guarantees
can be made about the amount of privacy loss that applies to any given input.
There are multiple methods for applying noise to aggregates, but the one that
offers the lowest amount of noise — and therefore the most useful outputs — is
one where a single entity samples and adds noise, known as central
DP. Alternatives include local DP, where each noise is added to each input to
the aggregation, or shuffle DP, which reduces noise requirements for local DP by
shuffling inputs.
Applying noise in a single place ensures that the amount of noise is directly
proportional to the sensitivity (that is, the maximum amount that any input
might contribute to the output) rather than being in some way proportional to
the number of inputs. The amount of noise relative to aggregates decreases as
the number of inputs increases, meaning that central DP effectively provides an
optimal amount of noise.
DP Noise in MPC
There are several approaches to adding noise in MPC.
Use of local or shuffle DP is possible. As noted, these methods can add more
noise than is ideal.
Noise can be added by each party independently. Each party adds noise in a
fraction that is based on its understanding of the number of honest parties
present. In twoparty MPC, each party has to assume the other is dishonest, so
each adds the entire noise quantity, ultimately doubling the overall noise that
is added. In a threeparty honest majority MPC, each party can add half of the
required noise on the assumption that one other party is honest, resulting in a
50% increase in the amount of noise.
Finally, an MPC protocol can be executed to add noise. The primary drawback of
this approach is that there is an increased cost to generating the noise in MPC.
However, MPC protocols can avoid having to include additional noise in order to
compensate for the risk of information leakage from a dishonest participant.
Adding noise using MPC provides strong assurances that noise is not known to any
party, including the parties that perform the computation, up to the limits of
the MPC scheme in use. Finally, the costs of computation in MPC scale only with
the privacy parameters for the differential privacy, not the number of inputs.
Amortizing this cost over large sets of inputs can make the additional cost
small.
Binomal Noise
The Bernoulli distribution provides approximate differential privacy (DP)
. This is sometimes named (epsilon, delta)differential privacy or (ε,
δ)differential privacy. The epsilon value in approximate DP bounds privacy
loss for most contributions to the output, however the delta value is a nonzero
bound on the probability that a higher privacy loss occurs.
A binomial, Bin(N, p), distribution is the number of successes out of N
Bernoulli trials, where each Bernoulli trial is a coin flip with success
probability p.
Due to the central limit theorem, a binomial distribution with large N is a
close approximation of a Normal or Gaussian distribution, which has a number of
useful properties.
This document describes a simple MPC protocol, with several instantiations, for
efficiently computing binomial noise.
Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14 when, and only when, they
appear in all capitals, as shown here.
The Binomial Mechanism for MPC
The binomial mechanism for DP generates binomial noise in MPC and adds it to
outputs before they are released.
Our parameter choices rely on an analysis from , which provides more
comprehensive formulae for a range of parameters.
To sample from a Bin(N, p) distribution in MPC, two things are needed:

A protocol for Bernoulli trials, or coinflipping protocol, that produces a
value of 1 with probability p and 0 otherwise.

A means to sum the value of N trials.
This protocol sets p to 0.5. This value of p provides both an optimal
privacy/utility trade off and good efficiency for computation in MPC. Each
Bernoulli sample requires a single, uniformly distributed bit, which can be done
very efficiently. Using p = 0.5 also requires the fewest samples for any set
of parameters, except for cases with extremely low variance requirements, which
we consider to be out of scope; see Section 2 of .
There are several ways to instantiate a coin flipping protocol in MPC depending
what MPC protocol is being used. describes some basic protocol
instantiations.
For any given set of privacy parameters (epsilon, delta) and for a known
sensitivity, describes how to determine the number of Bernoulli
samples needed.
To count the number of successes across these N trials, the MPC helpers simply
run an aggregation circuit over the secret shared results of the N Bernoulli
trials, each or which is either 0 or 1. The result of this sum is a sample from
a Bin(N, p) distribution. This binomial noise value is then added to the
output inside the MPC and then the final noised result revealed to the
appropriate output parties. That is, if the MPC computes f(D), it outputs
shares of the result f(D) + Bin(N,p).
The party receiving the output can then postprocess this output to get an
unbiased estimate for f(D) by subtracting the mean of the Bin(N,p)
distribution, which is N·p.
Document Organization
In the remainder of this document is organized as follows:

introduces an additional quantization scaling parameter that can
be used to optimize the privacy/utility tradeoff.

details the process of determining for a given function f()
and privacy parameters how to determine the optimal number of trials, N.

describes some instantiations of the coin flipping protocol and
the aggregation protocol.

includes a cost analysis of different instantiations in both
computation and communication costs.

compares the binomial mechanism to other DP approaches that
might be used in MPC.
Quantization Scale
provides an additional "quantization scale" parameter, s, for the
binomial mechanism that can be tuned to make it more closely approximate the
Gaussian mechanism and get an improved privacy/utility tradeoff.
The paper defines the application of the binomial mechanism as:
where f(D) is the value that is protected and X is a sample from a Bin(N,
p) distribution. This produces a scaled and unbiased output.
The value of s is typically smaller than one, meaning that the sample noise is
effectively able to produce noninteger values. However, operating on
noninteger values in MPC is more complex, so this documents uses a modified
version where the MPC computes:
For an MPC system, the output of the system is shares of this scaled and biased
value. The recipient can reconstruct the an unbiased, unscaled, noised value by:

Adding the shares it receives: o = sum(oᵢ, o₂, …)

Correcting for bias: o  N·p

Scaling the value: f′(D) = s * (o  N·p)
The resulting value is always within N·p·s of the computed aggregate, but it
could be negative if that aggregate is smaller than N·p·s.
Determining number of Bernoulli trials
Applying noise for differential privacy requires understanding the function
being computed, f(), and the private dataset, D. For an f that is a
ddimensional query with integer outputs, the output vector is in ℤ<sup>d</sup>. That
is, the output is a ddimensional vector of natural numbers.
The binomial mechanism requires understanding the sensitivity of the result
under three separate norms.
Sensitivity
Differential privacy describes sensitivity in terms of databases. In this,
databases are considering "neighboring" if the additional, removal, or sometimes
the substitution of inputs related to a single subject turns one database into
the other.
For two neighboring databases D₁, D₂, the sensitivity of f is:
For f(D) that produces output that is a ddimensional vector of integer
values, the pnorms of interest for use with the binomial mechanism is the L1,
L2, and L∞ (or Linfty) norms.
The L1 norm of x∊ℤ<sup>d</sup> is:
The L2 norm is:
Finally, the L∞ norm is:
These properties of the function f() are all specific to the use case and need
to be known.
Epsilon and Delta Constraints
The privacy parameters for approximate DP are epsilon, ε, and delta, δ. These
parameters determine the bounds on privacy loss.
Epsilon may vary considerably, though is typically in the range [0.01,
10]. Often, multiple queries spend proportions of a larger epsilon privacy
budget. For example, a privacy budget of epsilon=3 might be spent in three
separate queries with epsilon 1 or as four queries with epsilon of 2, 0.1, 0.3,
and 0.6.
Delta is often be fixed in the range
(2^{29}..2^{24}). Typically, the only constraint on delta is
to ensure that 1/delta > population; that is, expected number of contributions
that will suffer privacy loss greater than epsilon is kept less than one. For
MPC functions that include very large numbers of inputs, delta might need to be
reduced.
Theorem 1 of gives a way to determine the fewest Bernoulli trials,
N, needed to achieve approximate DP. There are two main constraints that need
to be satisfied which we call the delta_constraint and
epsilon_delta_constraint.
As the number Bernoulli trials increases, each constraint monotonically allows
for smaller epsilon and delta values to be achieved. To find the smallest
number of Bernoullis that simultaneously satisfies both constraints, find the
minimum N determined by the delta_constraint and the minimum N determined
by the epsilon_delta_contraint and then take the maximum of these two values.
A possible approach to satisfying both constraints is to perform a binary search
over N to find the smallest one satisfying both constraints simultaneously. A
search might be acceptable as the computation only needs to be performed once
for each set of parameters. An alternative and more direct approach is
described in the following sections.
Bounding N by delta_constraint
The delta_constraint is a function of delta, the dimension, d, the
sensitivityᵢnfty, the quantization scale, s, and p (which is fixed to 0.5
in this document). This produces a simple formula for determining the minimum
value of N:
= 4·max(23·ln(10·d/delta), 2·sensitivity_infty/s)
]]>
Bounding N by epsilon_delta_constraint
The epsilon_delta_constraint is a function of epsilon, delta, s, d,
sensitivity_1, sensitivity_2, sensitivity_infty, and p (0.5). It is a
more complicated formula.
For the epsilon_delta constraint, defines some intermediate
functions of the success probability, p. For p = 0.5, these become fixed
constants:
The epsilon_delta_constraint, as written in formula (7) of ,
determines what epsilon is currently attained by the provided N and other
parameters:
The value of N for a fixed set of values for epsilon, delta, sensitivity, and
s, is a quadratic equation in N.
To see this first write equation (7) as with other variables gathered into
constants c₁ and c₂:
The formula for epsilon can then be written as a quadratic equation in N:
Once the values of all the other parameters are fixed, this can be solved with
the quadratic formula.
Setting the Quantization Scale
Setting the quantization scale correctly can help get the best privacy/utility
trade offs for the mechanism. An additional equation to note is the error of
the mechanism which we would like to minimize subject to the privacy constraints
discusses more about why 0.5 is the optimal choice for p. When it
comes to setting the quantization scale s, making it smaller will decrease the
error directly but also require a larger N.
It is generally the case that making s smaller will continually decrease the
error, but at some point there is necessarily a performance constraint from the
MPC cost of how large an N is practical.
One approach to setting the scale parameter would be to first determine an upper
bound allowed for N and then set s as small as possible within that
constraint. Another approach would be to look for a point at which decreasing
s and increasing N leads to diminishing returns in reducing the error of the
mechanism.
Noise Generation Algorithm
Once the optimal number of Bernoulli trials has been determined, there are two
phases to the algorithm:

Perform a distributed coin flipping protocol so that all helpers hold secret
shares of 0 or 1 with probability, p.

Sum up these secret shared samples into a sample from a Bin(N, p).
This document uses p = 0.5, so the coin flipping protocol can use a
uniformlydistributed source of entropy.
Coin Flipping and Aggregation Protocols
The use of the binomial mechanism for p = 0.5 in a concrete MPC requires a
protocol for jointly computing a number of random bits. Different systems will
have different requirements. This section describes three basic protocols that
can be used to compute the binomial distribution.
Three Party Honest Majority
A three party honest majority system is appealing because there are very
efficient protocols for performing multiplication; see .
Two protocols are described:

A binary circuit allows the coin flip to be performed without any
communication cost using PRSS . Aggregation requires the use of an
addition circuit, which requires one binary multiplication per bit.

A circuit using prime fields allows the aggregation to be performed without
communication, but the coin flip protocol, which also uses PRSS, requires a
modulus conversion operation.
Overall, the binary circuit is more efficient in terms of communication costs,
but it might be easier to integrate the prime field circuit into a system that
uses prime fields.
Three Party Binary Field Protocol
A coin flip protocol in a three party honest majority system simply samples a
random share using PRSS. The result is a threeway, replicated sharing of a
random binary value.
Aggregating these values can be performed using a binary circuit in a tree. Two
bits, a and b, are added to form a binary value, {a∧b, a⊕b}.
This process is continued pairwise. The resulting pairs, {a1, a2} and {b1,
b2}, are also added pairwise to produce a threebit value, {a1∧b1,
a1⊕b1⊕(a2∧b2), a2⊕b2}.
Each successive iteration involves one more bit and half as many values, until a
single value with log₂(N) bits is produced.
This aggregation process requires at most 4N binary multiplications.
Three Party Large Prime Field Protocol
Addition of values in a prime field with a modulus greater than the number of
samples (N) can be performed trivially. However, producing a replicated secret
sharing across three parties using a single bit sample from PRSS results in a
value that is uniformly distributed between 0 and 2 inclusive.
A modulus conversion operation can be used to convert that into a sharing in the
prime field. This requires two multiplications, though some parts of those
multiplications can be avoided; see .
Three bits are sampled by each pair of parties. These are turned into three
shared values, where two of the shared values are filled with zeros. The
exclusive OR of these three values is computed using two multiplications in the
form: x⊕y = x + y  2xy. This produces a threeway replicated sharing of a
bit in the prime field.
Shares can then be aggregated through simple addition.
Two Party Protocols
Obtaining multiple random bits in a two party protocol might involve the use of
an oblivious transfer protocol. Ideally, these are obtained in a large prime
field so that addition is free.
Details of OT protocol TBD.
Performance Characteristics
A binomial function is relatively inexpensive to compute in MPC.
Cost Analysis
With large epsilon and delta values (that is, low privacy) the use of the
binomial mechanism can be very efficient. However, smaller values for epsilon
or delta can require significant numbers of Bernoulli trials.
The following table shows some typical values and the resulting number of
trials, along with approximate values for the quantization scaling factor (s)
and error.
epsilon 
delta 
N 
s 
error 
3 
10e6 
TODO 
TODO 
TODO 
1 
10e6 
TODO 
TODO 
TODO 
0.1 
10e6 
TODO 
TODO 
TODO 
Comparison with Alternative Approaches
Two other approaches that should be compared with are:

simply having each helper party add noise independently

amplification by shuffling where
local DP is added by clients and used to get a central DP guarantee
A binomial will alway give better privacy/utility trade offs compared to
independent noise. An MPC system has to ensure that t out of P parties can
reveal their shares without degrading the privacy of outputs. Consequently, the
noise that each party adds needs to be proportional to P/(Pt) times the
target amount, assuming that noise can be simply added. For a three party
honest majority system, P is 3 and t is 1, producing 50% more noise than is
ideal. For a two party system, the amount of noise needs to be doubled.
Shuffling and any scheme that makes use of noised inputs results in noise that
increases in magnitude as the number of inputs increases, which degrades
utility. The binomial mechanism does not result in any additional noise.
Security Considerations
TODO
IANA Considerations
This document has no IANA considerations.
References
Normative References
High Performance Pseudorandom Secret Sharing (PRSS)
Efficient Protocols for Binary Fields in the 3Party Honest Majority MPC Setting
Key words for use in RFCs to Indicate Requirement Levels
In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.
Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words
RFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.
Informative References
cpSGD: Communicationefficient and differentiallyprivate distributed SGD
Verifiable Distributed Aggregation Functions
Cisco
ISRG
Cloudflare
Google
This document describes Verifiable Distributed Aggregation Functions
(VDAFs), a family of multiparty protocols for computing aggregate
statistics over user measurements. These protocols are designed to
ensure that, as long as at least one aggregation server executes the
protocol honestly, individual measurements are never seen by any
server in the clear. At the same time, VDAFs allow the servers to
detect if a malicious or misconfigured client submitted an
measurement that would result in an invalid aggregate result.
Distributed Aggregation Protocol for Privacy Preserving Measurement
ISRG
Cloudflare
ISRG
Mozilla
Cloudflare
There are many situations in which it is desirable to take
measurements of data which people consider sensitive. In these
cases, the entity taking the measurement is usually not interested in
people's individual responses but rather in aggregated data.
Conventional methods require collecting individual responses and then
aggregating them, thus representing a threat to user privacy and
rendering many such measurements difficult and impractical. This
document describes a multiparty distributed aggregation protocol
(DAP) for privacy preserving measurement (PPM) which can be used to
collect aggregate data without revealing any individual user's data.
The Algorithmic Foundations of Differential Privacy
Now Publishers
Efficient BitDecomposition and ModulusConversion Protocols with an Honest Majority
Springer International Publishing
Distributed Differential Privacy via Shuffling
Springer International Publishing