1 Contraction of Locally Differentially Private Mechanisms

2025-04-30 0 0 660.84KB 33 页 10玖币
侵权投诉
1
Contraction of Locally Differentially Private
Mechanisms
Shahab Asoodeh and Huanyu Zhang
Abstract
We investigate the contraction properties of locally differentially private mechanisms. More specifically,
we derive tight upper bounds on the divergence between
PK
and
QK
output distributions of an
ε
-LDP
mechanism
K
in terms of a divergence between the corresponding input distributions
P
and
Q
, respectively.
Our first main technical result presents a sharp upper bound on the
χ2
-divergence
χ2(PKQK)
in terms
of
χ2(PQ)
and
ε
. We also show that the same result holds for a large family of divergences, including
KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound
on
χ2(PKQK)
in terms of total variation distance
TV(P, Q)
and
ε
. We then utilize these bounds to
establish locally private versions of the van Trees inequality, Le Cam’s, Assouad’s, and the mutual
information methods —powerful tools for bounding minimax estimation risks. These results are shown to
lead to tighter privacy analyses than the state-of-the-arts in several statistical problems such as entropy
and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.
I. INTRODUCTION
Local differential privacy (LDP) has now become a standard definition for individual-level privacy in
machine learning. Intuitively, a randomized mechanism (i.e., a channel) is said to be locally differentially
private if its output does not vary significantly with arbitrary perturbation of the input. More precisely, a
mechanism is
ε
-LDP if the privacy loss random variable, defined as the log-likelihood ratio of the output
for any two different inputs, is smaller than ε.
Since its formal introduction [
37
,
47
], LDP has been extensively incorporated into statistical problems,
e.g., locally private mean estimation problem [
2
4
,
10
,
14
,
15
,
19
,
26
,
32
34
,
40
,
41
,
43
,
58
,
63
], and
locally private distribution estimation problem [
2
,
7
,
16
,
38
,
39
,
45
,
60
,
68
]. The fundamental limits of
such statistical problems under LDP are typically characterized using information-theoretic frameworks
such as Le Cam’s, Assouad’s, and Fano’s methods [
69
]. A critical building block for sharp privacy analysis
in such methods turns out to be the contraction coefficient of LDP mechanisms. Contraction coefficient
ηf(K)
of a mechanism
K
under an
f
-divergence is a quantification of how much the data processing
inequality can be strengthened: It is the smallest
η1
such that
Df(PKQK)ηDf(PQ)
for any
distributions
P
and
Q
, where
PK
denotes the output distribution of
K
when its input is sampled from
P
.
Studying statistical problems under local privacy through the lens of contraction coefficients was initiated
by Duchi et al. [
33
,
36
] in which sharp minimax risks for locally private mean estimation problems
were characterized for sufficiently small
ε
. As the main technical result, they showed that any
ε
-LDP
mechanism Ksatisfies
DKL(PKQK)min{4, e2ε}(eε1)2TV2(P, Q),(1)
where
DKL(·∥·)
and
TV(·,·)
denote KL-divergence and total variation distance, respectively. In light
of the Pinsker’s inequality
2TV2(P, Q)DKL(PQ)
, this result gives an upper bound on
ηKL(K)
the
S. Asoodeh is with the Department of Computing and Software, McMaster University, Hamilton, ON L8S 1C7, Canada.
Email:
asoodeh@mcmaster.ca
. Much of this work was completed while S.A. was a visiting research scientist at the Meta’s
Statistics and Privacy Team.
H. Zhang is with Meta Platforms, Inc., New York, NY 10003, USA. Email: huanyuzhang@meta.com.
arXiv:2210.13386v4 [cs.IT] 3 May 2024
2
contraction coefficient under KL-divergence. However, thanks to the data processing inequality, this bound
becomes vacuous if the coefficient in
(1)
is strictly bigger than
1
(i.e.,
ε
is not sufficiently small). More
recently, Duchi and Ruan [34, Proposition 8] showed a similar upper bound for χ2-divergence:
χ2(PKQK)4(eε21)TV2(P, Q).(2)
According to Jensen’s inequality
4TV2(P, Q)χ2(PQ)
, and thus
(2)
implies an upper bound on
ηχ2(K)
the contraction coefficient under
χ2
-divergence. Analogously, this bound is non-trivial only for
sufficiently small
ε
. Similar upper bounds on the contraction coefficients under total variation distance
and hockey-stick divergence were determined in [
46
] and [
15
], respectively. Results of this nature are
recurrent themes in privacy analysis in statistics and machine learning, see [
2
,
4
6
] for other examples of
such results.
In this work, we develop a framework for characterizing tight upper bounds on
DKL(PKQK)
and
χ2(PKQK)
for any LDP mechanisms. We achieve this goal via two different approaches: (i) indirectly
by bounding
ηKL(K)
and
ηχ2(K)
, and (ii) directly by deriving inequalities of the form
(1)
and
(2)
that
are considerably tighter for all ε0. In particular, our main contributions are:
1)
We obtain a sharp upper bound on
ηχ2(K)
for any
ε
-LDP mechanism
K
in Theorem 1, and show that
this bound holds for a large family of divergences, including KL-divergence and squared Hellinger
distance.
2)
We derive upper bounds for
DKL(PKQK)
and
χ2(PKQK)
in terms of
TV(P, Q)
and the privacy
parameter
ε
in Theorem 2. While upper bounds in
(1)
and
(2)
scale as
O(e2ε)
and
O(eε2)
, respectively,
ours scales as
O(eε)
, thus significantly improving over those bounds for practical range of
ε
(that
is ε1
2).
3)
We use our main results to develop a systemic framework for quantifying the cost of local privacy
in several statistical problems under the “sequentially interactive” setting. Our framework enables
us to improve and generalize several existing results, and also produce new results beyond the reach
of existing techniques. In particular, we study the following problems:
a) Locally private Fisher information:We show that the Fisher information matrix
IZn(θ)
of
parameter
θ
given a privatized sequence
Zn:= (Z1, . . . , Zn)
of
Xniid
Pθ
satisfies
IZn(θ)
neε1
eε+1 2IX(θ)
(Lemma 1). This result then directly leads to a private version of the van Trees
inequality (Corollary 1) that is a classical approach for lower bounding the minimax quadratic risk.
In Appendix B, we also provide a private version of the Cramér-Rao bound, provided that there
exist unbiased private estimators. It is worth noting that Barnes et al. [
16
] recently investigated
locally private Fisher information under certain assumptions regarding the regularity of
Pθ
. More
specifically, they derived various upper bounds on
Tr(IZ(θ))
for
ε0
, when
log fθ(X)
is either
sub-exponential or sub-Gaussian, or when
E[(uTlog fθ(X))2]
is bounded for any unit vector
uRd
, where
fθ
is the density of
Pθ
with respect to the Lebesgue measure. In contrast, Lemma 1
establishes a similar upper bound for small
ε
(i.e.,
ε[0,1]
) but without imposing such regularity
conditions.
b) Locally private Le Cam’s and Assouad’s methods:Following [
33
], we establish locally private
versions of Le Cam’s and Assouad’s methods [
48
,
69
] that are demonstrably stronger than those
presented in [
33
] (Theorems 3and 4). We then used our private Le Cam’s method to study the
problem of entropy estimation under LDP where the underlying distribution is known to be supported
over
{1, . . . , k}
(Corollary 2). As applications of our private Assouad’s method, we study two
problems. First, we derive a lower bound for
h
minimax risk in the locally private distribution
estimation problem which improves the constants of the state-of-the-art lower bounds [
68
] in the
special cases
h= 1
and
h= 2
, and leads to the same order analysis for general
h1
in [
2
].
We also provide an upper bound by generalizing the Hadamard response [
7
] to
h
-norm with
h2
which matches the lower bound under some mild conditions. Second, we study private
3
non-parametric density estimation when the underlying density is assumed to be Hölder continuous
and derive a lower bound for
h
minimax risk in Corollary 4. Unlike the best existing result [
22
],
our lower bound holds for all ε0.
c) Locally private mutual information method:Recently, mutual information method [
66
, Section
11] has been proposed as a more flexible information-theoretic technique for bounding the minimax
risk. We invoke Theorem 1to provide (for the first time) a locally private version of the mutual
information bound in Theorem 5. To demonstrate the flexibility of this result, we consider the
Gaussian location model where the goal is to privately estimate
θΘ
from
Xniid
∼ N(θ, σ2Id)
.
Most existing results (e.g., [
16
,
32
34
]) assume
2
-norm as the loss and unit
-ball or unit
2
-ball
as
Θ
. However, our result presented in Corollary 5holds for any arbitrary loss functions and any
arbitrary set Θ(e.g., h-ball for any h1).
d) Locally private hypothesis testing:Given
n
i.i.d. samples and two distributions
P
and
Q
, we
derive upper and lower bounds for
SCP,Q
ε
, the sample complexity of privately determining which distri-
bution generates the samples. More precisely, we show in Lemma 2that in the sequentially interactive
(in fact, in the more general fully interactive) setting
SCP,Q
εeε
(eε1)2max 1
TV2(P,Q),eε
H2(P,Q)
and
SCP,Q
εe2ε
(eε1)2
1
TV2(P,Q)
for any
ε0
, where
H2(P, Q)
is the squared Hellinger distance
between
P
and
Q
. These bounds subsume and generalize the best existing result in [
33
] which
indicates
SCP,Q
ε= Θ1
ε2TV2(P,Q)
for sufficiently small
ε
. Furthermore, they have recently been
shown in [
52
, Theorem 1.6] to be optimal (up to a constant factor) for any
ε0
if
P
and
Q
are
binary. This, in fact, implies that (sequential or full) interaction does not help in the locally private
hypothesis testing problem if
P
and
Q
are binary or if
ε1
. Therefore, our results extend [
44
,
Theorem 5.3] that indicates the optimal mechanism is non-interactive for ε1.
Problem UB Previous LB LB
Entropy estimation N.A. N.A. min 1,1
neε+1
eε12log2k
(Corollary 2)
Distribution estimation,
h-norm
eε(11/h)(eε+d)1/h
n(eε1)
(Theorem 6)
min n1,eε/2d1/h
n(eε1) ,heε/2
n(eε1) i11/ho
(Corollary 3), [2]
Density estimation,
h-norm, β-Hölder N.A. (2)
2β+2 for ε1
[22]
(neε(eε1)2)
2β+2
(Corollary 4)
Gaussian location model,
arbitrary loss N.A. N.A
d
e2(VdΓ(1+d))1/d min 1,qσ2d
n(eε+1
eε1)
(Corollary 5)
Sample complexity of
hypothesis testing
e2ε
(eε1)21
TV2(P,Q)
(Lemma 2)
1
ε2TV2(P,Q)for ε1
[25]
eε
(eε1)2max 1
TV2(P,Q),eε
H2(P,Q)
(Lemma 2)
TABLE I. Summary of the minimax risks for
ε
-LDP estimation, where we have omitted constants for all the results.
For the distribution estimation with
h
-norm, our upper bound, built on Hadamard response mechanism discussed in
Appendix D, is order optimal in
n
and
d
for the dense case unless
εlog d
. For the Gaussian location model, we
consider the problem of privately estimating
θΘ
from
Xniid
∼ N(θ, σ2Id)
. The result shown in this table assumes
that
Θ
is the unit
2
-ball, where
Vd
is the volume of the unit
∥·∥
-ball (for arbitrary norm). Corollary 5, however,
concerns with the general Θ.
4
A. Additional Related Work
Local privacy is arguably one of the oldest forms of privacy in statistics literature and dates back to
Warner [64]. This definition resurfaced in [37] and was adopted in the context of differential privacy as
its local version. The study of statistical efficiency under LDP was initiated in [
33
,
36
] in the minimax
setting and has since gained considerable attention. While the original bounds on the private minimax
risk in [
33
,
36
] were meaningful only in the high privacy regime (i.e., small
ε
), the order optimal bounds
were recently given for several estimation problems in [
32
] for the general privacy regime. Interestingly,
their technique relies on the decay rate of mutual information over a Markov chain, which is known to
be equivalent to the contraction coefficient under KL-divergence [13]. However, their technique is quite
different from ours in that it did not concern computing the contraction coefficient of an LDP mechanism.
Among locally private statistical problems studied in the literature, two examples have received
considerably more attention, namely, mean estimation and discrete distribution estimation. For the first
problem, Duchi et al. [
36
] used
(1)
to develop asymptotically optimal procedures for estimating the mean
in the high privacy regime (i.e.,
ε < 1
). For the high privacy regime (i.e.,
ε > 1
), a new algorithm was
proposed in [
19
] that is optimal and matches the lower bound derived in [
32
] for interactive mechanisms.
There has been more work on locally private mean estimation that studies the problem under additional
constraints [
2
,
3
,
10
,
14
16
,
39
,
41
,
43
,
58
,
63
]. For the second problem, Duchi et al. [
33
] studied
(non-interactive) locally private distribution estimation problem under
1
and
2
loss functions and derived
the first lower bound for the minimax risk, which was shown to be optimal [
45
] for high privacy regime.
Follow-up works such as [
2
,
16
,
38
,
60
,
68
] characterized the optimal minimax rates for general
ε
.
Recently, [2] derived a lower bound for hloss with h1.
The problem of locally private entropy estimation has received significantly less attention in the literature,
despite the vast line of research on the non-private counterpart. The only related work in this area seems
to be [
21
,
23
] which studied estimating Rényi entropy of order
λ
and derived optimal rates only when
λ > 2
. Thus, the optimal private minimax rate seems to be still open. We remark that [
8
] explicitly
considered the problem of entropy estimation, but in the setting of central differential privacy.
The closest work to ours are [
15
,
70
] which extensively studied the contraction coefficient of LDP
mechanisms under the hockey-stick divergence. More specifically, it was shown in [
15
] that
K
is
ε
-LDP
if and only if
Eeε(PKQK)
the hockey-stick divergence between
PK
and
QK
is equal to zero for any
distributions
P
and
Q
, and thus if and only if the contraction coefficient of
K
under the hockey-stick
divergence is zero. By representing
χ2
-divergence in terms of the hockey-stick divergence, this result
leads to a conceptually similar, albeit weaker, result as Theorem 2.
In [
9
], Acharya et al. introduced an information-theoretic toolbox to establish lower bounds for private
estimation problems. However, they considered the threat model of central differential privacy, a totally
different model from the local differential privacy considered in this work.
B. Notation
We use upper-case letters (e.g.,
X
) to denote random variables and calligraphic letters to represent their
support sets (e.g.,
X
). We write
Xn
to denote
n
random variables
X1, . . . , Xn
. The set of all distributions
on
X
is denoted by
P(X)
. A mechanism (or channel)
K:X → P(Z)
is specified by a collection of
distributions
{K(·|x)∈ P(Z) : x∈ X}
. Given such mechanism
K
and
P∈ P(X)
, we denote by
PK
the
output distribution of
K
when the input is distributed according to
P
, given by
PK(A):=RP(dx)K(A|x)
for
A⊂ Z
. We use
EP[·]
to write the expectation with respect to
P
and
[n]
for an integer
n1
to
denote {1, . . . , n}.
II. PRELIMINARIES AND DEFINITIONS
In this section, we give basic definitions of
f
-divergence, contraction coefficients, and LDP mechanisms.
5
f
-Divergences and Contraction Coefficients. Given a convex function
f: (0,)R
such that
f(1) = 0
, the
f
-divergence between two probability measures
PQ
is defined as [
12
,
31
]
Df(PQ):=
EQfdP
dQ.Examples of f-divergences needed in the subsequent sections include:
KL-divergence DKL (PQ):=Df(PQ)for f(t) = tlog t,
total-variation distance TV(P, Q):=Df(PQ)for f(t) = 1
2|t1|,
χ2-divergence χ2(PQ):=Df(PQ)for f(t) = t21,
squared Hellinger distance H2(P, Q):=Df(PQ)for f(t) = (1 t)2, and
hockey-stick divergence (aka
Eγ
-divergence [
59
])
Eγ(PQ):=Df(PQ)
for
f(t) = (tγ)+
for
some γ1, where (a)+:= max{a, 0}.
All
f
-divergences are known to satisfy the data-processing inequality. That is, for any channel
K:X 7→
P(Z)
, we have
Df(PKQK)Df(PQ)
for any pair of distributions
(P, Q)
. However, this inequality
is typically strict. One way to strengthen this inequality is to consider
ηf(K)
the contraction coefficient
of Kunder f-divergence [11] defined as
ηf(K):= sup
P,Q∈P(X):
Df(PQ)̸=0
Df(QKPK)
Df(QP).(3)
With this definition at hand, we can write
Df(PKQK)ηf(K)Df(PQ)
, which is typically referred
to as the strong data processing inequality. We will study in details contraction coefficients under KL-
divergence,
χ2
-divergence, squared Hellinger distance, and total variation distance, denoted by
ηKL(K)
,
ηχ2(K)
,
ηH2(K)
, and
ηTV(K)
, respectively, in the next section. We also need the following well-known
fact about ηKL(K)[13]:
ηKL(K) = sup
PXU :
UXZ
I(U;Z)
I(U;X),(4)
where
K
is the channel specifying
PZ|X
,
I(A;B):=DKL(PABPAPB)
is the mutual information between
two random variables
A
and
B
, and
UXZ
denotes the Markov chain in that order. Another important
property of ηKL required in the proofs is its tensorization which is described in Appendix A.
Local Differential Privacy A randomized mechanism
K:X → P(Z)
is said to be
ε
-locally differentially
private (
ε
-LDP for short) for
ε0
if [
37
,
47
]
K(A|x)eεK(A|x),
for all
A⊂ Z
and
x, x∈ X
. Let
Qε
be the collection of all
ε
-LDP mechanisms
K
. It can be shown that LDP mechanisms can be equivalently
defined in terms of the hockey-stick divergence:
K∈ QεEeε(K(·|x)K(·|x)) = 0,x, x∈ X.(5)
Arguably, the most known LDP mechanism is the binary randomized-response mechanism, introduced
by Warner [
64
]. For
X={0,1}
, let the mechanism
K
be defined as
K(·|1) = Bernoulli(κ)
and
K(·|0) =
Bernoulli(1 κ)
. It can be easily verified that this mechanism is
ε
-LDP if
κ=eε
1+eε
. In information
theory parlance, the binary randomized response mechanism is a binary symmetric channel with crossover
probability 1
1+eε. A natural way to generalize this mechanism to the non-binary set is as follows.
Example 1. (k-ary randomized response) Let X=Z= [k]. Let the mechanism Kbe defined as
K(z|x) = (eε
eε+k1,if z=x,
1
eε+k1,otherwise.(6)
It can be verified that Eeε(K(·|x)K(·|x)) = 0 for all x, x[n], implying this mechanism is ε-LDP.
Suppose there are
n
users, each in possession of a sample
Xi
,
i[n]:={1, . . . , n}
. User
i
applies a
mechanism
Ki
to generate
Zi
a privatized version of
Xi
. The collection of such mechanisms is said to be
摘要:

1ContractionofLocallyDifferentiallyPrivateMechanismsShahabAsoodehandHuanyuZhangAbstractWeinvestigatethecontractionpropertiesoflocallydifferentiallyprivatemechanisms.Morespecifically,wederivetightupperboundsonthedivergencebetweenPKandQKoutputdistributionsofanε-LDPmechanismKintermsofadivergencebetween...

展开>> 收起<<
1 Contraction of Locally Differentially Private Mechanisms.pdf

共33页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:33 页 大小:660.84KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 33
客服
关注