1 Contraction of Locally Differentially Private Mechanisms

2025-04-30 0 0 660.84KB 33 页 10玖币

侵权投诉

Contraction of Locally Differentially Private

Mechanisms

Shahab Asoodeh and Huanyu Zhang

Abstract

We investigate the contraction properties of locally differentially private mechanisms. More speciﬁcally,

we derive tight upper bounds on the divergence between

and

output distributions of an

-LDP

mechanism

in terms of a divergence between the corresponding input distributions

and

, respectively.

Our ﬁrst main technical result presents a sharp upper bound on the

χ2

-divergence

χ2(PK∥QK)

in terms

χ2(P∥Q)

and

. We also show that the same result holds for a large family of divergences, including

KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound

χ2(PK∥QK)

in terms of total variation distance

TV(P, Q)

and

. We then utilize these bounds to

establish locally private versions of the van Trees inequality, Le Cam’s, Assouad’s, and the mutual

information methods —powerful tools for bounding minimax estimation risks. These results are shown to

lead to tighter privacy analyses than the state-of-the-arts in several statistical problems such as entropy

and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.

I. INTRODUCTION

Local differential privacy (LDP) has now become a standard deﬁnition for individual-level privacy in

machine learning. Intuitively, a randomized mechanism (i.e., a channel) is said to be locally differentially

private if its output does not vary signiﬁcantly with arbitrary perturbation of the input. More precisely, a

mechanism is

-LDP if the privacy loss random variable, deﬁned as the log-likelihood ratio of the output

for any two different inputs, is smaller than ε.

Since its formal introduction [

], LDP has been extensively incorporated into statistical problems,

e.g., locally private mean estimation problem [

–

], and

locally private distribution estimation problem [

]. The fundamental limits of

such statistical problems under LDP are typically characterized using information-theoretic frameworks

such as Le Cam’s, Assouad’s, and Fano’s methods [

]. A critical building block for sharp privacy analysis

in such methods turns out to be the contraction coefﬁcient of LDP mechanisms. Contraction coefﬁcient

ηf(K)

of a mechanism

under an

-divergence is a quantiﬁcation of how much the data processing

inequality can be strengthened: It is the smallest

η≤1

such that

Df(PK∥QK)≤ηDf(P∥Q)

for any

distributions

and

, where

denotes the output distribution of

when its input is sampled from

Studying statistical problems under local privacy through the lens of contraction coefﬁcients was initiated

by Duchi et al. [

] in which sharp minimax risks for locally private mean estimation problems

were characterized for sufﬁciently small

. As the main technical result, they showed that any

-LDP

mechanism Ksatisﬁes

DKL(PK∥QK)≤min{4, e2ε}(eε−1)2TV2(P, Q),(1)

where

DKL(·∥·)

and

TV(·,·)

denote KL-divergence and total variation distance, respectively. In light

of the Pinsker’s inequality

2TV2(P, Q)≤DKL(P∥Q)

, this result gives an upper bound on

ηKL(K)

the

S. Asoodeh is with the Department of Computing and Software, McMaster University, Hamilton, ON L8S 1C7, Canada.

Email:

asoodeh@mcmaster.ca

. Much of this work was completed while S.A. was a visiting research scientist at the Meta’s

Statistics and Privacy Team.

H. Zhang is with Meta Platforms, Inc., New York, NY 10003, USA. Email: huanyuzhang@meta.com.

arXiv:2210.13386v4 [cs.IT] 3 May 2024

contraction coefﬁcient under KL-divergence. However, thanks to the data processing inequality, this bound

becomes vacuous if the coefﬁcient in

(1)

is strictly bigger than

(i.e.,

is not sufﬁciently small). More

recently, Duchi and Ruan [34, Proposition 8] showed a similar upper bound for χ2-divergence:

χ2(PK∥QK)≤4(eε2−1)TV2(P, Q).(2)

According to Jensen’s inequality

4TV2(P, Q)≤χ2(P∥Q)

, and thus

(2)

implies an upper bound on

ηχ2(K)

the contraction coefﬁcient under

χ2

-divergence. Analogously, this bound is non-trivial only for

sufﬁciently small

. Similar upper bounds on the contraction coefﬁcients under total variation distance

and hockey-stick divergence were determined in [

] and [

], respectively. Results of this nature are

recurrent themes in privacy analysis in statistics and machine learning, see [

–

] for other examples of

such results.

In this work, we develop a framework for characterizing tight upper bounds on

DKL(PK∥QK)

and

χ2(PK∥QK)

for any LDP mechanisms. We achieve this goal via two different approaches: (i) indirectly

by bounding

ηKL(K)

and

ηχ2(K)

, and (ii) directly by deriving inequalities of the form

(1)

and

(2)

that

are considerably tighter for all ε≥0. In particular, our main contributions are:

We obtain a sharp upper bound on

ηχ2(K)

for any

-LDP mechanism

in Theorem 1, and show that

this bound holds for a large family of divergences, including KL-divergence and squared Hellinger

distance.

We derive upper bounds for

DKL(PK∥QK)

and

χ2(PK∥QK)

in terms of

TV(P, Q)

and the privacy

parameter

in Theorem 2. While upper bounds in

(1)

and

(2)

scale as

O(e2ε)

and

O(eε2)

, respectively,

ours scales as

O(eε)

, thus signiﬁcantly improving over those bounds for practical range of

(that

is ε≥1

2).

We use our main results to develop a systemic framework for quantifying the cost of local privacy

in several statistical problems under the “sequentially interactive” setting. Our framework enables

us to improve and generalize several existing results, and also produce new results beyond the reach

of existing techniques. In particular, we study the following problems:

a) Locally private Fisher information:We show that the Fisher information matrix

IZn(θ)

parameter

given a privatized sequence

Zn:= (Z1, . . . , Zn)

Xniid

∼Pθ

satisﬁes

IZn(θ)≼

neε−1

eε+1 2IX(θ)

(Lemma 1). This result then directly leads to a private version of the van Trees

inequality (Corollary 1) that is a classical approach for lower bounding the minimax quadratic risk.

In Appendix B, we also provide a private version of the Cramér-Rao bound, provided that there

exist unbiased private estimators. It is worth noting that Barnes et al. [

] recently investigated

locally private Fisher information under certain assumptions regarding the regularity of

Pθ

. More

speciﬁcally, they derived various upper bounds on

Tr(IZ(θ))

for

ε≥0

, when

log fθ(X)

is either

sub-exponential or sub-Gaussian, or when

E[(uT∇log fθ(X))2]

is bounded for any unit vector

u∈Rd

, where

fθ

is the density of

Pθ

with respect to the Lebesgue measure. In contrast, Lemma 1

establishes a similar upper bound for small

(i.e.,

ε∈[0,1]

) but without imposing such regularity

conditions.

b) Locally private Le Cam’s and Assouad’s methods:Following [

], we establish locally private

versions of Le Cam’s and Assouad’s methods [

] that are demonstrably stronger than those

presented in [

] (Theorems 3and 4). We then used our private Le Cam’s method to study the

problem of entropy estimation under LDP where the underlying distribution is known to be supported

over

{1, . . . , k}

(Corollary 2). As applications of our private Assouad’s method, we study two

problems. First, we derive a lower bound for

ℓh

minimax risk in the locally private distribution

estimation problem which improves the constants of the state-of-the-art lower bounds [

] in the

special cases

h= 1

and

h= 2

, and leads to the same order analysis for general

h≥1

in [

We also provide an upper bound by generalizing the Hadamard response [

] to

ℓh

-norm with

h≥2

which matches the lower bound under some mild conditions. Second, we study private

non-parametric density estimation when the underlying density is assumed to be Hölder continuous

and derive a lower bound for

ℓh

minimax risk in Corollary 4. Unlike the best existing result [

our lower bound holds for all ε≥0.

c) Locally private mutual information method:Recently, mutual information method [

, Section

11] has been proposed as a more ﬂexible information-theoretic technique for bounding the minimax

risk. We invoke Theorem 1to provide (for the ﬁrst time) a locally private version of the mutual

information bound in Theorem 5. To demonstrate the ﬂexibility of this result, we consider the

Gaussian location model where the goal is to privately estimate

θ∈Θ

from

Xniid

∼ N(θ, σ2Id)

Most existing results (e.g., [

–

]) assume

ℓ2

-norm as the loss and unit

ℓ∞

-ball or unit

ℓ2

-ball

. However, our result presented in Corollary 5holds for any arbitrary loss functions and any

arbitrary set Θ(e.g., ℓh-ball for any h≥1).

d) Locally private hypothesis testing:Given

i.i.d. samples and two distributions

and

, we

derive upper and lower bounds for

SCP,Q

, the sample complexity of privately determining which distri-

bution generates the samples. More precisely, we show in Lemma 2that in the sequentially interactive

(in fact, in the more general fully interactive) setting

SCP,Q

ε≳eε

(eε−1)2max 1

TV2(P,Q),eε

H2(P,Q)

and

SCP,Q

ε≲e2ε

(eε−1)2

TV2(P,Q)

for any

ε≥0

, where

H2(P, Q)

is the squared Hellinger distance

between

and

. These bounds subsume and generalize the best existing result in [

] which

indicates

SCP,Q

ε= Θ1

ε2TV2(P,Q)

for sufﬁciently small

. Furthermore, they have recently been

shown in [

, Theorem 1.6] to be optimal (up to a constant factor) for any

ε≥0

and

are

binary. This, in fact, implies that (sequential or full) interaction does not help in the locally private

hypothesis testing problem if

and

are binary or if

ε≤1

. Therefore, our results extend [

Theorem 5.3] that indicates the optimal mechanism is non-interactive for ε≤1.

Problem UB Previous LB LB

Entropy estimation N.A. N.A. min 1,1

neε+1

eε−12log2k

(Corollary 2)

Distribution estimation,

ℓh-norm

eε(1−1/h)(eε+d)1/h

√n(eε−1)

(Theorem 6)

min n1,eε/2d1/h

√n(eε−1) ,heε/2

√n(eε−1) i1−1/ho

(Corollary 3), [2]

Density estimation,

ℓh-norm, β-Hölder N.A. (nε2)−hβ

2β+2 for ε≤1

[22]

(ne−ε(eε−1)2)−hβ

2β+2

(Corollary 4)

Gaussian location model,

arbitrary loss N.A. N.A

√d

e2(VdΓ(1+d))1/d min 1,qσ2d

n(eε+1

eε−1)

(Corollary 5)

Sample complexity of

hypothesis testing

e2ε

(eε−1)21

TV2(P,Q)

(Lemma 2)

ε2TV2(P,Q)for ε≤1

[25]

eε

(eε−1)2max 1

TV2(P,Q),eε

H2(P,Q)

(Lemma 2)

TABLE I. Summary of the minimax risks for

-LDP estimation, where we have omitted constants for all the results.

For the distribution estimation with

ℓh

-norm, our upper bound, built on Hadamard response mechanism discussed in

Appendix D, is order optimal in

and

for the dense case unless

ε≳log d

. For the Gaussian location model, we

consider the problem of privately estimating

θ∈Θ

from

Xniid

∼ N(θ, σ2Id)

. The result shown in this table assumes

that

is the unit

ℓ2

-ball, where

is the volume of the unit

∥·∥

-ball (for arbitrary norm). Corollary 5, however,

concerns with the general Θ.

A. Additional Related Work

Local privacy is arguably one of the oldest forms of privacy in statistics literature and dates back to

Warner [64]. This deﬁnition resurfaced in [37] and was adopted in the context of differential privacy as

its local version. The study of statistical efﬁciency under LDP was initiated in [

] in the minimax

setting and has since gained considerable attention. While the original bounds on the private minimax

risk in [

] were meaningful only in the high privacy regime (i.e., small

), the order optimal bounds

were recently given for several estimation problems in [

] for the general privacy regime. Interestingly,

their technique relies on the decay rate of mutual information over a Markov chain, which is known to

be equivalent to the contraction coefﬁcient under KL-divergence [13]. However, their technique is quite

different from ours in that it did not concern computing the contraction coefﬁcient of an LDP mechanism.

Among locally private statistical problems studied in the literature, two examples have received

considerably more attention, namely, mean estimation and discrete distribution estimation. For the ﬁrst

problem, Duchi et al. [

] used

(1)

to develop asymptotically optimal procedures for estimating the mean

in the high privacy regime (i.e.,

ε < 1

). For the high privacy regime (i.e.,

ε > 1

), a new algorithm was

proposed in [

] that is optimal and matches the lower bound derived in [

] for interactive mechanisms.

There has been more work on locally private mean estimation that studies the problem under additional

constraints [

–

]. For the second problem, Duchi et al. [

] studied

(non-interactive) locally private distribution estimation problem under

ℓ1

and

ℓ2

loss functions and derived

the ﬁrst lower bound for the minimax risk, which was shown to be optimal [

] for high privacy regime.

Follow-up works such as [

] characterized the optimal minimax rates for general

Recently, [2] derived a lower bound for ℓhloss with h≥1.

The problem of locally private entropy estimation has received signiﬁcantly less attention in the literature,

despite the vast line of research on the non-private counterpart. The only related work in this area seems

to be [

] which studied estimating Rényi entropy of order

and derived optimal rates only when

λ > 2

. Thus, the optimal private minimax rate seems to be still open. We remark that [

] explicitly

considered the problem of entropy estimation, but in the setting of central differential privacy.

The closest work to ours are [

] which extensively studied the contraction coefﬁcient of LDP

mechanisms under the hockey-stick divergence. More speciﬁcally, it was shown in [

] that

-LDP

if and only if

Eeε(PK∥QK)

the hockey-stick divergence between

and

is equal to zero for any

distributions

and

, and thus if and only if the contraction coefﬁcient of

under the hockey-stick

divergence is zero. By representing

χ2

-divergence in terms of the hockey-stick divergence, this result

leads to a conceptually similar, albeit weaker, result as Theorem 2.

In [

], Acharya et al. introduced an information-theoretic toolbox to establish lower bounds for private

estimation problems. However, they considered the threat model of central differential privacy, a totally

different model from the local differential privacy considered in this work.

B. Notation

We use upper-case letters (e.g.,

) to denote random variables and calligraphic letters to represent their

support sets (e.g.,

). We write

to denote

random variables

X1, . . . , Xn

. The set of all distributions

is denoted by

P(X)

. A mechanism (or channel)

K:X → P(Z)

is speciﬁed by a collection of

distributions

{K(·|x)∈ P(Z) : x∈ X}

. Given such mechanism

and

P∈ P(X)

, we denote by

the

output distribution of

when the input is distributed according to

, given by

PK(A):=RP(dx)K(A|x)

for

A⊂ Z

. We use

EP[·]

to write the expectation with respect to

and

[n]

for an integer

n≥1

denote {1, . . . , n}.

II. PRELIMINARIES AND DEFINITIONS

In this section, we give basic deﬁnitions of

-divergence, contraction coefﬁcients, and LDP mechanisms.

-Divergences and Contraction Coefﬁcients. Given a convex function

f: (0,∞)→R

such that

f(1) = 0

, the

-divergence between two probability measures

P≪Q

is deﬁned as [

]

Df(P∥Q):=

EQfdP

dQ.Examples of f-divergences needed in the subsequent sections include:

•KL-divergence DKL (P∥Q):=Df(P∥Q)for f(t) = tlog t,

•total-variation distance TV(P, Q):=Df(P∥Q)for f(t) = 1

2|t−1|,

•χ2-divergence χ2(P∥Q):=Df(P∥Q)for f(t) = t2−1,

•squared Hellinger distance H2(P, Q):=Df(P∥Q)for f(t) = (1 −√t)2, and

•

hockey-stick divergence (aka

Eγ

-divergence [

])

Eγ(P∥Q):=Df(P∥Q)

for

f(t) = (t−γ)+

for

some γ≥1, where (a)+:= max{a, 0}.

All

-divergences are known to satisfy the data-processing inequality. That is, for any channel

K:X 7→

P(Z)

, we have

Df(PK∥QK)≤Df(P∥Q)

for any pair of distributions

(P, Q)

. However, this inequality

is typically strict. One way to strengthen this inequality is to consider

ηf(K)

the contraction coefﬁcient

of Kunder f-divergence [11] deﬁned as

ηf(K):= sup

P,Q∈P(X):

Df(P∥Q)̸=0

Df(QK∥PK)

Df(Q∥P).(3)

With this deﬁnition at hand, we can write

Df(PK∥QK)≤ηf(K)Df(P∥Q)

, which is typically referred

to as the strong data processing inequality. We will study in details contraction coefﬁcients under KL-

divergence,

χ2

-divergence, squared Hellinger distance, and total variation distance, denoted by

ηKL(K)

ηχ2(K)

ηH2(K)

, and

ηTV(K)

, respectively, in the next section. We also need the following well-known

fact about ηKL(K)[13]:

ηKL(K) = sup

PXU :

U−X−Z

I(U;Z)

I(U;X),(4)

where

is the channel specifying

PZ|X

I(A;B):=DKL(PAB∥PAPB)

is the mutual information between

two random variables

and

, and

U−X−Z

denotes the Markov chain in that order. Another important

property of ηKL required in the proofs is its tensorization which is described in Appendix A.

Local Differential Privacy A randomized mechanism

K:X → P(Z)

is said to be

-locally differentially

private (

-LDP for short) for

ε≥0

if [

]

K(A|x)≤eεK(A|x′),

for all

A⊂ Z

and

x, x′∈ X

. Let

Qε

be the collection of all

-LDP mechanisms

. It can be shown that LDP mechanisms can be equivalently

deﬁned in terms of the hockey-stick divergence:

K∈ Qε⇐⇒ Eeε(K(·|x)∥K(·|x′)) = 0,∀x, x′∈ X.(5)

Arguably, the most known LDP mechanism is the binary randomized-response mechanism, introduced

by Warner [

]. For

X={0,1}

, let the mechanism

be deﬁned as

K(·|1) = Bernoulli(κ)

and

K(·|0) =

Bernoulli(1 −κ)

. It can be easily veriﬁed that this mechanism is

-LDP if

κ=eε

1+eε

. In information

theory parlance, the binary randomized response mechanism is a binary symmetric channel with crossover

probability 1

1+eε. A natural way to generalize this mechanism to the non-binary set is as follows.

Example 1. (k-ary randomized response) Let X=Z= [k]. Let the mechanism Kbe deﬁned as

K(z|x) = (eε

eε+k−1,if z=x,

eε+k−1,otherwise.(6)

It can be veriﬁed that Eeε(K(·|x)∥K(·|x′)) = 0 for all x, x′∈[n], implying this mechanism is ε-LDP.

Suppose there are

users, each in possession of a sample

i∈[n]:={1, . . . , n}

. User

applies a

mechanism

to generate

a privatized version of

. The collection of such mechanisms is said to be

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

1ContractionofLocallyDifferentiallyPrivateMechanismsShahabAsoodehandHuanyuZhangAbstractWeinvestigatethecontractionpropertiesoflocallydifferentiallyprivatemechanisms.Morespecifically,wederivetightupperboundsonthedivergencebetweenPKandQKoutputdistributionsofanε-LDPmechanismKintermsofadivergencebetween...

展开>> 收起<<

1 Contraction of Locally Differentially Private Mechanisms.pdf

共33页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

1 Contraction of Locally Differentially Private Mechanisms

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: