Learning-Augmented Private Algorithms for Multiple Quantile Release

2025-05-02 0 0 2.36MB 33 页 10玖币
侵权投诉
Learning-Augmented Private Algorithms for Multiple Quantile Release
Mikhail Khodak 1Kareem Amin 2Travis Dick 2Sergei Vassilvitskii 2
Abstract
When applying differential privacy to sensitive
data, we can often improve performance using
external information such as other sensitive data,
public data, or human priors. We propose to use
the learning-augmented algorithms (or algorithms
with predictions) framework—previously applied
largely to improve time complexity or competitive
ratios—as a powerful way of designing and
analyzing privacy-preserving methods that can
take advantage of such external information
to improve utility. This idea is instantiated
on the important task of multiple quantile
release, for which we derive error guarantees
that scale with a natural measure of prediction
quality while (almost) recovering state-of-the-art
prediction-independent guarantees. Our analysis
enjoys several advantages, including minimal
assumptions about the data, a natural way of
adding robustness, and the provision of useful
surrogate losses for two novel “meta” algorithms
that learn predictions from other (potentially
sensitive) data. We conclude with experiments
on challenging tasks demonstrating that learning
predictions across one or more instances can lead
to large error reductions while preserving privacy.
1. Introduction
The differentially private (DP) release of statistics such as
the quantile
q
of a private dataset
xPRn
is an inevitably
error-prone task because we are by definition precluded
from revealing exact information about the instance at
hand (Dwork & Roth,2014). However, DP instances rarely
occur in a vacuum: even in the simplest practical settings,
we usually know basic information such as the fact that all
individuals have a nonnegative age. Often, the dataset we
are considering is drawn from a similar population as a pub-
1
Carnegie Mellon University; work done in part as an intern
at Google Research - New York.
2
Google Research - New York.
Correspondence to: Mikhail Khodak <khodak@cmu.edu>.
Proceedings of the
40 th
International Conference on Machine
Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright
2023 by the author(s).
lic dataset
zPRN
and should thus have similar quantiles,
a case known as the public-private setting (Liu et al.,2021;
Bie et al.,2022). Alternatively, in what we call sequential re-
lease, we aim to release the quantiles of each of a sequence
of datasets
x1,...,xT
one-by-one. These could be gener-
ated by a stationary or other process that allows information
derived from prior releases to inform predictions of future
releases. In all of these settings, we might hope to incorpo-
rate external information to reduce error, but approaches for
doing so tend to be ad hoc and assumption-heavy.
We propose that the framework of learning-augmented
algorithms—a.k.a. algorithms with predictions (Mitzen-
macher & Vassilvitskii,2021)—provides the right tools for
deriving DP algorithms in this setting, and instantiate this
idea for multiple quantile release (Gillenwater et al.,2021;
Kaplan et al.,2022). Algorithms with predictions is an
expanding field of algorithm design that constructs methods
whose instance-dependent performance improves with
the accuracy of some prediction about the instance. The
goal is to bound the cost
Cxpwq
of running on instance
x
given a prediction
w
by some metric
Uxpwq
of the quality
of the prediction on that instance. Motivated by practical
success (Liu et al.,2012;Kraska et al.,2018) and as a type
of beyond-worst-case analysis (Roughgarden,2020), such
algorithms can target a wide variety of cost measures, e.g.
competitive ratios in online algorithms (Anand et al.,2020;
Bamas et al.,2020;Diakonikolas et al.,2021;D
¨
utting et al.,
2021;Indyk et al.,2022;Yu et al.,2022;Christianson et al.,
2023;Jiang et al.,2020;Kumar et al.,2018;Lykouris &
Vassilvitskii,2021;Rohatgi,2020), space complexity in
streaming algorithms (Du et al.,2021), and time complexity
in graph algorithms (Dinitz et al.,2021;Chen et al.,2022;
Sakaue & Oki,2022) and distributed systems (Lattanzi
et al.,2020;Lindermayr & Megow,2022;Scully et al.,
2022). Departing from such work, we instead aim to design
learning-augmented algorithms whose cost
Cxpwq
captures
the error of some statistic—in our case quantiles—computed
privately on instance an
x
given a prediction
w
. We are
interested in bounding this cost in terms of the quality of
the external information provided to our algorithm,
Uxpwq
.
While incorporating external information into DP is well-
studied, c.f. public-private methods (Bie et al.,2022;Liu
et al.,2021) and private posterior inference (Dimitrakakis
et al.,2017;Geumlek et al.,2017;Seeman et al.,2020), by
1
arXiv:2210.11222v2 [cs.CR] 8 May 2023
Learning-Augmented Private Algorithms for Multiple Quantile Release
deriving and analyzing a learning-augmented algorithm for
multiple quantiles we show numerous comparative advan-
tages, including:
1.
Minimal assumptions about the data, in our case even
fewer than needed by the unaugmented baseline.
2.
Existing tools for studying the robustness of algorithms
to noisy predictions (Lykouris & Vassilvitskii,2021).
3.
Co-designing algorithms with predictions together with
methods for learning those predictions from data (Kho-
dak et al.,2022), which we show is crucial for both the
public-private and sequential release settings.
As part of this analysis we derive a learning-augmented ex-
tension of the
ApproximateQuantiles
(AQ) method
of Kaplan et al. (2022) that (nearly) matches its worst-case
guarantees while being much better if a natural measure
Uxpwq
of prediction quality is small. By studying
Ux
, we
make the following contributions to multiple quantiles:
1.
The first robust algorithm, even for one quantile, that
avoids assuming the data is bounded on some interval,
specifically by using a heavy-tailed prior.
2.
A provable way of ensuring robustness to poor priors,
without losing the consistency of good ones.
3.
A novel connection between DP quantiles and censored
regression that leads to (a) a public-private release al-
gorithm and (b) a sequential release scheme, both with
runtime and error guarantees.
Finally, we integrate these techniques to significantly
improve the accuracy of public-private and sequential
quantile release on several real and synthetic datasets.
2. Related work
There has been significant work on incorporating external
information to improve DP methods. A major line of work
is the public-private framework, where we have access
to public data that is related in some way to the private
data (Liu et al.,2021;Amid et al.,2022;Li et al.,2022;Bie
et al.,2022;Bassily et al.,2022). The use of public data
can be viewed as using a prediction, but such work starts by
making (often strong) distributional assumptions on the pub-
lic and private data; we instead derive instance-dependent
upper bounds with minimal assumptions that we then apply
to such public-private settings. Furthermore, our frame-
work allows us to ensure robustness to poor predictions
without distributional assumptions, and to derive learning
algorithms using training data that may itself be sensitive.
Another approach is to treat DP mechanisms (e.g. the
exponential) as Bayesian posterior sampling (Dimitrakakis
et al.,2017;Geumlek et al.,2017;Seeman et al.,2020).
Our work can be viewed as an adaptation where we give
explicit prior-dependent utility bounds. To our knowledge,
no such guarantees exist in the literature. Moreover, while
our focus is quantile estimation, the predictions-based
framework that we advocate is much broader, as many
DP methods—including for multiple quantiles—combine
multiple queries that must be considered jointly.
Our approach for augmenting DP with external information
centers the algorithms with predictions framework, where
past work has focused on using predictions to improve met-
rics related to time, space, and communication complexity.
We make use of existing techniques from this literature, in-
cluding robustness-consistency tradeoffs (Lykouris & Vassil-
vitskii,2021) and the online learning of predictions (Khodak
et al.,2022). Tuning DP algorithms has been an important
topic in private machine learning, e.g. for hyperparameter
tuning (Chaudhuri & Vinterbo,2013) and federated learn-
ing (Andrew et al.,2021), but these have not to our knowl-
edge considered incorporating per-instance predictions.
The specific task we focus on is DP quantiles, a well-studied
problem (Gillenwater et al.,2021;Kaplan et al.,2022), but
we are not aware of work adding outside information. We
also make the important contribution of an effective method
for removing data-boundedness assumptions. Our algorithm
builds upon the state-of-the-art work of Kaplan et al. (2022),
which is also our main source for empirical comparison.
3. Augmenting a private algorithm
The basic requirement for a learning-augmented algorithm
is that the cost
Cxpwq
of running it on an instance
x
with
prediction
w
should be upper bounded—usually up to
constant or logarithmic factors—by a metric
Uxpwq
of the
quality of the prediction on the instance. We denote this by
CxÀUx
. In our work the cost
Cxpwq
will be the error of
a privately released statistic, as compared to some ground
truth. We will use the following privacy notion:
Definition 3.1
(Dwork & Roth (2014))
.
Algorithm
A
is
pε, δq-differentially private
if for all subsets
S
of its range,
PrtApxq P Su ď eεPrtAp˜xq P Su ` δ
whenever
x˜x
are neighboring, i.e. they differ in at most one element.
Using
ε
-DP to denote
pε, 0q
-DP, the broad goal of this work
will be to reduce the error
Cxpwq
of
ε
-DP multiple quantile
release while fixing the privacy level ε.
3.1. Problem formulation
A good guarantee for a learning-augmented algorithm will
have several important properties that formally separate its
performance from naive upper bounds
UxÁCx
. The first,
consistency, requires it to be a reasonable indicator of strong
performance in the limit of perfect prediction:
Definition 3.2.
A learning-augmented guarantee
CxÀUx
is cx-consistent if Cxpwq ď cxwhenever Uxpwq “ 0.
Here
cx
is a prediction-independent quantity that should
depend weakly or not at all on problem difficulty (in the
case of quantiles, the minimum separation between data
2
Learning-Augmented Private Algorithms for Multiple Quantile Release
points). Consistency is often presented via a tradeoff with
robustness (Lykouris & Vassilvitskii,2021), which bounds
how poorly the method can do when the prediction is bad,
in a manner similar to a standard worst-case bound:
Definition 3.3.
A learning-augmented guarantee
CxÀUx
is
rx-robust
if it implies
Cxpwq ď rx
for all predictions
w
.
Unlike consistency, robustness usually depends strongly on
the difficulty of the instance
x
, with the goal being to not do
much worse than a prediction-free approach. Note that the
latter is trivially robust but not (meaningfully) consistent,
since it ignores the prediction; this makes clear the need for
considering the two properties via tradeoff between them.
As discussed further in Section 4.2, this existing language
for quantifying robustness is one of the advantages of
using the framework of learning-augmented algorithms for
incorporating external information into DP methods. We
report robustness-consistency trade-offs for our quantile
release algorithms in the same section.
A last desirable property of the prediction quality measure
Uxpwq
is that it should be useful for making good predic-
tions. One way to formalize this is to require
Uxt
to be learn-
able from multiple instances
xt
. For example, we could ask
for online learnability, i.e. the existence of an algorithm that
makes predictions
wtPW
in some action space
W
given
instances x1,...,xt´1whose regret is sublinear in T:
Definition 3.4.
The
regret
of actions
w1,...,wTP
W
on the sequence of functions
Ux1, . . . , UxT
is
maxwPWřT
t1Uxtpwtq ´ Uxtpwq.
Sublinear regret implies average prediction quality as good
as that of the optimal prediction in hindsight, up to an
additive term that vanishes as
TÑ 8
. Since
Uxt
roughly
upper-bounds the error
Cxt
, this means that asymptotically
the average error is governed by the average prediction
quality
minwPW1
TřT
t1Uxtpwq
of the optimal
wPW
. A
crucial observation here is that sublinear regret can often be
obtained by making the function
Ux
amenable to familiar
gradient-based online convex optimization methods such
as online gradient descent (Khodak et al.,2022). Doing so
also enables instance-dependent linear prediction: setting
wtusing a learned function of some instance features ft.
We demonstrate the usefulness of both learning and
robustness-consistency analysis in two applications where it
is reasonable to have external information about the sensitive
dataset(s). In the
public-private
setting, the prediction
w
is
obtained from a public dataset
x1
that is assumed to be sim-
ilar to
x
but is not subject to privacy-protection. In
sequen-
tial release
, we privately release information about each
dataset in a sequence
x1,...,xT
; the release at time
t
can
depend on
xt
and on a prediction
wt
, which can be derived
(privately) from past observations. In Section 5we show
that sequential release can be posed directly as a private
online learning problem, while the public-private setting
can be approached via online-to-batch conversion (Cesa-
Bianchi et al.,2004). Both are thus directly enabled by
treating the prediction quality measures
Uxt
as surrogate
objectives for the actual cost functions
Cx
and applying
standard optimization techniques (Khodak et al.,2022).
With these desiderata of algorithms with predictions guar-
antees in-mind, we now move to deriving them for quantile
release. The robustness and learnability of the resulting
prediction quality measures Uxare discussed in Section 4.
3.2. Warm-up: Releasing one quantile
Given a quantile
qP p0,1q
and a sorted dataset
xPRn
of
n
distinct points, we want to release
oP rxrtqnus,xrtqnu`1sq
,
i.e. such that the proportion of entries less than
o
is
q
. As in
prior work (Kaplan et al.,2022), the error of
o
will be the
number of points between it and the desired interval:
Gapqpx, oq “ ||ti:xrisăou| ´ tqnu|“|max
xrisăoi´tqnu|
(1)
Gapqpx, oq
is constant on intervals
Ik“ pxrks,xrk`1ss
in the partition by
x
of
R
(let
I0“ p´8,xr1ss
and
In“ pxrns,8q
), so we also say that
Gapqpx, Ikq
is the
same as Gapqpx, oqfor some oin the interior of Ik.
For single quantile release we choose perhaps the most natu-
ral way of specifying a prediction for a DP algorithm: via the
base measure
µ:RÞÑ Rě0
of the exponential mechanism:
Theorem 3.1
(McSherry & Talwar (2007))
.
If the
util-
ity upx, oq
of an outcome
o
of a query over dataset
x
has
sensitivity maxo,x˜x |upx, oq ´ up˜x, oq| ď
then the
exponential mechanism
, which releases
o
w.p.
9exppε
2∆ upx, oqqµpoqfor some base measure µ, is ε-DP.
The utility function we use is
uq“ ´Gapq
, so since this is
constant on each interval
Ik
the mechanism here is equiva-
lent to sampling
k
w.p.
9exppεuqpx, Ikq{2qµpIkq
and then
sampling
o
from
Ik
w.p.
9µpoq
. While the idea of spec-
ifying a prior for EM is well-known, the key idea here is
to obtain a prediction-dependent bound on the error that
reveals a useful measure of the quality of the prediction. In
particular, we can show (c.f. Lemma A.1) that running EM
in this way yields othat w.p. ě1´βsatisfies
Gapqpx, oq ď 2
εlog 1{β
Ψpqq
xpµqď2
εlog 1{β
Ψpqq
xpµq(2)
where the quantity
Ψpqq
xşexpε
2Gapqpx, oqqµpoqdo
is the inner product between the prior and the EM score
while
Ψpqq
xlimεÑ8 Ψpqq
xµppxrtqnus,xrtqnu`1ssq
is
the probability that the prior assigns to the optimal interval.
This suggests two metrics of prediction quality: the neg-
ative log-inner-products
Upqq
xpµq “ ´log Ψpqq
xpµq
and
Upqq
xpµq“´log Ψpqq
xpµq
. Both make intuitive sense: we
3
Learning-Augmented Private Algorithms for Multiple Quantile Release
expect predictions
µ
that assign a high probability to in-
tervals that the EM score weighs heavily to perform well,
and EM assigns the most weight to the optimal interval.
There are also many ways that these metrics are useful.
For one, in the case of perfect prediction—i.e. if
µ
as-
signs probability one to the optimal interval
Itqnu
—then
Ψpqq
xpµq “ Ψpqq
xpµq “ 1
, yielding an upper bound on the
error of only
2
εlog 1
β
. Secondly, as we will see, both are
also amenable for analyzing robustness (the mechanism’s
sensitivity to incorrect priors) and learning. A final and
important quality is that the guarantees using these metrics
hold under no extra assumptions. Between the two, the first
metric provides a tighter bound on the utility loss while the
second does not depend on ε, which may be desirable.
It is also fruitful to analyze the metrics for specific priors.
When
x
is in a bounded interval
pa, bq
and
µpoq “ 1oPpa,bq
b´a
is the uniform measure, then
Ψpqq
xpµq ě ψx
b´a
, where
ψx
is
the minimum distance between entries; thus we recover past
bounds, e.g. Kaplan et al. (2022, Lemma A.1), that implic-
itly use this measure to guarantee
Gapqpx, oq ď 2
εlog b´a
βψx
.
Here the support of the uniform distribution is correct by
assumption as the data is assumed bounded. However, an-
alyzing
Ψpqq
x
also yields a novel way of removing this as-
sumption: if we suspect the data lies in
pa, bq
, we set
µ
to
be the Cauchy prior with location
a`b
2
and scale
b´a
2
. Even
if we are wrong about the interval, there exists an
Rą0
s.t.
the data lies in the interval
pa`b
2˘Rq
, so using the Cauchy
yields
Ψpqq
xě2pb´aqψx{π
pb´aq2`4R2
and thus the following guarantee:
Corollary 3.1
(of Lem. A.1)
.
If the data lies in the interval
pa`b
2˘Rq
and
µ
is the Cauchy measure with location
a`b
2
and scale
b´a
2
then the output of the exponential mechanism
satisfies Gapqpx, oq ď 2
εlog ˆπb´a`4R2
b´a
2βψx˙w.p. ě1´β.
If
Rb´a
2
, i.e. we get the interval right, then the bound is
only an additive factor
2
εlog π
worse than before, but if we
are wrong then performance degrades as
Oplogp1`R2qq
,
unlike the
OpRq
error of the uniform prior. Note our use of
a heavy-tailed distribution here: a sub-exponential density
decays too quickly and leads to error
OpRq
rather than
Oplogp1`R2qq
. We can also adapt this technique if we
know only a single-sided bound, e.g. if values must be
positive, by using an appropriate half-Cauchy distribution.
3.3. Releasing multiple quantiles
To simultaneously estimate quantiles
q1, . . . , qm
we adapt
the
ApproximateQuantiles
method of (Kaplan et al.,
2022), which assigns each
qi
to a node in a binary tree and,
starting from the root, uses EM with the uniform prior to
estimate a quantile before sending the data below the out-
come
o
to its left child and the data above
o
to its right child.
Thus each entry is only involved in
rlog2ms
exponential
mechanisms, and so for data in
pa, bq
the maximum
Gapqi
across quantiles is
O´log2m
εlog mpb´aq
βψx¯
, which is much
better than the naive bound of a linear function of m.
Given one prior
µi
for each
qi
, a naive extension of
(2)
gets
a similar
polylogpmq
bound (c.f. Lem A.2); notably we ex-
tend the Cauchy-unboundedness result to multiple quantiles
(c.f. Cor. A.1). However the upper bound is not a determin-
istic function of
µi
, as it depends on restrictions of
x
and
µi
to subsets
poj, okq
of the domain induced by the outcomes
of EM for quantiles
qj
and
qk
earlier in the tree. It thus
does not encode a direct relationship between the prediction
and instance data and is less amenable for learning.
We instead want guarantees depending on a more natural
metric, e.g. one aggregating
Ψpqiiq
xpµiq
from the previous
section across pairs
pqi, µiq
. The core issue is that the data
splitting makes the probability assigned by a prior
µi
to data
outside the interval
poj, okq
induced by the outcomes of
quantiles
qj
and
qk
earlier in the tree not affect the distribu-
tion of
oi
. One way to handle this is to assign this probability
mass to the edges of
poj, okq
, rather than the more natural
conditional approach of
ApproximateQuantiles
. We
refer to this as “edge-based prior adaptation” and use it
to bound
Gapmax maxiGapqipx, oiq
via the harmonic
mean Ψpεq
xof the inner products Ψpqiiq
xpµiq:
Theorem 3.2
(c.f. Thm. A.1)
.
If
m2k´1
for some
k
, quantiles
q1, . . . , qm
are uniformly spaced, and for
each we have a prior
µi:RÞÑ Rě0
, then running
ApproximateQuantiles
with edge-based prior
adaptation (c.f. Algorithm 2) is ε-DP, and w.p. ě1´β
Gapmax ď2
εφlog2pm`1qrlog2pm`1qslog m{β
Ψpεq
x
for Ψpεq
x˜m
ÿ
i1
1{m
Ψpqiiq
xpµiq¸´1(3)
Here εiε
rlog2pm`1qsand φ1`?5
2is the golden ratio.
The golden ratio is due to a Fibonacci-type recurrence
bounding the maximum
Gapqi
at each depth of the tree.
Ψpεq
x
depends only on
x
and predictions
µi
, and it yields
a nice error metric
Upεq
x“ ´log Ψpεq
xlog řm
i1eUpqiiq
x
.
However, the dependence of the error on
m
is worse than
of
ApproximateQuantiles
, as
φlog2m
is roughly
Opm0.7q
. The bound is still sublinear and thus better than
the naive baseline of running EM mtimes.
The
˜
Opφlog2mq
dependence results from error compound-
ing across depths of the tree, so we can try to reduce depth
by going from a binary to a
K
-ary tree. This involves run-
ning EM
K´1
times at each node—and paying
K´1
more
in budget—to split the data into
K
subsets; the resulting
estimates may also be out of order. However, by showing
4
Learning-Augmented Private Algorithms for Multiple Quantile Release
Figure 1.
Maximum gap as a function of
m
for different variants of
AQ when using the Uniform prior, evaluated on 1000 samples from
a standard Gaussian (left) and the Adult “age” dataset (right). The
dashed and solid lines correspond to ε1and 0.1, respectively.
that sorting them back into order does not increase the error
and then controlling the maximum
Gapqi
at each depth via
another recurrence relation, we prove the following:
Theorem 3.3
(c.f. Thm. A.2)
.
For any
q1, . . . , qm
,
using
Krexppalog 2 logpm`1qqs
and edge-based
adaptation guarantees
ε
-DP and w.p.
ě1´β
has
Gapmax ď2π2
εexp ´2alogp2qlogpm`1q¯log m{β
Ψpεq
x
.
The rate in
m
is both sub-polynomial and super-poly-
logarithmic (
opmαq
and
ωplogαmq @ αą0
); while
asymptotically worse than the prediction-free original
result (Kaplan et al.,2022), for almost any practical value of
m
(e.g.
mP r3,1012s
) it does not exceed a small constant
(e.g. nine) times
log3m
. Thus if the error
´log Ψpεq
x
of the
prediction is small—i.e. the inner products between priors
and EM scores are large on (harmonic) average—then we
may do much better with this approach.
We compare K-ary AQ with edge-based adaptation to regu-
lar AQ on two datasets in Figure 1. The original is better at
higher
ε
but similar or worse at higher privacy. We also find
that conditional adaptation is only better on discretized data
that can have repetitions, a case where neither method pro-
vides guarantees. Overall, we find that our prior-dependent
analysis covers a useful algorithm, but for consistency with
past work and due to its better performance at high
ε
we
will focus on the original binary approach in experiments.
4. Utility of learning-augmented algorithms
In the previous section we derived a data-dependent function
Upεq
x“ ´log Ψpεq
x
that upper bounds the error of quantile
release using priors
µ1, . . . , µm
. As in the single-quantile
case, we can construct a looser,
ε
-independent upper bound
Ux“ ´log Ψxlog
m
ÿ
i1
eUpqiq
xěUpεq
x(4)
using the harmonic mean
Ψx
of
Ψpqiq
x
. We next summarize
the usefulness of these upper bounds for understanding and
applying DP methods with external information. Note that
all three aspects below are crucial in our experiments.
4.1. Minimal assumptions and new insights
Our guarantees require no extra data assumptions: in-fact,
the first outcome of our analysis was removing a bounded-
ness assumption. This contrasts with past public-private
work (Liu et al.,2021;Bie et al.,2022), which makes
distributional assumptions, and is why we can apply these
results to two very distinct settings in Section 5.
4.2. Ensuring robustness
While we incorporate external information into DP-
algorithms because we hope to improve performance, if
not done carefully it may lead to worse results. For exam-
ple, a quantile prior concentrated away from the data may
have error depending linearly on the distance to the optimal
interval. Ideally an algorithm that uses a prediction will be
robust, i.e. revert back to worst-case guarantees if the pre-
diction is poor, without significantly sacrificing consistency,
i.e. performing well if the prediction is good.
Using the formalization of these properties in Definitions 3.2
and 3.3, algorithms with predictions provides a conve-
nient way to deploy them by parameterizing the robustness-
consistency tradeoff, in which methods are designed to be
rxpλq
-robust and
cxpλq
-consistent for a user-specified pa-
rameter
λP r0,1s
(Bamas et al.,2020;Lykouris & Vassilvit-
skii,2021). For quantiles, we can obtain an elegant param-
eterized tradeoff by interpolating prediction priors with a
“robust” prior. In particular, since
Ψpqq
x
is linear we can pick
ρ
to be a trusted prior such as the uniform or Cauchy and for
any prediction
µ
use
µpλq“ p1´λqµ`λρ
instead. Setting
Ψpqq
xpµpλqq“p1´λqΨpq,εq
xpµq`λΨpqq
xpρq
in
(2)
yields:
Corollary 4.1
(of Lem. A.1; c.f. Cor. B.1)
.
For quan-
tile
q
, applying EM with prior
µpλq“ p1´λqµ`λρ
is
´2
εlog 1{β
λΨpq,εq
xpρq¯-robust and ´2
εlog 1{β
1´λ¯-consistent.
Thus w.h.p. error is simultaneously at most
2
εlog 1
λ
worse
than that of only using the robust prior
ρ
and we only have
error
2
εlog 1{β
1´λ
if the prediction
µ
is perfect, i.e. if it is only
supported on the optimal interval. This is easy to extend to
the multiple-quantile metric
´log Ψpεq
x
. In fact, we can even
interpolate between the
polylogpmq
prediction-free guaran-
tee of past work and our learning-augmented guarantee with
the worse dependence on
m
; thus if the prediction is not
good enough to overcome this worse rate we can still ensure
that we do not do much worse than the original guarantee.
Corollary 4.2
(of Lem. A.2 & Thm. A.1; c.f. Cor. B.2)
.
If
we run binary AQ on data in the interval
pa`b
2˘Rq
for
unknown
Rą0
and use the prior
µpλq
i“ p1´λqµ`λρ
for each
qi
, where
ρ
is Cauchy
pa`b
2,b´a
2q
, then the al-
gorithm is
ˆ2
εrlog2ms2log ˆπm b´a`4R2
b´a
2λβψx˙˙
-robust and
´2
εφlog2mrlog2mslog m{β
1´λ¯-consistent.
5
摘要:

Learning-AugmentedPrivateAlgorithmsforMultipleQuantileReleaseMikhailKhodak1KareemAmin2TravisDick2SergeiVassilvitskii2AbstractWhenapplyingdifferentialprivacytosensitivedata,wecanoftenimproveperformanceusingexternalinformationsuchasothersensitivedata,publicdata,orhumanpriors.Weproposetousethelearning-...

展开>> 收起<<
Learning-Augmented Private Algorithms for Multiple Quantile Release.pdf

共33页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:33 页 大小:2.36MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 33
客服
关注