To achieve good privacy-utility trade-offs, it is
important to accurately track the total privacy bud-
get spent throughout the entire training. In the con-
text of DP, repeated executions of the same (here:
Gaussian) mechanism is referred to as composi-
tion. Basic (Dwork et al.,2006b) and various more
refined, advanced composition theorems (Dwork
et al.,2010;Dwork and Rothblum,2016;Bun and
Steinke,2016) have been stated in the literature that
aim at providing tight bounds for the overall pri-
vacy budget. However, these advances still resulted
in relatively loose bounds and thus large overall
privacy budgets over the course of highly itera-
tive algorithms such as DP-SGD. Tight worst-case
bounds for composition were derived by Kairouz
et al. (2015), however, it was shown to be compu-
tationally infeasible to compute them in general
(Murtagh and Vadhan,2016).
For this reason, specific efforts have been made
to find tighter bounds and accurate approximations
for the overall privacy loss: A first example that
provides substantial reduced upper bounds is the
moments accountant (Abadi et al.,2016a), which
is closely related to Rényi DP (Mironov,2017), a
generalization of DP based on Rényi divergence.
Gaussian and
f
-DP (Dong et al.,2019) provide an
approximation of the total budget using the central
limit theorem (CLT). Finally, Gopi et al. (2021);
Koskela et al. (2020), inspired by Sommer et al.
(2019), are able to compute the exact budget nu-
merically up to arbitrary precision by aggregating
the privacy loss random variable with fast Fourier
transform.
3 Approach
We consider the following scenario to motivate
our approach: an entity wants to implement NLP
pipelines to gain insights from internal data, e.g.,
emails from customers. To seek advice and get sup-
port for modeling the data and building pipelines,
the entity aims to share an excerpt of the internal
data with a third party such as a consultant or a
group of researchers. In order to do this without
compromising the privacy of its customers, the aim
is to synthesize a verifiably private “toy” dataset
that reflects the properties of the original data with-
out leaking private information. On such a toy
dataset, a third party could research how to best
solve the task at hand and train a model to perform
inference on the actual internal data, without be-
ing able to access sensitive information about cus-
tomers. Formally, we aim to achieve the following
goal: We consider a dataset consisting of a train-
ing set
Dtrain
and test set
Dtest
. Given
Dtrain
or a
subset of it, we want to train a generative model to
synthesize a dataset
‹
Dtrain
that does not leak infor-
mation from the original
Dtrain
. Furthermore, the
synthesized dataset should share statistical prop-
erties with the original one so that a classification
model trained on
‹
Dtrain
performs as well as if it
was trained on
Dtrain
when making predictions
about Dtest.
To achieve this, we use the pretrained autore-
gressive transformer model (Vaswani et al.,2017)
GPT-2 (Radford et al.,2019) and use natural lan-
guage prompts to enable the conditional generation
of text based on desired textual attributes such as its
sentiment, domain or genre provided in the prompt.
Furthermore, we introduce a new training objective
that penalizes the generation of samples fitting an-
other label to reduce the risk of faulty labeled sam-
ples in our synthetic dataset. Finally, we fine-tune
our model using a differentially private optimizer
to provide privacy guarantees for our training data
and to prevent information leakage from our model
when subsequently sampling our synthetic dataset.
3.1 Conditional text generation with natural
language prompts
As we want to control specific textual attributes of
our synthetic data, we need to train our model in a
manner that allows us to generate different types
of texts corresponding to the desired attributes or
labels present in our dataset. We consider a text
sample to correspond to a set of
M
attributes of
interest, namely
A:= {a1, a2, . . . , aM}
, where
each attribute
aj
can take on a set of categorical
values
Cj
. In the case of product reviews,
a1
could
be the sentiment of a review that can take on the
values
a1∈C1={Positive,Negative}
and
a2
can be the product category, so that
a2∈C2=
{Books,Electronics,DVD,Kitchen}
. Our goal
is to learn a model
p(x|a1, ..., aM)
in order to con-
trollably synthesize text samples according to our
desired attributes.
A straightforward approach to realize this would
be to train a single generative model for all possi-
ble attribute value combinations. This approach
is, however, highly memory-intensive, as it re-
quires us to store the weights of a large number
of models that grows exponentially with the num-
ber of categorical attributes. Following recent work