Differentially Private Language Models for Secure Data Sharing Justus Mattern RWTH Aachen

2025-05-06 0 0 526.79KB 14 页 10玖币
侵权投诉
Differentially Private Language Models for Secure Data Sharing
Justus Mattern
RWTH Aachen
justus.mattern@rwth-aachen.de
Zhijing Jin
MPI & ETH Zürich
zjin@tue.mpg.de
Benjamin Weggenmann
SAP Security Research
benjamin.weggenmann@sap.com
Bernhard Schölkopf
MPI for Intelligent Systems
bs@tue.mpg.de
Mrinmaya Sachan
ETH Zürich
msachan@ethz.ch
Abstract
To protect the privacy of individuals whose
data is being shared, it is of high importance
to develop methods allowing researchers and
companies to release textual data while pro-
viding formal privacy guarantees to its origi-
nators. In the field of NLP, substantial efforts
have been directed at building mechanisms fol-
lowing the framework of local differential pri-
vacy, thereby anonymizing individual text sam-
ples before releasing them. In practice, these
approaches are often dissatisfying in terms of
the quality of their output language due to the
strong noise required for local differential pri-
vacy. In this paper, we approach the prob-
lem at hand using global differential privacy,
particularly by training a generative language
model in a differentially private manner and
consequently sampling data from it. Using
natural language prompts and a new prompt-
mismatch loss, we are able to create highly
accurate and fluent textual datasets taking on
specific desired attributes such as sentiment or
topic and resembling statistical properties of
the training data. We perform thorough experi-
ments indicating that our synthetic datasets do
not leak information from our original data and
are of high language quality and highly suit-
able for training models for further analysis
on real-world data. Notably, we also demon-
strate that training classifiers on private syn-
thetic data outperforms directly training clas-
sifiers on real data with DP-SGD.1
1 Introduction
Rapid advancements in the field of deep learn-
ing and natural language processing (NLP) have
enabled companies, public institutions and re-
searchers to extract information and gain knowl-
edge from large-scale data generated by individu-
als. In many cases, it is desirable to share such data
Equal Supervision.
1
Our code is available at
https://github.com/
justusmattern/private-datasets-with-llms.
GPT-2 Prompt-Based
DP Finetuning
Sensitive Data
Shareable, Anonymous Data
Write a [sent.] review about a [prod.]: [Rev.]
ReviewProductSentiment
Cool novel!BookPositive
Doesn't chargeElectronicsNegative
Really useful!ElectronicsPositive
ReviewProductSentiment
Dumb deviceElectronicsNegative
What a book!BookPositive
Great monitorElectronicsPositive
GPT-2 Prompt-Based
Generation
Internal:
External:
LLM
DP LLM
Write a [sent.] review about a [prod.]: [Rev.]
Figure 1: Main idea of our paper: To share potentially
sensitive datasets with third parties, we train a language
model (LM) on the sensitive data in a differentially pri-
vate manner and consequently prompt the LM to gener-
ate synthetic samples with privacy guarantees.
with third parties, for example when analyses are
performed by external consultants or in order to
provide high quality benchmarks for the research
community. This, however, entails a variety of risks
related to privacy that cannot merely be solved by
pseudonymization: A variety of deanonymization
attacks enable the re-identification of individuals
from tabular data such as movie ratings (Narayanan
and Shmatikov,2008), geolocation data (Lee et al.,
2017) and notably also text (Koppel et al.,2009;
Shrestha et al.,2017;Fabien et al.,2020). It is there-
fore highly desirable to develop anonymization
mechanisms enabling secure data sharing, ideally
with mathematical privacy guarantees as granted by
differential privacy (DP) (Dwork and Roth,2014).
Existing approaches anonymize every text sam-
ple individually by obtaining differentially pri-
vate vector representations (Weggenmann and Ker-
schbaum,2018;Fernandes et al.,2019) or using
sequence-to-sequence approaches that rewrite a
arXiv:2210.13918v2 [cs.LG] 26 Oct 2022
given sample to eliminate user-revealing informa-
tion (Shetty et al.,2018;Feyisetan et al.,2019a,
2020a;Weggenmann et al.,2022), thereby follow-
ing local differential privacy. As pointed out by
Mattern et al. (2022), local DP requires a very high
degree of noise which often leads to incoherent
language and only little semantic overlap. The
strict requirements of local DP are, however, not
necessary if we assume that an entity aiming to
share data already has access to the full collection
of user-written texts and only wants to release an
anonymized version of it.
In this paper, inspired by recent advances demon-
strating the feasibility of training large language
models (LLMs) in a differentially private manner
(Li et al.,2021), we propose a globally differen-
tially private data release mechanism relying on
the generation of a "twin" dataset of the original,
sensitive user data from large language models.
As depicted in Figure 1, we train GPT-2 (Radford
et al.,2019) to generate texts of our original dataset
based on prompts inferred from the sample’s in-
dividual attributes such as sentiment or topic. For
fine-tuning, we use a differentially private optimiza-
tion algorithm in order to protect the content of our
training data. Subsequently, we sample from the
trained model to generate a large number of syn-
thetic, anonymous texts, resulting in a verifiably
private "twin" dataset. We carefully evaluate our
proposed method using popular NLP datasets such
as IMDb movie reviews or Amazon product re-
views. Here, we find that even after learning with
strong privacy guarantees such as
= 3
or
= 8
from only a very limited amount of training sam-
ples such as 25 or 50, our generated data is of high
quality and the classifiers trained on it achieve ac-
curacies only
3% lower than those trained on the
full original dataset containing thousands of sam-
ples. Notably, we also find that transformer based
classification models trained on private data outper-
form models trained on real data with differentially
private optimization. Finally, we show that the dif-
ferentially private fine-tuning procedure effectively
minimizes the risk of data leakage from language
models that was previously discovered by Carlini
et al. (2021).
2 Background
2.1 Differential Privacy
Differential privacy (DP) is a formal notion of pri-
vacy that is currently considered the state-of-the-art
for quantifying and limiting information disclosure
about individuals. It has been introduced by Dwork
et al. (2006a) under the name
-indistinguishability
with the goal of giving semantic privacy by quan-
tifying the risk of an individual that results from
participation in data collection.
In the original, central model of DP, we con-
sider adjacent datasets that differ by at most one
record (i.e., one individual’s data). A differentially
private query on both databases should yield match-
ing results with similar probabilities, i.e., answers
that are probabilistically indistinguishable. This is
achieved via random mechanisms that return noisy
query results, thus masking the impact of each in-
dividual.
Definition 1.
Let
 > 0
be a privacy parameter,
and
0δ1
. A randomized mechanism
M
on
X
fulfills
(, δ)
-DP if for any pair of adjacent
inputs
x,x0∈ X
, and all sets of possible outputs
Zsupp M,
Pr [M(x)Z]e·Pr M(x0)Z+δ .
(1)
In the local model (Duchi et al.,2013), noise is
added locally at the data source, before the data
is collected and stored in a central database. A
basic example is randomized response (Warner,
1965), where each survey participant either pro-
vides a truthful or a random answer depending on
the flip of an (unbiased) coin. The local model
makes the strong assumption that any two inputs
are considered adjacent, which often makes it diffi-
cult to achieve a satisfying privacy-utility trade-off.
2.2 Differentially Private Optimization
An important application of DP is privacy-
preserving machine learning to protect the privacy
of the training data. Typically, neural networks are
trained by optimizing a loss function using stochas-
tic gradient descent (SGD) or a derived method
such as Adam (Kingma and Ba,2015), which itera-
tively compute gradients of the loss function over
batches of samples from the training dataset. As
shown by Song et al. (2013a); Bassily et al. (2014a);
Abadi et al. (2016a), it is possible to implement a
differentially private version of SGD (DP-SGD) by
clipping the gradients and applying the Gaussian
mechanism (Dwork and Roth,2014): The latter
works by applying noise from an isotropic Gaus-
sian distribution
N(0, σ2I)
, where the standard
deviation
σ
is derived based on the desired privacy
parameters and δ.
To achieve good privacy-utility trade-offs, it is
important to accurately track the total privacy bud-
get spent throughout the entire training. In the con-
text of DP, repeated executions of the same (here:
Gaussian) mechanism is referred to as composi-
tion. Basic (Dwork et al.,2006b) and various more
refined, advanced composition theorems (Dwork
et al.,2010;Dwork and Rothblum,2016;Bun and
Steinke,2016) have been stated in the literature that
aim at providing tight bounds for the overall pri-
vacy budget. However, these advances still resulted
in relatively loose bounds and thus large overall
privacy budgets over the course of highly itera-
tive algorithms such as DP-SGD. Tight worst-case
bounds for composition were derived by Kairouz
et al. (2015), however, it was shown to be compu-
tationally infeasible to compute them in general
(Murtagh and Vadhan,2016).
For this reason, specific efforts have been made
to find tighter bounds and accurate approximations
for the overall privacy loss: A first example that
provides substantial reduced upper bounds is the
moments accountant (Abadi et al.,2016a), which
is closely related to Rényi DP (Mironov,2017), a
generalization of DP based on Rényi divergence.
Gaussian and
f
-DP (Dong et al.,2019) provide an
approximation of the total budget using the central
limit theorem (CLT). Finally, Gopi et al. (2021);
Koskela et al. (2020), inspired by Sommer et al.
(2019), are able to compute the exact budget nu-
merically up to arbitrary precision by aggregating
the privacy loss random variable with fast Fourier
transform.
3 Approach
We consider the following scenario to motivate
our approach: an entity wants to implement NLP
pipelines to gain insights from internal data, e.g.,
emails from customers. To seek advice and get sup-
port for modeling the data and building pipelines,
the entity aims to share an excerpt of the internal
data with a third party such as a consultant or a
group of researchers. In order to do this without
compromising the privacy of its customers, the aim
is to synthesize a verifiably private “toy” dataset
that reflects the properties of the original data with-
out leaking private information. On such a toy
dataset, a third party could research how to best
solve the task at hand and train a model to perform
inference on the actual internal data, without be-
ing able to access sensitive information about cus-
tomers. Formally, we aim to achieve the following
goal: We consider a dataset consisting of a train-
ing set
Dtrain
and test set
Dtest
. Given
Dtrain
or a
subset of it, we want to train a generative model to
synthesize a dataset
Dtrain
that does not leak infor-
mation from the original
Dtrain
. Furthermore, the
synthesized dataset should share statistical prop-
erties with the original one so that a classification
model trained on
Dtrain
performs as well as if it
was trained on
Dtrain
when making predictions
about Dtest.
To achieve this, we use the pretrained autore-
gressive transformer model (Vaswani et al.,2017)
GPT-2 (Radford et al.,2019) and use natural lan-
guage prompts to enable the conditional generation
of text based on desired textual attributes such as its
sentiment, domain or genre provided in the prompt.
Furthermore, we introduce a new training objective
that penalizes the generation of samples fitting an-
other label to reduce the risk of faulty labeled sam-
ples in our synthetic dataset. Finally, we fine-tune
our model using a differentially private optimizer
to provide privacy guarantees for our training data
and to prevent information leakage from our model
when subsequently sampling our synthetic dataset.
3.1 Conditional text generation with natural
language prompts
As we want to control specific textual attributes of
our synthetic data, we need to train our model in a
manner that allows us to generate different types
of texts corresponding to the desired attributes or
labels present in our dataset. We consider a text
sample to correspond to a set of
M
attributes of
interest, namely
A:= {a1, a2, . . . , aM}
, where
each attribute
aj
can take on a set of categorical
values
Cj
. In the case of product reviews,
a1
could
be the sentiment of a review that can take on the
values
a1C1={Positive,Negative}
and
a2
can be the product category, so that
a2C2=
{Books,Electronics,DVD,Kitchen}
. Our goal
is to learn a model
p(x|a1, ..., aM)
in order to con-
trollably synthesize text samples according to our
desired attributes.
A straightforward approach to realize this would
be to train a single generative model for all possi-
ble attribute value combinations. This approach
is, however, highly memory-intensive, as it re-
quires us to store the weights of a large number
of models that grows exponentially with the num-
ber of categorical attributes. Following recent work
摘要:

DifferentiallyPrivateLanguageModelsforSecureDataSharingJustusMatternRWTHAachenjustus.mattern@rwth-aachen.deZhijingJinMPIÐZürichzjin@tue.mpg.deBenjaminWeggenmannSAPSecurityResearchbenjamin.weggenmann@sap.comBernhardSchölkopfMPIforIntelligentSystemsbs@tue.mpg.deMrinmayaSachanETHZürichmsachan@ethz...

展开>> 收起<<
Differentially Private Language Models for Secure Data Sharing Justus Mattern RWTH Aachen.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:526.79KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注