Differentially Private Language Models for Secure Data Sharing Justus Mattern RWTH Aachen

2025-05-06 0 0 526.79KB 14 页 10玖币

侵权投诉

Differentially Private Language Models for Secure Data Sharing

Justus Mattern

RWTH Aachen

justus.mattern@rwth-aachen.de

Zhijing Jin

MPI & ETH Zürich

zjin@tue.mpg.de

Benjamin Weggenmann

SAP Security Research

benjamin.weggenmann@sap.com

Bernhard Schölkopf∗

MPI for Intelligent Systems

bs@tue.mpg.de

Mrinmaya Sachan∗

ETH Zürich

msachan@ethz.ch

Abstract

To protect the privacy of individuals whose

data is being shared, it is of high importance

to develop methods allowing researchers and

companies to release textual data while pro-

viding formal privacy guarantees to its origi-

nators. In the ﬁeld of NLP, substantial efforts

have been directed at building mechanisms fol-

lowing the framework of local differential pri-

vacy, thereby anonymizing individual text sam-

ples before releasing them. In practice, these

approaches are often dissatisfying in terms of

the quality of their output language due to the

strong noise required for local differential pri-

vacy. In this paper, we approach the prob-

lem at hand using global differential privacy,

particularly by training a generative language

model in a differentially private manner and

consequently sampling data from it. Using

natural language prompts and a new prompt-

mismatch loss, we are able to create highly

accurate and ﬂuent textual datasets taking on

speciﬁc desired attributes such as sentiment or

topic and resembling statistical properties of

the training data. We perform thorough experi-

ments indicating that our synthetic datasets do

not leak information from our original data and

are of high language quality and highly suit-

able for training models for further analysis

on real-world data. Notably, we also demon-

strate that training classiﬁers on private syn-

thetic data outperforms directly training clas-

siﬁers on real data with DP-SGD.1

1 Introduction

Rapid advancements in the ﬁeld of deep learn-

ing and natural language processing (NLP) have

enabled companies, public institutions and re-

searchers to extract information and gain knowl-

edge from large-scale data generated by individu-

als. In many cases, it is desirable to share such data

∗Equal Supervision.

Our code is available at

https://github.com/

justusmattern/private-datasets-with-llms.

GPT-2 Prompt-Based

DP Finetuning

Sensitive Data

Shareable, Anonymous Data

Write a [sent.] review about a [prod.]: [Rev.]

ReviewProductSentiment

Cool novel!BookPositive

Doesn't chargeElectronicsNegative

Really useful!ElectronicsPositive

ReviewProductSentiment

Dumb deviceElectronicsNegative

What a book!BookPositive

Great monitorElectronicsPositive

GPT-2 Prompt-Based

Generation

Internal:

External:

LLM

DP LLM

Write a [sent.] review about a [prod.]: [Rev.]

Figure 1: Main idea of our paper: To share potentially

sensitive datasets with third parties, we train a language

model (LM) on the sensitive data in a differentially pri-

vate manner and consequently prompt the LM to gener-

ate synthetic samples with privacy guarantees.

with third parties, for example when analyses are

performed by external consultants or in order to

provide high quality benchmarks for the research

community. This, however, entails a variety of risks

related to privacy that cannot merely be solved by

pseudonymization: A variety of deanonymization

attacks enable the re-identiﬁcation of individuals

from tabular data such as movie ratings (Narayanan

and Shmatikov,2008), geolocation data (Lee et al.,

2017) and notably also text (Koppel et al.,2009;

Shrestha et al.,2017;Fabien et al.,2020). It is there-

fore highly desirable to develop anonymization

mechanisms enabling secure data sharing, ideally

with mathematical privacy guarantees as granted by

differential privacy (DP) (Dwork and Roth,2014).

Existing approaches anonymize every text sam-

ple individually by obtaining differentially pri-

vate vector representations (Weggenmann and Ker-

schbaum,2018;Fernandes et al.,2019) or using

sequence-to-sequence approaches that rewrite a

arXiv:2210.13918v2 [cs.LG] 26 Oct 2022

given sample to eliminate user-revealing informa-

tion (Shetty et al.,2018;Feyisetan et al.,2019a,

2020a;Weggenmann et al.,2022), thereby follow-

ing local differential privacy. As pointed out by

Mattern et al. (2022), local DP requires a very high

degree of noise which often leads to incoherent

language and only little semantic overlap. The

strict requirements of local DP are, however, not

necessary if we assume that an entity aiming to

share data already has access to the full collection

of user-written texts and only wants to release an

anonymized version of it.

In this paper, inspired by recent advances demon-

strating the feasibility of training large language

models (LLMs) in a differentially private manner

(Li et al.,2021), we propose a globally differen-

tially private data release mechanism relying on

the generation of a "twin" dataset of the original,

sensitive user data from large language models.

As depicted in Figure 1, we train GPT-2 (Radford

et al.,2019) to generate texts of our original dataset

based on prompts inferred from the sample’s in-

dividual attributes such as sentiment or topic. For

ﬁne-tuning, we use a differentially private optimiza-

tion algorithm in order to protect the content of our

training data. Subsequently, we sample from the

trained model to generate a large number of syn-

thetic, anonymous texts, resulting in a veriﬁably

private "twin" dataset. We carefully evaluate our

proposed method using popular NLP datasets such

as IMDb movie reviews or Amazon product re-

views. Here, we ﬁnd that even after learning with

strong privacy guarantees such as

= 3

= 8

from only a very limited amount of training sam-

ples such as 25 or 50, our generated data is of high

quality and the classiﬁers trained on it achieve ac-

curacies only

∼

3% lower than those trained on the

full original dataset containing thousands of sam-

ples. Notably, we also ﬁnd that transformer based

classiﬁcation models trained on private data outper-

form models trained on real data with differentially

private optimization. Finally, we show that the dif-

ferentially private ﬁne-tuning procedure effectively

minimizes the risk of data leakage from language

models that was previously discovered by Carlini

et al. (2021).

2 Background

2.1 Differential Privacy

Differential privacy (DP) is a formal notion of pri-

vacy that is currently considered the state-of-the-art

for quantifying and limiting information disclosure

about individuals. It has been introduced by Dwork

et al. (2006a) under the name



-indistinguishability

with the goal of giving semantic privacy by quan-

tifying the risk of an individual that results from

participation in data collection.

In the original, central model of DP, we con-

sider adjacent datasets that differ by at most one

record (i.e., one individual’s data). A differentially

private query on both databases should yield match-

ing results with similar probabilities, i.e., answers

that are probabilistically indistinguishable. This is

achieved via random mechanisms that return noisy

query results, thus masking the impact of each in-

dividual.

Deﬁnition 1.

Let

 > 0

be a privacy parameter,

and

0≤δ≤1

. A randomized mechanism

fulﬁlls

(, δ)

-DP if for any pair of adjacent

inputs

x,x0∈ X

, and all sets of possible outputs

Z⊂supp M,

Pr [M(x)∈Z]≤e·Pr M(x0)∈Z+δ .

(1)

In the local model (Duchi et al.,2013), noise is

added locally at the data source, before the data

is collected and stored in a central database. A

basic example is randomized response (Warner,

1965), where each survey participant either pro-

vides a truthful or a random answer depending on

the ﬂip of an (unbiased) coin. The local model

makes the strong assumption that any two inputs

are considered adjacent, which often makes it difﬁ-

cult to achieve a satisfying privacy-utility trade-off.

2.2 Differentially Private Optimization

An important application of DP is privacy-

preserving machine learning to protect the privacy

of the training data. Typically, neural networks are

trained by optimizing a loss function using stochas-

tic gradient descent (SGD) or a derived method

such as Adam (Kingma and Ba,2015), which itera-

tively compute gradients of the loss function over

batches of samples from the training dataset. As

shown by Song et al. (2013a); Bassily et al. (2014a);

Abadi et al. (2016a), it is possible to implement a

differentially private version of SGD (DP-SGD) by

clipping the gradients and applying the Gaussian

mechanism (Dwork and Roth,2014): The latter

works by applying noise from an isotropic Gaus-

sian distribution

N(0, σ2I)

, where the standard

deviation

is derived based on the desired privacy

parameters and δ.

To achieve good privacy-utility trade-offs, it is

important to accurately track the total privacy bud-

get spent throughout the entire training. In the con-

text of DP, repeated executions of the same (here:

Gaussian) mechanism is referred to as composi-

tion. Basic (Dwork et al.,2006b) and various more

reﬁned, advanced composition theorems (Dwork

et al.,2010;Dwork and Rothblum,2016;Bun and

Steinke,2016) have been stated in the literature that

aim at providing tight bounds for the overall pri-

vacy budget. However, these advances still resulted

in relatively loose bounds and thus large overall

privacy budgets over the course of highly itera-

tive algorithms such as DP-SGD. Tight worst-case

bounds for composition were derived by Kairouz

et al. (2015), however, it was shown to be compu-

tationally infeasible to compute them in general

(Murtagh and Vadhan,2016).

For this reason, speciﬁc efforts have been made

to ﬁnd tighter bounds and accurate approximations

for the overall privacy loss: A ﬁrst example that

provides substantial reduced upper bounds is the

moments accountant (Abadi et al.,2016a), which

is closely related to Rényi DP (Mironov,2017), a

generalization of DP based on Rényi divergence.

Gaussian and

-DP (Dong et al.,2019) provide an

approximation of the total budget using the central

limit theorem (CLT). Finally, Gopi et al. (2021);

Koskela et al. (2020), inspired by Sommer et al.

(2019), are able to compute the exact budget nu-

merically up to arbitrary precision by aggregating

the privacy loss random variable with fast Fourier

transform.

3 Approach

We consider the following scenario to motivate

our approach: an entity wants to implement NLP

pipelines to gain insights from internal data, e.g.,

emails from customers. To seek advice and get sup-

port for modeling the data and building pipelines,

the entity aims to share an excerpt of the internal

data with a third party such as a consultant or a

group of researchers. In order to do this without

compromising the privacy of its customers, the aim

is to synthesize a veriﬁably private “toy” dataset

that reﬂects the properties of the original data with-

out leaking private information. On such a toy

dataset, a third party could research how to best

solve the task at hand and train a model to perform

inference on the actual internal data, without be-

ing able to access sensitive information about cus-

tomers. Formally, we aim to achieve the following

goal: We consider a dataset consisting of a train-

ing set

Dtrain

and test set

Dtest

. Given

Dtrain

or a

subset of it, we want to train a generative model to

synthesize a dataset

‹

Dtrain

that does not leak infor-

mation from the original

Dtrain

. Furthermore, the

synthesized dataset should share statistical prop-

erties with the original one so that a classiﬁcation

model trained on

‹

Dtrain

performs as well as if it

was trained on

Dtrain

when making predictions

about Dtest.

To achieve this, we use the pretrained autore-

gressive transformer model (Vaswani et al.,2017)

GPT-2 (Radford et al.,2019) and use natural lan-

guage prompts to enable the conditional generation

of text based on desired textual attributes such as its

sentiment, domain or genre provided in the prompt.

Furthermore, we introduce a new training objective

that penalizes the generation of samples ﬁtting an-

other label to reduce the risk of faulty labeled sam-

ples in our synthetic dataset. Finally, we ﬁne-tune

our model using a differentially private optimizer

to provide privacy guarantees for our training data

and to prevent information leakage from our model

when subsequently sampling our synthetic dataset.

3.1 Conditional text generation with natural

language prompts

As we want to control speciﬁc textual attributes of

our synthetic data, we need to train our model in a

manner that allows us to generate different types

of texts corresponding to the desired attributes or

labels present in our dataset. We consider a text

sample to correspond to a set of

attributes of

interest, namely

A:= {a1, a2, . . . , aM}

, where

each attribute

can take on a set of categorical

values

. In the case of product reviews,

could

be the sentiment of a review that can take on the

values

a1∈C1={Positive,Negative}

and

can be the product category, so that

a2∈C2=

{Books,Electronics,DVD,Kitchen}

. Our goal

is to learn a model

p(x|a1, ..., aM)

in order to con-

trollably synthesize text samples according to our

desired attributes.

A straightforward approach to realize this would

be to train a single generative model for all possi-

ble attribute value combinations. This approach

is, however, highly memory-intensive, as it re-

quires us to store the weights of a large number

of models that grows exponentially with the num-

ber of categorical attributes. Following recent work

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

DifferentiallyPrivateLanguageModelsforSecureDataSharingJustusMatternRWTHAachenjustus.mattern@rwth-aachen.deZhijingJinMPIÐZürichzjin@tue.mpg.deBenjaminWeggenmannSAPSecurityResearchbenjamin.weggenmann@sap.comBernhardSchölkopfMPIforIntelligentSystemsbs@tue.mpg.deMrinmayaSachanETHZürichmsachan@ethz...

展开>> 收起<<

Differentially Private Language Models for Secure Data Sharing Justus Mattern RWTH Aachen.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Differentially Private Language Models for Secure Data Sharing Justus Mattern RWTH Aachen

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: