Privately Fine-Tuning Large Language Models with Differential Privacy Rouzbeh Behnia

2025-05-02 0 0 447.38KB 7 页 10玖币

侵权投诉

Privately Fine-Tuning Large Language Models with

Differential Privacy

Rouzbeh Behnia∗

School of Information Systems

and Management

University of South Florida

Sarasota, USA

behnia@usf.edu

Mohammadreza (Reza) Ebrahimi∗

School of Information Systems

and Management

University of South Florida

Tampa, USA

ebrahimim@usf.edu

Jason Pacheco

Department of Computer Science

University of Arizona

Tucson, USA

pachecoj@cs.arizona.edu

Balaji Padmanabhan

School of Information Systems

and Management

University of South Florida

Tampa, USA

bp@usf.edu

Abstract—Pre-trained Large Language Models (LLMs) are

an integral part of modern AI that have led to breakthrough

performances in complex AI tasks. Major AI companies with

expensive infrastructures are able to develop and train these large

models with billions and millions of parameters from scratch.

Third parties, researchers, and practitioners are increasingly

adopting these pre-trained models and ﬁne-tuning them on their

private data to accomplish their downstream AI tasks. However,

it has been shown that an adversary can extract/reconstruct the

exact training samples from these LLMs, which can lead to

revealing personally identiﬁable information. The issue has raised

deep concerns about the privacy of LLMs. Differential privacy

(DP) provides a rigorous framework that allows adding noise in

the process of training or ﬁne-tuning LLMs such that extracting

the training data becomes infeasible (i.e., with a cryptograph-

ically small success probability). While the theoretical privacy

guarantees offered in most extant studies assume learning models

from scratch through many training iterations in an asymptotic

setting, this assumption does not hold in ﬁne-tuning scenarios in

which the number of training iterations is signiﬁcantly smaller.

To address the gap, we present EW-Tune, a DP framework

for ﬁne-tuning LLMs based on Edgeworth accountant with

ﬁnite-sample privacy guarantees. Our results across four well-

established natural language understanding (NLU) tasks show

that while EW-Tune adds privacy guarantees to LLM ﬁne-tuning

process, it directly contributes to decreasing the induced noise to

up to 5.6% and improves the state-of-the-art LLMs performance

by up to 1.1% across all NLU tasks. We have open-sourced our

implementations for wide adoption and public testing purposes.

Index Terms—Differential privacy, large language models, ﬁne-

tuning, Edgeworth accountant

I. INTRODUCTION

Large language models (LLMs) have become an integral

component of modern AI. Deep learning architectures with bil-

lions of parameters are often designed based on transformers,

a building block ﬁrst introduced by Google’s BERT [1]. LLMs

provide breakthrough performance in complex AI tasks such

∗Equally contributing authors (alphabetically ordered by the last name.)

as dialogue systems [2] and text/automated story generation

[3]. Being equipped with the hardware infrastructure, major AI

companies such as Open AI and Facebook provide new LLMs

trained on the public data from the Internet [4], [5]. Common

examples include, RoBERTa [4] and GPT [5]. RoBERTa’s

training dataset includes English Wikipedia and millions of

online news crawled from the internet. Similarly, GPT was

trained on outbound links from Reddit.

AI researchers and practitioners often ﬁne-tune these pre-

trained models on their downstream AI tasks using their

own private data to accomplish downstream tasks such as

malware detection [6], text-to-image generation [7]. However,

recently, it has been shown that these pre-trained models are

vulnerable to privacy attacks [8]. This problem is mainly

due to the model’s tendency to memorize training samples

without overﬁtting, also known as the ”memorization issue”

[9]. This issue could lead to three major types of privacy

attacks: membership inference, model inversion, and training

data extraction.

•Membership inference [10]: determines whether a certain

user’s data was included in the training.

•Model inversion [11]: approximate the reconstruction of

the training data.

•Training data extraction [8]: aims to exactly reveal the

training samples which makes this type of attack the most

powerful one with the most adverse consequences for the

users.

While all three types of attacks can jeopardize the privacy

of the users whose information is in the training data, training

data extraction directly targets users’ personally identiﬁable

information and can endanger users’ identity via revealing

important information such as their address, social security

number, phone number, etc. The ﬁne-tuned LLMs used by

third parties on their private data will face the same privacy

arXiv:2210.15042v3 [cs.CR] 20 Mar 2023

concerns. These privacy concerns around the issue necessitate

privacy-preserving approaches for ﬁne-tuning LLMs. Such an

approach will allow third parties to privately ﬁne-tune the

LLMs on their private data without any information leak about

their private training samples.

Differential Privacy (DP) is a promising approach to ensure

the training data privacy with theoretical guarantees [12]. DP

provides a mathematically rigorous framework with privacy

guarantees that enables Stochastic Gradient Descent (SGD),

the cornerstone of learning in LLMs, in a private setting.

In such a setting, SGD can be applied as a randomized

mechanism multiple times in each iteration of the training.

Most DP methods provide asymptotic guarantees. For theo-

retical guarantees, the number of SGD applications (known as

compositions) is often assumed to be unlimited in most privacy

studies. This assumption leads to asymptotic guarantees in

these studies (i.e., inﬁnite compositions of SGD in the limit).

However, in LLM ﬁne-tuning the number of SGD iterations

is not only limited but also quite small (i.e., in the order of

several thousand) [13].

In this study, through a DP lens, and thanks to the ﬁnite sam-

ple guarantee achieved by Edgeworth expansion [14], we pro-

pose a novel LLM ﬁne-tuning framework, called EW-Tune,

with ﬁnite-sample guarantees. EW-Tune operates based on

an effective DP accounting approach known as Edgeworth

accountant, proposed in [14]. Edgeworth accountant computes

the amount of noise that is required to be added to the

gradients in SGD to guarantee a certain privacy budget (see

Section II-B). EW-Tune also leverages the latest efﬁcient

reparametrization technique proposed in [15].

A. Our contribution

While EW-Tune is a general framework, we showcase its

performance by focusing on its application to enhance the

privacy of LLM during ﬁne-tuning. Our contribution to the

LLM’s private ﬁne-tuning is two-fold:

•Our study serves as the ﬁrst step towards ﬁne-tuning

LLMs in a differentially private setting when the number

of compositions (i.e., the applications of differentially

private SGD) is ﬁnite and limited to only several thousand

(less than 4,000 times in our experiments). Compared to

the existing methods that provide an asymptotic bound

on the privacy budget, through utilizing Edgeworth ac-

countant, EW-Tune is able to provide a non-asymptotic

privacy bound by using Berry-Esseen bound derived from

the Edgeworth approximation. In the case of ﬁne-tuning

LLMs, given the ﬁnite number of compositions, for

the same privacy budget, EW-Tune induces less noise

to SGD compared to the state-of-the-art. This directly

improves the learning and the accuracy of the model.

•It is known that while ﬁne-tuning via DP enhances the

model’s privacy, it can negatively affect the model’s

utility (i.e., performance) [12]. Our experiments show that

EW-Tune signiﬁcantly contributes to the state of the art

by enhancing the privacy of LLMs while preserving their

utility/accuracy compared to multiple recent alternative

methods across several important downstream benchmark

tasks including text classiﬁcation, entailment detection,

and question answering. Overall, EW-Tune decreases

the noise-induced to SGD up to 5.6%. EW-Tune also

enhances the state-of-the-art model’s accuracy by up to

1.1%.

II. BACKGROUND AND RELATED WORK

We review three areas of the literature: (1) LLMs to identify

the state-of-the-art in language modeling and their ﬁne-tuning.

(2) Differentially private deep learning as the overarching

framework to rigorously guarantee the privacy of ﬁne-tuning

LLMs. (3) Edgeworth accountant as an emerging accountant

method that provides ﬁne-sample guarantees, which could be

a useful tool for ﬁne-tuning LLMs.

A. Large Language Models (LLMs)

Large language models are deep neural network archi-

tectures with billions of parameters [16]–[18]. They often

beneﬁt from an encoder-decoder architecture that generates

high-quality representations from sequence data (text, image,

malware, genes, etc.). Most LLMs use speciﬁc types of

layers with self-attention mechanisms known as transformers

to dynamically assign weights to input elements based on

their surrounding context [16]. Transformers enable LLMs to

provide high-quality representations of the input sequence. At

a high level, LLMs can be categorized into two types: masked

and autoregressive.

Masked language models are trained to predict a masked

token based on its surroundings. Highly effective examples

of masked language models include BERT [1] and RoBERTa

[16]. On the contrary, autoregressive language models learn to

predict the next token based on the previously generated ones,

which makes them suitable for text generation tasks [4], [19].

Due to their ability to produce high-quality representations

from input, masked language models are widely used in major

downstream AI tasks including text classiﬁcation, question

answering, semantic entailment detection, and speech recog-

nition.

Pre-trained LLMs are often ﬁne-tuned on speciﬁc tasks and

datasets, through which the weights of the original model are

updated to better tune for the domain-speciﬁc data and task in

hand.

B. Differentially Private Deep Learning

Differential privacy [20], formally deﬁned in Deﬁnition 1,

computes a privacy guarantee when the results of an algorithm,

run on private data, are made public. When applied to machine

learning, a differentially private (DP) mechanism allows for

the public release of the model parameters while ensuring the

privacy of the original training data.

Deﬁnition 1: A randomized mechanism M:X → Y is

(, δ)-DP, if for all adjacent datasets X, X0∈ X , differing in

a single element only, and all Y⊂ Y,P(M(X)∈Y)≤

eP(M(X0)∈Y) + δholds.

In Deﬁnition 1, (, δ)is often referred to as the privacy

budget. deﬁnes the distance between the two sides of

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

PrivatelyFine-TuningLargeLanguageModelswithDifferentialPrivacyRouzbehBehniaSchoolofInformationSystemsandManagementUniversityofSouthFloridaSarasota,USAbehnia@usf.eduMohammadreza(Reza)EbrahimiSchoolofInformationSystemsandManagementUniversityofSouthFloridaTampa,USAebrahimim@usf.eduJasonPachecoDepartm...

展开>> 收起<<

Privately Fine-Tuning Large Language Models with Differential Privacy Rouzbeh Behnia.pdf

共7页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Privately Fine-Tuning Large Language Models with Differential Privacy Rouzbeh Behnia

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: