Privately Fine-Tuning Large Language Models with Differential Privacy Rouzbeh Behnia

2025-05-02 0 0 447.38KB 7 页 10玖币
侵权投诉
Privately Fine-Tuning Large Language Models with
Differential Privacy
Rouzbeh Behnia
School of Information Systems
and Management
University of South Florida
Sarasota, USA
behnia@usf.edu
Mohammadreza (Reza) Ebrahimi
School of Information Systems
and Management
University of South Florida
Tampa, USA
ebrahimim@usf.edu
Jason Pacheco
Department of Computer Science
University of Arizona
Tucson, USA
pachecoj@cs.arizona.edu
Balaji Padmanabhan
School of Information Systems
and Management
University of South Florida
Tampa, USA
bp@usf.edu
Abstract—Pre-trained Large Language Models (LLMs) are
an integral part of modern AI that have led to breakthrough
performances in complex AI tasks. Major AI companies with
expensive infrastructures are able to develop and train these large
models with billions and millions of parameters from scratch.
Third parties, researchers, and practitioners are increasingly
adopting these pre-trained models and fine-tuning them on their
private data to accomplish their downstream AI tasks. However,
it has been shown that an adversary can extract/reconstruct the
exact training samples from these LLMs, which can lead to
revealing personally identifiable information. The issue has raised
deep concerns about the privacy of LLMs. Differential privacy
(DP) provides a rigorous framework that allows adding noise in
the process of training or fine-tuning LLMs such that extracting
the training data becomes infeasible (i.e., with a cryptograph-
ically small success probability). While the theoretical privacy
guarantees offered in most extant studies assume learning models
from scratch through many training iterations in an asymptotic
setting, this assumption does not hold in fine-tuning scenarios in
which the number of training iterations is significantly smaller.
To address the gap, we present EW-Tune, a DP framework
for fine-tuning LLMs based on Edgeworth accountant with
finite-sample privacy guarantees. Our results across four well-
established natural language understanding (NLU) tasks show
that while EW-Tune adds privacy guarantees to LLM fine-tuning
process, it directly contributes to decreasing the induced noise to
up to 5.6% and improves the state-of-the-art LLMs performance
by up to 1.1% across all NLU tasks. We have open-sourced our
implementations for wide adoption and public testing purposes.
Index Terms—Differential privacy, large language models, fine-
tuning, Edgeworth accountant
I. INTRODUCTION
Large language models (LLMs) have become an integral
component of modern AI. Deep learning architectures with bil-
lions of parameters are often designed based on transformers,
a building block first introduced by Google’s BERT [1]. LLMs
provide breakthrough performance in complex AI tasks such
Equally contributing authors (alphabetically ordered by the last name.)
as dialogue systems [2] and text/automated story generation
[3]. Being equipped with the hardware infrastructure, major AI
companies such as Open AI and Facebook provide new LLMs
trained on the public data from the Internet [4], [5]. Common
examples include, RoBERTa [4] and GPT [5]. RoBERTa’s
training dataset includes English Wikipedia and millions of
online news crawled from the internet. Similarly, GPT was
trained on outbound links from Reddit.
AI researchers and practitioners often fine-tune these pre-
trained models on their downstream AI tasks using their
own private data to accomplish downstream tasks such as
malware detection [6], text-to-image generation [7]. However,
recently, it has been shown that these pre-trained models are
vulnerable to privacy attacks [8]. This problem is mainly
due to the model’s tendency to memorize training samples
without overfitting, also known as the ”memorization issue”
[9]. This issue could lead to three major types of privacy
attacks: membership inference, model inversion, and training
data extraction.
Membership inference [10]: determines whether a certain
user’s data was included in the training.
Model inversion [11]: approximate the reconstruction of
the training data.
Training data extraction [8]: aims to exactly reveal the
training samples which makes this type of attack the most
powerful one with the most adverse consequences for the
users.
While all three types of attacks can jeopardize the privacy
of the users whose information is in the training data, training
data extraction directly targets users’ personally identifiable
information and can endanger users’ identity via revealing
important information such as their address, social security
number, phone number, etc. The fine-tuned LLMs used by
third parties on their private data will face the same privacy
1
arXiv:2210.15042v3 [cs.CR] 20 Mar 2023
concerns. These privacy concerns around the issue necessitate
privacy-preserving approaches for fine-tuning LLMs. Such an
approach will allow third parties to privately fine-tune the
LLMs on their private data without any information leak about
their private training samples.
Differential Privacy (DP) is a promising approach to ensure
the training data privacy with theoretical guarantees [12]. DP
provides a mathematically rigorous framework with privacy
guarantees that enables Stochastic Gradient Descent (SGD),
the cornerstone of learning in LLMs, in a private setting.
In such a setting, SGD can be applied as a randomized
mechanism multiple times in each iteration of the training.
Most DP methods provide asymptotic guarantees. For theo-
retical guarantees, the number of SGD applications (known as
compositions) is often assumed to be unlimited in most privacy
studies. This assumption leads to asymptotic guarantees in
these studies (i.e., infinite compositions of SGD in the limit).
However, in LLM fine-tuning the number of SGD iterations
is not only limited but also quite small (i.e., in the order of
several thousand) [13].
In this study, through a DP lens, and thanks to the finite sam-
ple guarantee achieved by Edgeworth expansion [14], we pro-
pose a novel LLM fine-tuning framework, called EW-Tune,
with finite-sample guarantees. EW-Tune operates based on
an effective DP accounting approach known as Edgeworth
accountant, proposed in [14]. Edgeworth accountant computes
the amount of noise that is required to be added to the
gradients in SGD to guarantee a certain privacy budget (see
Section II-B). EW-Tune also leverages the latest efficient
reparametrization technique proposed in [15].
A. Our contribution
While EW-Tune is a general framework, we showcase its
performance by focusing on its application to enhance the
privacy of LLM during fine-tuning. Our contribution to the
LLM’s private fine-tuning is two-fold:
Our study serves as the first step towards fine-tuning
LLMs in a differentially private setting when the number
of compositions (i.e., the applications of differentially
private SGD) is finite and limited to only several thousand
(less than 4,000 times in our experiments). Compared to
the existing methods that provide an asymptotic bound
on the privacy budget, through utilizing Edgeworth ac-
countant, EW-Tune is able to provide a non-asymptotic
privacy bound by using Berry-Esseen bound derived from
the Edgeworth approximation. In the case of fine-tuning
LLMs, given the finite number of compositions, for
the same privacy budget, EW-Tune induces less noise
to SGD compared to the state-of-the-art. This directly
improves the learning and the accuracy of the model.
It is known that while fine-tuning via DP enhances the
model’s privacy, it can negatively affect the model’s
utility (i.e., performance) [12]. Our experiments show that
EW-Tune significantly contributes to the state of the art
by enhancing the privacy of LLMs while preserving their
utility/accuracy compared to multiple recent alternative
methods across several important downstream benchmark
tasks including text classification, entailment detection,
and question answering. Overall, EW-Tune decreases
the noise-induced to SGD up to 5.6%. EW-Tune also
enhances the state-of-the-art model’s accuracy by up to
1.1%.
II. BACKGROUND AND RELATED WORK
We review three areas of the literature: (1) LLMs to identify
the state-of-the-art in language modeling and their fine-tuning.
(2) Differentially private deep learning as the overarching
framework to rigorously guarantee the privacy of fine-tuning
LLMs. (3) Edgeworth accountant as an emerging accountant
method that provides fine-sample guarantees, which could be
a useful tool for fine-tuning LLMs.
A. Large Language Models (LLMs)
Large language models are deep neural network archi-
tectures with billions of parameters [16]–[18]. They often
benefit from an encoder-decoder architecture that generates
high-quality representations from sequence data (text, image,
malware, genes, etc.). Most LLMs use specific types of
layers with self-attention mechanisms known as transformers
to dynamically assign weights to input elements based on
their surrounding context [16]. Transformers enable LLMs to
provide high-quality representations of the input sequence. At
a high level, LLMs can be categorized into two types: masked
and autoregressive.
Masked language models are trained to predict a masked
token based on its surroundings. Highly effective examples
of masked language models include BERT [1] and RoBERTa
[16]. On the contrary, autoregressive language models learn to
predict the next token based on the previously generated ones,
which makes them suitable for text generation tasks [4], [19].
Due to their ability to produce high-quality representations
from input, masked language models are widely used in major
downstream AI tasks including text classification, question
answering, semantic entailment detection, and speech recog-
nition.
Pre-trained LLMs are often fine-tuned on specific tasks and
datasets, through which the weights of the original model are
updated to better tune for the domain-specific data and task in
hand.
B. Differentially Private Deep Learning
Differential privacy [20], formally defined in Definition 1,
computes a privacy guarantee when the results of an algorithm,
run on private data, are made public. When applied to machine
learning, a differentially private (DP) mechanism allows for
the public release of the model parameters while ensuring the
privacy of the original training data.
Definition 1: A randomized mechanism M:X → Y is
(, δ)-DP, if for all adjacent datasets X, X0 X , differing in
a single element only, and all Y⊂ Y,P(M(X)Y)
eP(M(X0)Y) + δholds.
In Definition 1, (, δ)is often referred to as the privacy
budget. defines the distance between the two sides of
2
摘要:

PrivatelyFine-TuningLargeLanguageModelswithDifferentialPrivacyRouzbehBehniaSchoolofInformationSystemsandManagementUniversityofSouthFloridaSarasota,USAbehnia@usf.eduMohammadreza(Reza)EbrahimiSchoolofInformationSystemsandManagementUniversityofSouthFloridaTampa,USAebrahimim@usf.eduJasonPachecoDepartm...

展开>> 收起<<
Privately Fine-Tuning Large Language Models with Differential Privacy Rouzbeh Behnia.pdf

共7页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:7 页 大小:447.38KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 7
客服
关注