FedPC Federated Learning for Language Generation with Personal and Context Preference Embeddings_2

2025-04-27 0 0 694.28KB 13 页 10玖币
侵权投诉
FEDPC: FEDERATED LEARNING FOR LANGUAGE GENERATION
WITH PERSONAL AND CONTEXT PREFERENCE EMBEDDINGS
A PREPRINT
Andrew Silva
School of Interactive Computing
Georgia Institute of Technology
Atlanta, GA
andrew.silva@gatech.edu
Pradyumna Tambwekar
School of Interactive Computing
Georgia Institute of Technology
Atlanta, GA
ptambwekar3@gatech.edu
Matthew Gombolay
School of Interactive Computing
Georgia Institute of Technology
Atlanta, GA
matthew.gombolay@cc.gatech.edu
October 11, 2022
ABSTRACT
Federated learning is a training paradigm that learns from multiple distributed users without aggregat-
ing data on a centralized server. Such a paradigm promises the ability to deploy machine-learning
at-scale to a diverse population of end-users without first collecting a large, labeled dataset for all
possible tasks. As federated learning typically averages learning updates across a decentralized popu-
lation, there is a growing need for personalization of federated learning systems (i.e conversational
agents must be able to personalize to a specific user’s preferences). In this work, we propose a new
direction for personalization research within federated learning, leveraging both personal embed-
dings and shared context embeddings. We also present an approach to predict these “preference”
embeddings, enabling personalization without backpropagation. Compared to state-of-the-art person-
alization baselines, our approach achieves a 50% improvement in test-time perplexity using 0.001% of
the memory required by baseline approaches, and achieving greater sample- and compute-efficiency.
Keywords Natural Language Processing ·Federated Learning ·Personalization
1 Introduction
As conversational agents and dialog systems are deployed to real-world scenarios, these systems require data-efficient
personalization paradigms such that language systems such as conversational agents can be effectively adapted on-device.
The benefits of on-device optimization are two-fold; (1) Swift adaptation of model-behavior based on human-interactions
[Dudy et al., 2021], (2) Privacy protection by means of retaining all data related to the user on-device [Li et al., 2020a].
One of the prevailing paradigms for learning from and engaging with end-users is federated learning. Federated
learning is an inherently decentralized learning paradigm that assumes no access to a large labeled dataset and instead
leverages averaged parameter updates across all users of the system [McMahan et al., 2017]. Such averaged updates
invariably dilute individual preferences or deviations from the mean, resulting in a model that works well for the
average user while failing to appropriately capture under-represented preferences or sub-groups within the data. In this
work, we present a novel approach (FedPC) to personalizing federated learning with personal and context embeddings
(collectively called “preference embeddings”), adapting more efficiently and effectively than prior work with respect to
both data and compute on-device.
We leverage the insight that a client’s data distribution is informed by both individual preferences and additional
contextual information. For example, while each user may have their own individual style, there may be more general
population-wide trends that inform the style of personalized predictions (e.g., dialogue assistants helping patients with
cognitive disorders, whereby agents can personalize to individual patients and broader condition-wide trends). While
individual preferences may be unique to each client (e.g. a user’s taste or affect), we can more accurately personalize
to client preferences with the addition of context, as shared-context parameters carry beneficial stylistic information
Denotes equal contribution
arXiv:2210.03766v1 [cs.CL] 7 Oct 2022
FedPC A PREPRINT
Figure 1: Overview of our personalized federated learning setup, FedPC. Language models within client devices,
such as individual agents deployed to communicate with people at hospitals, homes, or construction sites, pull down
global model parameters and context embeddings. Local, on-device data is then paired with both personal and context
embeddings to produce personalized predictions with global model parameters.
across clients [Dudy et al., 2021, Jones, 1999]. Stylistic or situational context provides additional information to curate
relevant language outputs that can be shared across users.
In this work, we contribute a new approach to personalized federated learning that is both easier to learn and more
effective than prior work, and investigate the utility of personalization via individual preferences and contexts. While
prior language generation approaches have developed personal or persona-based generative systems [Wu et al., 2021,
Zhang et al., 2018] or context-based generative systems [Cheng et al., 2019, Lin et al., 2019a] individually, none have
combined them to personalize outputs in a low-data setting under stylized preferences. We show that our approach
is more sample-efficient than state-of-the-art baselines, while requiring less time to train. We additionally present an
inference-only version of our approach, personalizing without backpropagation for new users. Finally, we directly test
the potential for personalization with users who have been held-out from training (i.e., testing with new users). An
overview of our approach is given in Figure 1.
2 Related Work
Federated learning enables machine-learning at-scale to a diverse population of end-users without first collecting a large,
labeled dataset for all possible tasks. After the introduction of federated averaging [McMahan et al., 2017], focus has
shifted to different ways of personalizing to individual users. Prior personalization approaches for federated learning
have typically involved learning personal network heads and a shared global encoder (i.e., “split-learning” approaches
[Gupta and Raskar, 2018]), or learning a separate local model from a global initialization (i.e., a “meta-learning”
approach [Finn et al., 2017, Nichol et al., 2018]).
Learning Personal Model Heads
The most prevalent approach to personalization in federated learning is through
personalized model heads. Such approaches share gradient information to learn a global feature encoder, but retain
user-specific classification-head gradients on-device. Approaches such as FedRep [Collins et al., 2021] solely separate
out local and global gradients, while other methods such as PFedMe [Dinh et al., 2020] enforce constraints on model-
divergence (such as via FedProx[Li et al., 2020b]). Other approaches, such as FedMD [Li and Wang, 2019], enable
clients to adopt any desired architecture, sharing a common backbone but allowing for completely divergent model
heads [Arivazhagan et al., 2019, Kim et al., 2021, Rudovic et al., 2021, Paulik et al., 2021]. Finally, there has recently
2
FedPC A PREPRINT
been increased effort on identifying clusters of related users to share model heads, such as with K-Means clustering in
PFedKM [Tang et al., 2021] or through clustered personal embeddings in FedEmbed [Silva et al., 2022]. Notably, there
is no prior work which learns both personal and contextual model heads for personalization within federated learning.
Meta-Learning Global Models
An alternate approach to personalizing federated learning models is through the
adoption of meta-learning [Jiang et al., 2019, Fallah et al., 2020], for learning a global model prior to fine-tuning on
client-data. After cloning the global model as an initialization from all client’s updates, local, client-side models are
permitted to diverge and fine-tune to a user’s individual preferences or data distribution [Fallah et al., 2020, Deng et al.,
2020, Hanzely and Richtárik, 2020, Hanzely et al., 2020, Lin et al., 2019b]. However, computing and applying gradients
for a full model often requires too much time, power, and memory. As such, expensive full-model gradients can often
only be computed and applied when a device is not actively in-use. As in the split-learning literature, there are not
meta-learning approaches for disentangling personal and contextual preferences within personalized federated learning.
Learning with Personal Embeddings
Our work leverages the insight that personal preferences can be represented
using a personalized embedding, allowing the model to condition output predictions on personal preferences without
requiring completely re-trained classification heads or networks. Personal embeddings have been used in prior work
to capture an individual’s “style, often in imitation learning settings [Tamar et al., 2018, Hsiao et al., 2019, Paleja
et al., 2020, Schrum et al., 2022]. Treating personal embeddings as neural network parameters that are updated on-
device, these approaches learn to embed preferences and condition network output over both input data and preference
embeddings. Most closely related to our work are FedNLG [Lu et al., 2021], which predicts “persona” parameters
for users, and the Global+ model in FedEmbed [Silva et al., 2022], which learns a personal embedding for each user.
However, FedNLG requires access to a user’s entire history of language and demographic data in order to produce a
“persona” for each user, informing the generation of a “persona” embedding, and Global+ incorporates supervised style
feedback. Prior embedding-based approaches solely learn personal embeddings, neglecting stylization through context.
In our work, we explore the utility of incorporating context in addition to personal preferences, and all preference
embeddings are updated solely via a self-supervised language-modeling loss.
Personalization in Language
Personalization for language generation systems seeks to produce grounded systems
that can efficiently adapt to end-user needs [Yang and Flek, 2021, Dudy et al., 2021]. One such approach to personaliza-
tion is by learning a “persona” for each user and conditioning the language model on the embeddings or representation
for the persona via a memory network [Zhang et al., 2018, Wu et al., 2021, Lu et al., 2021]. “Personas” are generally
short sequences of 5-6 sentences which contain information about an individual such as “I have blonde hair” or “My
mom is a doctor. Similar approaches leverage Bayesian inference methods to infer context[Majumder et al., 2020]
or persona [Kim et al., 2020], and then condition the language generation on the inferred context. However such
approaches involve collecting and maintaining user-profiles on a central server which may violate user-confidentiality.
Alternate approaches seek bypass this issue by enabling dynamic speaker modeling through context-based fine-tuning
rather than conditioning on profile information [Cheng et al., 2019, Li and Liang, 2021]. FedPC leverages a similar
design to dynamically learn personal and context embeddings through data from small datasets for a given user, while
also preserving user-confidentiality via federated learning.
FedPC represents a new direction in personalized federated learning research, enabling personal and stylized language
generation with a fraction of the memory, data, and compute costs of prior approaches without requiring access to
pre-made personal profiles.
3 Approach
In this section, we present our novel approach to personalization in federated learning with FedPC. FedPC produces
personal and contextual preference embeddings either via backpropagation (i.e., learning preference embeddings), or by
inference (i.e., predicting preference embeddings). A visual overview of our federated learning architecture is in Figure
2, and a step-by-step walk-through of our training algorithm is given in Algorithm 1.
3.1 Personalization via Embeddings
Personalization in FedPC is achieved entirely through preference embeddings. Every input sample (e.g., an incomplete
sentence) is accompanied by both a personal preference embedding, representing the user, and a contextual preference
embedding, representing the context or style of the prediction. These two embeddings are combined via an element-wise
multiplication to produce a single preference embedding that accompanies the input sample. By leveraging both
personal and context embeddings, FedPC considers the individual user and the broader context of an utterance, enabling
personal, stylized prediction.
3
摘要:

FEDPC:FEDERATEDLEARNINGFORLANGUAGEGENERATIONWITHPERSONALANDCONTEXTPREFERENCEEMBEDDINGSAPREPRINTAndrewSilvaSchoolofInteractiveComputingGeorgiaInstituteofTechnologyAtlanta,GAandrew.silva@gatech.eduPradyumnaTambwekarSchoolofInteractiveComputingGeorgiaInstituteofTechnologyAtlanta,GAptambwekar3@gatech....

展开>> 收起<<
FedPC Federated Learning for Language Generation with Personal and Context Preference Embeddings_2.pdf

共13页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:13 页 大小:694.28KB 格式:PDF 时间:2025-04-27

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 13
客服
关注