FedPC Federated Learning for Language Generation with Personal and Context Preference Embeddings_2

2025-04-27 1 0 694.28KB 13 页 10玖币

侵权投诉

FEDPC: FEDERATED LEARNING FOR LANGUAGE GENERATION

WITH PERSONAL AND CONTEXT PREFERENCE EMBEDDINGS

A PREPRINT

Andrew Silva∗

School of Interactive Computing

Georgia Institute of Technology

Atlanta, GA

andrew.silva@gatech.edu

Pradyumna Tambwekar∗

School of Interactive Computing

Georgia Institute of Technology

Atlanta, GA

ptambwekar3@gatech.edu

Matthew Gombolay

School of Interactive Computing

Georgia Institute of Technology

Atlanta, GA

matthew.gombolay@cc.gatech.edu

October 11, 2022

ABSTRACT

Federated learning is a training paradigm that learns from multiple distributed users without aggregat-

ing data on a centralized server. Such a paradigm promises the ability to deploy machine-learning

at-scale to a diverse population of end-users without ﬁrst collecting a large, labeled dataset for all

possible tasks. As federated learning typically averages learning updates across a decentralized popu-

lation, there is a growing need for personalization of federated learning systems (i.e conversational

agents must be able to personalize to a speciﬁc user’s preferences). In this work, we propose a new

direction for personalization research within federated learning, leveraging both personal embed-

dings and shared context embeddings. We also present an approach to predict these “preference”

embeddings, enabling personalization without backpropagation. Compared to state-of-the-art person-

alization baselines, our approach achieves a 50% improvement in test-time perplexity using 0.001% of

the memory required by baseline approaches, and achieving greater sample- and compute-efﬁciency.

Keywords Natural Language Processing ·Federated Learning ·Personalization

1 Introduction

As conversational agents and dialog systems are deployed to real-world scenarios, these systems require data-efﬁcient

personalization paradigms such that language systems such as conversational agents can be effectively adapted on-device.

The beneﬁts of on-device optimization are two-fold; (1) Swift adaptation of model-behavior based on human-interactions

[Dudy et al., 2021], (2) Privacy protection by means of retaining all data related to the user on-device [Li et al., 2020a].

One of the prevailing paradigms for learning from and engaging with end-users is federated learning. Federated

learning is an inherently decentralized learning paradigm that assumes no access to a large labeled dataset and instead

leverages averaged parameter updates across all users of the system [McMahan et al., 2017]. Such averaged updates

invariably dilute individual preferences or deviations from the mean, resulting in a model that works well for the

average user while failing to appropriately capture under-represented preferences or sub-groups within the data. In this

work, we present a novel approach (FedPC) to personalizing federated learning with personal and context embeddings

(collectively called “preference embeddings”), adapting more efﬁciently and effectively than prior work with respect to

both data and compute on-device.

We leverage the insight that a client’s data distribution is informed by both individual preferences and additional

contextual information. For example, while each user may have their own individual style, there may be more general

population-wide trends that inform the style of personalized predictions (e.g., dialogue assistants helping patients with

cognitive disorders, whereby agents can personalize to individual patients and broader condition-wide trends). While

individual preferences may be unique to each client (e.g. a user’s taste or affect), we can more accurately personalize

to client preferences with the addition of context, as shared-context parameters carry beneﬁcial stylistic information

∗Denotes equal contribution

arXiv:2210.03766v1 [cs.CL] 7 Oct 2022

FedPC A PREPRINT

Figure 1: Overview of our personalized federated learning setup, FedPC. Language models within client devices,

such as individual agents deployed to communicate with people at hospitals, homes, or construction sites, pull down

global model parameters and context embeddings. Local, on-device data is then paired with both personal and context

embeddings to produce personalized predictions with global model parameters.

across clients [Dudy et al., 2021, Jones, 1999]. Stylistic or situational context provides additional information to curate

relevant language outputs that can be shared across users.

In this work, we contribute a new approach to personalized federated learning that is both easier to learn and more

effective than prior work, and investigate the utility of personalization via individual preferences and contexts. While

prior language generation approaches have developed personal or persona-based generative systems [Wu et al., 2021,

Zhang et al., 2018] or context-based generative systems [Cheng et al., 2019, Lin et al., 2019a] individually, none have

combined them to personalize outputs in a low-data setting under stylized preferences. We show that our approach

is more sample-efﬁcient than state-of-the-art baselines, while requiring less time to train. We additionally present an

inference-only version of our approach, personalizing without backpropagation for new users. Finally, we directly test

the potential for personalization with users who have been held-out from training (i.e., testing with new users). An

overview of our approach is given in Figure 1.

2 Related Work

Federated learning enables machine-learning at-scale to a diverse population of end-users without ﬁrst collecting a large,

labeled dataset for all possible tasks. After the introduction of federated averaging [McMahan et al., 2017], focus has

shifted to different ways of personalizing to individual users. Prior personalization approaches for federated learning

have typically involved learning personal network heads and a shared global encoder (i.e., “split-learning” approaches

[Gupta and Raskar, 2018]), or learning a separate local model from a global initialization (i.e., a “meta-learning”

approach [Finn et al., 2017, Nichol et al., 2018]).

Learning Personal Model Heads

The most prevalent approach to personalization in federated learning is through

personalized model heads. Such approaches share gradient information to learn a global feature encoder, but retain

user-speciﬁc classiﬁcation-head gradients on-device. Approaches such as FedRep [Collins et al., 2021] solely separate

out local and global gradients, while other methods such as PFedMe [Dinh et al., 2020] enforce constraints on model-

divergence (such as via FedProx[Li et al., 2020b]). Other approaches, such as FedMD [Li and Wang, 2019], enable

clients to adopt any desired architecture, sharing a common backbone but allowing for completely divergent model

heads [Arivazhagan et al., 2019, Kim et al., 2021, Rudovic et al., 2021, Paulik et al., 2021]. Finally, there has recently

FedPC A PREPRINT

been increased effort on identifying clusters of related users to share model heads, such as with K-Means clustering in

PFedKM [Tang et al., 2021] or through clustered personal embeddings in FedEmbed [Silva et al., 2022]. Notably, there

is no prior work which learns both personal and contextual model heads for personalization within federated learning.

Meta-Learning Global Models

An alternate approach to personalizing federated learning models is through the

adoption of meta-learning [Jiang et al., 2019, Fallah et al., 2020], for learning a global model prior to ﬁne-tuning on

client-data. After cloning the global model as an initialization from all client’s updates, local, client-side models are

permitted to diverge and ﬁne-tune to a user’s individual preferences or data distribution [Fallah et al., 2020, Deng et al.,

2020, Hanzely and Richtárik, 2020, Hanzely et al., 2020, Lin et al., 2019b]. However, computing and applying gradients

for a full model often requires too much time, power, and memory. As such, expensive full-model gradients can often

only be computed and applied when a device is not actively in-use. As in the split-learning literature, there are not

meta-learning approaches for disentangling personal and contextual preferences within personalized federated learning.

Learning with Personal Embeddings

Our work leverages the insight that personal preferences can be represented

using a personalized embedding, allowing the model to condition output predictions on personal preferences without

requiring completely re-trained classiﬁcation heads or networks. Personal embeddings have been used in prior work

to capture an individual’s “style,” often in imitation learning settings [Tamar et al., 2018, Hsiao et al., 2019, Paleja

et al., 2020, Schrum et al., 2022]. Treating personal embeddings as neural network parameters that are updated on-

device, these approaches learn to embed preferences and condition network output over both input data and preference

embeddings. Most closely related to our work are FedNLG [Lu et al., 2021], which predicts “persona” parameters

for users, and the Global+ model in FedEmbed [Silva et al., 2022], which learns a personal embedding for each user.

However, FedNLG requires access to a user’s entire history of language and demographic data in order to produce a

“persona” for each user, informing the generation of a “persona” embedding, and Global+ incorporates supervised style

feedback. Prior embedding-based approaches solely learn personal embeddings, neglecting stylization through context.

In our work, we explore the utility of incorporating context in addition to personal preferences, and all preference

embeddings are updated solely via a self-supervised language-modeling loss.

Personalization in Language

Personalization for language generation systems seeks to produce grounded systems

that can efﬁciently adapt to end-user needs [Yang and Flek, 2021, Dudy et al., 2021]. One such approach to personaliza-

tion is by learning a “persona” for each user and conditioning the language model on the embeddings or representation

for the persona via a memory network [Zhang et al., 2018, Wu et al., 2021, Lu et al., 2021]. “Personas” are generally

short sequences of 5-6 sentences which contain information about an individual such as “I have blonde hair” or “My

mom is a doctor.” Similar approaches leverage Bayesian inference methods to infer context[Majumder et al., 2020]

or persona [Kim et al., 2020], and then condition the language generation on the inferred context. However such

approaches involve collecting and maintaining user-proﬁles on a central server which may violate user-conﬁdentiality.

Alternate approaches seek bypass this issue by enabling dynamic speaker modeling through context-based ﬁne-tuning

rather than conditioning on proﬁle information [Cheng et al., 2019, Li and Liang, 2021]. FedPC leverages a similar

design to dynamically learn personal and context embeddings through data from small datasets for a given user, while

also preserving user-conﬁdentiality via federated learning.

FedPC represents a new direction in personalized federated learning research, enabling personal and stylized language

generation with a fraction of the memory, data, and compute costs of prior approaches without requiring access to

pre-made personal proﬁles.

3 Approach

In this section, we present our novel approach to personalization in federated learning with FedPC. FedPC produces

personal and contextual preference embeddings either via backpropagation (i.e., learning preference embeddings), or by

inference (i.e., predicting preference embeddings). A visual overview of our federated learning architecture is in Figure

2, and a step-by-step walk-through of our training algorithm is given in Algorithm 1.

3.1 Personalization via Embeddings

Personalization in FedPC is achieved entirely through preference embeddings. Every input sample (e.g., an incomplete

sentence) is accompanied by both a personal preference embedding, representing the user, and a contextual preference

embedding, representing the context or style of the prediction. These two embeddings are combined via an element-wise

multiplication to produce a single preference embedding that accompanies the input sample. By leveraging both

personal and context embeddings, FedPC considers the individual user and the broader context of an utterance, enabling

personal, stylized prediction.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

FEDPC:FEDERATEDLEARNINGFORLANGUAGEGENERATIONWITHPERSONALANDCONTEXTPREFERENCEEMBEDDINGSAPREPRINTAndrewSilvaSchoolofInteractiveComputingGeorgiaInstituteofTechnologyAtlanta,GAandrew.silva@gatech.eduPradyumnaTambwekarSchoolofInteractiveComputingGeorgiaInstituteofTechnologyAtlanta,GAptambwekar3@gatech....

展开>> 收起<<

FedPC Federated Learning for Language Generation with Personal and Context Preference Embeddings_2.pdf

共13页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

FedPC Federated Learning for Language Generation with Personal and Context Preference Embeddings_2

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: