
FedPC A PREPRINT
been increased effort on identifying clusters of related users to share model heads, such as with K-Means clustering in
PFedKM [Tang et al., 2021] or through clustered personal embeddings in FedEmbed [Silva et al., 2022]. Notably, there
is no prior work which learns both personal and contextual model heads for personalization within federated learning.
Meta-Learning Global Models
An alternate approach to personalizing federated learning models is through the
adoption of meta-learning [Jiang et al., 2019, Fallah et al., 2020], for learning a global model prior to fine-tuning on
client-data. After cloning the global model as an initialization from all client’s updates, local, client-side models are
permitted to diverge and fine-tune to a user’s individual preferences or data distribution [Fallah et al., 2020, Deng et al.,
2020, Hanzely and Richtárik, 2020, Hanzely et al., 2020, Lin et al., 2019b]. However, computing and applying gradients
for a full model often requires too much time, power, and memory. As such, expensive full-model gradients can often
only be computed and applied when a device is not actively in-use. As in the split-learning literature, there are not
meta-learning approaches for disentangling personal and contextual preferences within personalized federated learning.
Learning with Personal Embeddings
Our work leverages the insight that personal preferences can be represented
using a personalized embedding, allowing the model to condition output predictions on personal preferences without
requiring completely re-trained classification heads or networks. Personal embeddings have been used in prior work
to capture an individual’s “style,” often in imitation learning settings [Tamar et al., 2018, Hsiao et al., 2019, Paleja
et al., 2020, Schrum et al., 2022]. Treating personal embeddings as neural network parameters that are updated on-
device, these approaches learn to embed preferences and condition network output over both input data and preference
embeddings. Most closely related to our work are FedNLG [Lu et al., 2021], which predicts “persona” parameters
for users, and the Global+ model in FedEmbed [Silva et al., 2022], which learns a personal embedding for each user.
However, FedNLG requires access to a user’s entire history of language and demographic data in order to produce a
“persona” for each user, informing the generation of a “persona” embedding, and Global+ incorporates supervised style
feedback. Prior embedding-based approaches solely learn personal embeddings, neglecting stylization through context.
In our work, we explore the utility of incorporating context in addition to personal preferences, and all preference
embeddings are updated solely via a self-supervised language-modeling loss.
Personalization in Language
Personalization for language generation systems seeks to produce grounded systems
that can efficiently adapt to end-user needs [Yang and Flek, 2021, Dudy et al., 2021]. One such approach to personaliza-
tion is by learning a “persona” for each user and conditioning the language model on the embeddings or representation
for the persona via a memory network [Zhang et al., 2018, Wu et al., 2021, Lu et al., 2021]. “Personas” are generally
short sequences of 5-6 sentences which contain information about an individual such as “I have blonde hair” or “My
mom is a doctor.” Similar approaches leverage Bayesian inference methods to infer context[Majumder et al., 2020]
or persona [Kim et al., 2020], and then condition the language generation on the inferred context. However such
approaches involve collecting and maintaining user-profiles on a central server which may violate user-confidentiality.
Alternate approaches seek bypass this issue by enabling dynamic speaker modeling through context-based fine-tuning
rather than conditioning on profile information [Cheng et al., 2019, Li and Liang, 2021]. FedPC leverages a similar
design to dynamically learn personal and context embeddings through data from small datasets for a given user, while
also preserving user-confidentiality via federated learning.
FedPC represents a new direction in personalized federated learning research, enabling personal and stylized language
generation with a fraction of the memory, data, and compute costs of prior approaches without requiring access to
pre-made personal profiles.
3 Approach
In this section, we present our novel approach to personalization in federated learning with FedPC. FedPC produces
personal and contextual preference embeddings either via backpropagation (i.e., learning preference embeddings), or by
inference (i.e., predicting preference embeddings). A visual overview of our federated learning architecture is in Figure
2, and a step-by-step walk-through of our training algorithm is given in Algorithm 1.
3.1 Personalization via Embeddings
Personalization in FedPC is achieved entirely through preference embeddings. Every input sample (e.g., an incomplete
sentence) is accompanied by both a personal preference embedding, representing the user, and a contextual preference
embedding, representing the context or style of the prediction. These two embeddings are combined via an element-wise
multiplication to produce a single preference embedding that accompanies the input sample. By leveraging both
personal and context embeddings, FedPC considers the individual user and the broader context of an utterance, enabling
personal, stylized prediction.
3