
PQLM - MULTILINGUAL DECENTRALIZED PORTABLE QUANTUM LANGUAGE MODEL
Shuyue Stella Li?1Xiangyu Zhang?1Shu Zhou3Hongchao Shu1
Ruixing Liang1Hexin Liu4Leibny Paola Garcia1,2
1Center for Language and Speech Processing, Johns Hopkins University
2Human Language Technology Center of Excellence, Johns Hopkins University
3Department of Physics, Hong Kong University of Science and Technology
4School of Electrical and Electronic Engineering, Nanyang Technological University
ABSTRACT
With careful manipulation, malicious agents can reverse en-
gineer private information encoded in pre-trained language
models. Security concerns motivate the development of
quantum pre-training. In this work, we propose a highly
portable quantum language model (PQLM) that can easily
transmit information to downstream tasks on classical ma-
chines. The framework consists of a cloud PQLM built with
random Variational Quantum Classifiers (VQC) and local
models for downstream applications. We demonstrate the ad
hoc portability of the quantum model by extracting only the
word embeddings and effectively applying them to down-
stream tasks on classical machines. Our PQLM exhibits
comparable performance to its classical counterpart on both
intrinsic evaluation (loss, perplexity) and extrinsic evaluation
(multilingual sentiment analysis accuracy) metrics. We also
perform ablation studies on the factors affecting PQLM per-
formance to analyze model stability. Our work establishes a
theoretical foundation for a portable quantum pre-trained lan-
guage model that could be trained on private data and made
available for public use with privacy protection guarantees.
Index Terms—Quantum Machine Learning, Language
Modeling, Federated Learning, Model Portability
1. INTRODUCTION
A competitive language model can be extremely useful for
downstream tasks such as machine translation and speech
recognition despite the domain mismatch between pre-training
and downstream tasks [1, 2]. They become more powerful
with increased training data, but there is a trade-off between
data privacy and utility[3]. Previous works on ethical AI have
shown that pre-trained language models (PLM)s memorizes
training data in addition to learning about the language [4, 5],
which opens up vulnerabilities for potential adversaries to
recover sensitive training data from the model.
While some argue that language models should only be
trained on data explicitly produced for public use [6], pri-
?Equal contribution in alphabetical order
Fig. 1: Decentralized Quantum Language Model Pipeline.
Text data is trained on language model on NISQ servers, the
word embeddings are transferred to downstream models Mi
vate data are richer in certain domains compared to public
corpora, including dialogue systems, code-mixing languages,
and medical applications [7, 8]. Therefore, it is essential to
develop new methods to mitigate potential data security and
privacy problems while being able to take advantage of the
rich linguistic information encoded in private data.
Recently, there has been growing interest in leveraging
random quantum circuits in neural models to solve data pri-
vacy issues [9]. The entanglement of states from the random
configuration of gates in the quantum circuits makes it pos-
sible to securely encode sensitive information contained in
training data [10, 9]. The combination of random quantum
circuits and decentralized training ensures privacy [11]. Ad-
ditionally, quantum computing has become the next logical
step in the development of deep learning for its efficiency in
manipulating large tensors [12].
The architecture of a large quantum computer is vastly
different from the classical computer as physical and environ-
mental constraints cannot be met [13], which means that the
model trained on the large quantum computer is difficult to
be directly used by others on the classical computer. Ad hoc
portability is defined as the model’s ability to transmit the
most essential information contained in the language model
arXiv:2210.03221v5 [cs.LG] 27 Feb 2023