PQLM - MULTILINGUAL DECENTRALIZED PORTABLE QUANTUM LANGUAGE MODEL Shuyue Stella Li1Xiangyu Zhang1Shu Zhou3Hongchao Shu1 Ruixing Liang1Hexin Liu4Leibny Paola Garcia12
2025-05-02
1
0
2.42MB
5 页
10玖币
侵权投诉
PQLM - MULTILINGUAL DECENTRALIZED PORTABLE QUANTUM LANGUAGE MODEL
Shuyue Stella Li?1Xiangyu Zhang?1Shu Zhou3Hongchao Shu1
Ruixing Liang1Hexin Liu4Leibny Paola Garcia1,2
1Center for Language and Speech Processing, Johns Hopkins University
2Human Language Technology Center of Excellence, Johns Hopkins University
3Department of Physics, Hong Kong University of Science and Technology
4School of Electrical and Electronic Engineering, Nanyang Technological University
ABSTRACT
With careful manipulation, malicious agents can reverse en-
gineer private information encoded in pre-trained language
models. Security concerns motivate the development of
quantum pre-training. In this work, we propose a highly
portable quantum language model (PQLM) that can easily
transmit information to downstream tasks on classical ma-
chines. The framework consists of a cloud PQLM built with
random Variational Quantum Classifiers (VQC) and local
models for downstream applications. We demonstrate the ad
hoc portability of the quantum model by extracting only the
word embeddings and effectively applying them to down-
stream tasks on classical machines. Our PQLM exhibits
comparable performance to its classical counterpart on both
intrinsic evaluation (loss, perplexity) and extrinsic evaluation
(multilingual sentiment analysis accuracy) metrics. We also
perform ablation studies on the factors affecting PQLM per-
formance to analyze model stability. Our work establishes a
theoretical foundation for a portable quantum pre-trained lan-
guage model that could be trained on private data and made
available for public use with privacy protection guarantees.
Index Terms—Quantum Machine Learning, Language
Modeling, Federated Learning, Model Portability
1. INTRODUCTION
A competitive language model can be extremely useful for
downstream tasks such as machine translation and speech
recognition despite the domain mismatch between pre-training
and downstream tasks [1, 2]. They become more powerful
with increased training data, but there is a trade-off between
data privacy and utility[3]. Previous works on ethical AI have
shown that pre-trained language models (PLM)s memorizes
training data in addition to learning about the language [4, 5],
which opens up vulnerabilities for potential adversaries to
recover sensitive training data from the model.
While some argue that language models should only be
trained on data explicitly produced for public use [6], pri-
?Equal contribution in alphabetical order
Fig. 1: Decentralized Quantum Language Model Pipeline.
Text data is trained on language model on NISQ servers, the
word embeddings are transferred to downstream models Mi
vate data are richer in certain domains compared to public
corpora, including dialogue systems, code-mixing languages,
and medical applications [7, 8]. Therefore, it is essential to
develop new methods to mitigate potential data security and
privacy problems while being able to take advantage of the
rich linguistic information encoded in private data.
Recently, there has been growing interest in leveraging
random quantum circuits in neural models to solve data pri-
vacy issues [9]. The entanglement of states from the random
configuration of gates in the quantum circuits makes it pos-
sible to securely encode sensitive information contained in
training data [10, 9]. The combination of random quantum
circuits and decentralized training ensures privacy [11]. Ad-
ditionally, quantum computing has become the next logical
step in the development of deep learning for its efficiency in
manipulating large tensors [12].
The architecture of a large quantum computer is vastly
different from the classical computer as physical and environ-
mental constraints cannot be met [13], which means that the
model trained on the large quantum computer is difficult to
be directly used by others on the classical computer. Ad hoc
portability is defined as the model’s ability to transmit the
most essential information contained in the language model
arXiv:2210.03221v5 [cs.LG] 27 Feb 2023
摘要:
展开>>
收起<<
PQLM-MULTILINGUALDECENTRALIZEDPORTABLEQUANTUMLANGUAGEMODELShuyueStellaLi?1XiangyuZhang?1ShuZhou3HongchaoShu1RuixingLiang1HexinLiu4LeibnyPaolaGarcia1,21CenterforLanguageandSpeechProcessing,JohnsHopkinsUniversity2HumanLanguageTechnologyCenterofExcellence,JohnsHopkinsUniversity3DepartmentofPhysics,Hong...
声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
相关推荐
-
公司营销部领导述职述廉报告VIP免费
2024-12-03 4 -
100套述职述廉述法述学框架提纲VIP免费
2024-12-03 3 -
20220106政府党组班子党史学习教育专题民主生活会“五个带头”对照检查材料VIP免费
2024-12-03 3 -
20220106县纪委监委领导班子党史学习教育专题民主生活会对照检查材料VIP免费
2024-12-03 6 -
A文秘笔杆子工作资料汇编手册(近70000字)VIP免费
2024-12-03 3 -
20220106县领导班子党史学习教育专题民主生活会对照检查材料VIP免费
2024-12-03 4 -
经济开发区党工委书记管委会主任述学述职述廉述法报告VIP免费
2024-12-03 34 -
20220106政府领导专题民主生活会五个方面对照检查材料VIP免费
2024-12-03 11 -
派出所教导员述职述廉报告6篇VIP免费
2024-12-03 8 -
民主生活会对县委班子及其成员批评意见清单VIP免费
2024-12-03 50
分类:图书资源
价格:10玖币
属性:5 页
大小:2.42MB
格式:PDF
时间:2025-05-02


渝公网安备50010702506394