Knowledge Prompts Injecting World Knowledge into Language Models through Soft Prompts Cicero Nogueira dos Santos Zhe Dong Daniel Cer John Nham

2025-05-03 0 0 1.4MB 10 页 10玖币

侵权投诉

Knowledge Prompts: Injecting World Knowledge into

Language Models through Soft Prompts

Cicero Nogueira dos Santos, Zhe Dong, Daniel Cer, John Nham,

Siamak Shakeri, Jianmo Ni, Yun-hsuan Sung

Google Research

{cicerons, zhedong, cer, jnham, siamaks, jianmon, yhsung}@google.com

Abstract

Soft prompts have been recently proposed as

a tool for adapting large frozen language mod-

els (LMs) to new tasks. In this work, we repur-

pose soft prompts to the task of injecting world

knowledge into LMs. We introduce a method

to train soft prompts via self-supervised learn-

ing on data from knowledge bases. The result-

ing soft knowledge prompts (KPs) are task in-

dependent and work as an external memory of

the LMs. We perform qualitative and quanti-

tative experiments and demonstrate that: (1)

KPs can effectively model the structure of the

training data; (2) KPs can be used to improve

the performance of LMs in different knowl-

edge intensive tasks.

1 Introduction

Very large neural language models (LMs) are

known to perform well on knowledge intensive

natural language understanding (NLU) tasks, be-

cause they memorize a signiﬁcant amount of world

knowledge from the training data. The larger the

LM, the more facts it can memorize at the training

time, and the better the results at the inference time

(Roberts et al.,2020). Despite their success, these

models also present some important drawbacks

such as: the parametric memory of these models

have a ﬁxed size and cannot grow (or shrink) over

time without fully retraining the model; there is

no control in terms of which part of the memory

stores data about what; facts that do not co-occur

frequently in the training data are not well repre-

sented in the model; very large models are needed

to memorize enough data in order to perform well

on knowledge intensive tasks such as generative

question answering; and at last, but not the least,

the memorized knowledge gets obsolete overtime,

and requires re-training the model for refreshness.

In this work, we employ soft prompts to over-

come some of these issues of LMs. Soft prompts

(Lester et al.,2021;Li and Liang,2021;Ham-

bardzumyan et al.,2021) have been recently pro-

posed as a tool for adapting large frozen LMs

to new tasks. Nevertheless, we repurpose soft

prompts to the task of injecting world knowledge

into LMs. The goal is to train an external memory

that is composed of a large set of soft prompts

that encode world knowledge. We introduce a

method to train knowledge driven soft prompts via

self-supervised learning on data from knowledge

bases. The resulting soft prompts, which we call

knowledge prompts (KPs), function as an auxiliary

memory of the LMs that is activated when solving

knowledge intensive tasks. Different from regu-

lar applications of soft prompts that concatenate a

ﬁxed small set of embeddings to every input, our

approach learns a very large set of KPs, which are

sparsely activated depending on the input.

We focus on entity-centric KPs, which means

that each prompt primarily encodes information

about one entity from a knowledge base. We use

Wikidata (Vrandeˇ

ci´

c and Krötzsch,2014) triples

as our training data and train KPs for the top 1.1M

entities, based on the number of triples. We present

a qualitative analysis of KPs using t-SNE plots

and k-nearest neighbors approaches. In terms of

quantitative analysis, we show experimental results

for three knowledge intensive tasks: question an-

swering, fact checking and relation classiﬁcation.

For all datasets, the use of KPs improves the per-

formance of the T5 baseline. Our experimental

results demonstrate that KPs are an effective way

to expand the memory of frozen LMs.

The main contributions of this work are the fol-

lowing:

•

we propose a self-supervised approach to train

knowledge driven soft prompts that can be

used to inject world knowledge into LMs.

•

we demonstrate that knowledge prompts can

effectively model the structure of the training

data and can also improve the performance of

arXiv:2210.04726v1 [cs.CL] 10 Oct 2022

Figure 1: Training of Knowledge Prompts: given a serialized KB triple where one of the entities has been

masked out, the frozen LM has to predict the masked entity given the input and the knowledge prompt of the

non-masked entity, which is Michelle Obama in the example. The cross-entropy loss is computed and the error is

back-propagated through the frozen LM in order to update the KP.

LMs on knowledge intensive tasks.

•

this work sheds light on the usability of soft

prompts for storing data rather than storing

instructions on how to solve speciﬁc tasks.

2 Methods

2.1 Soft Prompts

Different approaches have been recently proposed

to train soft prompts (Lester et al.,2021;Li and

Liang,2021;Hambardzumyan et al.,2021). One

of the most popular methods, and probably the

simplest one, consists of the following steps (Lester

et al.,2021):

(1)

for a task in the dataset, prepend a ﬁxed num-

ber of embeddings (soft prompts) to the word

embeddings of every input;

(2)

during ﬁnetuning, update the soft prompt

while keeping all the other parameters of the

LM frozen.

Despite its simplicity, this approach has demon-

strated to be very effective when used with large

language models.

2.2 Soft Knowledge Prompts

We are interested in training soft knowledge

prompts (KPs) to encode world knowledge, which

could work as an external memory for LMs. In

this work, we focus on the training of entity-centric

KPs, each of which stores the knowledge related

to a speciﬁc entity from a knowledge base (KB).

In other words, the KP of an entity encodes in-

formation from the KB triples that mention the

entity either as a subject or an object. We adopt

KB triples from Wikidata (Vrandeˇ

ci´

c and Krötzsch,

2014), as a simple and trustworthy source of world

knowledge.

2.2.1 KP Training

We train KPs with a masked language model-

ing (MLM) objective (Devlin et al.,2019;Taylor,

1953), where the goal is to generate the object entity

of a KB triple given the subject entity and relation,

and vice versa. As an example, the input / target

pair "

Germany capital <MASK>

"/"

Berlin

" will

be used to update the KP for

Germany

, while the

pair "

<MASK> capital Berlin

"/"

Germany

" will

be used to update the KP for Berlin.

The KPs are randomly initialized, and are up-

dated only when the corresponding entities appear

(not masked) in the input. This makes the training

of KPs sparse and parallelizable.

Given an input triple with the object entity being

masked, a training iteration has the following steps:

(1)

retrieve the KP of the subject entity, which is

a simple lookup operation;

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

KnowledgePrompts:InjectingWorldKnowledgeintoLanguageModelsthroughSoftPromptsCiceroNogueiradosSantos,ZheDong,DanielCer,JohnNham,SiamakShakeri,JianmoNi,Yun-hsuanSungGoogleResearchfcicerons,zhedong,cer,jnham,siamaks,jianmon,yhsungg@google.comAbstractSoftpromptshavebeenrecentlyproposedasatoolforadapting...

展开>> 收起<<

Knowledge Prompts Injecting World Knowledge into Language Models through Soft Prompts Cicero Nogueira dos Santos Zhe Dong Daniel Cer John Nham.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Knowledge Prompts Injecting World Knowledge into Language Models through Soft Prompts Cicero Nogueira dos Santos Zhe Dong Daniel Cer John Nham

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: