
SIGIR ’23, July 23–27, 2023, Taipei, Taiwan Yunzhi Yao, et al.
Schema-aware references
Convict
Movement Transport
convicted
transport
PER
LOC
vehicle
Prompt
Schema graph
Instance
Retrieval
Figure 1: Schema-aware reference as prompt. We construct
a schema-instance hybrid reference store from which we
retrieve related knowledge as a prompt for data-ecient
learning with PLMs (e.g., BART [24]).
relational triple extraction. [
18
] formulates event extraction as a
conditional generation problem with a manually designed prompt,
which achieves high performance with only a few training data.
Existing methods have notable limitations. Unlike general NLP
tasks, knowledge graph construction requires structured prediction
that adheres to a pre-dened schema. Raw text data for PLMs may
not have sucient task-specic patterns, leading to a semantic
gap between the input sequence and schema. Constrained prompt
templates struggle to fully utilize semantic knowledge and generate
schema-conforming outputs. Moreover, prior prompt-based learn-
ing relies on the parametric-based paradigm, which is unable to
unleash the potential analogical capability of pre-trained language
models [
4
]. Notably, they may fail to generalize well for complex
examples and perform unstably with limited training data since
the scarce or complex examples are not easy to be learned in para-
metric space during optimization. For example, texts mentioning
the same event type can vary signicantly in structure and expres-
sion. “A man was hacked to death by the criminal” and “The aircraft
received re from an enemy machine gun” both describe an Attack
event, although they are almost literally dierent. With only few-
shot training samples, the model may struggle to discriminate such
complex patterns and extract correct information.
To overcome the aforementioned limitations, we try to fully
leverage the schema and global information in training data as
references for help. Note that humans can use associative learning
to recall relevant skills in memories to conquer complex tasks with
little practice. Similarly, given the insucient features of a single
sentence in the low-resource setting, it is benecial to leverage
that schema knowledge and the similar annotated examples to
enrich the semantics of individual instances and provide reference
[
49
]. Motivated by this, as shown in Figure 1, we propose a novel
approach of schema-aware Reference AsPrompt (RAP), which
dynamically leverages symbolic schema and knowledge inherited
from examples as prompts to enhance the PLMs for knowledge
graph construction.
However, there exist two problems: (1) Collecting reference knowl-
edge: Since rich schema and training instances are complemen-
tary to each other, it is necessary to combine and map these data
accordingly to construct reference store. (2) Leveraging reference
knowledge: Plugin-in-play integrating those reference knowledge to
existing KG construction models is also challenging since there are
various types of models (e.g., generation-based and classication-
based methods).
To address the problem of collecting reference knowledge, we
propose a schema-aware reference store that enriches schema with
text instances. Specically, we align instances from human-annotated
and weak-supervised text with structured schema; thus, symbolic
knowledge and textual corpora are in the same space for repre-
sentation learning. Then we construct a unied reference store
containing the knowledge derived from both symbolic schema and
training instances. To address the problem of leveraging reference
knowledge, we propose retrieval-based reference integration to se-
lect informative knowledge as prompts [
54
]. Since not all external
knowledge is advantageous, we utilize a retrieval-based method
to dynamically select knowledge as prompts that are the most
relevant to the input sequence from the schema-aware reference
store. In this way, each sample can achieve diverse and suitable
knowledgeable prompts that can provide rich symbolic guidance
in low-resource settings.
To demonstrate the eectiveness of our proposed RAP, we ap-
ply it to knowledge graph construction tasks of relational triple
extraction and event extraction tasks. Note that our approach is
model-agnostic and readily pluggable into any previous approaches.
We evaluate the model on two relation triple extraction datasets:
NYT and WebNLG, and two event extraction datasets: ACE05-E
and CASIE. Experimental results show that the RAP model can
perform better in low-resource settings.
2 PRELIMINARIES
In this paper, we apply our approach, RAP, to two representative
tasks of knowledge graph construction, namely: relation triple
extraction and event extraction.
2.1 Task Denition
Event Extraction. Event extraction is the process of automatically
extracting events from unstructured natural language texts, guided
by an event schema. To clarify the process, the following terms
are used: a trigger word is a word or phrase that most accurately
describes the event, and an event argument is an entity or attribute
involved in the event, such as the time or tool used. For example,
the sentence “A man was hacked to death by the criminal” describes
an Attack event triggered by the word ‘hacked’. This event includes
two argument roles: the Attacker (criminal) and the Victim (a man).
The model should be able to identify event triggers, their types,
arguments, and their corresponding roles.
Relation Triple Extraction. Joint extraction of entity mentions
and their relations which are in the form of a triple (subject, relation,
object) from unstructured texts, is an important task in knowledge
graph construction. Given the input sentences, the desired outputs
are relational triples
(𝑒ℎ𝑒𝑎𝑑, 𝑟, 𝑒𝑡𝑎𝑖𝑙 )
, where
𝑒ℎ𝑒𝑎𝑑
is the head entity,
𝑟
is the relation, and
𝑒𝑡𝑎𝑖𝑙
is the tail entity. For instance, given