
help. As the prompt-based pre-trained models own
the zero-shot ability to generate candidate entity
types, we design two appropriate prompt templates
and achieve automatic acquisition of seed rules by
the prompt model outputs’ consistency. (2) Once
seed rules are obtained, we form a high-quality
instance pool to train the downstream task, continu-
ously add potential instances to the pool, and distill
new logical rules from the pool in an iterative man-
ner. (3) Based on the convergent instance pool, we
further fine-tune a new prompt-based model with
more suitable prompt templates to obtain more re-
liable seed rules, and yield a better downstream
task model. Compared with previous methods, our
system no longer relies on manual seed rules or dic-
tionaries, but only needs several prompt templates.
Experiments on three public named entity tag-
ging benchmarks demonstrate the effectiveness of
our proposed framework
STREAM
, with consis-
tent improvements over several baseline models
and far exceed the state-of-the-art (SOTA) systems.
Besides, we perform a detailed ablation study to
analyze the quality of our obtained seed rules, the
convergence of our propose iterative framework,
and some specific cases of learned logical rules.
Accordingly, the major contributions of our work
are summarized as follows:
(1) We introduce the large pre-trained prompt-
based models to end the dilemma that the logical
rule learning systems require seed rules as a start.
(2) We develop an effective and stable frame-
work to distill logical rules in an iterative manner,
which combines prompt-based fine-tuning and rule
distillation to achieve mutual enhancement.
(3) We conduct detailed experiments to illustrate
the effectiveness and rationality of our framework
— with several predefined prompt templates, the
performance of our method has surpassed previous
rule learning systems based on manual rules.
2 Methodology
2.1 Overview
In this work, we adopt named entity tagging as the
specific downstream task to compare with previous
work (Li et al.,2021) of learning logical rules. The
diagram of STREAM is visualized in Figure 3.
2.2 Logical Rules
In real scenarios, logic rules can appear in vari-
ous forms. For convenience, we define the logical
rules in the unified form of “if p then q (i.e.p
→
q)”. In named entity tagging task, “p” can be any
logical expression and “q” is the corresponding
entity category. For example, a logical rule may
look like: “if the entity’s lexical string is PD
1
, then
its corresponding entity label should be disease”.
As demonstrated in previous work (Zhou and
Su,2002), we define five meta logical rules to tag
named entities based on their lexical, contextual,
and syntax information. In addition, some combi-
nations of simple logical rules are also considered.
2.2.1 Meta Logical Rules
Following existing literature, our pre-defined meta-
rules are: (1) TOKENSTRING rule matches entity’s
lexical string; (2) PRENGRAM rule matches en-
tity’s preceding context tokens; (3) POSTNGRAM
rule matches entity’s succeeding context tokens;
(4) POSTAG rule matches entity’s part-of-speech
tags; (5) DEPENDENCYREL rule matches the de-
pendency relations of the entity and its headword.
11⋯2
⋯
2 =−3
Thirty PD patients participated in the study
NUM PROPN NOUN VERB ADP DET NOUN
nummod
compound nsubj prep
proj
det
Figure 2: Dependency parsing example.
Figure 2shows an example with its dependency
structure. In this sentence, word PD is a potential
disease entity and following logical rules may exist:
TOKENSTRING == PD →disease
PRENGRAM == thirty →disease
POSTNGRAM == patients →disease
POSTAG == PROPN →disease
DEPENDENCYREL ==
(compound, patient)→disease
In fact, above simple rules may sometimes fail to
work, therefore we introduce complex rules, which
combine several simple rules into compound rules
by logical connectives including and (
∧
), or (
∨
)
and negation (
¬
). For example, only a mention that
satisfies both rule POSTNGRAM == patients and
rule POSTAG == PROPN can be a disease entity.
2.2.2 Logical Rules Mining
After defining the form of meta logical rules, we
traverse the entire training set and recall all poten-
tial rules that satisfy the format of meta rules.
1PD: Parkinson’s disease