fied framework on various entity-intensive QA and
generation tasks, in which we train an entity mem-
ory for efficient knowledge incorporation. First,
EDMem is pre-trained on Wikipedia documents,
where it learns entity embeddings in the memory
along with an encoder-decoder model. EDMem
learns to select relevant entities from the memory
via an entity linking objective, and learns to gener-
ate answers using entity knowledge via a language
modeling objective. Second, to precisely generate
entity names, we design three decoding methods
that utilize the entity linking ability of EDMem
in its generation process, when we fine-tune it on
downstream tasks. These include (1) free-form:
left-to-right generation with entity identifiers; (2)
static entity linking: first select entities by entity
linking, build prefix trees for the selected entities,
and then perform constrained entity generation us-
ing the trees; (3) dynamic entity linking: select
entities on-the-fly for constrained entity generation.
We conduct experiments on two popular testbeds
of entity knowledge: open-domain QA and entity-
intensive generation. With the incorporation of en-
tity knowledge, EDMem outperforms non-memory
encoder-decoder models on both tasks, and it re-
tains the efficiency advantage of closed-book (i.e.,
non-retrieval) models. Compared to memory-based
auto-encoders, EDMem achieves both higher over-
all accuracy (+9%) and better entity precision
(+8%) on open-domain QA datasets, and it gener-
ates high-quality text from the memory-supported
decoder on generation datasets when auto-encoders
fail to do so. To summarize, EDMem is the first
knowledge-augmented closed-book framework to
perform both tasks in a unified manner.
2 Related Work
Closed-Book Models
Closed-book models are
pre-trained models that store knowledge in their
own parameters. For example, COMET (Bosse-
lut et al.,2019) fine-tuned GPT2 (Radford et al.,
2018) to construct knowledge graphs by gener-
ating commonsense triples. Recently, fine-tuned
BART (Lewis et al.,2020a) or T5 (Raffel et al.,
2020) models are proved to be competitive on
open-domain QA (Ye et al.,2020;Roberts et al.,
2020). Therefore, closed-book models are able to
memorize some entity knowledge after pre-trained
on massive data. However, studies showed that
closed-book models just recalled similar inputs and
answers in their pre-training corpus (Wang et al.,
2021), and their performances were behind open-
book models.
Open-Book Models
Open-book models first re-
trieve evidence documents from external cor-
pora and read these documents to predict an an-
swer (Chen et al.,2017). REALM (Guu et al.,
2020) proposed a self-supervised approach to pre-
train a retriever-reader model. DPR (Karpukhin
et al.,2020) devised a contrastive objective to train
a dense bi-encoder retriever on open-domain QA.
Subsequent approaches combined DPR with a gen-
erative objective to build large, powerful models
on open-domain QA and generation tasks (Lewis
et al.,2020b;Izacard and Grave,2021;Sachan
et al.,2021;Yu et al.,2022a). However, open-
book models have to process the raw text of all re-
trieved documents, which leads to extremely long
inference time. Besides, additional overheads are
brought by loading the document index and retriev-
ing evidence documents for each example.
Entity Memory
EaE (Févry et al.,2020) was
the first to pre-train an entity memory with an auto-
encoder framework to perform entity prediction on
open-domain QA. FILM (Verga et al.,2021) fol-
lowed EaE and added a fact memory containing
representations of Wikidata triples. To better en-
code relational knowledge, OPQL (Sun et al.,2021)
learned latent relational representations for arbi-
trary entity pairs. Recent work focused on learn-
ing a huge mention-level memory (~150M entries)
with extensive pre-training (de Jong et al.,2022) or
leveraging the entity memory in domain adaptive
training (Kang et al.,2022). These models are all
based on an auto-encoder framework. Thus, they
are able to predict entities IDs but would fail to gen-
erate any non-entity answers or sentences. There
is a preprint paper contemporaneous to our work
which trained a memory with an encoder-decoder
model (Chen et al.,2022). However, it used QA
pairs as memory entries instead of entities, limiting
its application to QA tasks. Besides, their memory
is much heavier (60M entries) than ours (1M).
3 Proposed Framework
Suppose we have a pre-defined vocabulary of
N
entities
E={e1, . . . , eN}
. A mention is the ac-
tual tokens in context which refer to an entity. The
set of all mentions in the corpus is denoted as
M
.
Thus, there is a global alias table
T:E → 2M
,
where each entity is mapped to all its mentions.