
practically, if CT may be a useful inductive bias for
neural coreference resolution systems.
In this paper, we attempt to provide an answer to
these questions through a careful analysis of neural
coreference models using various discourse metrics
(referred to as centering metrics) and conducting
several statistical tests. Because CT, at its core is
a linguistic theory, and not a computational one,
we first provide a computational operationalization
of CT that we can directly implement (
§
2). Our
operationalization requires us to concretely specify
the linguistic notations present in the original work
(Grosz et al.,1995;Poesio et al.,2004) and draw
conclusions about how well neural coreference
resolvers accord with CT.
In a series of systematic analyses (
§
5), we first
show that neural coreference resolution models
achieve relatively high scores under centering met-
rics, indicating they do contain some information
about discourse coherence, even though they are
not trained by any CT signals. In addition, as
shown in Fig. 2, there is a non-trivial relationship
between CT and coreference, which we quantify
by mutual information, between the performance
of a coreference resolver and our various CT oper-
ationalizations (Chambers and Smyth,1998;Gor-
don and Hendrick,1998). However, the centering
scores taper off as we have more accurate coref-
erence models (i.e., models with higher CoNLL
F1): the dependence between CT and coreference
performance decreases when CoNLL F1 reaches
above 50%. This interval, unfortunately, is where
all modern coreference resolution models lie. This
indicates that entity coherence information is no
longer helpful in improving current neural corefer-
ence resolution systems.
Next, we turn to answering the question: Where
in their architecture do neural coreference systems
capture this CT information? Our experiments
on the well-known C2Fcoreference model with
SpanBERT embeddings (Joshi et al.,2020) (
§
5.3)
reveal that the contextualized SpanBERT embed-
dings contain much of the coherence information,
which explains why incorporating elements of
CT only yields minor improvements to a neural
coreference systems.
Finally, we explore what information required in
coreference resolution is not captured by CT? We
show that CT does not capture factors such as re-
cency bias and world knowledge (
§
6) which might
be required in the task of coreference resolution.
In order to explore the role of recency bias, we
extend our CT formulation to account for this bias
by controlling the salience of centers in the CT for-
mulation. We show that this reformulation of CT
captures coreference information better compared
to vanilla CT at the same centering score level. We
end with a summary of takeaways from our work.
2 Coreference and Centering Theory
In this section, we overview the necessary back-
ground on coreference and centering theory in
our own notation. We define a
discourse D=
[U1, . . . , UN]
of length
N
as a sequence of
N
utterances, each denoted as
Un
. We take an
ut-
terance Un
of length
M
to be a string of tokens
t1· · · tM
where each token
tm
is taken from a vo-
cabulary
V
.
1
Let
M(Un) = {m1, m2, . . .}
be
the set of mentions in the utterance
Un
. A
men-
tion
is a subsequence of the tokens that comprise
Un=t1· · · tM
. Mentions could be pronouns, re-
peated noun phrases, and so forth, and are often
called anaphoric devices in the discourse literature.
2.1 Coreference
Next, let
E
be the set of entities in the world. A
coreference resolver f:M(D)→ E
imple-
ments a function from the set of mentions onto
the set of entities (henceforth also referred to as
the MENTION ENTITY MAPPING
f
).
2
In Table 1,
J·K
denotes
f(·)
for illustration, i.e., a mention,
e.g., Mike, is mapped to the entity
JMikeKi
. Here
we reuse the notation
M(·)
, where
M(D)def
=
S
Un∈D
M(Un)
. Rule-based or feature-based corefer-
ences resolvers (Hobbs,1978;Sidner,1979;Bren-
nan et al.,1987;Kong et al.,2009) resolve coref-
erence by explicitly combining CT constraints or
syntactic constraints. Current state-of-the-art coref-
erence resolvers are end-to-end neural models (Lee
et al.,2017;Joshi et al.,2020;Wu et al.,2020).
2.2 Centering Theory
Centering theory (CT) offers a theoretical expla-
nation of local discourse structure that models the
interaction of referential continuity and the salience
1
This definition of an utterance could be understood as a
textual unit as short as a clause, but it also could be understood
as a textual unit as long as multiple paragraphs; we have left it
intentionally open-ended and will revisit this point in §4.
2
In general, coreference resolution includes a mention de-
tection step. In our analysis, we assume the mentions to be
given. Thus,
f
can essentially be thought of as an implemen-
tation of the entity-linking step in coreference resolution.