
oriented latent graph (namely, POLar). Unlike the explicit
syntactic structures, we make use of a two-parameter Hard-
Kuma distribution [Bastings et al., 2019]to automatically in-
duce latent graph from task’s need (cf. §4). Particularly, we
propose a predicate-centered Gaussian inducer for yielding
the latent edges, by which the nearer and informative words
to the predicate will be placed with more considerations. The
POLar is then dynamically pruned, so that only the task-
relevant structure will be built, while the irrelevant edges are
droped. The overall CSRL framework is differentiable and
performs predictions end-to-end (cf. Fig. 2).
The BERT [Devlin et al., 2019]pre-trained language
model (PLM) is extensively employed in existing works for
CSRL performance boosts [Xu et al., 2021; Wu et al., 2021a].
Nevertheless, it could be problematic to directly leverage
BERT for CSRL. On the one hand, one entire dialog often
consists of far more than two utterance sentences, while the
raw BERT restricts the input with at maximum two sentence
pieces, which consequently limits the PLM’s utility. There-
fore, we consider adopting the DiaBERT [Liu and Lapata,
2019; Li et al., 2020], which is designed for well supporting
multiple utterance inputs and thus yields better dialogue-level
representations. On the other hand, we note that in CSRL
both two speakers use the personal pronoun in their own per-
spective (i.e., ‘I’, ‘you’), and directly concatenating the multi-
turn utterances into PLM will unfortunately hurt the speaker-
role consistency, i.e., speaker coreference issue. Therefore,
we introduce a coreference-consistency-enhanced DiaBERT
(namely CoDiaBERT, cf. Fig. 3) that enhances the speaker-
role sensitivity of PLM with a pronoun-based speaker predic-
tion (PSP) strategy.
Our system significantly outperforms strong-performing
baselines with big margins on three CSRL benchmarks. In
particular, over 4% F1 score of improvement is achieved for
detecting the cross-utterance type of arguments. Further anal-
yses reveal the usefulness of the proposed latent graph and the
dynamic pruning method, as well as the CoDiaBERT PLM.
Also we show that our model effectively solves long-range
dependence issue. Overall, we make these contributions:
•We for the first time propose to improve the CSRL task
by incorporating a novel latent graph structure.
•We construct a predicate-oriented latent graph via a
predicate-centered Gaussian inducer. The structure is dynam-
ically pruned and refined for best meeting the task need.
•We introduce a PLM for yielding better dialogue-level
text representations, which supports multiple utterance sen-
tences, and is sensitive to the speaker roles.
•Our framework achieves new state-of-the-art CSRL re-
sults on three benchmark data.
2 Related Work
The SRL task aims at uncovering the shallow semantic
structure of text, i.e. ‘who did what to whom where and
when’. As a fundamental natural language processing (NLP)
task, SRL can facilitate a broad range of downstream ap-
plications [Shen and Lapata, 2007; Liu and Gildea, 2010;
Wang et al., 2015]. By installing the current neural mod-
els, the current standard SRL has secured strong task per-
POLar
Induction
Dialogue
Encoder
CoDiaBERT
GCN x M
POLar
Pruning
Utterance1
<CLZ>
<SEP>
POLar
Encoder
word
word
…
<SEP>
…
prd
…
Utterance2
…
…
B-ARG0
I-ARG0
O
O
O
B-ARG1
Predicate node
Word node
Dialogue Decoder
Gating
⊕
Figure 2: The overall CSRL framework.
formances [Strubell et al., 2018; Li et al., 2019; Fei et
al., 2021c]. Recently, Xu et al. [2021]pioneer the task
of CSRL by extending the regular SRL into multi-turn di-
alogue scenario, in which they provide benchmark datasets
and CSRL neural model. Later a limited number of sub-
sequent works have explored this task [Wu et al., 2021b;
Wu et al., 2021a], where unfortunately several important fea-
tures of CSRL are not well considered. In this work, we im-
prove the CSRL by fully uncovering the task characteristics.
This work also closely relate to the line of syntax-driven
SRL [Marcheggiani and Titov, 2017; Fei et al., 2020c; Fei
et al., 2020b]. For the regular SRL, the external syntactic de-
pendency structure is a highly-frequently equipped feature for
performance enhancement, as the SRL shares much underly-
ing structure with syntax [He et al., 2018; Fei et al., 2020a;
Fei et al., 2021a]. However, it could be problematic for CSRL
to directly benefit from such convient syntactic knowledge,
due to the dialogue nature of the text as we revealed ear-
lier. We thus propose to construct a latent structure at di-
alogue level, so as to facilitate the CSRL task with struc-
tural knowledge. In recent years, constructing latent graph for
downstream NLP tasks has received certain research attention
[Choi et al., 2018]. As an alternative to the pre-defined syn-
tactic dependency structure yielded from third-party parsers,
latent structure induced from the task context could effec-
tively reduce noises [Corro and Titov, 2019], and meanwhile
enhance the efficacy (i.e., creating task-relevant connections)
[Chen et al., 2020]. In this work, we revisit the characteristic
of CSRL, and based on the two-parameter Hard-Kuma distri-
bution [Bastings et al., 2019]investigate a predicate-oriented
latent graph by proposing a predicate-centered Gaussian in-
ducer.
3 CSRL Framework
Task modeling. Consider a conversation text U={ut}T
t=1
(Tis the total utterance number), with each utterance
ut={w0, w1,···} a sequence of words (w0is the utterance
speaker). In CSRL the predicate prd is labeled as input at the
current (lastest) utterance uT. We follow Xu et al. [2021],
modeling the task as a sequence labeling problem with a BIO
tagset. CSRL system identifies and classifies the arguments
of a predicate into semantic roles, such as A0,A1,AM-LOC,
etc, where we denote the complete role set as R. Given U