lingchensanwen/DCQA-QUD-parsing.
2 Background and related work
Discourse frameworks
Questions Under Dis-
cussion is a general framework with vast theo-
retical research especially in pragmatics, e.g., in-
formation structure (Roberts,2012;Büring,2003;
Velleman and Beaver,2016), presuppositions (Si-
mons et al.,2010), and implicature (Hirschberg,
1985;Van Kuppevelt,1996;Jasinskaja et al.,2017).
Ginzburg et al. (1996) extended Stalnaker (1978)’s
dynamic view of context to dialogue by integrating
QUD with dialogue semantics, where the speakers
are viewed as interactively posing and resolving
queries. In QUD analysis of monologue, each sen-
tence aims to answer a (mostly implicit) question
triggered in prior context. Sometimes the questions
form hierarchical relationships (stacks where larger
questions have sub-questions, starting from the root
question “What is the way things are?”) (Büring,
2003;Roberts,2004;De Kuthy et al.,2018;Ri-
ester,2019). However, because of the inherent
subjectivity among naturally elicited QUD ques-
tions (Westera et al.,2020;Ko et al.,2020), we
leave question relationships for future work.
QUD and coherence structures are closely re-
lated. Prior theoretical work looked into the map-
ping of QUDs to discourse relations (Jasinskaja
et al.,2008;Onea,2016) or the integration of the
two (Kuppevelt,1996). Hunter and Abrusán (2015)
and Riester (2019) studied structural correspon-
dances between QUD stacks and SDRT specif-
ically. Westera et al. (2020) showed that QUD
could be a useful tool to quantitatively study the
predictability of discourse relations (Garvey and
Caramazza,1974;Kehler et al.,2008;Bott and
Solstad,2014). In Pyatkin et al. (2020), discourse
relation taxonomies were also converted to tem-
platic questions, though not in the QUD context.
Traditionally, discourse “dependency parsing”
refers to parsing the RST structure (Hirao et al.,
2013;Bhatia et al.,2015;Morey et al.,2018). Since
QUD structures are marked by free-form questions,
the key aspect of “parsing” a QUD structure is
thus question generation, yielding a very different
task and type of structure than RST parsing. As
we show in the paper, the two are complementary
to each other and not comparable. This work fo-
cuses on automating and evaluating a QUD parser;
we leave for future work to explore what types of
structure is helpful in different downstream tasks.
The DCQA dataset
Corpora specific for QUD
are scarce. Existing work includes a handful of in-
terviews and 40 German driving reports annotated
with question stacks (De Kuthy et al.,2018;Hesse
et al.,2020), as well as Westera et al. (2020)’s 6
TED talks annotated following Kehler and Rohde
(2017)’s expectation-driven model (eliciting ques-
tions without seeing upcoming context). Ko et al.
(2020)’s larger INQUISITIVE question dataset is an-
notated in a similar manner, but INQUISITIVE only
provides questions for the first 5 sentences of an
article, and they did not annotate answers.
This work in contrast repurposes the much larger
DCQA
dataset (Ko et al.,2022), consisting of more
than 22K questions crowdsourced across 606 news
articles. DCQA was proposed as a way to more
reliably and efficiently collect data to train QA sys-
tems to answer high-level questions, specifically
QUD questions in INQUISITIVE. Though not orig-
inally designed for QUD parsing, DCQA is suit-
able for our work because its annotation procedure
follows the reactive model of processing that is
standard in QUD analysis (Benz and Jasinskaja,
2017), where the questions are elicited after ob-
serving the upcoming context. Concretely, for each
sentence in the article, the annotator writes a QUD
such that the sentence is its answer, and identifies
the “anchor” sentence in preceding context that the
question arose from. Figure 1(a) shows questions
asked when each of the sentences 2-6 are consid-
ered as answers, and their corresponding anchor
sentences. As with other discourse parsers, ours
is inevitably bound by its training data. However,
DCQA’s crowdsourcable paradigm makes future
training much easier to scale up and generalize.
3 Questions vs. coherence relations
We first illustrate how questions capture inter-
sentential relationships, compared with those in
coherence structures. We utilize the relation taxon-
omy in RST for convenience, as in Section 5.3 we
also compare the structure of our QUD dependency
trees with that of RST.
Given each existing anchor-answer sentence pair
across 7 DCQA documents, we asked two gradu-
ate students in Linguistics to select the most ap-
propriate discourse relation between them (from
the RST relation taxonomy (Carlson and Marcu,
2001)). Both students were first trained on the
taxonomy using the RST annotation manual.