parser (Flanigan et al.,2014), a graph-based
parser (Cai and Lam,2020b), a transition-based
parser (Zhou et al.,2021b), a Seq2Seq-based
parser (Bevilacqua et al.,2021), and an AMR-
specific pre-training parser (Bai et al.,2022a). The
test domains cover news, biomedical, novel, and
wiki questions. We conduct experiments under
the zero-shot setting, where a model is trained
on the source domain and evaluated on the target
domain without using any target-domain labeled
data. Our results show that 1) all models give
relatively lower (up to 45.5%) performances on
out-of-domain test sets, with the most dramatic
drop on named entities and wiki links; 2) the graph
pretraining-based parser is stronger in domain
transfer than the other parsers; 3) the transition-
based parser is more robust than the seq2seq-
based parser. We further analyze the impact of a
set of linguistic features, and the results suggest
that the performance degradation is positively
correlated with the distribution shifts of words and
AMR concepts. Compared with the distribution
divergences of the input features, those of the
output features are more challenging to cross-
domain AMR parsing.
Based on our analysis, we investigate two
approaches to bridge the domain gap for improving
cross-domain AMR parsing. We first continually
pre-train a BART model on target domain raw
text to reduce the distribution gap of words.
To further bridge the domain gap of output
features, we adopt a pre-trained AMR parser
to construct silver AMR graphs on the target
domain, which potentially reduces the output
features divergence. Experimental results show
that the proposed methods consistently improve
the parsing performance on out-of-domain test
sets. To our knowledge, this is the first systematic
study on cross-domain AMR parsing. Our code
and results will be available at
https://github.com/
goodbai-nlp/AMR-DomainAdaptation.
2 Related Work
2.1 AMR Parsing
On a coarse-grained level, the current AMR parsing
systems can be categorized into two main classes.
The first is two-stage parsing system, which first
identifies concepts, and then predicts relations
based on the concept decisions. Two tasks are
modeled either in a pipeline (Flanigan et al.,
2014,2016) or jointly (Lyu and Titov,2018;
Zhang et al.,2019a). The other one is one-
stage parsing, which generates a parse graph
incrementally. The one-stage parsing methods
can be further divided into three categories:
graph-based parsing, transition-based parsing, and
seq2seq-based parsing. Transition-based parsing
induces an AMR graph by predicting a sequence
of transition actions. The transition-based AMR
parsers either maintain a stack and a buffer (Wang
et al.,2015;Damonte et al.,2017;Ballesteros and
Al-Onaizan,2017;Vilares and Gómez-Rodríguez,
2018;Liu et al.,2018;Naseem et al.,2019;
Fernandez Astudillo et al.,2020;Lee et al.,2020)
or make use of a pointer (Zhou et al.,2021a,b).
Graph-based parsing builds a semantic graph
incrementally. At each time step, a new node
along with its connections to existing nodes are
jointly decided. The graph is induced either in top-
down manner (Cai and Lam,2019) or in specific
traversal order (Zhang et al.,2019b;Cai and Lam,
2020a). Seq2seq-based parsing treats AMR parsing
as a sequence-to-sequence problem by linearizing
AMR graphs so that existing seq2seq models can be
readily utilized. Various seq2seq architectures have
been employed for AMR parsing, such as vanilla
seq2seq (Barzdins and Gosko,2016;Konstas et al.,
2017), supervised attention (Peng et al.,2017),
character-based (Van Noord and Bos,2017), and
pre-trained Transformer (Bevilacqua et al.,2021;
Bai et al.,2022a).
Despite great success, most previous work
on AMR parsing focuses on the in-domain
setting, where the training and test data share
the same domain. In contrast, we systematically
evaluate the model performance on
4
out-of-
domain datasets. To our knowledge, we are
the first to systematically study cross-domain
generalization for AMR parsing.
2.2 Related Tasks
We summarize recent research studying other
semantic formalisms as well as the cross-domain
generalization of named entity recognition (NER),
semantic role labeling (SRL) and constituency
parsing.
Semantic parsing on other formalisms. AMR is
strong-correlated with other semantic formalisms
such as semantic dependency parsing (SDP, Oepen
et al.,2016) and universal conceptual cogni-
tive annotation (UCCA, Abend and Rappoport,
2013;Hershcovich et al.,2017), and recent