Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models Aaron MuellerS Yu XiaC Tal LinzenC

2025-04-30 0 0 951.57KB 14 页 10玖币
侵权投诉
Causal Analysis of Syntactic Agreement Neurons
in Multilingual Language Models
Aaron MuellerS, Yu XiaC, Tal LinzenC
SJohns Hopkins University CNew York University
amueller@jhu.edu, yx1675@nyu.edu, linzen@nyu.edu
Abstract
Structural probing work has found evidence
for latent syntactic information in pre-trained
language models. However, much of this
analysis has focused on monolingual mod-
els, and analyses of multilingual models have
employed correlational methods that are con-
founded by the choice of probing tasks. In
this study, we causally probe multilingual
language models (XGLM and multilingual
BERT) as well as monolingual BERT-based
models across various languages; we do this
by performing counterfactual perturbations on
neuron activations and observing the effect on
models’ subject-verb agreement probabilities.
We observe where in the model and to what
extent syntactic agreement is encoded in each
language. We find significant neuron over-
lap across languages in autoregressive multi-
lingual language models, but not masked lan-
guage models. We also find two distinct layer-
wise effect patterns and two distinct sets of
neurons used for syntactic agreement, depend-
ing on whether the subject and verb are sep-
arated by other tokens. Finally, we find that
behavioral analyses of language models are
likely underestimating how sensitive masked
language models are to syntactic information.
1 Introduction
Syntactic information is necessary for robust gener-
alization in natural language processing tasks (for
a case study using the natural language inference
task, see McCoy et al. 2019). The success of pre-
trained language models (LMs) such as RoBERTa
(Liu et al.,2019) and GPT-3 (Brown et al.,2020)
in many NLP tasks has prompted hypotheses that
they accomplish their performance through struc-
tural representations induced during pre-training,
rather than only lexical or positional represen-
tations (Manning et al.,2020); behavioral evi-
dence for LMs’ syntactic abilities has been found
in masked LMs (MLMs; Warstadt et al.,2020;
Warstadt and Bowman,2020;Goldberg,2019) and
autoregressive LMs (ALMs; Hu et al.,2020). Ev-
idence for structural representations has been re-
ported for multilingual pre-trained LMs (Goldberg,
2019;Mueller et al.,2020) and in sequence-to-
sequence models (Mueller et al.,2022).
Despite efforts to understand the structural infor-
mation encoded by pre-trained LMs (Hewitt and
Manning,2019;Chi et al.,2020;Elazar et al.,2021;
Ravfogel et al.,2021;Finlayson et al.,2021;inter
alia), it remains unclear how and where multilin-
gual models encode this information. Most multi-
lingual probing studies are correlational and use
dependency parsing or labeling as a proxy task in-
dicative of syntactic information (Chi et al.,2020;
Stanczak et al.,2022). This is problematic: Models
do not need structural or word order information to
achieve high performance on dependency labeling
(Sinha et al.,2021), and training a parametric prob-
ing classifier introduces many confounds (Hewitt
and Liang,2019;Antverg and Belinkov,2022).
Causal probing, however, enables non-
parametric analyses of models through coun-
terfactual interventions on inputs or model
representations. Causal probing studies have
argued for the existence of specific syntactic
agreement neurons and units in neural language
models (Finlayson et al.,2021;Lakretz et al.,
2019;De Cao et al.,2021), but these studies have
focused on monolingual models—usually (though
not always) in English. Causal methods allow us
to make stronger arguments about where and how
syntactic agreement is performed in pre-trained
LMs, and we can apply them to answer questions
about the language specificity and construction
specificity of syntactic agreement neurons.
In this study, we extend causal mediation analy-
sis (Pearl,2001;Robins,2003;Vig et al.,2020) to
multilingual language models, including an autore-
gressive LM and a masked LM. We also analyze
a series of monolingual MLMs across languages.
We employ the syntactic interventions approach
arXiv:2210.14328v1 [cs.CL] 25 Oct 2022
of Finlayson et al. (2021) on stimuli in languages
typologically related to English, such that we can
observe whether there exist syntax neurons that
are shared across a set of languages that are all
relatively high-resource and grammatically similar.
Our contributions include the following:
1.
We causally probe for syntactic agreement
neurons in an autoregressive language model,
XGLM (Lin et al.,2021); a masked language
model, multilingual BERT (Devlin et al.,
2019); and a series of monolingual BERT-
based models. We find two distinct layer-wise
effect patterns, depending on whether the sub-
ject and verb are separated by other tokens.
2.
We quantify the degree of neuron overlap
across languages and syntactic structures, find-
ing that many neurons are shared across struc-
tures and fewer are shared across languages.
3.
We analyze the sparsity of syntactic agree-
ment representations for individual structures
and languages, and find that syntax neurons
are more sparse in MLMs than ALMs, but also
that the degree of sparsity is similar across
models and structures.
Our data and code are publicly available.1
2 Related Work
Multilingual language modeling.
Multilingual
language models enable increased parameter effi-
ciency per language, as well as cross-lingual trans-
fer to lower-resource language varieties (Wu and
Dredze,2019). This makes both training and de-
ployment more efficient when support for many
languages is required. A common approach for
training multilingual LMs is to concatenate train-
ing corpora for many languages into one corpus,
often without language IDs (Conneau et al.,2020;
Devlin et al.,2019).
These models present interesting opportunities
for syntactic analysis: Do multilingual models
maintain similar syntactic abilities despite a de-
creased number of parameters that can be dedi-
cated to each language? Current evidence suggests
slight interference effects, but also that identical
models maintain much of their monolingual per-
formance when trained on multilingual corpora
(Mueller et al.,2020). Is syntactic agreement, in
particular, encoded independently per language or
1https://github.com/aaronmueller/
multilingual-lm-intervention
shared across languages? Some studies suggest
that syntax is encoded in similar ways across lan-
guages (Chi et al.,2020;Stanczak et al.,2022),
though these rely on correlational methods based
on dependency parsing, which introduce confounds
and may not rely on syntactic information per se.
Syntactic probing.
Various behavioral probing
studies have analyzed the syntactic behavior of
monolingual and multilingual LMs (Linzen et al.,
2016;Marvin and Linzen,2018;Ravfogel et al.,
2019;Mueller et al.,2020;Hu et al.,2020). Re-
sults from behavioral analyses are generally eas-
ier to interpret and present clearer evidence for
what models’ preferences are given various con-
texts. However, these methods do not tell us where
or how syntax is encoded.
A parallel line of work employs parametric
probes. Here, a linear classifier or multi-layer per-
ceptron probe is trained to map from a model’s
hidden representations to dependency attachments
and/or labels (Hewitt and Manning,2019) to locate
syntax-sensitive regions of a model. This approach
has been applied in multilingual models (Chi et al.,
2020), and produced evidence for parallel depen-
dency encodings across languages. However, if
such probes are powerful, they may learn the target
task themselves rather than tap into an ability of
the underlying model (Hewitt and Liang,2019),
leading to uninterpretable results. When control-
ling for this, even highly selective probes may not
need access to syntactic information to achieve
high structural probing performance (Sinha et al.,
2021). There are further confounds when analyzing
individual neurons using correlational methods; for
example, probes may locate encoded information
that is not actually used by the model (Antverg and
Belinkov,2022).
Causal probing has recently become more com-
mon for interpreting various phenomena in neu-
ral models of language. Lakretz et al. (2019) and
Lakretz et al. (2021) search for syntax-sensitive
units in English and Italian monolingual LSTMs
by intervening directly on activations and evalu-
ating syntactic agreement performance. Vig et al.
(2020) propose causal mediation analysis for lo-
cating neurons and attention heads implicated in
gender bias in pre-trained language models; this
method involves intervening directly on the inputs
or on individual neurons. Finlayson et al. (2021)
extend this approach to implicate neurons in syn-
tactic agreement. This study extends their data and
method to multilingual stimuli and models.
Other causal probing work uses interventions on
model representations, rather than inputs. This
includes amnesic probing (Elazar et al.,2021),
where part-of-speech and dependency information
is deleted from a model using iterative nullspace
projection (INLP; Ravfogel et al.,2020). Ravfo-
gel et al. (2021) employ INLP to understand how
relative clause boundaries are encoded in BERT.
3 Methods
3.1 Causal Metrics
We first define terms to represent the quantities we
measure before and after the intervention. We are
interested in the impact of an intervention
x
on a
model’s preference
yx
for grammatical inflections
over ungrammatical ones. We start with the origi-
nal input, on which we apply the
null
intervention:
This represents performing no change to the orig-
inal input. Given prompt
u
and verb
v
, we first
calculate the following ratio:
ynull(u, v) = p(vpl |usg )
p(vsg |usg)(1)
Here,
usg
represents a prompt that would re-
quire a singular verb inflection
vsg
at the
[MASK]
for the sentence to be grammatical; for example,
“The doctor near the cars
[MASK]
it”.
vsg
is the
third-person singular present inflection of verb
v
,
and
vpl
is the plural present inflection; for exam-
ple,
vsg =
observes” and
vpl =
observe”. Note
that this ratio has the incorrect inflection as the
numerator; this entails that if the model computes
agreement correctly, we will have y < 1.
We now define the
swap-number
intervention,
where the grammatical number of
u
is flipped (re-
sulting in “The doctors near the cars
[MASK]
it” for
the previous example). This results in the following
expression for y:
yswap-number(u, v) = p(vpl |upl)
p(vsg |upl)(2)
Now, the numerator is the correct inflection, so we
expect y > 1.
As we are interested in the contribution of in-
dividual model components to the model’s over-
all preference for correct inflections, we focus on
indirect effects, where we perform interventions
on individual model components and observe the
Figure 1: Example of computing the natural indirect
effect (NIE). We change a neuron’s activation to what
it would have been if we had intervened on the prompt,
then measure the relative change in y.
change in
y
. In particular, we measure the
natural
indirect effect (NIE), as follows.
We intervene on an individual neuron
z
. We
change
z
s original activation given
u
and
v
(de-
noted
znull(u, v)
) to the activation it would have
taken if we had performed the intervention on
u
(denoted
zswap-number(u, v)
). The rest of the neu-
rons retain their original activations. “Natural” here
refers to the fact that our intervention changes the
activation
z
to the value it would have in another
natural setting
u0
, rather than setting it to some
predefined constant (such as 0) that it may or may
not obtain given natural inputs. We measure the
relative change in
y
after applying the intervention
(see Figure 1for a visual example):
NIE(swap-number, null;y, z) =
Eu,v ynull,zswap-number(u,v)(u, v)ynull(u, v)
ynull(u, v)=
Eu,v ynull,zswap-number(u,v)(u, v)
ynull(u, v)1
(3)
If a neuron encodes useful information for syn-
tactic agreement, we expect
y
to increase after
the intervention, making the numerator positive.
Positive NIEs indicate that a neuron encodes pref-
erences for correct verb inflections, and negative
NIEs indicate that the neuron prefers incorrect in-
flections. The closer the NIE is to 0, the less of a
contribution a neuron makes to syntactic agreement
in either direction.
3.2 Models
Finlayson et al. (2021) analyzed a series of mono-
lingual autoregressive language models (ALMs):
GPT-2 (Radford et al.,2019), TransformerXL (Dai
摘要:

CausalAnalysisofSyntacticAgreementNeuronsinMultilingualLanguageModelsAaronMuellerS,YuXiaC,TalLinzenCSJohnsHopkinsUniversityCNewYorkUniversityamueller@jhu.edu,yx1675@nyu.edu,linzen@nyu.eduAbstractStructuralprobingworkhasfoundevidenceforlatentsyntacticinformationinpre-trainedlanguagemodels.However,muc...

展开>> 收起<<
Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models Aaron MuellerS Yu XiaC Tal LinzenC.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:951.57KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注