Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models Aaron MuellerS Yu XiaC Tal LinzenC

2025-04-30 0 0 951.57KB 14 页 10玖币

侵权投诉

Causal Analysis of Syntactic Agreement Neurons

in Multilingual Language Models

Aaron MuellerS, Yu XiaC, Tal LinzenC

SJohns Hopkins University CNew York University

amueller@jhu.edu, yx1675@nyu.edu, linzen@nyu.edu

Abstract

Structural probing work has found evidence

for latent syntactic information in pre-trained

language models. However, much of this

analysis has focused on monolingual mod-

els, and analyses of multilingual models have

employed correlational methods that are con-

founded by the choice of probing tasks. In

this study, we causally probe multilingual

language models (XGLM and multilingual

BERT) as well as monolingual BERT-based

models across various languages; we do this

by performing counterfactual perturbations on

neuron activations and observing the effect on

models’ subject-verb agreement probabilities.

We observe where in the model and to what

extent syntactic agreement is encoded in each

language. We ﬁnd signiﬁcant neuron over-

lap across languages in autoregressive multi-

lingual language models, but not masked lan-

guage models. We also ﬁnd two distinct layer-

wise effect patterns and two distinct sets of

neurons used for syntactic agreement, depend-

ing on whether the subject and verb are sep-

arated by other tokens. Finally, we ﬁnd that

behavioral analyses of language models are

likely underestimating how sensitive masked

language models are to syntactic information.

1 Introduction

Syntactic information is necessary for robust gener-

alization in natural language processing tasks (for

a case study using the natural language inference

task, see McCoy et al. 2019). The success of pre-

trained language models (LMs) such as RoBERTa

(Liu et al.,2019) and GPT-3 (Brown et al.,2020)

in many NLP tasks has prompted hypotheses that

they accomplish their performance through struc-

tural representations induced during pre-training,

rather than only lexical or positional represen-

tations (Manning et al.,2020); behavioral evi-

dence for LMs’ syntactic abilities has been found

in masked LMs (MLMs; Warstadt et al.,2020;

Warstadt and Bowman,2020;Goldberg,2019) and

autoregressive LMs (ALMs; Hu et al.,2020). Ev-

idence for structural representations has been re-

ported for multilingual pre-trained LMs (Goldberg,

2019;Mueller et al.,2020) and in sequence-to-

sequence models (Mueller et al.,2022).

Despite efforts to understand the structural infor-

mation encoded by pre-trained LMs (Hewitt and

Manning,2019;Chi et al.,2020;Elazar et al.,2021;

Ravfogel et al.,2021;Finlayson et al.,2021;inter

alia), it remains unclear how and where multilin-

gual models encode this information. Most multi-

lingual probing studies are correlational and use

dependency parsing or labeling as a proxy task in-

dicative of syntactic information (Chi et al.,2020;

Stanczak et al.,2022). This is problematic: Models

do not need structural or word order information to

achieve high performance on dependency labeling

(Sinha et al.,2021), and training a parametric prob-

ing classiﬁer introduces many confounds (Hewitt

and Liang,2019;Antverg and Belinkov,2022).

Causal probing, however, enables non-

parametric analyses of models through coun-

terfactual interventions on inputs or model

representations. Causal probing studies have

argued for the existence of speciﬁc syntactic

agreement neurons and units in neural language

models (Finlayson et al.,2021;Lakretz et al.,

2019;De Cao et al.,2021), but these studies have

focused on monolingual models—usually (though

not always) in English. Causal methods allow us

to make stronger arguments about where and how

syntactic agreement is performed in pre-trained

LMs, and we can apply them to answer questions

about the language speciﬁcity and construction

speciﬁcity of syntactic agreement neurons.

In this study, we extend causal mediation analy-

sis (Pearl,2001;Robins,2003;Vig et al.,2020) to

multilingual language models, including an autore-

gressive LM and a masked LM. We also analyze

a series of monolingual MLMs across languages.

We employ the syntactic interventions approach

arXiv:2210.14328v1 [cs.CL] 25 Oct 2022

of Finlayson et al. (2021) on stimuli in languages

typologically related to English, such that we can

observe whether there exist syntax neurons that

are shared across a set of languages that are all

relatively high-resource and grammatically similar.

Our contributions include the following:

We causally probe for syntactic agreement

neurons in an autoregressive language model,

XGLM (Lin et al.,2021); a masked language

model, multilingual BERT (Devlin et al.,

2019); and a series of monolingual BERT-

based models. We ﬁnd two distinct layer-wise

effect patterns, depending on whether the sub-

ject and verb are separated by other tokens.

We quantify the degree of neuron overlap

across languages and syntactic structures, ﬁnd-

ing that many neurons are shared across struc-

tures and fewer are shared across languages.

We analyze the sparsity of syntactic agree-

ment representations for individual structures

and languages, and ﬁnd that syntax neurons

are more sparse in MLMs than ALMs, but also

that the degree of sparsity is similar across

models and structures.

Our data and code are publicly available.1

2 Related Work

Multilingual language modeling.

Multilingual

language models enable increased parameter efﬁ-

ciency per language, as well as cross-lingual trans-

fer to lower-resource language varieties (Wu and

Dredze,2019). This makes both training and de-

ployment more efﬁcient when support for many

languages is required. A common approach for

training multilingual LMs is to concatenate train-

ing corpora for many languages into one corpus,

often without language IDs (Conneau et al.,2020;

Devlin et al.,2019).

These models present interesting opportunities

for syntactic analysis: Do multilingual models

maintain similar syntactic abilities despite a de-

creased number of parameters that can be dedi-

cated to each language? Current evidence suggests

slight interference effects, but also that identical

models maintain much of their monolingual per-

formance when trained on multilingual corpora

(Mueller et al.,2020). Is syntactic agreement, in

particular, encoded independently per language or

1https://github.com/aaronmueller/

multilingual-lm-intervention

shared across languages? Some studies suggest

that syntax is encoded in similar ways across lan-

guages (Chi et al.,2020;Stanczak et al.,2022),

though these rely on correlational methods based

on dependency parsing, which introduce confounds

and may not rely on syntactic information per se.

Syntactic probing.

Various behavioral probing

studies have analyzed the syntactic behavior of

monolingual and multilingual LMs (Linzen et al.,

2016;Marvin and Linzen,2018;Ravfogel et al.,

2019;Mueller et al.,2020;Hu et al.,2020). Re-

sults from behavioral analyses are generally eas-

ier to interpret and present clearer evidence for

what models’ preferences are given various con-

texts. However, these methods do not tell us where

or how syntax is encoded.

A parallel line of work employs parametric

probes. Here, a linear classiﬁer or multi-layer per-

ceptron probe is trained to map from a model’s

hidden representations to dependency attachments

and/or labels (Hewitt and Manning,2019) to locate

syntax-sensitive regions of a model. This approach

has been applied in multilingual models (Chi et al.,

2020), and produced evidence for parallel depen-

dency encodings across languages. However, if

such probes are powerful, they may learn the target

task themselves rather than tap into an ability of

the underlying model (Hewitt and Liang,2019),

leading to uninterpretable results. When control-

ling for this, even highly selective probes may not

need access to syntactic information to achieve

high structural probing performance (Sinha et al.,

2021). There are further confounds when analyzing

individual neurons using correlational methods; for

example, probes may locate encoded information

that is not actually used by the model (Antverg and

Belinkov,2022).

Causal probing has recently become more com-

mon for interpreting various phenomena in neu-

ral models of language. Lakretz et al. (2019) and

Lakretz et al. (2021) search for syntax-sensitive

units in English and Italian monolingual LSTMs

by intervening directly on activations and evalu-

ating syntactic agreement performance. Vig et al.

(2020) propose causal mediation analysis for lo-

cating neurons and attention heads implicated in

gender bias in pre-trained language models; this

method involves intervening directly on the inputs

or on individual neurons. Finlayson et al. (2021)

extend this approach to implicate neurons in syn-

tactic agreement. This study extends their data and

method to multilingual stimuli and models.

Other causal probing work uses interventions on

model representations, rather than inputs. This

includes amnesic probing (Elazar et al.,2021),

where part-of-speech and dependency information

is deleted from a model using iterative nullspace

projection (INLP; Ravfogel et al.,2020). Ravfo-

gel et al. (2021) employ INLP to understand how

relative clause boundaries are encoded in BERT.

3 Methods

3.1 Causal Metrics

We ﬁrst deﬁne terms to represent the quantities we

measure before and after the intervention. We are

interested in the impact of an intervention

on a

model’s preference

for grammatical inﬂections

over ungrammatical ones. We start with the origi-

nal input, on which we apply the

null

intervention:

This represents performing no change to the orig-

inal input. Given prompt

and verb

, we ﬁrst

calculate the following ratio:

ynull(u, v) = p(vpl |usg )

p(vsg |usg)(1)

Here,

usg

represents a prompt that would re-

quire a singular verb inﬂection

vsg

at the

[MASK]

for the sentence to be grammatical; for example,

“The doctor near the cars

[MASK]

it”.

vsg

is the

third-person singular present inﬂection of verb

and

vpl

is the plural present inﬂection; for exam-

ple,

vsg =

“observes” and

vpl =

“observe”. Note

that this ratio has the incorrect inﬂection as the

numerator; this entails that if the model computes

agreement correctly, we will have y < 1.

We now deﬁne the

swap-number

intervention,

where the grammatical number of

is ﬂipped (re-

sulting in “The doctors near the cars

[MASK]

it” for

the previous example). This results in the following

expression for y:

yswap-number(u, v) = p(vpl |upl)

p(vsg |upl)(2)

Now, the numerator is the correct inﬂection, so we

expect y > 1.

As we are interested in the contribution of in-

dividual model components to the model’s over-

all preference for correct inﬂections, we focus on

indirect effects, where we perform interventions

on individual model components and observe the

Figure 1: Example of computing the natural indirect

effect (NIE). We change a neuron’s activation to what

it would have been if we had intervened on the prompt,

then measure the relative change in y.

change in

. In particular, we measure the

natural

indirect effect (NIE), as follows.

We intervene on an individual neuron

. We

change

’s original activation given

and

(de-

noted

znull(u, v)

) to the activation it would have

taken if we had performed the intervention on

(denoted

zswap-number(u, v)

). The rest of the neu-

rons retain their original activations. “Natural” here

refers to the fact that our intervention changes the

activation

to the value it would have in another

natural setting

, rather than setting it to some

predeﬁned constant (such as 0) that it may or may

not obtain given natural inputs. We measure the

relative change in

after applying the intervention

(see Figure 1for a visual example):

NIE(swap-number, null;y, z) =

Eu,v ynull,zswap-number(u,v)(u, v)−ynull(u, v)

ynull(u, v)=

Eu,v ynull,zswap-number(u,v)(u, v)

ynull(u, v)−1

(3)

If a neuron encodes useful information for syn-

tactic agreement, we expect

to increase after

the intervention, making the numerator positive.

Positive NIEs indicate that a neuron encodes pref-

erences for correct verb inﬂections, and negative

NIEs indicate that the neuron prefers incorrect in-

ﬂections. The closer the NIE is to 0, the less of a

contribution a neuron makes to syntactic agreement

in either direction.

3.2 Models

Finlayson et al. (2021) analyzed a series of mono-

lingual autoregressive language models (ALMs):

GPT-2 (Radford et al.,2019), TransformerXL (Dai

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

CausalAnalysisofSyntacticAgreementNeuronsinMultilingualLanguageModelsAaronMuellerS,YuXiaC,TalLinzenCSJohnsHopkinsUniversityCNewYorkUniversityamueller@jhu.edu,yx1675@nyu.edu,linzen@nyu.eduAbstractStructuralprobingworkhasfoundevidenceforlatentsyntacticinformationinpre-trainedlanguagemodels.However,muc...

展开>> 收起<<

Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models Aaron MuellerS Yu XiaC Tal LinzenC.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models Aaron MuellerS Yu XiaC Tal LinzenC

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: