NeuroCounterfactuals Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation Phillip HowardGadi SingerVasudev LalYejin ChoiSwabha Swayamdipta

2025-05-02 0 0 938.04KB 17 页 10玖币

侵权投诉

NeuroCounterfactuals:

Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation

Phillip Howard♦Gadi Singer♦Vasudev Lal♦Yejin Choi♥♣ Swabha Swayamdipta♣♠

♦Intel Labs ♣Allen Institute for AI ♠University of Southern California

♥Paul G. Allen School of Computer Science & Engineering, University of Washington

phillip.r.howard@intel.com

Abstract

While counterfactual data augmentation offers

a promising step towards robust generalization

in natural language processing, producing a set

of counterfactuals that offer valuable inductive

bias for models remains a challenge. Most ex-

isting approaches for producing counterfactu-

als, manual or automated, rely on small per-

turbations via minimal edits, resulting in sim-

plistic changes. We introduce NeuroCounter-

factuals, designed as loose counterfactuals, al-

lowing for larger edits which result in natu-

ralistic generations containing linguistic diver-

sity, while still bearing similarity to the origi-

nal document. Our novel generative approach

bridges the beneﬁts of constrained decoding,

with those of language model adaptation for

sentiment steering. Training data augmenta-

tion with our generations results in both in-

domain and out-of-domain improvements for

sentiment classiﬁcation, outperforming even

manually curated counterfactuals, under select

settings. We further present detailed analyses

to show the advantages of NeuroCounterfac-

tuals over approaches involving simple, mini-

mal edits.

1 Introduction

Despite the enormous successes in natural language

processing, out-of-domain (OOD) generalization

still poses a challenge for even the most powerful

models, which achieve remarkable performance

in domain (Recht et al.,2019;Torralba and Efros,

2011). This can be attributed to the models’ re-

liance on spurious biases (Geirhos et al.,2020;Mc-

Coy et al.,2019;Gururangan et al.,2018), i.e. fea-

tures which co-occur with the ground truth without

any causal dependence (Simon,1954). Adopting

methods from causal inference (Pearl,2009;Feder

et al.,2022), training data augmentation with coun-

terfactuals (CFs) has been proposed for NLP as one

potential solution (Levesque et al.,2012;Kaushik

et al.,2019,2021). Counterfactuals are designed

(movie) ∧ (collection) ∧ (analogies) ∧ (plot) ∧ (devices)

Positively

Steered

Constrained Decoding

is a collection of plot devices and analogies that work

well enough to keep the movie from being a total bore

GPT-2

This movie is a loose collection of unintelligible

analogies and ill-conceived plot devices

Concept

Extraction

NeuroLogic

ConceptNet

This…

COCO-EX

Original

NeuroCounterfactual

Fine-tuned on SST-2

This movie is a loose collection of intelligible

analogies and well-conceived plot devices

Minimal Edit

Counterfactual

Augmented Training Data for

Sentiment Classification

Figure 1: Illustration of our approach. 1We extract

tokens from an Original (negative) movie review that

evoke concepts from ConceptNet (§2.1). 2We use a

GPT-2 model adapted to only reviews with the opposite

(positive) polarity as a sentiment steer (§2.2). 3Finally,

to ensure that the generation is similar to the original,

we use NeuroLogic, a constrained decoding approach

(§2.3;Lu et al.,2021), where the constraints are ex-

tracted tokens from 1. This results in NeuroCounter-

factuals, which are loose counterfactuals of the origi-

nal, but are more naturalistic (§4; Tab. 1), compared to

minimal edit counterfactuals (bottom). 4When used to

augment training data for sentiment classiﬁcation, our

generations are valuable for OOD generalization (§3).

to study the change in a response variable (e.g., the

target label), following an intervention (e.g., alter-

ing a causal feature), typically in the form of edits

to the input text (Khashabi et al.,2020;Andreas,

2020). Training data augmentation with counter-

factuals can thus provide strong inductive biases

to help with robustness against spurious biases, re-

sulting in improved OOD generalization (Vig et al.,

2020;Eisenstein,2022).

However, designing the appropriate interven-

tions to produce counterfactuals can be challenging.

Indeed, most counterfactuals are produced via ba-

arXiv:2210.12365v1 [cs.CL] 22 Oct 2022

sic edits to the input text, either manually (Gardner

et al.,2020;Kaushik et al.,2019) or automatically

(Yang et al.,2021;Wang and Culotta,2021;Wu

et al.,2021), such that the target label changes.

These minimal edits are made via substitution, in-

sertion or deletion of tokens in the original sen-

tence, resulting in simplistic generations, which are

often unrealistic and lack linguistic diversity.

As a

result, counterfactuals via minimum edits often fail

to provide adequate inductive biases to promote ro-

bustness (Khashabi et al.,2020;Huang et al.,2020;

Joshi and He,2022).

In this paper, we investigate the potential of more

realistic and creative counterfactuals, which go be-

yond simple token-level edits, towards improving

robust generalization. While allowing larger edits

reduces proximity to the original sentence, we be-

lieve that this is a worthwhile trade-off for more

realistic and creative counterfactuals, which offer

greater ﬂexibility in sentiment steering, increasing

the likelihood that the counterfactual possesses the

desired label. We propose a novel approach that

can generate diverse counterfactuals via concept-

controlled text generation, illustrated in Figure 1.

In particular, our approach combines the beneﬁts

of domain adaptive pretraining (Gururangan et al.,

2020) for soft steering of the target label (Liu et al.,

2021), with those of

NeuroLogic

decoding (Lu

et al.,2021), an unsupervised, inference-time algo-

rithm that generates ﬂuent text while strictly satisfy-

ing complex lexical constraints. As constraints, we

use tokens that evoke salient concepts derived from

ConceptNet (Speer et al.,2017). Our resulting gen-

erations, called

NeuroCounterfactuals2

, provide

loose counterfactuals to the original, while demon-

strating nuanced linguistic alterations to change the

target label (§2).

Compared to minimal-edit counterfactuals, our

counterfactuals are more natural and linguistically

diverse, resulting in syntactic, semantic and prag-

matic changes which alter the label while preserv-

ing relevance to the original concepts (Table 1).

On experiments with training data augmentation

for sentiment classiﬁcation, our approach achieves

better performance compared to competitive base-

lines using minimal edit counterfactuals (§3). Our

performance even matches baselines using human-

annotated counterfactuals, on some settings, while

For instance, the minimal edit counterfactual in Figure 1

contains the phrase “loose collection of intelligible analogies”,

a somewhat unnatural construction for a positive movie review.

2NeuroCFs, for short.

avoiding the cost of human annotation. While

Neu-

roCFs

are designed to be loose counterfactuals, our

detailed analyses show that it is still important to

augment training data with examples possessing

a moderately high degree of similarity with the

original examples (§4). When the ultimate goal is

improving robust generalization, we show that go-

ing beyond minimal edit counterfactuals can result

in richer data augmentation.3

2NeuroCounterfactuals

We describe our methodology for automatic gen-

eration of loose counterfactuals,

NeuroCFs

, for

sentiment classiﬁcation. The key idea underlying

our approach is the need for retention of concepts

to ensure content similarity to the original text,

while steering the sentiment to the opposite po-

larity. Our method, illustrated in Figure 1, com-

bines a concept-constrained decoding strategy with

a sentiment-steered language model. First, we de-

tail our approach for extracting the salient concepts

from a document (§2.1). Next, we discuss language

model adaptation to produce sentiment-steered

LMs (§2.2). Finally, we provide an overview of

the

NeuroLogic

decoding algorithm for controlled

text generation, and how it can be adapted for the

task of generating sentiment counterfactuals (§2.3).

2.1 Extracting Salient Concepts

Our ﬁrst step constitutes extraction of concepts

from the original document, which can be used to

reconstruct its content, when used as constraints

during decoding (§2.3). Speciﬁcally, we aim to

identify a set of constraints which will require the

counterfactual to be similar in content to the orig-

inal sentence while still allowing the generation

to be steered towards the opposite polarity. Using

extracted concepts as constraints achieves this be-

cause the concepts consist of the content-bearing

noun phrases as opposed to the sentiment-bearing

adjectives. For example, in the original sentence

from Figure 1, we seek to constrain our generated

counterfactual to contain concept-oriented phrases,

such as “movie”,“analogy”, and “plot devices”

without explicitly requiring the presence of other

tokens which may indicate the sentiment (e.g., “un-

intelligible”,“ill-conceived”).

We achieve this mapping via linking tokens and

phrases in the document to nodes in the ConceptNet

Our code and data are available at

https://github.

com/IntelLabs/NeuroCounterfactuals

Source Label Review

Original −But this ﬁlm decided to throw away the talents of the people involved in a simpering version so watered down from the

source material that it’s amazing they had the guts to call it Wuthering Heights at all.

W&C. −But this ﬁlm decided to throw away the talents of the people involved in a simpering version so watered down from the

source material that it s unimpressive they had the guts to call it wuthering heights at all

Y.et al. −But this ﬁlm decided to throw away the talents of the people involved in a simpering version so watered down from the

source material that it’s amazing they had the guts to call it wuthering heights at all.

NeuroCFs-1g LBut the ﬁlm guts its source material, and it does so with a version of the heights of artistry that people have come to

expect from the talents of jean renoir.

NeuroCFs-np LBut this ﬁlm decided to take the talents of the source material and make them its own, and it’s a gutsier version of the

people we know and love from the heights.

Original LUnfortunately I had to rent a Dreamcast to play it, but even though I did beat it I can’t wait to buy it for PS2.

W&C. LFortunately i had to rent a dreamcast to play it but even though i did beat it i can t wait to buy it for ps2

Y.et al. ?? Unfortunately i had to rent a dreamcast to play it, but even though i did beat it i can’t wait to buy it for ps2.

NeuroCFs-1g −Unfortunately it’s not nearly as good as the dreamcast ps2 version.

NeuroCFs-np −Unfortunately i had to rent a dreamcast to play it but even though i did beat it i can’t recommend it for ps2 or xbox.

Table 1: Comparison of IMDB-S train examples (Original) with generated counterfactuals from different ap-

proaches: W&C. (Wang and Culotta,2021), Y.et al. (Yang et al.,2021), and our NeuroCF variants, designed

to ﬂip the target label. The sentiment labels for the counterfactuals can be L(positive), −(negative), or ?? (un-

clear), as assessed by authors of this work. For the baselines, substitutions and insertions are underlined, ignoring

punctuation and capitalization, and deletions are struck out. NeuroCFs result in more complex changes to the

original, and are more successful in steering the sentiment for label ﬂipping; minimal edits are at times unable

to result in meaningful changes to the sentiment, and result in reduced grammaticality. Concepts in the original

sentence that were used as constraints to generate NeuroCFs are in blue italics . Also see App §A; Tab. 14.

knowledge graph (Speer et al.,2017), thus evoking

salient concepts. Nodes in ConceptNet are repre-

sented as non-canonicalized, free-form text. To this

end, we use COCO-EX (Becker et al.,2021), a

ConceptNet entity linking tool. COCO-EX im-

proves upon simple string-matching techniques

which have been commonly used for ConceptNet

entity linking in the past by selecting meaningful

concepts and mapping them to a set of concept

nodes based on relational information in the graph.

Most extracted concepts correspond to nominal en-

tities. Moreover, this mapping implicitly ensures

that our extraction refrains from sentiment-bearing

tokens and phrases.

We primarily use COCO-EX for its ability to

identify meaningful concepts, but also explore the

use of links to related concepts it provides in Sec-

tion 4.4. We also compare with a baseline using

noun chunks as constraints in App C.2.

2.2 Steering Sentiment via LM Adaptation

The second component for our method is a senti-

ment “steer”, i.e. an autoregressive language model

which has been trained or adapted via ﬁnetuning

(Gururangan et al.,2020) exclusively on sentences

with single (negative or positive) polarity. Specif-

ically, we use two steers for each sentiment label:

one which models positive sentiment text, (denoted

), and another which models negative sentiment

text, (denoted

p−

), where

indicates the param-

eters of the adapted language model. In contrast

to the hard predicate constraints over speciﬁc to-

kens as given by the extracted concepts in §2.1, our

selective use of steering LMs can be viewed as a

softer type of constraint which biases the genera-

tions towards text containing the desired sentiment

polarity (Liu et al.,2021).

2.3 Decoding with Conceptual Constraints

Our method utilizes

NeuroLogic

Decoding (Lu

et al.,2021), a controlled text generation algorithm

to generate ﬂuent text satisfying a set of lexical con-

straints from a pretrained language model. Given a

series of predicates

D(a,y)

which are true iff

ap-

pears in the generated sequence

NeuroLogic

ac-

cepts a set of clauses

{Ci|i∈1,· · · m}

consisting

of one or more predicates speciﬁed in Conjunctive

Normal Form (CNF):

(D1∨D2· · · ∨ Di)

| {z }

∧ · · ·∧(Dk∨Dk+1 · · · ∨ Dn)

| {z }

where each predicate

is a positive constraint,

D(ai,y)

, which is satisﬁed (i.e., evaluates as true)

if the subsequence

appears in the generated se-

quence y.

NeuroLogic

employs a beam search approxima-

tion of an objective function which maximizes the

probability of the generated sequence while penal-

izing deviations from the set of mclauses:

y=arg max

y∈Y pθ(y|x)−λ

j=1

(1 −Cj)(1)

where

λ0

penalizes deviations from the set of

constraints. Candidates are scored at each stage

of beam search according to their partial or full

satisfaction of the constraints:

f(y≤t) = log pθ(y≤t|x) + λmax

D(a,y≤t)

|ˆ

|a|(2)

where

represents a subsequence of

in the cur-

rent generation and the maximum is taken over

all unsatisﬁed constraints consisting of more than

one token. This has the effect of preferring candi-

dates which at least partially satisfy multi-token

constraints; for example, a generated sequence

y≤t=

“The boy climbs an apple” would be re-

warded for partially satisfying the constraint

“apple tree” via its subsequence ˆ

a=“apple”.

Unlike the top-

selection strategy used in tra-

ditional beam search,

NeuroLogic

performs prun-

ing, grouping, and selection steps to identify the

best candidates which satisfy the given constraints.

Speciﬁcally, candidates which irreversibly violate

one or more constraints are pruned, and the re-

maining candidates are grouped according to their

number of satisﬁed clauses in order to encourage

diversity. The best candidate within each group is

then selected according to the scoring function in

Equation 2.

Each word or phrase in the original example

which is linked to a ConceptNet node (§2.1) be-

comes a clause in our constraint set used with

Neu-

roLogic

. We allow each clause to be satisﬁed by

the lowercase or capitalized form of the concept

via an OR constraint. For the example in Figure 1,

this constraint set would be speciﬁed in CNF as

follows:

(Movie ∨movie)∧(Plot Devices ∨plot devices)∧

(Collection ∨collection)∧(Analogies ∨analogies)

Once the constraints have been identiﬁed in the

original, we substitute the sentiment-steered LMs

(§2.2) into Equation 1, corresponding to a polarity

opposite to the original:

y=arg max

y∈Y pi

θ(y|x)−λ

j=1

(1 −Cj).(3)

Here,

θ=p+

when we aim to generate a positive-

sentiment example and

θ=p−

, for a negative-

sentiment example. The resulting generation,

, is

aNeuroCounterfactual (NeuroCF).

In Eq. 3, the generation is conditioned on

which indicates a prompt, comprising a preﬁx of

the original input; we investigate two variants for

. When

is a unigram (

) comprising the ﬁrst

token of the original input, we call the generations

NeuroCFs-1g

. When

is the longest

eutral

reﬁx of the original input, we call the genera-

tions

NeuroCFs-np

; these are slightly tighter

Neu-

roCFs

containing a greater portion of the original

input. Table 1 provides examples showing the orig-

inal sentence and our generated

NeuroCFs

, high-

lighting words in the original that were included in

the concept-oriented constraint set for

NeuroLogic

decoding.

NeuroCFs

are not guaranteed to not con-

tain new concepts, beyond the speciﬁcations of the

constraint set. See App. §Afor further examples.

3 Data Augmentation with NeuroCFs

Our experiments compare

NeuroCFs

to CFs from

minimal edit approaches, for augmentation of sen-

timent classiﬁcation training data.

3.1 Experimental Setup

Sentiment Steer

Our positive and negative sen-

timent steers are based on a GPT-2 Large model

(Radford et al.,2019), ﬁnetuned on (positive and

negative, resp.) subsets of the Stanford Sentiment

Treebank (SST-2; Socher et al.,2013) corpus, in-

cluding train, test and validation splits.4

NeuroLogic

For decoding with

NeuroLogic

we use a beam size of 20, length penalty of 0.3, and

-gram size of 2 for preventing repetitions. We

use

β= 1.25

as the reward factor for in-progress

constraint satisfaction and set the constraint sat-

isfaction tolerance to 2. Please refer to Lu et al.

(2021) for details on these hyperparameters.

For the generation of

NeuroCFs-np

, we iden-

tify the longest neutral preﬁx of the original input.

As candidates, we consider all preﬁxes containing

at least 4 tokens, such that the rest of the review

contains at least one identiﬁed concept. We ﬁlter

the longest candidate, predicted as neutral using an

off-the-shelf 5-way sentiment classiﬁer.5

Following prior work (Kaushik et al.,2019), we

generate

NeuroCFs

for a subset of movie reviews

We use the sentiment experts released by Liu et al. (2021).

5From ShannonAI.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

NeuroCounterfactuals:BeyondMinimal-EditCounterfactualsforRicherDataAugmentationPhillipHoward}GadiSinger}VasudevLal}YejinChoi~|SwabhaSwayamdipta|}IntelLabs|AllenInstituteforAIUniversityofSouthernCalifornia~PaulG.AllenSchoolofComputerScience&Engineering,UniversityofWashingtonphillip.r.howard@intel.c...

展开>> 收起<<

NeuroCounterfactuals Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation Phillip HowardGadi SingerVasudev LalYejin ChoiSwabha Swayamdipta.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

NeuroCounterfactuals Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation Phillip HowardGadi SingerVasudev LalYejin ChoiSwabha Swayamdipta

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: