SAT Improving Semi-Supervised Text Classiﬁcation with Simple Instance-Adaptive Self-Training Hui Chen Wei Han Soujanya Poria

2025-05-03 0 0 630.29KB 6 页 10玖币

侵权投诉

SAT: Improving Semi-Supervised Text Classiﬁcation with Simple

Instance-Adaptive Self-Training

Hui Chen Wei Han Soujanya Poria

Singapore University of Technology and Design

{hui_chen, wei_han}@mymail.sutd.edu.sg

sporia@sutd.edu.sg

Abstract

Self-training methods have been explored in

recent years and have exhibited great perfor-

mance in improving semi-supervised learn-

ing. This work presents a Simple instance-

Adaptive self-Training method (SAT) for semi-

supervised text classiﬁcation. SAT ﬁrst gener-

ates two augmented views for each unlabeled

data and then trains a meta-learner to auto-

matically identify the relative strength of aug-

mentations based on the similarity between the

original view and the augmented views. The

weakly-augmented view is fed to the model

to produce a pseudo-label and the strongly-

augmented view is used to train the model to

predict the same pseudo-label. We conducted

extensive experiments and analyses on three

text classiﬁcation datasets and found that with

varying sizes of labeled training data, SAT

consistently shows competitive performance

compared to existing semi-supervised learning

methods. Our code can be found at https:

//github.com/declare-lab/SAT.git.

1 Introduction

Pretrained language models have achieved ex-

tremely good performance in a wide range of nat-

ural language understanding tasks (Devlin et al.,

2019). However, such performance often has a

strong dependence on large-scale high-quality su-

pervision. Since labeled linguistic data needs large

amounts of time, money, and expertise to obtain,

improving models’ performance in few-shot sce-

narios (i.e., there are only a few training examples

per class) has become a challenging research topic.

Semi-supervised learning in NLP has received

increasing attention in improving performance in

few-shot scenarios, where both labeled data and

unlabeled data are utilized (Berthelot et al.,2019b;

Sohn et al.,2020;Li et al.,2021). Recently, sev-

eral self-training methods have been explored to

obtain task-speciﬁc information in unlabeled data.

UDA (Xie et al.,2020) applied data augmentations

to unlabeled data and proposed an unsupervised

consistency loss that minimizes the divergence be-

tween different unlabeled augmented views. To

give self-training more supervision, MixText (Chen

et al.,2020a;Berthelot et al.,2019b) employed

Mixup (Zhang et al.,2018;Chen et al.,2022) to

learn an intermediate representation of labeled and

unlabeled data. Both UDA and MixText utilized

consistency regularization and conﬁrmed that such

regularization exhibits outstanding performance in

semi-supervised learning. To simplify the consis-

tency regularization process, FixMatch (Sohn et al.,

2020) classiﬁed two unlabeled augmented views

into a weak view and a strong view, and then mini-

mized the divergence between the probability dis-

tribution of the strong view and the pseudo label

of the weak view. However, in NLP, it is hard to

distinguish the relative strength of augmented text

by observation, and randomly assigning an aug-

mentation strength will limit the performance of

FixMatch on text.

To tackle this problem in FixMatch, our paper in-

troduces an instance-adaptive self-training method

SAT, where we propose two criteria based on a

classiﬁer and a scorer to automatically identify the

relative strength of augmentations on text. Our

main contributions are:

•

First, we apply popular data augmentation

techniques to generate different views of un-

labeled data and design two novel criteria to

calculate the similarity between the original

view and the augmented view of unlabeled

data in FixMatch, boosting its performance

on text.

•

We then conduct empirical experiments and

analyses on three few-shot text classiﬁcation

datasets. Experimental results conﬁrm the

efﬁcacy of our SAT method.

arXiv:2210.12653v1 [cs.CL] 23 Oct 2022

2 Method

2.1 Problem Setting

In this work, we learn a model to map an input

x∈

onto a label

y∈ Y

in text classiﬁcation tasks.

In semi-supervised learning, we use both labeled

examples and unlabeled examples during training.

Let

X={(xb, yb) : b∈(1, ..., B)}

be a batch

labeled examples, where

are the training

examples and

are labels. Let

U={ub:b∈

(1, ..., µB)}

be a batch of

µB

unlabeled examples,

where

is a hyperparameter which determines the

relative sizes of Xand U.

2.2 SAT

The entire process of SAT is illustrated in Algo-

rithm 1. Similar to common semi-supervised learn-

ing methods, our approach consists of a supervised

part and an unsupervised part. Our supervised part

minimizes the cross-entropy loss between the la-

beled data and their targets. Our unsupervised part

ﬁrst generates two unlabeled augmented views,

then applies an augmentation choice network to

determine the relative augmentation strength, and

ﬁnally calculates a consistency loss between the

probability distribution of the strongly-augmented

view and the pseudo label of the weakly-augmented

view. Since the relative augmentation strength in

our SAT method has no direct correlation to the aug-

mentation techniques, our semi-supervised learn-

ing process can be more adaptive to the training

data, compared to FixMatch.

The augmentation choice network is trained by

the labeled data and we design two criteria to train

it where (1) one is based on a

classiﬁer

and (2) the

other is based on a

scorer

. Line 2to Line 7in Al-

gorithm 1shows how we train the augmentation

choice network. For each labeled data, we ﬁrst cal-

culate the similarity between the original data and

its augmented variants, respectively, and then rank

the augmented samples according to the similarity

scores. In our classiﬁer-based criterion, we employ

cross-entropy loss

to measure the distance, while

in our scorer-based criterion, we calculate the

co-

sine similarity

. Afterward, we deﬁne the one with

a higher similarity score as the weakly-augmented

sample and use it to train the augmentation choice

network. For our classiﬁer-based method, we ap-

ply a

cross-entropy loss

as the training objective.

For our scorer-based method, we use a

contrastive

loss

(Chen et al.,2020b) to update the network.

Finally, the trained augmentation choice network

is used to automatically identify the augmentation

strength in unlabeled data.

Algorithm 1:

SAT: Simple Instance-

Adaptive Self-Training

Input: Dtrain ={X ,U} where

X={(xb, yb) : b∈(1, ..., B)}and

U={ub:b∈(1, ..., µB)};

augmentation methods

α1

α2

; main

network f(; θ)with parameters θ

and its probability distribution p;

augmentation choice network

G(; θG)

with parameters

θG

; criteria

C,Γ; cross-entropy loss H;

unlabeled loss weight λu;

conﬁdence threshold τ; learning

rates β, η

Output: Updated network weights θ

// Calculate supervised loss

1ls=1

BPB

b=1 H(yb, p(y|xb))

2for (xb, yb)∈ X do

3ib

1, ib

C(α1(xb), xb, yb),C(α2(xb), xb, yb)

4ib

w, ib

s= Descending(ib

1, ib

5end

// Update the augmentation choice

network

6laug_choice =

BPB

b=1 Γ(xb, α1(xb), α2(xb), ib

7θG=θG−β∇laug_choice

8for each ub∈ U do

9ˆ

w,ˆ

s=G(ub, α1(ub), α2(ub); θG)

10 end

// Calculate unsupervised loss

11 lu=1

µB PµB

b=1 1{max(p(y|αˆ

w(ub))) >

τ}H(argmax(p(y|αˆ

w(ub))), p(y|αˆ

s(ub)))

// Total loss: add up supervised

loss and unsupervised loss

12 ltotal =ls+λulu

// Update the main network

13 θ=θ−η∇ltotal

3 Experimental Setup

We conducted empirical experiments to compare

our approach with a couple of existing semi-

supervised learning methods on a variety of text

classiﬁcation benchmark datasets.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

SAT:ImprovingSemi-SupervisedTextClassicationwithSimpleInstance-AdaptiveSelf-TrainingHuiChenWeiHanSoujanyaPoriaSingaporeUniversityofTechnologyandDesign{hui_chen,wei_han}@mymail.sutd.edu.sgsporia@sutd.edu.sgAbstractSelf-trainingmethodshavebeenexploredinrecentyearsandhaveexhibitedgreatperfor-manceinim...

展开>> 收起<<

SAT Improving Semi-Supervised Text Classiﬁcation with Simple Instance-Adaptive Self-Training Hui Chen Wei Han Soujanya Poria.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

SAT Improving Semi-Supervised Text Classiﬁcation with Simple Instance-Adaptive Self-Training Hui Chen Wei Han Soujanya Poria

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: