SAT: Improving Semi-Supervised Text Classification with Simple
Instance-Adaptive Self-Training
Hui Chen Wei Han Soujanya Poria
Singapore University of Technology and Design
{hui_chen, wei_han}@mymail.sutd.edu.sg
sporia@sutd.edu.sg
Abstract
Self-training methods have been explored in
recent years and have exhibited great perfor-
mance in improving semi-supervised learn-
ing. This work presents a Simple instance-
Adaptive self-Training method (SAT) for semi-
supervised text classification. SAT first gener-
ates two augmented views for each unlabeled
data and then trains a meta-learner to auto-
matically identify the relative strength of aug-
mentations based on the similarity between the
original view and the augmented views. The
weakly-augmented view is fed to the model
to produce a pseudo-label and the strongly-
augmented view is used to train the model to
predict the same pseudo-label. We conducted
extensive experiments and analyses on three
text classification datasets and found that with
varying sizes of labeled training data, SAT
consistently shows competitive performance
compared to existing semi-supervised learning
methods. Our code can be found at https:
//github.com/declare-lab/SAT.git.
1 Introduction
Pretrained language models have achieved ex-
tremely good performance in a wide range of nat-
ural language understanding tasks (Devlin et al.,
2019). However, such performance often has a
strong dependence on large-scale high-quality su-
pervision. Since labeled linguistic data needs large
amounts of time, money, and expertise to obtain,
improving models’ performance in few-shot sce-
narios (i.e., there are only a few training examples
per class) has become a challenging research topic.
Semi-supervised learning in NLP has received
increasing attention in improving performance in
few-shot scenarios, where both labeled data and
unlabeled data are utilized (Berthelot et al.,2019b;
Sohn et al.,2020;Li et al.,2021). Recently, sev-
eral self-training methods have been explored to
obtain task-specific information in unlabeled data.
UDA (Xie et al.,2020) applied data augmentations
to unlabeled data and proposed an unsupervised
consistency loss that minimizes the divergence be-
tween different unlabeled augmented views. To
give self-training more supervision, MixText (Chen
et al.,2020a;Berthelot et al.,2019b) employed
Mixup (Zhang et al.,2018;Chen et al.,2022) to
learn an intermediate representation of labeled and
unlabeled data. Both UDA and MixText utilized
consistency regularization and confirmed that such
regularization exhibits outstanding performance in
semi-supervised learning. To simplify the consis-
tency regularization process, FixMatch (Sohn et al.,
2020) classified two unlabeled augmented views
into a weak view and a strong view, and then mini-
mized the divergence between the probability dis-
tribution of the strong view and the pseudo label
of the weak view. However, in NLP, it is hard to
distinguish the relative strength of augmented text
by observation, and randomly assigning an aug-
mentation strength will limit the performance of
FixMatch on text.
To tackle this problem in FixMatch, our paper in-
troduces an instance-adaptive self-training method
SAT, where we propose two criteria based on a
classifier and a scorer to automatically identify the
relative strength of augmentations on text. Our
main contributions are:
•
First, we apply popular data augmentation
techniques to generate different views of un-
labeled data and design two novel criteria to
calculate the similarity between the original
view and the augmented view of unlabeled
data in FixMatch, boosting its performance
on text.
•
We then conduct empirical experiments and
analyses on three few-shot text classification
datasets. Experimental results confirm the
efficacy of our SAT method.
arXiv:2210.12653v1 [cs.CL] 23 Oct 2022