FAST: Improving Controllability for Text Generation with
Feedback Aware Self-Training
Junyi Chai, Reid Pryzant, Victor Ye Dong, Konstantin Golobokov, Chenguang Zhu, Yi Liu
Microsoft Corporation
juchai,reidpryzant,victordong,kogolobo,chezhu,lewisliu@microsoft.com
Abstract
Controllable text generation systems often
leverage control codes to direct various prop-
erties of the output like style and length. In-
spired by recent work on causal inference
for NLP, this paper reveals a previously over-
looked flaw in these control code-based con-
ditional text generation algorithms. Spurious
correlations in the training data can lead mod-
els to incorrectly rely on parts of the input
other than the control code for attribute se-
lection, significantly undermining downstream
generation quality and controllability. We
demonstrate the severity of this issue with a
series of case studies and then propose two
simple techniques to reduce these correlations
in training sets. The first technique is based
on resampling the data according to an ex-
ample’s propensity towards each linguistic at-
tribute (IPS). The second produces multiple
counterfactual versions of each example and
then uses an additional feedback mechanism to
remove noisy examples (feedback aware self-
training, FAST). We evaluate on 3 tasks – news
headline, meta review, and search ads gener-
ation – and demonstrate that FAST can sig-
nificantly improve the controllability and lan-
guage quality of generated outputs when com-
pared to state-of-the-art controllable text gen-
eration approaches.
1 Introduction
In neural text generation, there is a growing in-
terest in controlling the presence of particular lin-
guistic attributes in the output text, for example
sentiment, length, politeness, and topic (Sennrich
et al.,2016;Kikuchi et al.,2016;Ficler and Gold-
berg,2017;Shen et al.,2022). This is typically
accomplished via control codes: categorical vari-
ables that represent the desired output property and
are pre-pended to the model inputs during training
and testing (Keskar et al.,2019).
This paper builds on recent work in text-based
causal inference (Feder et al.,2021;Veitch et al.,
2021;Pryzant et al.,2021) to reveal a previously
overlooked flaw in control code-based text gener-
ation systems: spurious correlations in the data
can cause models to incorrectly rely on parts of
the input other than the control code for attribute
selection, undermining downstream generation per-
formance.
For example, consider a system that generates
news headlines while conditioning on article text
and a control code for headline length (e.g. long
for desktop, short for mobile) as in Murao et al.
(2019). We show in §4.1 that among publicly avail-
able news datasets, correlations exist between the
contents of an article and the length of that article’s
title. Longer articles or articles about technical top-
ics may be associated with longer titles. This leads
NLP models to struggle at generating short titles
from “long title”-looking articles.
We show how this phenomenon can introduce
confounding statistical relationships in the data,
leading to assumption violations and significantly
degrading model quality. Then we propose two sim-
ple data augmentation techniques for improving the
issue. Both algorithms operate by breaking these
spurious correlations and isolating the statistical
relationship between control codes and linguistic
attributes. In the first approach, we resample the
training set according to an inverse propensity score
(IPS, Robins et al. (1994)), boosting the presence
of rare context-attribute combinations in the data.
In the second approach (FAST) we train a prelimi-
nary model, use counterfactual data augmentation
to generate all possible attributes for each exam-
ple, then retrain on the counterfactually balanced
dataset, as illustrated in Figure 1.
We conduct experiments in 3 conditional text
generation scenarios: generating news headlines
from article contents (controlling the headline
lengths), generating the next sentence from pre-
ceding sentences (controlling the intent), and gen-
erating search ad copy from landing pages (control-
arXiv:2210.03167v1 [cs.CL] 6 Oct 2022