
pro
gressive zero-shot dataset generation
framework (Figure 1b), called PROGEN. In
a nutshell, PROGEN learns a model for a
downstream task by performing two phrases
alternatively – using PLMs to create labeled
examples leveraging the feedback from the current
task-specific model, and training a task-specific
model given the generated labeled examples. To
compute reliable signals as feedback, we employ
the influence function (Koh and Liang (2017); IF)
to quantify contribution to the loss for each training
point. In the context of zero-shot learning where no
human-annotated data is assumed, we integrate a
noise-resistant objective in the calculation of IF so
that it can tackle the noise in the synthetic dataset.
To incorporate feedback into PLMs, we sort the
training samples based on their quantified influence
score, and formulate those most influential ones
as in-context examples (Brown et al.,2020) to
steer the generation. Overall, PROGEN has the
following advantages: 1) the quality estimation
phrase requires no human annotations, thus works
in a purely zero-shot learning setting; 2) unlike
most controllable generation methods that tune or
require the access to PLMs (Keskar et al.,2019;
Dathathri et al.,2020;Liu et al.,2021,inter alia),
the in-context feedback phrase does not need to
modify parameters in the PLM and incurs minimal
disturbance to its generation procedure. Our main
contributions are three folds:
•
We propose a progressive framework for zero-
shot dataset generation to generate higher-
quality dataset (§3);
•
We propose noise-resistant influence function
to estimate the quality of each sample without
any human annotations (§3.1), and a learning-
free controllable generation method via in-
context feedback (§3.2);
•
Across multiple text classification datasets,
we show our framework obtains better perfor-
mance over various prompt-based methods,
and achieves on-par zero-shot performance
with only 1% synthetic dataset size, when
compared to methods without in-context
feedback (§4).
Our code can be found at
https://github.
com/HKUNLP/ProGen.
2 Background
In this section, we briefly review the baseline
approaches of zero-shot dataset generation and how
the synthesized dataset can be used for zero-shot
learning on downstream tasks.
Zero-shot Dataset Generation
Take text clas-
sification task as an example, vanilla zero-shot
dataset generation methods (Meng et al.,2022;Ye
et al.,2022) aims to generate a synthetic dataset
D={(x, y)}
with the help of a PLM
P
. They first
sample a class label
y
from a uniform distribution:
y∼U(y1, y2, . . . , yk),(1)
where
k
is the number of classes. They then wrap
y
up into a label-descriptive prompt
T(y)
to steer
the generation of x:
x∼ P(·|T (y)).(2)
Since the parameters of
P
is frozen and the
generation
x
for each
y
is deterministic, different
sampling algorithms (e.g., Top-k sampling (Fan
et al.,2018) and nucleus sampling (Holtzman et al.,
2020)) can be adopted to increase the diversity
of generated dataset. A synthetic dataset
D
is
constructed after pairing the generated xwith y.
Dataset-generation-based Zero-shot Learning
The vast linguistic (Jawahar et al.,2019;Goldberg,
2019;Tenney et al.,2019) and factual (Petroni et al.,
2019;Jiang et al.,2020b) knowledge encoded in
PLMs’ parameters is the key towards the success
of conventional prompt-based zero-shot learning
(PROMPTING) (Brown et al.,2020). However,
PROMPTING fails to fully exert the capacity
of PLMs and heavily relies on gigantic PLMs
during inference. This motivates another line of
work (Meng et al.,2022;Ye et al.,2022) to explore
a more flexible and efficient way of conducting
zero-shot learning based on dataset generation.
Given the synthetic dataset generated as above,
a task-specific model is trained, allowing any
task-specific inductive bias and with an order-of-
magnitude smaller number of parameters compared
to PLMs. The performance of the final task-
specific model is mostly dominated by the quality
of the synthetic dataset, and a low-quality dataset
degrades the final zero-shot performance. This
thereby motivates us to explore methods that
improve the dataset quality.