
the user’s goal. For a new dialogue
Ci
, we first gen-
erate its user goal
Gi
based on the ontology. The
user goal and belief state are a set of domain-slot-
value triplets: (domain, slot_name, slot_value). For
example, when a user wants to book a 4-star ho-
tel for 2 nights, and a cheap restaurant that serves
Chinese food, his user goal will be {(hotel, stars,
4), (hotel, book stay, 2), (restaurant, pricerange,
cheap), (restaurant, food, chinese)}. We investi-
gate several ways to generate the user goal, i.e.,
determining the domains, slots, and slot values to
be selected, which will be discussed as follows.
Example Selection. Given the target user goal
Gt
,
we select a few seed dialogues as in-context exam-
ples, from which GPT-3 can learn to generate the
target dialogue
Ci
. To achieve that, the selected di-
alogue examples should contain as much ontology
information needed in the target dialogue (i.e., men-
tioned slots) as possible so that GPT-3 can mimic
the “in-domain” generation. To measure how two
dialogue goals
Gi
and
Gj
overlap, we calculate their
similarity as:
wij =
D(Gi)TD(Gj)
D(Gi)SD(Gj)
·
S(Gi)TS(Gj)
S(Gi)SS(Gj)
,(1)
where
D(Gi)
and
S(Gi)
denote the set of domains
and slots in the user goal
Gi
, respectively. The first
part is the Jaccard similarity (Niwattanakul et al.,
2013) of the domain set, while the second part is
that of the slot set. The probability of a dialogue
Cj
from the seed dataset
Ds
being sampled as in-
context examples for the target dialogue Ciis:
pj=ewij /τ
PCk∈Dsewik/τ ,(2)
where
τ
is the temperature. A higher temperature
will introduce more randomness and diversity in
example selection.
We investigate several ways to generate user
goals and select in-context examples:
•
Random Sampling: we randomly select do-
mains, slots, and slot values to form a user
goal and sample in-context examples as de-
scribed in Equation 2. In this way, we can
generate any unseen user goal and thus the
corresponding dialogues. However, as the
number of seed dialogues is limited, it is hard
to guarantee that the sampled dialogue exam-
ples can cover all the information required for
generating the target dialogue.
Instruction: You are going to book a hotel, and your requirements for
the hotel are ([hotel] type is hotel , pricerange is cheap , parking is yes ,
stay is 2 , day is tuesday , people is 6). Make sure you get the booking
information once booked.
Conversation:
User ([hotel] type is hotel , pricerange is cheap): I am looking for a place
to to stay that has cheap price range it should be in a type of hotel .
Assistant ([hotel] [request] area): okay , do you have a specific area you
want to stay in ?
User ([hotel] parking is yes , pricerange is cheap): no , i just need to
make sure it is cheap . oh , and i need parking .
Assistant ([hotel] [inform] price choice parking type [offerbook]): i
found [value_choice] [value_price] [value_type] for you that include -s
parking . do you like me to book it ?
User ([hotel] stay is 3 , day is tuesday , people is 6): yes , please . 6
people 3 nights starting on tuesday .
Assistant ([hotel] [nobook] day [request] stay day): i am sorry but i was
n’t able to book that for you for [value_day] . is there another day you
would like to stay or perhaps a shorter stay ?
User ([hotel] stay is 2): how about only 2 nights .
Assistant ([hotel] [offerbooked] reference [general] [reqmore]): booking
was successful . reference number is : [value_reference] . anything else i
can do for you ?
User ([general]): no , that will be all . goodbye .
Assistant ([general] [bye]): thank you for using our services .
Figure 3: Illustration of an in-context example from the
MultiWOZ dataset. The user goal,belief states, and
dialog acts are in red. User utterances are in blue, while
system responses are in green. Best viewed in color.
•
Value Substitution: we only substitute the slot
values of the seed dialogues’ user goal to form
a new user goal. This method can ensure that
all the required slots are mentioned in the in-
context examples. However, GPT-3 will tend
to replicate the in-context examples, and thus
few diversity can be introduced.
•
Combination: we first select a few dialogues
from the seed dataset and then combine their
user goals to create a new goal. As the new
user goal might involve too many domains
and slots, we randomly drop some slots. This
method can ensure that all the mentioned slots
in the target user goal are covered in the ex-
amples and encourage the GPT-3 to generate
diverse data.
We experimentally found the Combination method
yields the best performance. More details, compar-
ison, and discussion of different goal generation
methods can be found in Appendix A.2.
Demonstration. To better demonstrate the desired
pattern of generated data for a dialogue to GPT-3,
we design the format for the example dialogues as
shown in Figure 3. The user goal and belief state
are converted from a sequence of triplets to the
natural language via a template. For example, the
user goal of {(hotel, stars, 4), (hotel, book stay, 2),
(restaurant, pricerange, cheap), (restaurant, food,
chinese)} will be converted to [hotel] star is 4 ,