
et al.,2021;Li et al.,2020b) and propose a novel
definition generation method based on a designed
contrastive objective. Conceptually, definition gen-
eration is to transform the encoding of the target
word to its textual interpretation. To this end, the
encoding and the decoding of the target word can
be regarded as two views of representations with
respect to the same semantics. Our idea is then to
leverage the two representations in the definition
generation model, and encourage them to align
with each other to capture fine-grained semantics.
Specifically, we treat the target word representa-
tion and the definition representation as a positive
pair, and feed them into a contrastive learning ob-
jective. This kind of contrastive loss is naturally
complementary for the language generation loss,
and can be seamlessly incorporated into existing
pre-trained encoder-decoder models.
To validate the effectiveness of our proposal,
we conduct a series of experiments on three pub-
licly available datasets. Both automatic and manual
evaluation results suggest that our method gener-
ates more specific definitions and addresses well
the under-specific problem in the task of definition
generation. In general, our contributions can be
summarized as follows:
• We tackle the under-specific problem for pre-
trained definition generation models by devel-
oping a novel fine-grained contrastive learning
objective.
•
We validate the effectiveness of the proposed
method through comparing with several SOTA
models on three popular datasets using both
automatic and manual judgments.1
•
We analyze the details of our method by per-
forming ablated studies and demonstrate the
effect of our method in addressing under-
specific problem based on case studies.
2 Related Work
2.1 Definition Generation
The task of Definition Generation is firstly pro-
posed by Noraset et al. (2017). They used word
embedding to generate its corresponding definition,
and utilize definition generation as an auxiliary task
for reverse dictionary and word embedding training.
1
Our code could be found in
https:
//github.com/rattlesnakey/
Definition-Gneration-Contrastive
Some later works explore more application scenar-
ios and model architectures for definition genera-
tion. Ni and Wang (2017) propose a dual-encoder
model to generate the proper definition of the given
word under a specific context, and use it for explain-
ing emerging words on the Internet. Gadetsky et al.
(2018) use both local and global information of the
words in their model for word disambiguation. Fol-
lowing them, Ishiwatari et al. (2019) design gate
mechanisms to fuse multi-source information of
the word and context. Furthermore, some works at-
tempt to utilize other information of the target word.
Washio et al. (2019) build relation of defined and
defining words using word pair embedding (Joshi
et al.,2018). Different from former works that
using distributed representations of target words,
Yang et al. (2019) introduce target words’ concepts
in HowNet (Dong and Dong,2003) as fine-grained
knowledge in Chinese definition modeling. Also,
there exist literature works based on refined meth-
ods to learn the target words. Both Li et al. (2020a)
and Reid et al. (2020) decompose the meaning of
the target word into a group of latent variables and
rely on variational inference for estimation.
Recently, pre-trained encoder-decoder mod-
els have been used in definition generation and
achieved great success. Bevilacqua et al. (2020)
use special tokens to mark the target word in the
context and feed them into a BART model (Lewis
et al.,2019). Huang et al. (2021) fine-tune a T5
model and re-rank all the candidate results from
the T5 model to obtain definitions in a proper speci-
ficity. Kong et al. (2022) design a MASS model
based on multi-task framework to generate simple
definition in an unsupervised manner. Despite of
their promising performances on definition gen-
eration, the under-specific problem has been less
investigated. Although Huang et al. (2021) de-
sign a scoring mechanism that measures defini-
tions’ specificity, we argue that the fundamental
reason of the under-specific problem lies in the lack
of fine-grained semantic learning in pre-trained
encoder-decoder models, which we leverage con-
trastive learning to address in this work.
2.2 Contrastive Learning in Semantic
Representation
Contrastive learning has been widely used in en-
hancing semantic information for various NLP
tasks. For example, Gao et al. (2021) use a dropout
trick to derive positive samples in the embedding