Learning Disentangled Representations for Natural Language Deﬁnitions Danilo S. Carvalho1Giangiacomo Mercatali1yYingji Zhang1yAndre Freitas12 Department of Computer Science University of Manchester United Kingdom1

2025-04-29 0 0 3.82MB 14 页 10玖币

侵权投诉

Learning Disentangled Representations for Natural Language Deﬁnitions

Danilo S. Carvalho1Giangiacomo Mercatali1†Yingji Zhang1†Andre Freitas1,2

Department of Computer Science, University of Manchester, United Kingdom1

Idiap Research Institute, Switzerland2

<firstname.lastname>@[postgrad.†]manchester.ac.uk

Abstract

Disentangling the encodings of neural mod-

els is a fundamental aspect for improving

interpretability, semantic control and down-

stream task performance in Natural Language

Processing. Currently, most disentanglement

methods are unsupervised or rely on synthetic

datasets with known generative factors. We ar-

gue that recurrent syntactic and semantic reg-

ularities in textual data can be used to provide

the models with both structural biases and gen-

erative factors. We leverage the semantic struc-

tures present in a representative and semanti-

cally dense category of sentence types, deﬁ-

nitional sentences, for training a Variational

Autoencoder to learn disentangled represen-

tations. Our experimental results show that

the proposed model outperforms unsupervised

baselines on several qualitative and quantita-

tive benchmarks for disentanglement, and it

also improves the results in the downstream

task of deﬁnition modeling.

1 Introduction

Learning disentangled representations is a funda-

mental step towards enhancing the interpretability

of the encodings in deep generative models, as

well as improving their downstream performance

and generalization ability. Disentangled represen-

tations aim to encode the fundamental structure

of the data in a more explicit manner, where in-

dependent latent variables are embedded for each

generative factor (Bengio et al.,2013).

Previous work in machine learning proposed

to learn disentangled representations by modify-

ing the ELBO objective of the Variational Autoen-

coders (VAE) (Kingma and Welling,2014), within

an unsupervised framework (Higgins et al.,2017;

Kim and Mnih,2018;Chen et al.,2018). On the

other hand, a more recent line of work claims the

beneﬁts of supervision in disentanglement (Lo-

catello et al.,2019) and it advocates the importance

of designing frameworks able to exploit structures

Training

english poets who lived in the lake district

word embedding (w)

qφ

w, r pθˆw, ˆr

role embedding (r)

Differentia

Quality

Supertype Differentia

Event

Location

Evaluation

Qualitative

Latent traversals

Interpolation

TSNE

Quantitative

Disentanglement

metrics

Downstream task

Definition Modeling

Figure 1: Left: Supervision mechanism with deﬁni-

tion semantic roles (DSR) encoded in the latent space.

The dotted arrow represent the conditional VAE ver-

sion. Right: Evaluation framework.

in the data for introducing inductive biases. In par-

allel, disentanglement approaches for NLP have

been tackling text style transfer, and evaluating the

results with extrinsic metrics, such as style transfer

accuracy (Hu et al.,2017;John et al.,2019;Cheng

et al.,2020).

While style transfer approaches investigate the

ability to disentangle and control syntactic factors

such as tense and gender, the aspect of understand-

ing and disentangling the semantic structure in lan-

guage is under-explored, but with recent attempts

of separating syntactic and semantic latent spaces

showing promising results (Chen et al.,2019;Bao

et al.,2019). Furthermore, evaluating disentangle-

ment is challenging, because it requires knowledge

of generative factors, leading most approaches to

train on synthetic datasets (Higgins et al.,2017;

Zhang et al.,2021).

In this work, we argue that recurrent semantic

structures at sentence level can be leveraged both

as inductive biases for enhancing disentanglement

(

RQ1

) but also for providing meaningful genera-

tive factors that can be employed for evaluating the

degree of disentanglement (

RQ2

). We also inves-

tigate whether organizing the generative factors in

arXiv:2210.02898v2 [cs.CL] 15 Feb 2023

groups may facilitate learning and disentanglement

(

RQ3

). As a result, this work focuses on natural

language deﬁnitions, which are a textual resource

characterised by a principled structure in terms of

semantic roles, as demonstrated by previous work

which proposed the extraction of structural and se-

mantic patterns in this kind of data (Silva et al.,

2016,2018).

Seeking to address the highlighted issues and an-

swer the research questions, we make the following

contributions, also depicted in Figure 1.

1) We design a supervised framework for en-

hancing disentanglement in language representa-

tions by conditioning on the information provided

by the semantic role labels (SRL) in natural lan-

guage deﬁnitions. We present two mechanisms for

injecting SRL biases into latent variables, ﬁrstly,

reconstructing both words and corresponding SRL

in a VAE, secondly, employing SRL information as

input variables for a Conditional VAE (Zhao et al.,

2017).

2) We propose a framework for evaluating the

disentanglement properties of the encodings on

non-synthetic textual datasets. Our evaluation

framework employs semantic role label groupings

as generative factors, enabling the measurement

of several contemporary quantitative metrics. The

results show that the proposed bias injection mech-

anisms are able to increase the degree of disentan-

glement (separability) of the representations.

3) We demonstrate that models trained with our

disentanglement framework are able to outperform

contemporary baselines in the downstream task of

deﬁnition modeling (Noraset et al.,2017).

2 Disentangling framework

In this section we ﬁrst describe the framework de-

signed for improving disentanglement in natural

language deﬁnitions with semantic role labels. Sec-

ondly, we present three models, shown in Figure 2

based on the Variational Autoencoder (VAE) (Bow-

man et al.,2016) architecture for achieving disen-

tanglement.

2.1 Disentangling deﬁnitions

Deﬁnition semantic roles

Our framework is

based on natural language deﬁnitions, which are

a particular type of linguistic expression, charac-

terised by high abstraction, and speciﬁc phrasal

properties. Previous work in NLP for dictionary

deﬁnitions (Silva et al.,2018) has shown that there

are categories that can be consistently found in

most deﬁnitions. In fact, Silva et al. (2018) deﬁne

precise Semantic Role Labels (SRL) for phrases

representing deﬁnitions, under the name of Deﬁni-

tion Semantic Roles (DSR).

The example from (Silva et al.,2018) classiﬁes

the semantic roles within "english poets who lived

in the lake district" as follows. "poets" as noun

category (supertype), "english" as quality of the

term (Differentia Quality), "who lived" as event

that the subject is involved with (differentia event),

and "in the lake district" as the location of the action

(Event location). The full DSRs proposed by Silva

et al. (2018) are reported in Table 9in Appendix A.

Disentangling using SRL

Our goal is to enhance

disentanglement in natural language by injecting

categorical structures into latent variables. We ﬁnd

that this goal is well aligned with the ﬁndings of Lo-

catello et al. (2019), where it is claimed that a

higher degree of disentanglement may beneﬁt from

supervision and inductive biases. Our hypothesis

is that we may leverage such semantic information

for learning representation with higher degree of

disentanglement. While in the context of this work

we use dictionary deﬁnitions as a target empirical

setting, we conjecture that these conclusions can

be extended to broader deﬁnitional sentence-types.

The core intuition behind the approach is that the

supervision signal should increase the likelihood

of point clustering in regions corresponding to, or

related to the discrete supervision labels, given the

network architecture formulation.

2.2 Deﬁnition VAEs

Unsupervised VAE

The ﬁrst training framework

that we consider is the traditional variational au-

toencoder (VAE) for sentences (Bowman et al.,

2016), which operates in an unsupervised fash-

ion, as in Figure 2a. The unsupervised VAE

employs a multivariate gaussian prior distribu-

tion

p(z)

and generates a sentence

with a de-

coder network

pθ(x|z)

. The joint distribution

for the decoder is deﬁned as

p(z)pθ(x|z)

, which,

for a sequence of tokens

of length

result as

pθ(x|z) = QT

i=1 pθ(xi|x<i, z)

. The VAE objec-

tive consists into maximizing the expectation of the

log-likelihood which is deﬁned as

Ep(x)log pθ(x)

Due to the computational intractability of the such

expectation value, the variational distribution

qθ

employed to approximate pθ(z|x).

As a result, an evidence lower bound

LVAE

E D

tokens tokens

rec. loss

(a) Unsupervised VAE

E D

tokens tokens

roles

joint loss

(b) Supervised VAE

E D

tokens tokens

roles

rec. loss

roles

Figure 2: Proposed architectures for learning disentangled representations in deﬁnitions.

(ELBO) where

Ep(x)[log pθ(x)] ≥ LVAE

, is de-

rived as follows:

LTokens =Eqφ(z|x)hlog pθ(x|z)i−KLqφ(z|x)||p(z)

DSR supervised VAE

The aim of this model

is to inject the categorical structure of the deﬁni-

tion semantic roles (DSR) into the latent variables,

by factorizing them into the VAE auto-encoding

objective function. In order to achieve this goal,

we introduce the variable r for semantic roles, and

train the "DSR VAE", where both sentence and se-

mantic roles are auto-encoded. The variable

here

operates just as

, with the corresponding label val-

ues. As a result, two separate losses are produced

and added together for the ﬁnal loss, as shown in

Figure 2b. The ELBO for semantic roles is deﬁned

as follows:

LRoles =Eqφ(z|r)hlog pθ(r|z)i−KLqφ(z|r)||p(z)

The ﬁnal loss is given by LTokens +LRoles.

Conditional VAE with SRL

For explicitly lever-

aging the deﬁnition semantic roles, we propose a

supervision mechanism based on the Conditional

VAE (CVAE) (Zhao et al.,2017), shown in Fig-

ure 2c. Similar to the previously described model,

we instantiate a VAE framework, where

is the

variable for the tokens, and

for the roles. We

perform auto-encoding for both roles and tokens,

and additionally, we condition the decoder network

on the roles. The CVAE is trained to maximize the

conditional log likelihood of

given

, which in-

volves an intractable marginalization over the latent

variable z.

The ELBO is deﬁned as:

LCVAE =Eqφ(z|r,x)hlog pθ(x|z, r)i

−KLqφ(z|x, r)||p(z|r)

Training

We consider LSTM-based VAE and

Transformer-based VAE (Optimus (Li et al.,2020))

as baselines. The training process follows the vari-

ational autoencoding methodology (Kingma and

Welling,2014). First, tokenization is performed

in the sentences and the roles. The Encoder net-

work involves feeding both ﬁrst into embedding

layers, then into LSTM / Transformer layers. Sub-

sequently, two vectors

and

are sampled with

two linear layers, and the vector

is computed with

the re-parameterization trick. Finally, the decoder

network is built with the LSTM / Transformer lay-

ers and another embedding layer, which return the

same dimension that was given as input.

3 Evaluation framework

We ﬁrst present the evaluation framework that for

measuring disentanglement, then describe and jus-

tify the generative factor setup used in the experi-

ments.

3.1 DSR as generative factors

While early approaches for disentanglement in

NLP have been proposed in the context of in style

transfer applications (John et al.,2019;Cheng et al.,

2020) and are assessed purely in terms of style

transfer accuracy, evaluating the intrinsic properties

of the latent encodings is fundamental for disentan-

glement, as mentioned in several machine learning

approaches (Higgins et al.,2017;Kim and Mnih,

2018). Recently, Zhang et al. (2021) proposed a

framework for computing several popular quantita-

tive disentanglement metrics such as (Higgins et al.,

2017;Kim and Mnih,2018) testing it on synthetic

datasets. The limitation in (Zhang et al.,2021) is

that it works only with synthetic datasets.

In this work, we propose a method where seman-

tic role labels, such as the ones provided in (Silva

et al.,2018), are used as generative factors for eval-

uating the degree of disentanglement in the en-

codings. The framework, illustrated in Figure 3,

considers multiple generative factors, where each

factor is composed by a number of semantic roles

(for example the factor "location" includes, origin-

location, and event-location). In this way, the

dataset can be seen as the result of a sampling

of multiple generative factors, which is the same

principle used when creating synthetic datasets for

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

LearningDisentangledRepresentationsforNaturalLanguageDenitionsDaniloS.Carvalho1GiangiacomoMercatali1yYingjiZhang1yAndreFreitas1;2DepartmentofComputerScience,UniversityofManchester,UnitedKingdom1IdiapResearchInstitute,Switzerland2@[postgrad.y]manchester.ac.ukAbstractDisentanglingtheencodingsofneural...

展开>> 收起<<

Learning Disentangled Representations for Natural Language Deﬁnitions Danilo S. Carvalho1Giangiacomo Mercatali1yYingji Zhang1yAndre Freitas12 Department of Computer Science University of Manchester United Kingdom1.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Learning Disentangled Representations for Natural Language Deﬁnitions Danilo S. Carvalho1Giangiacomo Mercatali1yYingji Zhang1yAndre Freitas12 Department of Computer Science University of Manchester United Kingdom1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: