Multi-objective Deep Data Generation with Correlated Property Control Shiyu Wang1 Xiaojie Guo2 Xuanyang Lin1 Bo Pan1 Yuanqi Du3 Yinkai Wang4 Yanfang

2025-05-02 0 0 1.56MB 21 页 10玖币

侵权投诉

Multi-objective Deep Data Generation with

Correlated Property Control

Shiyu Wang1, Xiaojie Guo2, Xuanyang Lin1, Bo Pan1, Yuanqi Du3, Yinkai Wang4, Yanfang

Ye5, Ashley Ann Petersen6, Austin Leitgeb6, Saleh AlKhalifa7, Kevin Minbiole6, William

Wuest1, Amarda Shehu8, Liang Zhao1,†

1Emory University, {shiyu.wang, mike.lin, bo.pan, william.wuest, liang.zhao}@emory.edu

2IBM Thomas.J. Watson Research Center, xguo7@gmu.edu

3Cornell University, yd392@cornell.edu

4Tufts University, yinkai.wang@tufts.edu

5University of Notre Dame, yye7@nd.edu

6Villanova University, {apeter24, austin.leitgeb, kevin.minbiole}@villanova.edu

7Recursiv LLC, salehesam@gmail.com

8George Mason University, ashehu@gmu.edu

Abstract

Developing deep generative models has been an emerging ﬁeld due to the ability to

model and generate complex data for various purposes, such as image synthesis

and molecular design. However, the advancement of deep generative models is

limited by challenges to generate objects that possess multiple desired properties:

1) the existence of complex correlation among real-world properties is common but

hard to identify; 2) controlling individual property enforces an implicit partially

control of its correlated properties, which is difﬁcult to model; 3) controlling multi-

ple properties under various manners simultaneously is hard and under-explored.

We address these challenges by proposing a novel deep generative framework,

CorrVAE, that recovers semantics and the correlation of properties through disen-

tangled latent vectors. The correlation is handled via an explainable mask pooling

layer, and properties are precisely retained by generated objects via the mutual

dependence between latent vectors and properties. Our generative model preserves

properties of interest while handling correlation and conﬂicts of properties under a

multi-objective optimization framework. The experiments demonstrate our model’s

superior performance in generating data with desired properties. The code of

CorrVAE is available at https://github.com/shi-yu-wang/CorrVAE.

1 Introduction

Developing powerful deep generative models has been an emerging ﬁeld due to its capability to

model and generate high-dimensional complex data for various purposes, such as image synthesis

[

], molecular design [

], protein design [

], co-authorship network analysis [

]

and natural language generation [

]. Extensive efforts have been spent on learning underly-

ing low-dimensional representation and the generation process of high-dimensional data through

deep generative models such as variational autoencoders (VAE) [

], generative adversarial

networks (GANs) [

], normalizing ﬂows [

], etc [

]. Particularly, enhancing the

disentanglement and independence of latent dimensions has been attracting the attention of the

community [

], enabling controllable generation that generates data with desired

properties by interpolating latent variables [

]. For instance, CSVAE

†Corresponding author.

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.01796v3 [cs.LG] 17 Oct 2022

transfers image attributes by correlating latent variables with desired properties [

]. Semi-VAE pairs

latent space with properties by minimizing the mean-square-error (MSE) between latent variables

and desired properties [

]. Property-controllable VAE (PCVAE) synthesizes image objects with

desired positions and scales [

] by enforcing the mutual dependence between disentangled latent

variables and properties. Conditional Transformer Language (CTRL) model generates text with

task-speciﬁc style and contents [

]. Despite of the rapid growth of research regarding property

mature

Day: 3

Size: 5cm

Color: green

Day: 32

Size: 10cm

Color: yellow

Day: 61

Size: 13cm

Color: red Solubility: > 0.5

Toxicity: as low as possible

Potential: 3V

…

(a) (b)

Figure 1: (a) Correlated properties are common for the real-world object, such as the day (growth

time), size and color of wild pepper; (b) Generation of molecules that satisfy desired properties can

be viewed as a multi-objective optimization task

controllable generation in various domains, critical challenges still remain such as: 1)

Difﬁculty

in identifying property correlation

. Existing models for data property control typically map each

property to its exclusive latent variables. Also, all the latent variables are inherently enforced to

be independent of each other. Therefore such complete distentanglement among the consideration

of properties disallows the model to characterize the correlation among properties and hence can

only work for controlling properties that are independent of each other. However, properties in

real data objects are usually correlated (Figure 1 (a)). For example, in a human face image, the

face width has a correlation with eye size. The color of the wild pepper has a correlation with the

growth time (Figure 1 (a)). The correlation of properties has been under-explored, which largely

impairs the effectiveness of generative models; 2)

Controlling individual property also enforces

an implicit partial control of its correlated properties, which is difﬁcult to model.

For those

correlated properties, controlling one of them will also constrain the others into some subspace such

as a hyperplane or even a non-convex set. For example, when generating face images, if we constrain

the width of the face is 100 pixels then the size of the eyes will be constrained to a reasonable range;

Difﬁculty in simultaneously controlling multiple properties under various manners

. The

real-world application usually requires generated object to satisfy multiple constraints of properties

simultaneously. One may want to maximize a property’s value, ﬁx another property to a certain value,

and constrain the third property within a range. Therefore, the data generation problem is entangled

with and hardened by the multi-objective optimization goal, which has not been well explored. For

example, chemists may design a molecule that has speciﬁc potential, minimizes its toxicity and

meanwhile possesses solubility within a range (Figure 1 (b)). We overcome these challenges by

proposing a novel deep generative model, CorrVAE, that recovers semantics and the correlation of

properties via disentangled latent vectors. The correlation is handled by an explainable mask pooling

layer, and properties are precisely retained by the generated data via the mutual dependence between

latent vectors and properties. Our generative model preserves multiple properties of interest while

handling correlation and conﬂicts of properties under a multi-objective optimization framework. The

contributions of this paper are summarized as follows:

•A novel deep generative model for multi-objective control of correlated properties.

Beyond

disentangled representation learning, we aim at corresponding latent variables to target properties

for better interpretability and controllability. The model is generic to different types of data such

as images and graphs, together with disentangled terms to obtain independent latent variables to

jointly handle correlated properties.

•A correlated invertible mapping is proposed for mapping correlated real properties to latent

independent variables.

An interpretable mask pooling layer has been proposed to explicitly iden-

tify how the real-world properties are generated by the corresponding subsets of latent independent

variables. The information of these latent variables will be aggregated and enforced to be mutually

dependent on the property via an invertible constraint.

•A multi-objective optimization framework is formulated for deep data generation problem

Corresponding latent variables in the low-dimensional representation are optimized under multiple

objectives and constraints for property control purposes. Our framework is generic to various

multiple objectives such as optimizing a property value, constraining property values into a range,

maximizing or minimizing a property value while maintaining the correlation among properties.

•Extensive experiments are conducted on real-world datasets.

The proposed model can generate

data with multiple desired properties simultaneously in the generation process, demonstrating the

effectiveness of the model. Moreover, our model shows superior accuracy of generated properties

against target properties compared with comparison models with multiple real-world datasets.

This paper ﬁrstly introduces the general framework of the proposed model. Then we will discuss

the details of the model, including the derivation of the overall objective, the mask pooling layer,

and invertible mapping between latent space and correlated properties. Lastly, we conduct the

comprehensive experiment to compare our model with existing methods.

2 Related works

2.1 Disentangled representation learning

Disentangled representation learning aims to encode information of high-dimensional complex

data into a low-dimensional space that consists of mutually independent variables to separate out

independent factors of variation of data distribution in the representation [

Because of the success of VAE and GANs as deep generative models [

], a few techniques

have been developed as variations of VAE or GANs to achieve disentanglement of latent variables

in the representation space. For instance,

-VAE modiﬁes the variational evidence lower bound

(ELBO) by adding a hyperparameter

before the KL-divergence term to encourage disentangled

latent variables [

]. Instead, cycle-consistent VAE was proposed for supervised disentanglement

of speciﬁed and unspeciﬁed factors of variation using pairwise similarity labels [

]. On the other

hand, InfoGAN maximizes the mutual information between the latent variable and the generated

sample under the framework of GANs. Disentangled Representation learning-Generative Adversarial

Network (DR-GAN) disentangles the representation with the pose of the face on the image through

the pose code provided to the generator and pose estimation by the discriminator [43].

2.2 Property controllable deep generative models

Considerable efforts have been spent on developing deep generative models that generate data with

desired properties [

]. Techniques for property controllable generation include

but are not limited to 1) reinforcement learning (RL) approach to goal-directed data generation that

preserves target properties, such as Graph Convolutional Policy Network (GCPN) [

] and GraphAF

[

], and 2) mutual dependence between properties and latent variables to control generation process

by manipulating values of latent variables, such as PCVAE [

] and Conditional Subspace VAE

(CSVAE) [

]. RL approach nevertheless suffers from the requirement of a large sample size for the

training purpose. Moreover, all the above methods are unable to precisely capture complex correlation

among properties in an explainable way, nor can they generate data that simultaneously satisfy

multiple correlated targets of either values or ranges. To ﬁll the gap between the existing methods

and the need for controllable generation from various domains, we propose a novel controllable deep

generative model that handles the correlation of properties via an explainable mask pooling layer and

generates data with desired properties under a multi-objective optimization framework.

3 Problem formulation

Suppose we have a dataset

, in which each sample can be represented as

, along with

{y1, y2, ..., ym}

properties of

, which can be either correlated or independent with each other.

For instance, if the data is a molecule, properties can be molecular weight, polarity or solubility. We

further assume that

(x, y)

is generated via some random processes from continuous latent variables

in (w, z), where wcontrols the properties of interest in yand zcontrols all other aspects of x.

We aim to learn a generative model that generates

(x, y)

conditioning on

(w, z)

, where

is disen-

tangled with

and variables in

are disentangled with each other to control either correlated or

independent properties. Once the model is trained, the user can generate data with target values

or ranges of properties via editing the corresponding elements in

. For example, we may want

to generate a molecule with a speciﬁc value of the weight, and solubility within a range while

minimizing its toxicity by changing values of

that contribute to those properties. This goal leads

to the following questions answered by our work: how to automatically identify the correlation

among properties, how to control individual property while enforcing an implicit partial control of its

correlated properties, and how to control multiple properties simultaneously?

4 Proposed approach

Object

encoder













Object

decoder

Property

encoder

mask

MLP

…





























 





 





 





 

Mutual dependence between and 

Weight layer



relu 

relu

    

Figure 2: Overal framework of CorrVAE. CorrVAE encodes the information of correlated properties

into the latent space

and other information of the object into

via the property and the object

encoder, respectively. Then the correlation among properties is captured by the mask pooling layer,

where the information to predict a speciﬁc property is aggregated into the bridging latent variable

. The mutual dependence between

and the corresponding property is enforced by the invertible

constraint achieved by ResNet. Lastly, the data can be generated from (w, z)via the object decoder.

In general, the proposed approach, CorrVAE, identiﬁes the property correlation via a novel mask

pooling layer. It precisely retains properties via the constraint of mutual dependence between latent

vectors and properties, and simultaneously controls multiple properties under a multi-objective

optimization framework. The overall framework of the model is shown in Figure 2. Speciﬁcally,

the model contains two phases: (1)

Learning phase

encodes the information of properties via the

property encoder and other information of the data via the object encoder. As shown in Figure 2,

the correlation information of properties is captured by a novel mask pooling layer. The mutual

dependence between latent variables and properties is enforced via a constraint applied to the learning

objective. The data is generated via the object decoder in Figure 2; (2)

Generation phase

generates

data with desired properties of speciﬁc values or within target ranges under the multi-objective

optimization framework. In this section, we will introduce two phases in detail.

4.1 Learning phase

4.1.1 Overall objective for disentangled learning on latent variables

The goal requires us to not only model the dependence between

and

(w, z)

for latent representation

learning and data generation, but also model the dependence between

and

for controlling the

property. We propose to achieve this by maximizing the joint log likelihood

p(x, y)

via its variational

lower bound. Given an approximate posterior

q(z, w|x, y)

, we can use the Jensen’s equality to obtain

the variational lower bound of log p(x, y)as:

log p(x, y) = log Eq(z,w|x,y)[p(x, y, w, z)/q(z, w|x, y)] ≥Eq(z,w|x,y)[log p(x, y, w, z)/q(z, w|x, y)].(1)

The joint likelihood

log p(x, y, w, z)

is further decomposed as

log p(x, y|z, w) + log p(z, w)

given

Two assumptions: (1)

and

are conditionally independent given

since

only captures informa-

tion from

; (2)

is independent from

and

, equivalent to

y⊥z|w

. This gives us

x⊥y|(w, z)

suggesting that

. Con-

sequently, we write the joint log-likelihood and maximize its lower bound:

log pθ,γ (x, y, w, z) = log pθ(x|w, z) + log p(w, z) + log pγ(y|w)

= log pθ(x|w, z) + log p(w, z) +

i=1

log pγ(yi|w0

i),(2)

where we deﬁne a next-level latent variable

as the set of values in

that are independent with

each other and contribute to the

-th property to bridge the mapping

w→y

while allowing property

controlling. Each value

w0={w0

1, w0

2, ..., w0

relates to each

. Since properties are

independent conditioning on

, the decomposition of the third term holds in Eq. 2. The relationship

between

and

will be further explained in Section 4.1.2. Given

qφ(w, z|x, y) = qφ(w, z|x) =

qφ(w|x)·qφ(z|x)

, we rewrite the joint probability in Eq.

(2)

as the form of the Bayesian variational

inference as the ﬁrst term of the learning objective:

L1=−Eqφ(w,z|x)[log pθ(x|w, z)] −Eqφ(w|x)[log pγ(y|w)] + DKL(qφ(w, z|x)||p(w, z)).(3)

Meanwhile, since the objective function in Eq.

(3)

does not contribute to our assumption that

independent from

and

, and values in

are independent with each other, we decompose the

KL-divergence in Eq. (3) and penalize the term:

L2=ρ1·DKL(q(z, w)||q(z)q(w)) + ρ2·DKL(q(w)|| Y

q(wi)),(4)

where

ρ1

and

ρ2

are co-efﬁcient hyper-parameters to penalize the two terms. Details of the proof and

derivation regarding the overall objective can be refereed in Appendix A.

4.1.2 Relating the properties and latent variables

To model the dependence between the correlated properties and the associated latent variables

p(y|w)

in Eq

(3)

as well as to capture the correlation among properties, we propose to directly learn the

speciﬁc relationship between disentangled latent variables in

and properties

. The correlations

among

are also captured. Speciﬁcally, we design a mask pooling layer achieved by a mask matrix

M∈ {0,1}l×m

, where

is the dimension of the latent vector

captures the way how

relates

, where

Mi,j = 1

denotes that

relates to the

-th property

, otherwise there is no relation. In

this way, two properties that relate to the same variable in

can be regarded as correlated. The binary

elements in

are trained with the Gumbel Softmax function. In implementation, the

norm of the

mask matrix is also added to the objective to encourage the sparsity of M.

Next, given the learned mask matrix

, we model the mapping from

. For properties

, we

can calculate the corresponding

that aggregates the values in

that contribute to each property as

w·JTM

, each column of which corresponds to the related latent variables in

to be aggregated

to predict the corresponding

. For each property

, we aggregate all the information from its

related latent variable set in winto the next-level latent variable w0

j(i.e., the j-th variable of w0) via

an aggregation function h:

w0=h(w·JTM;β),(5)

where

is a vector with all values as one,



represents the element-wise multiplication and

is the

parameter of h. Then the property ycan be predicted using w0as:

y=f(w0;γ),(6)

where

is the set of prediction functions with

w0=h(w·JTM;β)

as the input and

are the

parameter which will be further explained in the next section. Thus, we have built a one-to-one

mapping between

and

. In addition, the correlation of

and

can be recovered if

·i·M·j6= 0

4.1.3 Invertible constraint for multiple-property control

As stated in the problem formulation, our proposed model aims to generate a data point

that retains

the original property value requirement for the given properties. The most straightforward way to

do this is to model both the mutual dependence between each

and its relevant latent variable set

. However, this can incur double errors in this two-way mapping, since there exists a complex

correlation among properties in

and there are many cases that

·i·M·j6= 0

. To address it, we

propose an invertible function that mathematically ensures the exact recovery of bridging variables

w0given a group of desired properties ybased on the following deduction.

As in Eq.

(6)

, the set of correlated properties

y={y1, y2, ..., ym}

are correlated with the set of latent

variables

w0={w0

1, w0

2, ..., w0

in a one-to-one mapping fashion. Thus we assume that

can be

sampled from a multivariate Gaussian given w0as follows:

p(y|w0) = N(y|f(w0;γ),Σ); y= (y1, y2, ..., ym), w0={w0

1, w0

2, ..., w0

m},Σ∈Rm×m

s.t., f(w0;γ)[j] = ¯

f(w0;γ)[j] + w0

j, Lip(¯

f(w0;γ)[j]) <1if ||Wk||2<1, j = 1, ..., m, (7)

where

Lip

denotes to the

Lipschitz −constant

. Namely, to precisely control the properties

we learn a set of invertible functions

f(w0;γ)

indicated in Eq 6 to model

pγ(y|w0)

is the set of

parameters in Eq 6. The constraint enforces

f(w0;γ)[j]

to be an invertible function to achieve mutual

dependence between yjand w0

j[2]. As a result, we have the third term of the objective function:

L3=−Ew0∼p(w0)[N(y|f(w0;γ),Σ)] + ||Lip(¯

f(w0;γ)[j]) −1||2(8)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Multi-objectiveDeepDataGenerationwithCorrelatedPropertyControlShiyuWang1,XiaojieGuo2,XuanyangLin1,BoPan1,YuanqiDu3,YinkaiWang4,YanfangYe5,AshleyAnnPetersen6,AustinLeitgeb6,SalehAlKhalifa7,KevinMinbiole6,WilliamWuest1,AmardaShehu8,LiangZhao1,y1EmoryUniversity,{shiyu.wang,mike.lin,bo.pan,william.wuest...

展开>> 收起<<

Multi-objective Deep Data Generation with Correlated Property Control Shiyu Wang1 Xiaojie Guo2 Xuanyang Lin1 Bo Pan1 Yuanqi Du3 Yinkai Wang4 Yanfang.pdf

共21页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Multi-objective Deep Data Generation with Correlated Property Control Shiyu Wang1 Xiaojie Guo2 Xuanyang Lin1 Bo Pan1 Yuanqi Du3 Yinkai Wang4 Yanfang

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: