Modular Flows Differential Molecular Generation Yogesh Verma Samuel Kaski Markus Heinonen Aalto University

2025-05-06 0 0 2.47MB 20 页 10玖币

侵权投诉

Modular Flows: Differential Molecular Generation

Yogesh Verma, Samuel Kaski, Markus Heinonen

Aalto University

{yogesh.verma, samuel.kaski, markus.heinonen}@aalto.fi

Vikas Garg

YaiYai Ltd and Aalto University

vgarg@csail.mit.edu; vikas@yaiyai.fi

Abstract

Generating new molecules is fundamental to advancing critical applications such

as drug discovery and material synthesis. Flows can generate molecules effectively

by inverting the encoding process, however, existing ﬂow models either require

artifactual dequantization or speciﬁc node/edge orderings, lack desiderata such

as permutation invariance, or induce discrepancy between the encoding and the

decoding steps that necessitates post hoc validity correction. We circumvent these

issues with novel continuous normalizing E(3)-equivariant ﬂows, based on a system

of node ODEs coupled as a graph PDE, that repeatedly reconcile locally toward

globally aligned densities. Our models can be cast as message passing temporal

networks, and result in superlative performance on the tasks of density estimation

and molecular generation. In particular, our generated samples achieve state of the

art on both the standard QM9 and ZINC250K benchmarks.

1 Introduction

Figure 1: A toy illustration of

ModFlow

in action

with a two-node graph. The two local ﬂows -

and

- co-evolve toward a more complex joint

density, both driven by the same differential f.

Generative models have rapidly become ubiquitous in

machine learning with advances from image synthesis

(Ramesh et al., 2022) to protein design (Ingraham et al.,

2019). Molecular generation (Stokes et al., 2020) has

also received signiﬁcant attention owing to its promise

for discovering new drugs and materials. Searching for

valid molecules in prohibitively large discrete spaces is,

however, challenging: estimates for drug-like structures

range between

1023

and

1060

but only a tiny fraction -

on the order of

108

- has been synthesized (Polishchuk

et al., 2013; Merz et al., 2020). Thus, learning repre-

sentations that exploit appropriate molecular inductive

biases (e.g., spatial correlations) becomes crucial.

Earlier models focused on generating sequences based

on the SMILES notation (Weininger, 1988) used in

Chemistry to describe the molecular structures as

strings. However, they were supplanted by genera-

tive models that capture valuable spatial information

such as bond strengths and dihedral angles, e.g., by

embedding molecular graphs via some graph neural

network (GNNs) (Scarselli et al., 2009; Garg et al., 2020). Such models primarily include variants

of Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Normalizing

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

arXiv:2210.06032v2 [cs.LG] 13 Oct 2022

Flows (Dinh et al., 2014, 2016). Besides known issues with their training, GANs (Goodfellow et al.,

2014; Maziarka et al., 2020) suffer from the well-documented problem of mode collapse, thereby

generating molecules that lack diversity. VAEs (Kingma and Welling, 2013; Lim et al., 2018; Jin

et al., 2018), on the other hand, are susceptible to a distributional shift between the training data and

the generated samples. Moreover, optimizing for likelihood via a surrogate lower bound is likely

insufﬁcient to capture the complex dependencies inherent in the molecules.

Flows are especially appealing since, in principle, they enable estimating (and sampling from)

complex data distributions using a sequence of invertible transformations on samples from a more

tractable continuous distribution. Molecules are discrete, so many ﬂow models (Madhawa et al.,

2019; Honda et al., 2019; Shi et al., 2020) add noise during encoding and later apply a dequantization

procedure. However, dequantization begets distortion and issues related to convergence (Luo et al.,

2021). Moroever, many methods segregate the generation of atoms from bonds, so the decoded

structure is often not a valid molecule and requires post hoc correction to ensure validity (Zang

and Wang, 2020), effecting a discrepancy between the encoding and the decoded distributions.

Permutation dependence is another undesirable artifact of these methods. Some alternatives have

been explored to avoid dequantization, e.g., (Lippe and Gavves, 2021) encodes molecules in a

continuous latent space via variational inference and jointly optimizes a ﬂow model for generation.

Discrete graph ﬂows (Luo et al., 2021) also circumvent the many pitfalls of dequantization by

resorting to discrete latent variables, and performing validity checks during the generative process.

However, discrete ﬂows follow an autoregressive procedure that requires a speciﬁc ordering of nodes

and edges during training. In general, one shot methods can generate much faster than discrete ﬂows.

We offer a different ﬂow-based perspective tailored to molecules. Speciﬁcally, we suggest coupled

continuous normalizing E(3)-equivariant ﬂows that bestow generative capabilities from neural partial

differential equation (PDE) models on graphs. Graph PDEs have been known to enable designing

new embedding methods such as variants of GNNs (Chamberlain et al., 2021), extending GNNs

to continuous layers as Neural ODEs (Poli et al., 2019), and accommodating spatial information

(Iakovlev et al., 2020). We instead seek to bring to the fore their efﬁcacy and elegance as tools to

help generate complex objects, such as molecules, viewed as outcomes resulting from an interplay of

co-adapting latent trajectories (i.e., underlying dynamics). Concretely, a ﬂow is associated with each

node of the graph, and these ﬂows are conjoined as a joint ODE system conditioned on neighboring

nodes. While these ﬂows originate independently as samples from simple distributions, they adjust

progressively toward more complex joint distributions as they repeatedly interact with the neighboring

ﬂows. We view molecules as samples generated from the globally aligned distributions obtained after

many such local feedback iterations. We call the proposed method Modular Flows (

ModFlow

s) to

underscore that each node can be regarded as a module that coordinates with other modules. Table 1

summarizes the capabilities of ModFlow compared to some previous generative works.

Contributions.

We propose to learn continuous-time, ﬂow based generative models, grounded on

graph PDEs, for generating molecules without resorting to any validity correction. In particular,

•

we propose

ModFlow

, a novel generative model based on coupled continuous normalizing

E(3)-equivariant ﬂows.

ModFlow

encapsulates essential inductive bias using PDEs, and

deﬁnes multiple ﬂows that interact locally toward a globally consistent joint density;

Table 1: A comparison of generative modeling approaches for molecules.

Method One-shot Modular Invertible Continuous-time

JT-VAE 3 3 7 7 Jin et al. (2018)

MRNN 7 7 7 7 Popova et al. (2019)

GraphAF 7 7 37Shi et al. (2020)

GraphDF 7 7 37Luo et al. (2021)

MoFlow 3737Zang and Wang (2020)

GraphNVP 3737Madhawa et al. (2019)

ModFlow 3 3 3 3 this work

Figure 2: A demonstration of the modular ﬂow generation. The initial Gaussian distributions

N(0, I)

evolve

into complex densities z(T)under fand are subsequently translated into probabilities and labels.

•

we encode permutation, translation, rotation, and reﬂection equivariance with E(3) equivari-

ant GNNs adapted to molecular generation, and can leverage 3D geometric information;

•ModFlow

is end-to-end trainable, non-autoregressive, and obviates the need for any external

validity checks or correction;

•

empirically,

ModFlow

achieves state-of-the-art performance on both the standard QM9 (Ra-

makrishnan et al., 2014) and ZINC250K (Irwin et al., 2012) benchmarks.

2 Related works

Generative models.

Earlier attempts for molecule generation (Kusner et al., 2017; Dai et al., 2018)

aimed at representing molecules as SMILES strings (Weininger, 1988) and developed sequence

generation models. A challenge for these approaches is to learn complicated grammar rules that can

generate syntactically valid sequences of molecules. Recently, representing molecules as graphs has

inspired new deep generative models for molecular generation (Segler et al., 2018; Samanta et al.,

2018; Neil et al., 2018), ranging from VAEs (Jin et al., 2018; Kajino, 2019) to ﬂows (Madhawa et al.,

2019; Luo et al., 2021; Shi et al., 2020). The core idea is to learn to encode molecular graphs into

a latent space, and subsequently decode samples from this latent space to generate new molecules

(Atwood and Towsley, 2016; Xhonneux et al., 2020; You et al., 2018).

Graph partial differential equations.

Graph PDEs is an emerging area that studies PDEs on

structured data encoded as graphs. For instance, one can deﬁne a PDE on graphs to track the evolution

of signals deﬁned over the graph nodes under some dynamics. Graph PDEs have enabled, among

others, design of new graph neural networks; see, e.g., works such as GNODE (Poli et al., 2019),

NeuralPDE (Iakovlev et al., 2020), Neural operator (Li et al., 2020), GRAND (Chamberlain et al.,

2021), and PDE-GCN (Eliasof et al., 2021). Different from all these works, we focus on using PDEs

for generative modeling of molecules (as graph-structured objects). Interestingly,

ModFlow

proposed

in this work may be viewed as a new equivariant temporal graph network (Rossi et al., 2020; Souza

et al., 2022).

Validity oracles.

A key challenge of molecular generative models is to be able to generate valid

molecules, according to various criteria for molecular validity or feasibility. It is a common practice

to call on external chemical software as rejection oracles to reduce or exclude invalid molecules, or

do validity checks as part of autoregressive generation (Luo et al., 2021; Shi et al., 2020; Popova

et al., 2019). An important open question has been whether generative models can learn to achieve

high generative validity intrinsically, i.e., without being aided by oracles or resorting to additional

checks. ModFlow takes a major step forward toward that goal.

3 Modular Flows

We focus on unsupervised learning of an underlying graph density

p(G)

using a dataset of observed

molecular graphs

D={Gn}N

n=1

. We learn a generative ﬂow model

pθ(G)

speciﬁed by ﬂow

parameters θ, and use it to sample novel high-probability molecules.

3.1 Molecular Representation

Graph representation.

We represent each molecular graph

G= (V, E)

of size

as a tuple of

vertices

V= (v1, . . . , vM)

and edges

E⊂V×V

. Each vertex takes a value from an alphabet

on atoms:

v∈ A ={C,H,N,O,P,S, . . .}

; while each edge

e∈ B ={1,2,3}

abstracts some bond

type (i.e., single, double, or triple). We assume that, conditioned on the edges, the graph likelihood

factorizes as a product of categorical distributions over vertices given their latent representations:

p(G) := p(V|E, {z}) =

i=1

Cat(vi|σ(zi)) ,(1)

where

zi= (ziC, ziH, . . .)∈R|A|

is a set of atom scores for node

such that

zik ∈R

pertains to type

k∈ A, and σis the softmax function

σ(zi)k=exp(zik)

Pk0exp(zik0),(2)

which turns the real-valued scores

into normalized probabilities.

ModFlow

also supports 3D

molecular graphs that contain atomic coordinates and angles as additional information.

Tree representations.

We can obtain an alternative representation for molecules: we can decom-

pose each molecule into a tree-like structure, by contracting certain vertices into a single node

(denoted as a cluster) such that the molecular graph becomes acyclic. Following Jin et al. (2018),

we restrict these clusters to ring substructures present in the molecular data, in addition to the atom

alphabet. Thus, we obtain an extended alphabet

Atree =A∪{C1,C2, . . .}

, where each cluster label

corresponds to some ring substructure in the label vocabulary

. We then reduce the vocabulary to

the 30 most commonly occurring substructures of Atree. For further details, see Appendix A.2.

3.2 Differential modular ﬂows

Normalizing ﬂows (Kobyzev et al., 2021) provide a general recipe for constructing ﬂexible probability

distributions, used in density estimation (Cramer et al., 2021; Huang et al., 2018) and generative

modeling (Zhen et al., 2020; Zang and Wang, 2020). We propose to model the atom scores

zi(t)

as a

Continuous-time Normalizing Flow (CNF) (Grathwohl et al., 2018) over time

t∈R+

. We assume

the initial scores at time

t= 0

follow an uninformative Gaussian base distribution

zi(0) ∼ N (0, I)

for each node i. Node scores evolve in parallel over time according to the differential equation

zi(t) := ∂zi(t)

∂t =fθt, zi(t),zNi(t),xi,xNi, i ∈ {1, . . . , M},(3)

where

Ni={j: (i, j)∈E}

is the set of neighbors of node

and

zNi(t) = {zj(t) : j∈ Ni}

the

scores of the neighbors at time

;

and

xNi

denote, respectively, the positional (2D/3D) information

and its neighbours; and

denotes the parameters of the ﬂow function

to be learned. Stacking

together all node differentials, we obtain a modular system of coupled ODEs:

z(t) = 





z1(t)

zM(t)





=





fθt, z1(t),zN1(t),xi,xNi

fθt, zM(t),zNM(t),xi,xNi





(4)

z(T) = z(0) + ZT

z(t)dt . (5)

This coupled system of ODEs may be viewed as a graph PDE (Iakovlev et al., 2020; Chamberlain

et al., 2021), where the evolution of each node depends only on its neighbors.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ModularFlows:DifferentialMolecularGenerationYogeshVerma,SamuelKaski,MarkusHeinonenAaltoUniversity{yogesh.verma,samuel.kaski,markus.heinonen}@aalto.fiVikasGargYaiYaiLtdandAaltoUniversityvgarg@csail.mit.edu;vikas@yaiyai.fiAbstractGeneratingnewmoleculesisfundamentaltoadvancingcriticalapplicationssuchas...

展开>> 收起<<

Modular Flows Differential Molecular Generation Yogesh Verma Samuel Kaski Markus Heinonen Aalto University.pdf

共20页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Modular Flows Differential Molecular Generation Yogesh Verma Samuel Kaski Markus Heinonen Aalto University

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: