Multi-Objective GFlowNets

2025-05-02 0 0 1.3MB 23 页 10玖币

侵权投诉

Moksh Jain 1 2 Sharath Chandra Raparthy 12* Alex Hernandez-Garcia 1 2 Jarrid Rector-Brooks 1 2

Yoshua Bengio 123 Santiago Miret 4Emmanuel Bengio 5

Abstract

We study the problem of generating diverse can-

didates in the context of Multi-Objective Opti-

mization. In many applications of machine learn-

ing such as drug discovery and material design,

the goal is to generate candidates which simul-

taneously optimize a set of potentially conﬂict-

ing objectives. Moreover, these objectives are

often imperfect evaluations of some underlying

property of interest, making it important to gen-

erate diverse candidates to have multiple options

for expensive downstream evaluations. We pro-

pose Multi-Objective GFlowNets (MOGFNs), a

novel method for generating diverse Pareto opti-

mal solutions, based on GFlowNets. We introduce

two variants of MOGFNs: MOGFN-PC, which

models a family of independent sub-problems de-

ﬁned by a scalarization function, with reward-

conditional GFlowNets, and MOGFN-AL, which

solves a sequence of sub-problems deﬁned by an

acquisition function in an active learning loop.

Our experiments on wide variety of synthetic and

benchmark tasks demonstrate advantages of the

proposed methods in terms of the Pareto perfor-

mance and importantly, improved candidate diver-

sity, which is the main contribution of this work.

1. Introduction

Decision making in practical applications usually in-

volves reasoning about multiple, often conﬂicting, objec-

tives (Keeney et al.,1993). Consider the example of in-silico

drug discovery, where the goal is to generate novel drug-like

molecules that effectively inhibit a target, are easy to synthe-

size, and possess a safety proﬁle for human use (Dara et al.,

2021). These objectives often exhibit mutual incompatibil-

∗

Work done during an internship at Recursion

Universit

e de

Montr

eal

Mila - Quebec AI Institute

CIFAR Fellow & IVADO

Intel Labs

Recursion. Correspondence to: Moksh Jain

mok-

shjn00@gmail.com>.

Proceedings of the

40 th

International Conference on Machine

Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright

2023 by the author(s).

ity as molecules that are effective against a target may also

have detrimental effects on humans, making it infeasible to

ﬁnd a single molecule that maximizes all the objectives si-

multaneously. Instead, the goal in these Multi-Objective Op-

timization (MOO; Ehrgott,2005;Miettinen,2012) problems

is to identify candidate molecules that are Pareto optimal,

covering the best possible trade-offs between the objectives.

A less appreciated aspect of multi-objective problems is

that the objectives to optimize are usually underspeciﬁed

proxies which only approximate the true design objectives.

For instance, the binding afﬁnity of a molecule to a target

is an imperfect approximation of the molecule’s inhibitory

effect against the target in the human body. In such scenarios

it is important to not only cover the Pareto front but also to

generate sets of diverse candidates for each Pareto optimal

solution in order to increase the likelihood of success of the

generated candidates in expensive downstream evaluations,

such as in-vivo tests and clinical trials (Jain et al.,2022).

The beneﬁts of generating diverse candidates are twofold.

First, by diversifying the set of candidates we obtain an

advantage similar to Bayesian ensembles: we reduce the

risk of failure that might occur due to the imperfect gen-

eralization of learned proxy models. Diverse candidates

should lie in different regions of the input-space manifold

where the objective of interest might be large (considering

the uncertainty in the output of the proxy model). Second,

experimental measurements such as in-vitro assays may not

reﬂect the ultimate objectives of interest, such as efﬁcacy in

human bodies. Multiple candidates may have the same as-

say score, but different downstream efﬁcacy, so diversity in

candidates increases odds of success. Existing approaches

for MOO overlook this aspect of diversity and instead focus

primarily on generating Pareto optimal solutions.

Generative Flow Networks (GFlowNets; Bengio et al.,

2021a;b) are a recently proposed family of probabilistic

models which tackle the problem of diverse candidate gen-

eration. Contrary to the reward maximization view of preva-

lent Reinforcement Learning (RL) and Bayesian optimiza-

tion (BO) approaches, GFlowNets sample candidates with

probability proportional to their reward. Sampling candi-

dates, as opposed to greedily generating them implicitly en-

courages diversity in the generated candidates. GFlowNets

arXiv:2210.12765v2 [cs.LG] 17 Jul 2023

Multi-Objective GFlowNets

have shown promising results in single objective problems

such as molecule generation (Bengio et al.,2021a) and bio-

logical sequence design (Jain et al.,2022).

In this paper, we propose Multi-Objective GFlowNets

(MOGFNs), which leverage the strengths of GFlowNets

and existing MOO approaches to enable the generation of

diverse Pareto optimal candidates. We consider two vari-

ants of MOGFNs – (a) Preference-Conditional GFlowNets

(MOGFN-PC) which leverage the decomposition of MOO

into single objective sub-problems through scalarization,

and (b) MOGFN-AL, which leverages the transformation

of MOO into a sequence of single objective sub-problems

within the framework of multi-objective Bayesian optimiza-

tion. Our contributions are as follows:

We introduce a novel framework of MOGFNs to tackle

the practically signiﬁcant and previously unexplored

problem of diverse candidate generation in MOO.

Through experiments on challenging molecule genera-

tion and sequence generation tasks, we demonstrate that

MOGFN-PC generates diverse Pareto-optimal candi-

dates. This is the ﬁrst successful application and empir-

ical validation of reward-conditional GFlowNets (Ben-

gio et al.,2021b).

In a challenging active learning task for designing ﬂuo-

rescent proteins, we show that MOGFN-AL results in

signiﬁcant improvements to sample efﬁciency as well

as diversity of generated candidates.

We perform a thorough analysis on the key components

of MOGFNs to provide insights into design choices that

affect performance.

2. Background

2.1. Multi-Objective Optimization

Multi-objective Optimization (MOO) involves ﬁnding a set

of feasible candidates

x⋆∈ X

which simultaneously maxi-

mize dobjectives R(x)=[R1(x), . . . , Rd(x)]:

max

x∈X

R(x).(1)

When these objectives are conﬂicting, there is no single

x⋆

which simultaneously maximizes all objectives. Con-

sequently, MOO adopts the concept of Pareto optimality,

which describes a set of solutions that provide optimal trade-

offs among the objectives.

Given

x1, x2∈ X

is said to dominate

, written

(

x1≻x2

), iff

Ri(x1)≥Ri(x2)∀i∈ {1, . . . , d}

and

∃k∈ {1, . . . , d}

such that

Rk(x1)> Rk(x2)

. A candidate

x⋆

is Pareto-optimal if there exists no other solution

x′∈ X

which dominates

x⋆

. In other words, for a Pareto-optimal

candidate it is impossible to improve one objective without

sacriﬁcing another. The Pareto set is the set of all Pareto-

optimal candidates in

, and the Pareto front is deﬁned as

the image of the Pareto set in objective-space.

Diversity: Since the objectives are not guaranteed to be

injective, any point on the Pareto front can be the image of

several candidates in the Pareto set. This designates diver-

sity in candidate space. Capturing all the diverse candidates,

corresponding to a point on the Pareto front, is critical for

applications such as in-silico drug discovery, where the ob-

jectives

(e.g. binding afﬁnity to a target protein) are mere

proxies for the more expensive downstream measurements

(e.g., effectiveness in clinical trials on humans). This no-

tion of diversity of candidates is typically not captured by

existing approaches for MOO.

2.1.1. APPROACHES FOR TACKLING MOO

While, there exist many approaches tackling MOO prob-

lems (Ehrgott,2005;Miettinen,2012;Pardalos et al.,2017),

in this work, we consider two distinct approaches that de-

compose the MOO problem into a family of single objec-

tive sub-problems. These approaches, described below, are

well-suited for the GFlowNet formulations we introduce in

Section 3.

Scalarization: In scalarization, a set of weights (prefer-

ences)

ωi

are assigned to the objectives

, where

ωi≥0

and

i=1 ωi= 1

. The MOO problem can then be de-

composed into single-objective sub-problems of the form

maxx∈X R(x|ω)

, where

R(x|ω)

is called a scalarization

function, which combines the

objectives into a scalar. Solu-

tions to these sub-problems capture all Pareto-optimal solu-

tions to the original MOO problem depending on the choice

R(x|ω)

and characteristics of the Pareto front. Weighted

Sum Scalarization,

R(x|ω) = Pd

i=1 ωiRi(x)

, for instance,

captures all Pareto optimal candidates for problems with a

convex Pareto front (Ehrgott,2005). On the other hand,

Weighted Tchebycheff,

R(x|ω) = max

1≤i≤dωi|Ri(x)−z⋆

where

z⋆

denotes an ideal value for objective

, captures

all Pareto optimal solutions even for problems with a non-

convex Pareto front (Choo & Atkins,1983;Pardalos et al.,

2017). As such, scalarization transforms the multi-objective

optimization into a family of independent single-objective

sub-problems.

Multi-Objective Bayesian Optimization: In many ap-

plications, the objectives

can be expensive to evalu-

ate, thereby making sample efﬁciency essential. Multi-

Objective Bayesian optimization (MOBO) (Shah & Ghahra-

mani,2016;Garnett,2022) builds upon BO to tackle these

scenarios. MOBO relies on a probabilistic model

which

approximates the objectives

(oracles).

is typically a

multi-task Gaussian Process (Shah & Ghahramani,2016).

Multi-Objective GFlowNets

Notably, as the model is Bayesian, it captures the epistemic

uncertainty in the predictions due to limited data available

for training, which can be used as a signal for prioritiz-

ing potentially useful candidates. The optimization is per-

formed over

rounds, where each round

consists of

ﬁtting the surrogate model

on the data

accumulated

from previous rounds, and using this model to generate

a batch of

candidates

{x1, . . . , xb}

to be evaluated with

the oracles

, resulting in

Bi={(x1, y1),...,(xb, yb)}

The evaluated batch

is then incorporated into the data for

the next round

Di+1 =Di∪ B

. The batch of candidates

in each round is generated by maximizing an acquisition

function

which combines the predictions from the sur-

rogate model along with its epistemic uncertainty into a

single scalar utility score. The acquisition function quanti-

ﬁes the utility of a candidate given the candidates evaluated

so far. Effectively, MOBO decomposes the MOO into a

sequence of single objective optimization problems of the

form max{x1,...,xb}∈2Xa({x1, . . . , xb};ˆ

f).

2.2. GFlowNets

GFlowNets (Bengio et al.,2021a;b) are a family of prob-

abilistic models that learn a stochastic policy to generate

compositional objects

x∈ X

, such as a graph describing a

candidate molecule, through a sequence of steps, with prob-

ability proportional to their reward

R(x)

. If

R:X → R+

has multiple modes then i.i.d samples from

π∝R

gives

a good coverage of the modes of

, resulting in a diverse

set of candidate solutions. The sequential construction of

x∈ X

can be described as a trajectory

τ∈ T

in a weighted

directed acyclic graph (DAG)

G= (S,E)1

, starting from an

empty object

and following actions

a∈ A

as building

blocks. For example, a molecular graph may be sequentially

constructed by adding and connecting new nodes or edges to

the graph. Let

s∈ S

, or state, denote a partially constructed

object. Transitions between states

−→ s′∈ E

indicate

that action

at state

leads to state

s′

. Sequences of such

transitions form constructive trajectories.

The GFlowNet forward policy

PF(−|s)

is a distribution

over the children of state

. An object

can be generated

by starting at

and sampling a sequence of actions iter-

atively from

. Similarly, the backward policy

PB(−|s)

is a distribution over the parents of state

and can gener-

ate backward trajectories starting at any terminal

, e.g.,

iteratively sampling from

starting at

shows a way

could have been constructed. Let

π(x)

be the marginal like-

lihood of sampling trajectories terminating in

following

, and partition function

Z=Px∈X R(x)

. The learning

problem solved by GFlowNets is to learn a forward policy

such that the marginal likelihood of sampling any ob-

If the object is constructed in a canonical order (say a string

constructed from left to right), Gis a tree.

ject

π(x)

is proportional to its reward

R(x)

. In this paper

we adopt the trajectory balance (TB; Malkin et al.,2022)

learning objective. The trajectory balance objective learns

PF(−|s;θ), PB(−|s;θ), Zθ

parameterized by

, which ap-

proximate the forward and backward policies and partition

function such that

π(x)≈R(x)

Z,∀x∈ X

. We refer the

reader to Bengio et al. (2021b); Malkin et al. (2022) for a

more thorough introduction to GFlowNets.

3. Multi-Objective GFlowNets

In this section, we introduce Multi-Objective GFlowNets

(MOGFNs) to tackle the problem of diverse Pareto optimal

candidate generation. The two MOGFN variants we discuss

respectively exploit the decomposition of MOO problems

into a family of independent single objective subproblems

or a sequence of single objective sub-problems.

3.1. Preference-Conditional GFlowNets

As discussed in Section 2.1, given an appropriate scalar-

ization function, candidates optimizing each sub-problem

maxx∈X R(x|ω)

correspond to a single point on the Pareto

front. As the objectives are often imperfect proxies for some

underlying property of interest we aim to generate diverse

candidates for each sub-problem. One naive way to achieve

this is by solving each independent sub-problem with a sep-

arate GFlowNet. However, this approach not only poses

signiﬁcant computational challenges in terms of training a

large number of GFlowNets, but also fails to take advantage

of the shared structure present between the sub-problems.

Reward-conditional GFlowNets (Bengio et al.,2021b) are

a generalization of GFlowNets that learn a single condi-

tional policy to simultaneously model a family of distri-

butions associated with a corresponding family of reward

functions. Let

denote a set of values

, with each

c∈ C

inducing a unique reward function

R(x|c)

. We can de-

ﬁne a family of weighted DAGs

{Gc= (Sc,E), c ∈ C}

which describe the construction of

x∈ X

, with condi-

tioning information

available at all states in

. Having

as input allows the policy to learn the shared structure

across different values of

. We denote

PF(−|s, c)

and

PB(−|s′, c)

as the conditional forward and backward poli-

cies,

Z(c) = Px∈X R(x|c)

as the conditional partition

function and

π(x|c)

as the marginal likelihood (given

)

of sampling trajectories

from

terminating in

. The

learning objective in reward-conditional GFlowNets is thus

estimating PF(−|s, c)such that π(x|c)∝R(x|c). Exploit-

ing the shared structure enables a single conditional policy

(e.g., a neural net taking

and

as input and actions prob-

abilities as outputs) to model the entire family of reward

functions simultaneously. Moreover, the policy can general-

ize to values of cnot seen during training.

Multi-Objective GFlowNets

The MOO sub-problems possess a similar shared structure,

induced by the preferences. Thus, we can leverage a single

reward-conditional GFlowNet instead of a set of indepen-

dent GFlowNets to model the sub-problems simultaneously.

Formally, Preference-conditional GFlowNets (MOGFN-PC)

are reward-conditional GFlowNets with the preferences

ω∈

∆d

over the set of objectives

{R1(x), . . . , Rd(x)}

as the

conditioning variable. MOGFN-PC models the family of re-

ward functions deﬁned by the scalarization function

R(x|ω)

MOGFN-PC is a general approach and can accommodate

any scalarization function, be it existing ones discussed in

Section 2.1 or novel scalarization functions designed for a

particular task. To illustrate this ﬂexibility, we introduce the

Weighted-log-sum (WL):

R(x|ω) = Qd

i=1 Ri(x)ωi

scalar-

ization function. We hypothesize that the weighted sum in

log

space can potentially can help in scenarios where all

objectives are to be optimized simultaneously, and the scalar

reward from Weighted-Sum can be dominated by a single

reward. The scalarization function is a key component for

MOGFN-PC, and we further study the empirical impact of

various scalarization functions in Section 6.

Training MOGFN-PC The procedure to train MOGFN-

PC, or any reward-conditional GFlowNet, closely follows

that of a standard GFlowNet and is described in Algo-

rithm 1. The objective is to learn the parameters

of the

forward and backward conditional policies

PF(−|s, ω;θ)

and

PB(−|s′, ω;θ)

, and the log-partition function

log Zθ(ω)

such that

π(x|ω)∝R(x|ω)

. To this end, we consider an

extension of the trajectory balance objective:

L(τ, ω;θ) = log Zθ(ω)Qs→s′∈τPF(s′|s, ω;θ)

R(x|ω)Qs→s′∈τPB(s|s′, ω;θ)2

(2)

One important component is the distribution

p(ω)

used to

sample preferences during training.

p(ω)

inﬂuences the re-

gions of the Pareto front that are captured by MOGFN-PC.

In our experiments, we use a Dirichlet

(α)

to sample pref-

erences

which are encoded with thermometer encoding

(Buckman et al.,2018) when input to the policy in some of

the tasks. Following prior work, we apply an exponent

for the reward

R(x|ω)

, i.e.

π(x|ω)∝R(x|ω)β

. This incen-

tivizes the policy to focus on the modes of

R(x|ω)

, which

is critical for generation of high reward diverse candidates.

By changing

we can trade-off the diversity for higher

rewards. We study the impact of these choices empirically

in Section 6.

MOGFN-PC and MOReinforce MOGFN-PC is closely

related to MOReinforce (Lin et al.,2021) in that both

learn a preference-conditional policy to sample Pareto-

optimal candidates. The key difference is the learning ob-

jective: MOReinforce uses a multi-objective variant of RE-

INFORCE (Williams,1992), whereas MOGFN-PC uses a

preference-conditional GFlowNet objective (Equation (2)).

MOReinforce, given a preference

will converge to generat-

ing a single candidate that maximizes

R(x|ω)

. MOGFN-PC,

on the other hand, samples from

R(x|ω)

, resulting in gen-

eration of diverse candidates from the Pareto set according

to the preferences speciﬁed by

. This ability to gener-

ate diverse Pareto optimal candidates is a key feature of

MOGFN-PC, whose advantage is demonstrated empirically

in Section 5.

3.2. Multi-Objective Active Learning with GFlowNets

In many applications, the objective functions of interest

are often computationally expensive to evaluate. Consider

the drug discovery scenario, where evaluating objectives

such as the binding energy of a candidate molecule to a

target even in imperfect simulations can take several hours.

Sample efﬁciency, in terms of number of evaluations of the

objective functions, is therefore critical.

We introduce MOGFN-AL which leverages GFlowNets

to generate candidates in each round of a multi-objective

Bayesian optimization loop. MOGFN-AL tackles MOO

through a sequence of single-objective sub-problems de-

ﬁned by acquisition function

. MOGFN-AL is a multi-

objective extension of GFlowNet-AL (Jain et al.,2022).

Here, we apply MOGFN-AL for biological sequence de-

sign summarized in Algorithm 2(Appendix A), building

upon the framework proposed by Stanton et al. (2022). This

problem was previously studied by Seff et al. (2019) and

has connections to denoising autoencoders (Bengio et al.,

2013).

In existing work applying GFlowNets for biological se-

quence design, the GFlowNet policy generates the se-

quences token-by-token (Jain et al.,2022). While this offers

greater ﬂexibility to explore the space of sequences, it can be

prohibitively slow when the sequences are large. In contrast,

we use GFlowNets to propose candidates at each round

generating mutations for existing candidates

x∈ˆ

where

is the set of non-dominated candidates in

. Given a

sequence

, the GFlowNet, through a sequence of stochastic

steps, generates a set of mutations

m={(li, vi)}T

i=1

where

l∈ {1,...,|x|}

is the location to be replaced and

v∈ A

is the token to replace

x[l]

while

is the number of mu-

tations. Let

x′m

be the sequence resulting from mutations

on sequence

. The reward for a set of sampled muta-

tions for

is the value of the acquisition function on

x′m

R(m, x) = a(x′m|ˆ

. This mutation based approach scales

better to tasks with longer sequences while still affording

ample exploration in sequence space for generating diverse

candidates. We demonstrate this empirically in Section 5.3.

Multi-Objective GFlowNets

4. Related Work

Evolutionary Algorithms (EA) Traditionally, evolution-

ary algorithms such as NSGA-II have been widely used

in various multi-objective optimization problems (Ehrgott,

2005;Konak et al.,2006;Blank & Deb,2020). More re-

cently, Miret et al. (2022) incorporated graph neural net-

works into evolutionary algorithms enabling them to tackle

large combinatorial spaces. Unlike MOGFNs, evolutionary

algorithms are required to solve each instance of a MOO

from scratch rather than by amortizing computation during

training in order to quickly generate solutions at run-time.

Evolutionary algorithms, however, can be augmented with

MOGFNs for generating mutations to improve efﬁciency,

as in Section 3.2.

Multi-Objective Reinforcement Learning MOO prob-

lems have also received signiﬁcant interest in the RL liter-

ature (Hayes et al.,2022). Traditional approaches broadly

consist of learning sets of Pareto-dominant policies (Roijers

et al.,2013;Van Moffaert & Now

e,2014;Reymond et al.,

2022). Recent work has focused on extending Deep RL

algorithms for multi-objective settings, e.g., with Envelope-

MOQ (Yang et al.,2019), MO-MPO (Abdolmaleki et al.,

2020;2021) , and MOReinforce (Lin et al.,2021). A gen-

eral shortcoming of RL-based approaches is their objective

focuses on discovering a single mode of the reward func-

tion, and thus hardly generate diverse candidates, an issue

that also persists in the multi-objective setting. In contrast,

MOGFNs sample candidates proportional to the reward,

implicitly resulting in diverse candidates.

Multi-Objective Bayesian Optimization (MOBO)

Bayesian optimization (BO) has been used in the context

of MOO when the objectives are expensive to evaluate

and sample efﬁciency is a key consideration. MOBO

approaches consist of learning a surrogate model of

the true objective functions, which is used to deﬁne

an acquisition function such as expected hypervolume

improvement (Emmerich et al.,2011;Daulton et al.,2020;

2021) and max-value entropy search (Belakaria et al.,2019),

as well as scalarization-based approaches (Paria et al.,2020;

Zhang & Golovin,2020). Abdolshah et al. (2019) and

Lin et al. (2022) study the MOBO problem in the setting

with preferences over the different objectives. Stanton et al.

(2022) proposed LaMBO, which uses language models in

conjunction with BO for multi-objective sequence design

problems. While recent work (Konakovic Lukovic et al.,

2020;Maus et al.,2022) studies the problem of generating

diverse candidates in the context of MOBO, it is limited

to local optimization near Pareto-optimal candidates in

low-dimensional continuous problems. As such, the key

drawbacks of MOBO approaches are that they typically do

not consider the need for diversity in generated candidates

and that they mainly consider continuous low-dimensional

state spaces. As we discuss in Section 3.2, MOBO

approaches can be augmented with GFlowNets for diverse

candidate generation in discrete spaces.

Other Approaches Zhao et al. (2022) introduced LaMOO

which tackles the MOO problem by iteratively splitting

the candidate space into smaller regions, whereas Daulton

et al. (2022) introduce MORBO, which performs BO in

parallel on multiple local regions of the candidate space.

Both these methods, however, are limited to continuous

candidate spaces.

5. Empirical Results

In this section, we present our empirical ﬁndings across

a wide range of tasks ranging from sequence design to

molecule generation. Through our experiments, we aim

to answer the following questions:

Can MOGFNs model the preference-conditional re-

ward distribution?

Q2 Can MOGFNs sample Pareto-optimal candidates?

Q3 Are candidates sampled by MOGFNs diverse?

Do MOGFNs scale to high-dimensional problems rele-

vant in practice?

We obtain positive experimental evidence for Q1-Q4.

Metrics: We rely on standard MOO metrics such as the

Hypervolume (HV) and

indicators, as well as the Gen-

erational Distance+ (GD+). To measure diversity we use

the Top-K Diversity and Top-K Reward metrics of Bengio

et al. (2021a). We detail all metrics in Appendix C. For

all our empirical evaluations we follow the same protocol.

First, we sample a set of preferences which are ﬁxed for all

the methods. For each preference we sample

128

candidates

from which we pick the top

, compute their scalarized

reward and diversity, and report the averages over prefer-

ences. We then use these samples to compute the HV and

indicators. We pick the best hyperparameters for all

methods based on the HV and report the mean and standard

deviation over 3seeds for all quantities.

Baselines: We consider the closely related MORein-

force (Lin et al.,2021) as a baseline. We also study

its variants MOSoftQL and MOA2C which use Soft Q-

Learning (Haarnoja et al.,2017) and A2C (Mnih et al.,

2016) in place of REINFORCE. We additionally compare

against Envelope-MOQ (Yang et al.,2019), another pop-

ular multi-objective reinforcement learning method. For

fragment-based molecule generation we consider an addi-

tional baseline, MARS (Xie et al.,2021), a relevant MCMC

approach for this task. Notably, we do not consider base-

lines like LaMOO (Zhao et al.,2022) and MORBO (Daulton

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Multi-ObjectiveGFlowNetsMokshJain12SharathChandraRaparthy12*AlexHernandez-Garcia12JarridRector-Brooks12YoshuaBengio123SantiagoMiret4EmmanuelBengio5AbstractWestudytheproblemofgeneratingdiversecan-didatesinthecontextofMulti-ObjectiveOpti-mization.Inmanyapplicationsofmachinelearn-ingsuchasdrugdiscovery...

展开>> 收起<<

Multi-Objective GFlowNets.pdf

共23页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Multi-Objective GFlowNets

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: