Multi-Objective GFlowNets

2025-05-02 0 0 1.3MB 23 页 10玖币
侵权投诉
Multi-Objective GFlowNets
Moksh Jain 1 2 Sharath Chandra Raparthy 12* Alex Hernandez-Garcia 1 2 Jarrid Rector-Brooks 1 2
Yoshua Bengio 123 Santiago Miret 4Emmanuel Bengio 5
Abstract
We study the problem of generating diverse can-
didates in the context of Multi-Objective Opti-
mization. In many applications of machine learn-
ing such as drug discovery and material design,
the goal is to generate candidates which simul-
taneously optimize a set of potentially conflict-
ing objectives. Moreover, these objectives are
often imperfect evaluations of some underlying
property of interest, making it important to gen-
erate diverse candidates to have multiple options
for expensive downstream evaluations. We pro-
pose Multi-Objective GFlowNets (MOGFNs), a
novel method for generating diverse Pareto opti-
mal solutions, based on GFlowNets. We introduce
two variants of MOGFNs: MOGFN-PC, which
models a family of independent sub-problems de-
fined by a scalarization function, with reward-
conditional GFlowNets, and MOGFN-AL, which
solves a sequence of sub-problems defined by an
acquisition function in an active learning loop.
Our experiments on wide variety of synthetic and
benchmark tasks demonstrate advantages of the
proposed methods in terms of the Pareto perfor-
mance and importantly, improved candidate diver-
sity, which is the main contribution of this work.
1. Introduction
Decision making in practical applications usually in-
volves reasoning about multiple, often conflicting, objec-
tives (Keeney et al.,1993). Consider the example of in-silico
drug discovery, where the goal is to generate novel drug-like
molecules that effectively inhibit a target, are easy to synthe-
size, and possess a safety profile for human use (Dara et al.,
2021). These objectives often exhibit mutual incompatibil-
Work done during an internship at Recursion
1
Universit
´
e de
Montr
´
eal
2
Mila - Quebec AI Institute
3
CIFAR Fellow & IVADO
4
Intel Labs
5
Recursion. Correspondence to: Moksh Jain
<
mok-
shjn00@gmail.com>.
Proceedings of the
40 th
International Conference on Machine
Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright
2023 by the author(s).
ity as molecules that are effective against a target may also
have detrimental effects on humans, making it infeasible to
find a single molecule that maximizes all the objectives si-
multaneously. Instead, the goal in these Multi-Objective Op-
timization (MOO; Ehrgott,2005;Miettinen,2012) problems
is to identify candidate molecules that are Pareto optimal,
covering the best possible trade-offs between the objectives.
A less appreciated aspect of multi-objective problems is
that the objectives to optimize are usually underspecified
proxies which only approximate the true design objectives.
For instance, the binding affinity of a molecule to a target
is an imperfect approximation of the molecule’s inhibitory
effect against the target in the human body. In such scenarios
it is important to not only cover the Pareto front but also to
generate sets of diverse candidates for each Pareto optimal
solution in order to increase the likelihood of success of the
generated candidates in expensive downstream evaluations,
such as in-vivo tests and clinical trials (Jain et al.,2022).
The benefits of generating diverse candidates are twofold.
First, by diversifying the set of candidates we obtain an
advantage similar to Bayesian ensembles: we reduce the
risk of failure that might occur due to the imperfect gen-
eralization of learned proxy models. Diverse candidates
should lie in different regions of the input-space manifold
where the objective of interest might be large (considering
the uncertainty in the output of the proxy model). Second,
experimental measurements such as in-vitro assays may not
reflect the ultimate objectives of interest, such as efficacy in
human bodies. Multiple candidates may have the same as-
say score, but different downstream efficacy, so diversity in
candidates increases odds of success. Existing approaches
for MOO overlook this aspect of diversity and instead focus
primarily on generating Pareto optimal solutions.
Generative Flow Networks (GFlowNets; Bengio et al.,
2021a;b) are a recently proposed family of probabilistic
models which tackle the problem of diverse candidate gen-
eration. Contrary to the reward maximization view of preva-
lent Reinforcement Learning (RL) and Bayesian optimiza-
tion (BO) approaches, GFlowNets sample candidates with
probability proportional to their reward. Sampling candi-
dates, as opposed to greedily generating them implicitly en-
courages diversity in the generated candidates. GFlowNets
1
arXiv:2210.12765v2 [cs.LG] 17 Jul 2023
Multi-Objective GFlowNets
have shown promising results in single objective problems
such as molecule generation (Bengio et al.,2021a) and bio-
logical sequence design (Jain et al.,2022).
In this paper, we propose Multi-Objective GFlowNets
(MOGFNs), which leverage the strengths of GFlowNets
and existing MOO approaches to enable the generation of
diverse Pareto optimal candidates. We consider two vari-
ants of MOGFNs – (a) Preference-Conditional GFlowNets
(MOGFN-PC) which leverage the decomposition of MOO
into single objective sub-problems through scalarization,
and (b) MOGFN-AL, which leverages the transformation
of MOO into a sequence of single objective sub-problems
within the framework of multi-objective Bayesian optimiza-
tion. Our contributions are as follows:
C1
We introduce a novel framework of MOGFNs to tackle
the practically significant and previously unexplored
problem of diverse candidate generation in MOO.
C2
Through experiments on challenging molecule genera-
tion and sequence generation tasks, we demonstrate that
MOGFN-PC generates diverse Pareto-optimal candi-
dates. This is the first successful application and empir-
ical validation of reward-conditional GFlowNets (Ben-
gio et al.,2021b).
C3
In a challenging active learning task for designing fluo-
rescent proteins, we show that MOGFN-AL results in
significant improvements to sample efficiency as well
as diversity of generated candidates.
C4
We perform a thorough analysis on the key components
of MOGFNs to provide insights into design choices that
affect performance.
2. Background
2.1. Multi-Objective Optimization
Multi-objective Optimization (MOO) involves finding a set
of feasible candidates
x∈ X
which simultaneously maxi-
mize dobjectives R(x)=[R1(x), . . . , Rd(x)]:
max
x∈X
R(x).(1)
When these objectives are conflicting, there is no single
x
which simultaneously maximizes all objectives. Con-
sequently, MOO adopts the concept of Pareto optimality,
which describes a set of solutions that provide optimal trade-
offs among the objectives.
Given
x1, x2∈ X
,
x1
is said to dominate
x2
, written
(
x1x2
), iff
Ri(x1)Ri(x2)i∈ {1, . . . , d}
and
k∈ {1, . . . , d}
such that
Rk(x1)> Rk(x2)
. A candidate
x
is Pareto-optimal if there exists no other solution
x∈ X
which dominates
x
. In other words, for a Pareto-optimal
candidate it is impossible to improve one objective without
sacrificing another. The Pareto set is the set of all Pareto-
optimal candidates in
X
, and the Pareto front is defined as
the image of the Pareto set in objective-space.
Diversity: Since the objectives are not guaranteed to be
injective, any point on the Pareto front can be the image of
several candidates in the Pareto set. This designates diver-
sity in candidate space. Capturing all the diverse candidates,
corresponding to a point on the Pareto front, is critical for
applications such as in-silico drug discovery, where the ob-
jectives
Ri
(e.g. binding affinity to a target protein) are mere
proxies for the more expensive downstream measurements
(e.g., effectiveness in clinical trials on humans). This no-
tion of diversity of candidates is typically not captured by
existing approaches for MOO.
2.1.1. APPROACHES FOR TACKLING MOO
While, there exist many approaches tackling MOO prob-
lems (Ehrgott,2005;Miettinen,2012;Pardalos et al.,2017),
in this work, we consider two distinct approaches that de-
compose the MOO problem into a family of single objec-
tive sub-problems. These approaches, described below, are
well-suited for the GFlowNet formulations we introduce in
Section 3.
Scalarization: In scalarization, a set of weights (prefer-
ences)
ωi
are assigned to the objectives
Ri
, where
ωi0
and
Pd
i=1 ωi= 1
. The MOO problem can then be de-
composed into single-objective sub-problems of the form
maxx∈X R(x|ω)
, where
R(x|ω)
is called a scalarization
function, which combines the
d
objectives into a scalar. Solu-
tions to these sub-problems capture all Pareto-optimal solu-
tions to the original MOO problem depending on the choice
of
R(x|ω)
and characteristics of the Pareto front. Weighted
Sum Scalarization,
R(x|ω) = Pd
i=1 ωiRi(x)
, for instance,
captures all Pareto optimal candidates for problems with a
convex Pareto front (Ehrgott,2005). On the other hand,
Weighted Tchebycheff,
R(x|ω) = max
1idωi|Ri(x)z
i|
,
where
z
i
denotes an ideal value for objective
Ri
, captures
all Pareto optimal solutions even for problems with a non-
convex Pareto front (Choo & Atkins,1983;Pardalos et al.,
2017). As such, scalarization transforms the multi-objective
optimization into a family of independent single-objective
sub-problems.
Multi-Objective Bayesian Optimization: In many ap-
plications, the objectives
Ri
can be expensive to evalu-
ate, thereby making sample efficiency essential. Multi-
Objective Bayesian optimization (MOBO) (Shah & Ghahra-
mani,2016;Garnett,2022) builds upon BO to tackle these
scenarios. MOBO relies on a probabilistic model
ˆ
f
which
approximates the objectives
R
(oracles).
ˆ
f
is typically a
multi-task Gaussian Process (Shah & Ghahramani,2016).
2
Multi-Objective GFlowNets
Notably, as the model is Bayesian, it captures the epistemic
uncertainty in the predictions due to limited data available
for training, which can be used as a signal for prioritiz-
ing potentially useful candidates. The optimization is per-
formed over
M
rounds, where each round
i
consists of
fitting the surrogate model
ˆ
f
on the data
Di
accumulated
from previous rounds, and using this model to generate
a batch of
b
candidates
{x1, . . . , xb}
to be evaluated with
the oracles
R
, resulting in
Bi={(x1, y1),...,(xb, yb)}
.
The evaluated batch
B
is then incorporated into the data for
the next round
Di+1 =Di∪ B
. The batch of candidates
in each round is generated by maximizing an acquisition
function
a
which combines the predictions from the sur-
rogate model along with its epistemic uncertainty into a
single scalar utility score. The acquisition function quanti-
fies the utility of a candidate given the candidates evaluated
so far. Effectively, MOBO decomposes the MOO into a
sequence of single objective optimization problems of the
form max{x1,...,xb}∈2Xa({x1, . . . , xb};ˆ
f).
2.2. GFlowNets
GFlowNets (Bengio et al.,2021a;b) are a family of prob-
abilistic models that learn a stochastic policy to generate
compositional objects
x∈ X
, such as a graph describing a
candidate molecule, through a sequence of steps, with prob-
ability proportional to their reward
R(x)
. If
R:X R+
has multiple modes then i.i.d samples from
πR
gives
a good coverage of the modes of
R
, resulting in a diverse
set of candidate solutions. The sequential construction of
x∈ X
can be described as a trajectory
τ∈ T
in a weighted
directed acyclic graph (DAG)
G= (S,E)1
, starting from an
empty object
s0
and following actions
a∈ A
as building
blocks. For example, a molecular graph may be sequentially
constructed by adding and connecting new nodes or edges to
the graph. Let
s∈ S
, or state, denote a partially constructed
object. Transitions between states
sa
s∈ E
indicate
that action
a
at state
s
leads to state
s
. Sequences of such
transitions form constructive trajectories.
The GFlowNet forward policy
PF(−|s)
is a distribution
over the children of state
s
. An object
x
can be generated
by starting at
s0
and sampling a sequence of actions iter-
atively from
PF
. Similarly, the backward policy
PB(−|s)
is a distribution over the parents of state
s
and can gener-
ate backward trajectories starting at any terminal
x
, e.g.,
iteratively sampling from
PB
starting at
x
shows a way
x
could have been constructed. Let
π(x)
be the marginal like-
lihood of sampling trajectories terminating in
x
following
PF
, and partition function
Z=Px∈X R(x)
. The learning
problem solved by GFlowNets is to learn a forward policy
PF
such that the marginal likelihood of sampling any ob-
1
If the object is constructed in a canonical order (say a string
constructed from left to right), Gis a tree.
ject
π(x)
is proportional to its reward
R(x)
. In this paper
we adopt the trajectory balance (TB; Malkin et al.,2022)
learning objective. The trajectory balance objective learns
PF(−|s;θ), PB(−|s;θ), Zθ
parameterized by
θ
, which ap-
proximate the forward and backward policies and partition
function such that
π(x)R(x)
Z,x∈ X
. We refer the
reader to Bengio et al. (2021b); Malkin et al. (2022) for a
more thorough introduction to GFlowNets.
3. Multi-Objective GFlowNets
In this section, we introduce Multi-Objective GFlowNets
(MOGFNs) to tackle the problem of diverse Pareto optimal
candidate generation. The two MOGFN variants we discuss
respectively exploit the decomposition of MOO problems
into a family of independent single objective subproblems
or a sequence of single objective sub-problems.
3.1. Preference-Conditional GFlowNets
As discussed in Section 2.1, given an appropriate scalar-
ization function, candidates optimizing each sub-problem
maxx∈X R(x|ω)
correspond to a single point on the Pareto
front. As the objectives are often imperfect proxies for some
underlying property of interest we aim to generate diverse
candidates for each sub-problem. One naive way to achieve
this is by solving each independent sub-problem with a sep-
arate GFlowNet. However, this approach not only poses
significant computational challenges in terms of training a
large number of GFlowNets, but also fails to take advantage
of the shared structure present between the sub-problems.
Reward-conditional GFlowNets (Bengio et al.,2021b) are
a generalization of GFlowNets that learn a single condi-
tional policy to simultaneously model a family of distri-
butions associated with a corresponding family of reward
functions. Let
C
denote a set of values
c
, with each
c∈ C
inducing a unique reward function
R(x|c)
. We can de-
fine a family of weighted DAGs
{Gc= (Sc,E), c ∈ C}
which describe the construction of
x∈ X
, with condi-
tioning information
c
available at all states in
Sc
. Having
c
as input allows the policy to learn the shared structure
across different values of
c
. We denote
PF(−|s, c)
and
PB(−|s, c)
as the conditional forward and backward poli-
cies,
Z(c) = Px∈X R(x|c)
as the conditional partition
function and
π(x|c)
as the marginal likelihood (given
c
)
of sampling trajectories
τ
from
PF
terminating in
x
. The
learning objective in reward-conditional GFlowNets is thus
estimating PF(−|s, c)such that π(x|c)R(x|c). Exploit-
ing the shared structure enables a single conditional policy
(e.g., a neural net taking
c
and
s
as input and actions prob-
abilities as outputs) to model the entire family of reward
functions simultaneously. Moreover, the policy can general-
ize to values of cnot seen during training.
3
Multi-Objective GFlowNets
The MOO sub-problems possess a similar shared structure,
induced by the preferences. Thus, we can leverage a single
reward-conditional GFlowNet instead of a set of indepen-
dent GFlowNets to model the sub-problems simultaneously.
Formally, Preference-conditional GFlowNets (MOGFN-PC)
are reward-conditional GFlowNets with the preferences
ω
d
over the set of objectives
{R1(x), . . . , Rd(x)}
as the
conditioning variable. MOGFN-PC models the family of re-
ward functions defined by the scalarization function
R(x|ω)
.
MOGFN-PC is a general approach and can accommodate
any scalarization function, be it existing ones discussed in
Section 2.1 or novel scalarization functions designed for a
particular task. To illustrate this flexibility, we introduce the
Weighted-log-sum (WL):
R(x|ω) = Qd
i=1 Ri(x)ωi
scalar-
ization function. We hypothesize that the weighted sum in
log
space can potentially can help in scenarios where all
objectives are to be optimized simultaneously, and the scalar
reward from Weighted-Sum can be dominated by a single
reward. The scalarization function is a key component for
MOGFN-PC, and we further study the empirical impact of
various scalarization functions in Section 6.
Training MOGFN-PC The procedure to train MOGFN-
PC, or any reward-conditional GFlowNet, closely follows
that of a standard GFlowNet and is described in Algo-
rithm 1. The objective is to learn the parameters
θ
of the
forward and backward conditional policies
PF(−|s, ω;θ)
and
PB(−|s, ω;θ)
, and the log-partition function
log Zθ(ω)
such that
π(x|ω)R(x|ω)
. To this end, we consider an
extension of the trajectory balance objective:
L(τ, ω;θ) = log Zθ(ω)QssτPF(s|s, ω;θ)
R(x|ω)QssτPB(s|s, ω;θ)2
.
(2)
One important component is the distribution
p(ω)
used to
sample preferences during training.
p(ω)
influences the re-
gions of the Pareto front that are captured by MOGFN-PC.
In our experiments, we use a Dirichlet
(α)
to sample pref-
erences
ω
which are encoded with thermometer encoding
(Buckman et al.,2018) when input to the policy in some of
the tasks. Following prior work, we apply an exponent
β
for the reward
R(x|ω)
, i.e.
π(x|ω)R(x|ω)β
. This incen-
tivizes the policy to focus on the modes of
R(x|ω)
, which
is critical for generation of high reward diverse candidates.
By changing
β
we can trade-off the diversity for higher
rewards. We study the impact of these choices empirically
in Section 6.
MOGFN-PC and MOReinforce MOGFN-PC is closely
related to MOReinforce (Lin et al.,2021) in that both
learn a preference-conditional policy to sample Pareto-
optimal candidates. The key difference is the learning ob-
jective: MOReinforce uses a multi-objective variant of RE-
INFORCE (Williams,1992), whereas MOGFN-PC uses a
preference-conditional GFlowNet objective (Equation (2)).
MOReinforce, given a preference
ω
will converge to generat-
ing a single candidate that maximizes
R(x|ω)
. MOGFN-PC,
on the other hand, samples from
R(x|ω)
, resulting in gen-
eration of diverse candidates from the Pareto set according
to the preferences specified by
ω
. This ability to gener-
ate diverse Pareto optimal candidates is a key feature of
MOGFN-PC, whose advantage is demonstrated empirically
in Section 5.
3.2. Multi-Objective Active Learning with GFlowNets
In many applications, the objective functions of interest
are often computationally expensive to evaluate. Consider
the drug discovery scenario, where evaluating objectives
such as the binding energy of a candidate molecule to a
target even in imperfect simulations can take several hours.
Sample efficiency, in terms of number of evaluations of the
objective functions, is therefore critical.
We introduce MOGFN-AL which leverages GFlowNets
to generate candidates in each round of a multi-objective
Bayesian optimization loop. MOGFN-AL tackles MOO
through a sequence of single-objective sub-problems de-
fined by acquisition function
a
. MOGFN-AL is a multi-
objective extension of GFlowNet-AL (Jain et al.,2022).
Here, we apply MOGFN-AL for biological sequence de-
sign summarized in Algorithm 2(Appendix A), building
upon the framework proposed by Stanton et al. (2022). This
problem was previously studied by Seff et al. (2019) and
has connections to denoising autoencoders (Bengio et al.,
2013).
In existing work applying GFlowNets for biological se-
quence design, the GFlowNet policy generates the se-
quences token-by-token (Jain et al.,2022). While this offers
greater flexibility to explore the space of sequences, it can be
prohibitively slow when the sequences are large. In contrast,
we use GFlowNets to propose candidates at each round
i
by
generating mutations for existing candidates
xˆ
Pi
where
ˆ
Pi
is the set of non-dominated candidates in
Di
. Given a
sequence
x
, the GFlowNet, through a sequence of stochastic
steps, generates a set of mutations
m={(li, vi)}T
i=1
where
l∈ {1,...,|x|}
is the location to be replaced and
v∈ A
is the token to replace
x[l]
while
T
is the number of mu-
tations. Let
xm
be the sequence resulting from mutations
m
on sequence
x
. The reward for a set of sampled muta-
tions for
x
is the value of the acquisition function on
xm
,
R(m, x) = a(xm|ˆ
f)
. This mutation based approach scales
better to tasks with longer sequences while still affording
ample exploration in sequence space for generating diverse
candidates. We demonstrate this empirically in Section 5.3.
4
Multi-Objective GFlowNets
4. Related Work
Evolutionary Algorithms (EA) Traditionally, evolution-
ary algorithms such as NSGA-II have been widely used
in various multi-objective optimization problems (Ehrgott,
2005;Konak et al.,2006;Blank & Deb,2020). More re-
cently, Miret et al. (2022) incorporated graph neural net-
works into evolutionary algorithms enabling them to tackle
large combinatorial spaces. Unlike MOGFNs, evolutionary
algorithms are required to solve each instance of a MOO
from scratch rather than by amortizing computation during
training in order to quickly generate solutions at run-time.
Evolutionary algorithms, however, can be augmented with
MOGFNs for generating mutations to improve efficiency,
as in Section 3.2.
Multi-Objective Reinforcement Learning MOO prob-
lems have also received significant interest in the RL liter-
ature (Hayes et al.,2022). Traditional approaches broadly
consist of learning sets of Pareto-dominant policies (Roijers
et al.,2013;Van Moffaert & Now
´
e,2014;Reymond et al.,
2022). Recent work has focused on extending Deep RL
algorithms for multi-objective settings, e.g., with Envelope-
MOQ (Yang et al.,2019), MO-MPO (Abdolmaleki et al.,
2020;2021) , and MOReinforce (Lin et al.,2021). A gen-
eral shortcoming of RL-based approaches is their objective
focuses on discovering a single mode of the reward func-
tion, and thus hardly generate diverse candidates, an issue
that also persists in the multi-objective setting. In contrast,
MOGFNs sample candidates proportional to the reward,
implicitly resulting in diverse candidates.
Multi-Objective Bayesian Optimization (MOBO)
Bayesian optimization (BO) has been used in the context
of MOO when the objectives are expensive to evaluate
and sample efficiency is a key consideration. MOBO
approaches consist of learning a surrogate model of
the true objective functions, which is used to define
an acquisition function such as expected hypervolume
improvement (Emmerich et al.,2011;Daulton et al.,2020;
2021) and max-value entropy search (Belakaria et al.,2019),
as well as scalarization-based approaches (Paria et al.,2020;
Zhang & Golovin,2020). Abdolshah et al. (2019) and
Lin et al. (2022) study the MOBO problem in the setting
with preferences over the different objectives. Stanton et al.
(2022) proposed LaMBO, which uses language models in
conjunction with BO for multi-objective sequence design
problems. While recent work (Konakovic Lukovic et al.,
2020;Maus et al.,2022) studies the problem of generating
diverse candidates in the context of MOBO, it is limited
to local optimization near Pareto-optimal candidates in
low-dimensional continuous problems. As such, the key
drawbacks of MOBO approaches are that they typically do
not consider the need for diversity in generated candidates
and that they mainly consider continuous low-dimensional
state spaces. As we discuss in Section 3.2, MOBO
approaches can be augmented with GFlowNets for diverse
candidate generation in discrete spaces.
Other Approaches Zhao et al. (2022) introduced LaMOO
which tackles the MOO problem by iteratively splitting
the candidate space into smaller regions, whereas Daulton
et al. (2022) introduce MORBO, which performs BO in
parallel on multiple local regions of the candidate space.
Both these methods, however, are limited to continuous
candidate spaces.
5. Empirical Results
In this section, we present our empirical findings across
a wide range of tasks ranging from sequence design to
molecule generation. Through our experiments, we aim
to answer the following questions:
Q1
Can MOGFNs model the preference-conditional re-
ward distribution?
Q2 Can MOGFNs sample Pareto-optimal candidates?
Q3 Are candidates sampled by MOGFNs diverse?
Q4
Do MOGFNs scale to high-dimensional problems rele-
vant in practice?
We obtain positive experimental evidence for Q1-Q4.
Metrics: We rely on standard MOO metrics such as the
Hypervolume (HV) and
R2
indicators, as well as the Gen-
erational Distance+ (GD+). To measure diversity we use
the Top-K Diversity and Top-K Reward metrics of Bengio
et al. (2021a). We detail all metrics in Appendix C. For
all our empirical evaluations we follow the same protocol.
First, we sample a set of preferences which are fixed for all
the methods. For each preference we sample
128
candidates
from which we pick the top
10
, compute their scalarized
reward and diversity, and report the averages over prefer-
ences. We then use these samples to compute the HV and
R2
indicators. We pick the best hyperparameters for all
methods based on the HV and report the mean and standard
deviation over 3seeds for all quantities.
Baselines: We consider the closely related MORein-
force (Lin et al.,2021) as a baseline. We also study
its variants MOSoftQL and MOA2C which use Soft Q-
Learning (Haarnoja et al.,2017) and A2C (Mnih et al.,
2016) in place of REINFORCE. We additionally compare
against Envelope-MOQ (Yang et al.,2019), another pop-
ular multi-objective reinforcement learning method. For
fragment-based molecule generation we consider an addi-
tional baseline, MARS (Xie et al.,2021), a relevant MCMC
approach for this task. Notably, we do not consider base-
lines like LaMOO (Zhao et al.,2022) and MORBO (Daulton
5
摘要:

Multi-ObjectiveGFlowNetsMokshJain12SharathChandraRaparthy12*AlexHernandez-Garcia12JarridRector-Brooks12YoshuaBengio123SantiagoMiret4EmmanuelBengio5AbstractWestudytheproblemofgeneratingdiversecan-didatesinthecontextofMulti-ObjectiveOpti-mization.Inmanyapplicationsofmachinelearn-ingsuchasdrugdiscovery...

展开>> 收起<<
Multi-Objective GFlowNets.pdf

共23页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:23 页 大小:1.3MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 23
客服
关注