Graph neural networks to learn joint representations of disjoint molecular graphs Chen Shao12 Chen Zhou1 and Pascal Friederich13

2025-05-06 0 0 1.32MB 10 页 10玖币
侵权投诉
Graph neural networks to learn joint representations of
disjoint molecular graphs
Chen Shao1,2, Chen Zhou1, and Pascal Friederich1,3,*
1Institute of Theoretical Informatics, Karlsruhe Institute of Technology, Am
Fasanengarten 5, 76131 Karlsruhe, Germany
2Present address: Institute for Applied Informatics and Formal Description
Systems, Karlsruhe Institute of Technology, Kaiserstr. 89, 76133 Karlsruhe
Germany
3Institute of Nanotechnology, Karlsruhe Institute of Technology,
Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen, Germany
*Contact: pascal.friederich@kit.edu
Abstract
Graph neural networks are widely used to learn global representations of graphs, which
are then used for regression or classification tasks. Typically, the graphs in such data sets are
connected, i.e. each training sample consists of a single internally connected graph associated
with a global label. However, there is a wide variety of yet unconsidered but application-
relevant tasks, where labels are assigned to sets of disjoint graphs, which requires the generation
of global representations of disjoint graphs. In this paper, we present a new data set with
chemical reactions, which is illustrating this task. Each sample consists of a pair of disjoint
molecular graphs and a joint label representing a scalar measure associated with the chemical
reaction of the molecules. We show the initial results of graph neural networks that are able
to solve the task within a combinatorial subset of the data set, but do not generalize well to
the full data set and unseen (sub)graphs.
1 Introduction
In the last few years, graph neural networks (GNNs) attracted growing attention in chemical
sciences, where they play an important role in solving challenges in molecular property prediction
and design. Currently, GNNs are widely used for regression and classification tasks, e.g. to predict
molecular solubility or toxicity. However, most currently considered tasks of GNNs are limited
to single input molecular graphs, i.e. they use internally connected input molecular graphs to
learn node and edge representations, convert them to global graph representations, from which the
global label is then predicted. However, there is a wide range of real-world tasks where global labels
are assigned to sets of input graphs, rather than single input graphs. Examples for such tasks is
the prediction of solubility (not for a single solvent but for arbitrary combinations of solvents and
solutes), reactivity prediction, where two or more molecular graphs react and the task is to identify
the reaction center or a global property such as the reaction energy or reaction barrier, catalytic
activity, e.g. of catalytic surfaces and given reactants, and many more. Common in all those
tasks is that the joint label depends not on one but on multiple disjoint input graphs.
1
arXiv:2210.09517v2 [cs.CE] 30 Oct 2022
Figure 1: New task for graph learning: Learning joint representations to predict global labels from
sets of disjoint input graphs.
Many GNNs for tasks in chemistry depend not only on the connectivity of the nodes in the
input graph but also on their geometric arrangement. The labels in the task we are presenting
here are typically invariant to the relative geometric arrangement of the different molecular graphs.
However, the internal geometry of each of the input graphs might still add useful information about
the final label, even though the geometry follows from the connectivity and the node features, and
thus might as well be learned, given enough data. The development of GNN architectures for the
task of learning global representations of disjoint graphs is therefore a highly relevant research
area, and we hope that this work stimulates further development in that direction.
2 Related Works
The prediction of molecular properties is a cornerstone in chemistry, e.g. in drug discovery, where
accurate predictions are needed to identify drug candidates in an efficient and computationally
inexpensive way. Molecular graphs allow us to learn informative representations of molecules, by
learning from the chemical structure of a molecule directly and enhancing that information with
physics-informed features of atoms and bonds, potentially including the 3D geometry information
of the molecular structure. The basic principle of GNNs is as follows: Atoms that are connected
by bonds are close in the graph, which means that they have the greatest influence on each other.
Through graph convolutions or message passing, such pairwise influence decays with the distance
between the atoms. This enables GNNs to learn informative atom representations which can then
be combined to global vector representations of entire molecules.
GNNs have found tremendous success in processing molecules and molecule properties, which
has become one of their main applications1. Seminal work by Duvenaud et al. showed how
GNNs can be seen as a generalized and learnable alternative to until then prevalent fingerprint
representations of molecules.2Gilmer et al. suggested a more generalized framework which they
called message passing neural networks (MPNNs) and showed that MPNNs can accurately predict
quantum mechanical properties, calculated by density functional theory (DFT), which allows the
wider and more successful application of GNN to quantum chemistry.3Nowadays, many GNN
architectures are available in the hope of being able to replace expensive quantum mechanical
calculations with fast data-driven predictions.4 5 6 7.
Each molecule and associated molecular property can be uniquely determined by its 3D rep-
resentation. After seminal and very promising work in that direction,3, Schütt et al.8lever-
aged continuous-filter convolutions to learn local atomic environments in an architecture consist-
ing of atom-wise blocks and interaction blocks. This idea is further optimized in DimeNet and
2
摘要:

GraphneuralnetworkstolearnjointrepresentationsofdisjointmoleculargraphsChenShao1,2,ChenZhou1,andPascalFriederich1,3,*1InstituteofTheoreticalInformatics,KarlsruheInstituteofTechnology,AmFasanengarten5,76131Karlsruhe,Germany2Presentaddress:InstituteforAppliedInformaticsandFormalDescriptionSystems,Karl...

展开>> 收起<<
Graph neural networks to learn joint representations of disjoint molecular graphs Chen Shao12 Chen Zhou1 and Pascal Friederich13.pdf

共10页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:10 页 大小:1.32MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 10
客服
关注