Maximum Common Subgraph Guided Graph Retrieval Late and Early Interaction Networks Indradyumna Roy Soumen Chakrabarti and Abir De

2025-04-29 0 0 1.54MB 25 页 10玖币

侵权投诉

Maximum Common Subgraph Guided Graph Retrieval:

Late and Early Interaction Networks

Indradyumna Roy, Soumen Chakrabarti, and Abir De

IIT Bombay,

{indraroy15, soumen, abir}@cse.iitb.ac.in

Abstract

The graph retrieval problem is to search in a large corpus of graphs for ones that are most similar to a query graph.

A common consideration for scoring similarity is the maximum common subgraph (MCS) between the query and

corpus graphs, usually counting the number of common edges (i.e., MCES). In some applications, it is also desirable

that the common subgraph be connected, i.e., the maximum common connected subgraph (MCCS). Finding exact

MCES and MCCS is intractable, but may be unnecessary if ranking corpus graphs by relevance is the goal. We

design fast and trainable neural functions that approximate MCES and MCCS well. Late interaction methods compute

representations for the query and corpus graphs separately, and compare these representations using simple similarity

functions at the last stage, leading to highly scalable systems. Early interaction methods combine information from

both graphs right from the input stages, are usually considerably more accurate, but slower. We propose both late and

early interaction neural MCES and MCCS formulations. They are both based on a continuous relaxation of a node

alignment matrix between query and corpus nodes. For MCCS, we propose a novel differentiable ‘gossip’ network for

estimating the size of the largest connected common subgraph. Extensive experiments with seven data sets show that

our proposals are superior among late interaction models in terms of both accuracy and speed. Our early interaction

models provide accuracy competitive with the state of the art, at substantially greater speeds.

1 Introduction

Given a query graph, the graph retrieval problem is to search for relevant or similar response graphs from a corpus of

graphs [

]. Depending on the application, the notion of relevance may involve graph

edit distance (GED) [

], the size of the maximum common subgraph (MCS) [

], full graph or

subgraph isomorphism [

], etc. In this work, we focus on two variations of MCS-based relevance measures:

(i) maximum common edge subgraph (MCES) [

], which has applications in distributed computing [

] and

molecule search [

] and (ii) maximum common connected subgraph (MCCS) [

], which has applications

in keyword search over knowledge graphs [

], software development [

], image analysis [

], etc.

In recent years, there has been an increasing interest in designing neural graph retrieval models [6, 7, 8, 9, 10, 11].

However, most of them learn black box relevance models which provide suboptimal performance in the context of MCS

based retrieval (Section 4). Moreover, they do not provide intermediate matching evidence to justify their scores and

therefore, they lack interpretability. In this context, Li et al.

[6]

proposed a graph matching network (GMN) [

] based

on a cross-graph attention mechanism, which works extremely well in practice (Section 4). Nevertheless, it suffers from

three key limitations, leaving considerable scope for the design of enhanced retrieval models. (i) Similar to other graph

retrieval models, it uses a general-purpose scoring layer, which renders it suboptimal in the context of MCS based graph

retrieval. (ii) As acknowledged by the authors, GMN is slow in both training and inference, due to the presence of the

expensive cross-attention mechanism. (iii) MCS or any graph similarity function entails an injective mapping between

nodes and edges across the graph pairs. In contrast, cross-attention induces potentially inconsistent and non-injective

mappings, where a given query node can be mapped to multiple corpus nodes and vice-versa.

arXiv:2210.11020v1 [cs.LG] 20 Oct 2022

1.1 Our contributions

We begin by writing down (the combinatorial forms of) MCES and MCCS objectives in speciﬁc ways that facilitate

subsequent adaptation to neural optimization using both late and early interaction networks. Notably, these networks are

trained end-to-end, using only the distant supervision by MCES or MCCS values of query-corpus graph pairs, without

explicit annotation of the structural mappings identifying the underlying common subgraph.

Late interaction models.

We use a graph neural network (GNN) to ﬁrst compute the node embeddings of the query

and corpus graphs independently of each other and then deploy an interaction network for computing the relevance

scores. This decoupling between embedding computation and interaction steps leads to efﬁcient training and inference.

We introduce LMCES and LMCCS, two late interaction neural architectures for MCES and MCCS based graph

retrieval respectively. The interaction model is a differentiable graph alignment planner. It learns a Gumbel-Sinkhorn

(GS) network to provide an approximate alignment plan between the query and corpus graphs. In contrast to GMN [

it induces an approximately injective mapping between the nodes and edges of the query and corpus graphs. The MCES

objective is then computed as a differentiable network applied to this mapping. For MCCS, we further develop a novel

differentiable gossip network that computes the size of the largest connected component in the common subgraph

estimated from the above mapping. These neural gadgets may be of independent interest.

Early interaction model.

In the early interaction model, we perform the interaction step during the node embedding

computation phase, which makes the query and corpus embeddings dependent on each other. This improves predictive

power, at the cost of additional training and inference time. Here, we propose XMCS (cross-MCS), an early interaction

model that works well for both MCES and MCCS based graph retrieval. At each propagation layer of the GNN, we ﬁrst

reﬁne the alignment plan using the embeddings computed in the previous layer, then update the underlying coverage

objective using the reﬁned alignment and ﬁnally use these signals to compute the node embeddings of the current layer.

Comprehensive evaluation.

We experiment with seven diverse datasets, which show that: (i) our late and early

interaction models outperform the corresponding state-of-the-art methods in terms of both accuracy and inference

time; (ii) in many cases, LMCES and LMCCS outperform the early interaction model of GMN [

]; and (iii) GMN’s

accuracy can be signiﬁcantly improved by substituting its ﬁnal layer with our MCS-speciﬁc neural surrogate.

1.2 Related work

Combinatorial algorithms for MCS.

Both MCES and MCCS are NP-complete [

]. Several works designed heuristics

for computing MCES for speciﬁc types of graphs [

]. Bahiense et al.

[16]

formulated MCES as an integer

programming problem, provided a polyhedral analysis of the underlying formulation, and ﬁnally designed a branch

and cut algorithm to solve it. Such polyhedral study for MCES was also performed by others [

]. Combinatorial

methods for different variants of MCCS have also been thoroughly studied. Some of them provide exact MCCS [

]. McCreesh et al.

[34]

proposed McSplit, a branch and bound algorithm for maximum common induced and

connected graph, which is efﬁcient in terms of time and memory. Other works provide effective heuristics to ﬁnd

approximate MCCS [37, 38]. However, these methods are not differentiable and therefore not suitable for data-driven

MCS estimation.

Learning models for MCS.

There are some recent efforts to design machine learning models for graph similarity

and search [

]. Among them, Bai et al.

[39

, GLsearch

]

compute MCS between two graphs using a reinforcement

learning setup. In contrast, we consider a supervised learning setting for graph retrieval. Although Bai et al.

GraphSim

]

focus on the supervised learning scenario, their relevance scoring model performs poorly for MCS based

retrieval.

Graph matching for computer vision.

Neural models for graph matching are used in applications of computer vision

for computing image similarity, object detection, etc. However, these applications permit explicit supervision of the

underlying node alignments [

]. They adopt different types of losses which include

permutation loss [

], Hungarian loss [

], and displacement loss [

]. In our problem, we only use distant

supervision in the form of size of the underlying MCS. Moreover, these work mostly consider graph matching problems,

whereas we consider maximum common subgraph detection using distant supervision from MCS score alone.

Graph representation learning.

Representation learning on graphs has been widely researched in the last decade [

]. Among them, graph neural networks (GNN) are the most popular node embedding models [

]. Given a node

, a GNN collects information from

-hop neighbors of the node

and applies a

symmetric aggregator on top of it to obtain the representation vector of the nodes. In the context of graph retrieval, the

node embeddings are used in two ways. In the ﬁrst approach, they are further aggregated into graph embeddings, which

are then used to compute the similarity between query and corpus graphs, by comparing the embeddings in the vector

space [

]. The second approach consists of SimGNN [

], GOTSim [

], GraphSim [

], GEN [

] and GMN [

which compare the node embeddings and ﬁnd suitable alignments between them. Here, GMN applies cross attention

based mechanism on the node embeddings given a graph neural network [

]. Recently, Chen et al.

[58]

designed a

structure aware transformer architecture, which can represent a subgraph around a node more effectively than several

other representation models.

Differentiable solvers for combinatorial algorithms.

Our work attempts to ﬁnd a neural surrogate for the combinato-

rial challenge of maximizing objective scores over a permutation space. In effect, we are attempting to solve an Optimal

Transport problem, where the central challenge is to present a neural gadget which is differentiable and backpropagable,

thus enabling end-to-end training. Cuturi

[59]

utilized iterative row and column normalization, earlier proposed by

Sinkhorn [

], to approximately solve the transportation problem subject to marginal constraints. In another approach,

Vlastelica et al.

[61]

attempted to solve combinatorial problems exactly using available black-box solvers, by proposing

to use the derivative of an afﬁne surrogate of the piece-wise constant function in the backward pass. Rolínek et al.

[62]

leverage this to perform deep graph matching based on explicit supervision of the ground truth node alignments. In

another approach, Berthet et al.

[63]

perturb the inputs to the discrete solvers with random noise, so as to make them

differentiable. Karalias and Loukas

[64]

design probabilistic loss functions for tackling the combinatorial objective of

selecting a subset of nodes adhering to some given property. Finally, Kotary et al.

[65]

present a detailed survey of the

existing neural approaches for solving constrained optimization problems on graphs.

2 Late interaction models for MCES and MCCS

In this section, we ﬁrst write down the exact objectives for MCES and MCCS. These expressions are partly based upon

a pairing of nodes between query and corpus graphs, and partly on typical graph algorithms. They lead naturally to our

subsequent development of two late interaction models, LMCES and LMCCS. We begin with formal deﬁnitions of

MCES and MCCS.

Deﬁnition 1 (MCES and MCCS) Given query and corpus graphs Gq=(Vq, Eq)and Gc=(Vc, Ec).

(1)

The maximum common edge subgraph

MCES(Gq, Gc)

is the common (not necessarily induced nor connected)

subgraph between Gqand Gc, having the maximum number of edges [16].

(2)

The maximum common connected subgraph

MCCS(Gq, Gc)

is the common connected subgraph with the maximum

number of nodes [22].

2.1 Combinatorial formulations for MCES and MCCS

As mentioned in Section 1.2, combinatorial algorithms for MCES and MCCS abound in the literature [

]. However, it is difﬁcult to design neural surrogates for these algorithms. Therefore, we come up with the new

optimization objectives for MCES and MCCS, which allow us to design neural surrogates by gradually replacing its

different components with differentiable parameterized components.

Exact MCES.

Given a query graph

Gq= (Vq, Eq)

and a corpus graph

Gc= (Vc, Ec)

, we pad the graph having fewer

vertices (typically,

), with

||Vc| − |Vq||

disconnected nodes. This ensures that the padded graphs have an equal

number of nodes

. Let us denote the adjacency matrices of

and

(after padding) as

Aq∈ {0,1}N×N

and

Ac∈ {0,1}N×N

. To ﬁnd

MCES(Gq, Gc)

from the adjacency matrices

and

, we ﬁrst obtain the candidate

common subgraph under some proposed node alignment given by permutations

, which can be characterized by

the adjacency matrix

min(Aq,P AcPT)

. This matrix shows the overlapping edges under the proposed node alignment.

Subsequently, we choose the permutation which maximizes the total number of edges in this subgraph. Formally, we

compute the MCES score by solving a coverage maximization problem, as follows:

maxP∈P Pi,j min Aq,P AcP>i,j (1)

where Pis the set of permutation matrices of size N×Nand the min operator is applied elementwise.

Exact MCCS.

MCES does not require the common subgraph to be connected, which may be desirable in some

applications. For example, in keyword search and question answering over knowledge graphs (KGs) [

], one

may wish to have the entity nodes, forming the response, to be connected to each other. In molecule search, one may

require a connected functional group to be present in both query and corpus graphs [

]. In such situations, MCCS may

be more appropriate.

Given a query graph

and a corpus graph

and their adjacency matrices

and

after padding, we ﬁrst

apply a row-column permutation on

using the permutation matrix

and obtain the candidate common subgraph

with adjacency matrix

min(Aq,P AcP>)

. Then we apply Tarjan’s algorithm [

] to return the size of the largest

connected component, which we maximize w.r.t. P:

maxP∈P TARJANSCC min Aq,P AcP>.(2)

Here TARJANSCC takes the adjacency matrix as input and returns the size of the largest connected component of

corresponding graph.

Bottleneck of approximating the optimization problems (1) and (2).

One way of avoiding the intractability of

searching through

, is to replace the hard permutation matrix with a differentiable soft surrogate via the Gumbel-

Sinkhorn (GS) network [68, also see Section B]. However, such a relaxation is not adequate on its own.

1. Most elements of min(Aq,P AcP>)are binary, which deprives the learner of gradient signals.

In practice, the nodes or edges may have associated (noisy) features, which play an important role in determining

similarity between nodes or edges across query and corpus graphs. For example, in scene graph based image

retrieval, "

panther

" may be deemed similar to "

leopard

". The objectives in Eqs.

(1)

and

(2)

do not capture

such phenomenon.

Tarjan’s algorithm ﬁrst applies DFS on a graph to ﬁnd the connected components and then computes the size of the

largest among them, in terms of the number of nodes. Therefore, even for a ﬁxed P, it is not differentiable.

Next, we address the above bottlenecks by replacing the objectives

(1)

and

(2)

with two neural surrogates, which are

summarized in Figure 1.

2.2 Design of LMCES

We design the neural surrogate of Eq.

(1)

by replacing the adjacency matrices with the corresponding continuous node

embeddings computed by a GNN, and the hard permutation matrix with a soft surrogate—a doubly stochastic matrix—

generated by the Gumbel-Sikhorn network [68]. These node embeddings allow us to compute non-zero gradients and

approximate similarity between nodes and their local neighborhood in the continuous domain. Speciﬁcally, we compute

this neural surrogate in the following two steps.

Step 1: Computing node embeddings.

We use a message passing graph neural network [

]

GNNθ

with

propagation layers and trainable parameters

, to compute the node embeddings

hu(1),...,hu(R)∈Rd

, for

each node

in the query and corpus graphs, to which

GNNθ

is applied separately. Finally, we build two matrices

Hq(r),Hc(r)∈RN×d

for

r∈[R]

by stacking the node embedding vectors for query and corpus graphs. Formally,

we have

Hq(1),...,Hq(R) = GNNθ(Gq),Hc(1),...,Hc(R) = GNNθ(Gc)(3)

Step 2: Interaction between Hq(r)and Hc(r).

In principle, the embeddings

hu(1),...,hu(R)

of a node

capture

information about the neighborhood of

. Thus, we can view the set of embedding matrices

{Hq(r)|r∈[R]}

and

{Hc(r)|r∈[R]}

as a reasonable representation of the query and corpus graphs, respectively. To compute a smooth

surrogate of the adjacency matrix of the common subgraph, i.e.,

min(Aq,P AcP>)

, we seek to ﬁnd the corresponding

alignment between

Hq(r)

and

Hc(r)

using soft-permutation (doubly stochastic) matrices

P(r)

generated through a

Gumbel-Sikhorn network

GSφ

. Here, we feed

Hq(r)

and

Hc(r)

into

GSφ

and obtain a doubly stochastic matrix

P(r)

P(r)= GSφ(Hq(r),Hc(r)) ∀r∈[R](4)

Finally, we replace the true relevance scoring function in Eq. (1) with the following smooth surrogate:

s(Gq, Gc) = X

r∈[R]

wrX

i,j

min(Hq(r),P(r)Hc(r))i,j (5)

Here

{wr≥0 : r∈[R]}

are trainable parameters, balancing the quality of signals over all message rounds

. Note

that the

message rounds execute on the query and corpus graphs separately. The interaction between corresponding

rounds, happens at the very end.

2.3 Design of LMCCS

In case of MCES, our key modiﬁcation was to replace the adjacency matrices with node embeddings and design

a differentiable network to generate a soft-permutation matrix. In the case of MCCS, we have to also replace the

non-differentiable step of ﬁnding the size of the largest connected component of the common subgraph with a neural

surrogate, for which we design a novel gossip protocol.

Differentiable gossip protocol to compute the largest connected component.

Given any graph

G= (V, E)

with

the adjacency matrix

, we can ﬁnd its largest connected component (LCC) by using a gossip protocol. At iteration

t= 0

, we start with assigning each node a message vector

xu(0) ∈RN

, which is the one-hot encoding of the node

,i.e.,

xu(0)[v] = 1

for

v=u

and

otherwise. In iteration

t > 0

, we update the message vectors

xu(t+ 1)

xu(t+ 1) = Pv∈nbr(u)∪{u}xv(t)

. Here,

nbr(u)

is the set of neighbors of

. Initially we start with sparse vector

with exactly one non-zero entry. As we increase the number of iterations,

would gradually collect messages from

Algorithm 1 GOSSIP(B)

1: X(0) = I#identity

2: for t= 1,...T −1do

3: X(t+ 1) ←X(t)(B+I)

4: Return maxu∈V||X(T)[•, u]||0

the distant nodes which are reachable from

. This would increase the

number of non-zero entries of

. For sufﬁciently large value of iterations

(diameter of

), we will attain

xu(T)[v]6= 0

whenever

and

lie in

the same connected component and

xu(T)[v] = 0

, otherwise. Once this

stage is reached, one can easily compute the number of nodes in the largest

connected component of

maxu||xu(T)||0

,i.e., the maximum number

of non-zero entries in a message vector. Algorithm 1 shows the gossip

protocol, with the adjacency matrix Bas input and the size of the largest connected component as output.

Exact MCCS computation using the gossip protocol.

Given the query and corpus graphs

and

with their

adjacency matrices Aqand Ac, we can rewrite Eqn. (2) in the equivalent form

max

P∈P GOSSIP(min(Aq,P AcP>)).(6)

Recall that Pis the set permutation matrices of size N×N.

Neural surrogate.

One can use a

network to obtain a permutation matrix

, which in principle, can support

backpropagation of

min(Aq,P AcP>)

. However, as mentioned in Section 2.1 (items 1 and 2),

A•

is 0/1, and

often

saturates, losing gradient signal for training. In response, we will design a neural surrogate for the true MCCS size

given in Eq. (2), using three steps.

Step 1: Computation of edge embeddings.

To tackle the above challenge, we introduce a parallel back-propagation

path, in order to allow the GNNs to learn which edges are more important than others. To this end, we ﬁrst use

the GNNs to compute edge embedding vectors

me(r)∈RdE

for edges

e∈E

, in addition to node embeddings

at each propagation layer

r∈[R]

. Then, we gather the edge embeddings from the ﬁnal layer

, into the matrices

Mq(R),Mc(R)∈R|E|×dE

for query and corpus pairs. Next, we feed each separately into a neural network

Lα

with

trainable parameters

, which predicts the importance of edges based on each edge embedding. Thus, we obtain two

matrices

Sq∈RN×N

and

Sc∈RN×N

, which are composed of the edge scores between the corresponding node pairs.

Formally, we have:

Hq(R),Mq(R) = GNNθ(Gq),Hc(R),Mc(R) = GNNθ(Gc)(7)

Sq=Lα(Mq(R)) ,Sc=Lα(Mc(R)) (8)

In our implementation, Lαconsists of one linear layer, a ReLU layer and another linear layer.

Step 2: Continuous approximation of MCCS.

In MCES, we approximated the MCES score in Eq.

(1)

directly, using

the neural coverage function deﬁned in Eq.

(5)

in terms of the node embeddings. Note that, here, we did not attempt

to develop any continuous approximation of

min Aq,P AcP>

— the adjacency matrix of the candidate common

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MaximumCommonSubgraphGuidedGraphRetrieval:LateandEarlyInteractionNetworksIndradyumnaRoy,SoumenChakrabarti,andAbirDeIITBombay,{indraroy15,soumen,abir}@cse.iitb.ac.inAbstractThegraphretrievalproblemistosearchinalargecorpusofgraphsforonesthataremostsimilartoaquerygraph.Acommonconsiderationforscoringsim...

展开>> 收起<<

Maximum Common Subgraph Guided Graph Retrieval Late and Early Interaction Networks Indradyumna Roy Soumen Chakrabarti and Abir De.pdf

共25页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Maximum Common Subgraph Guided Graph Retrieval Late and Early Interaction Networks Indradyumna Roy Soumen Chakrabarti and Abir De

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: