NEURO PRIM ANATTENTION -BASED MODEL FOR SOLVING NP- HARD SPANNING TREE PROBLEMS Yuchen Shi

2025-05-02 0 0 2.94MB 17 页 10玖币

侵权投诉

NEUROPRIM: ANATTENTION-BASED MODEL FOR SOLVING

NP-HARD SPANNING TREE PROBLEMS

Yuchen Shi

Department of Mathematical Sciences

University of Chinese Academy of Sciences

Beijing 100049, China

shiyuchen20@mails.ucas.ac.cn

Congying Han

Department of Mathematical Sciences

University of Chinese Academy of Sciences

Beijing 100049, China

hancy@ucas.ac.cn

Tiande Guo∗

Department of Mathematical Sciences

University of Chinese Academy of Sciences

Beijing 100049, China

tdguo@ucas.ac.cn

June 13, 2023

ABSTRACT

Spanning tree problems with specialized constraints can be difﬁcult to solve in real-world scenarios,

often requiring intricate algorithmic design and exponential time. Recently, there has been growing

interest in end-to-end deep neural networks for solving routing problems. However, such methods

typically produce sequences of vertices, which makes it difﬁcult to apply them to general combina-

torial optimization problems where the solution set consists of edges, as in various spanning tree

problems. In this paper, we propose NeuroPrim, a novel framework for solving various spanning

tree problems by deﬁning a Markov Decision Process (MDP) for general combinatorial optimization

problems on graphs. Our approach reduces the action and state space using Prim’s algorithm and

trains the resulting model using REINFORCE. We apply our framework to three difﬁcult problems on

Euclidean space: the Degree-constrained Minimum Spanning Tree (DCMST) problem, the Minimum

Routing Cost Spanning Tree (MRCST) problem, and the Steiner Tree Problem in graphs (STP).

Experimental results on literature instances demonstrate that our model outperforms strong heuristics

and achieves small optimality gaps of up to 250 vertices. Additionally, we ﬁnd that our model

has strong generalization ability, with no signiﬁcant degradation observed on problem instances as

large as 1000. Our results suggest that our framework can be effective for solving a wide range of

combinatorial optimization problems beyond spanning tree problems.

Keywords degree-constrained minimum spanning tree problem

minimum routing cost spanning tree problem

Steiner

tree problem in graphs ·Prim’s algorithm ·reinforcement learning

1 Introduction

Network design problems are ubiquitous and arise in various practical applications. For instance, while designing a road

network, one needs to ensure that the number of intersections does not exceed a certain limit, giving rise to a Degree-

constrained Minimum Spanning Tree (DCMST) problem. Similarly, designing a communication network requires

minimizing the average cost of communication between two users, which can be formulated as a Minimum Routing

Cost Spanning Tree (MRCST) problem. In addition, minimizing the cost of opening facilities and communication

network layouts while maintaining interconnectivity of facilities is critical for telecommunication network infrastructure.

∗Corresponding author

arXiv:2210.12453v2 [cs.LG] 10 Jun 2023

Accepted for publication in SCIENCE CHINA Mathematics

This problem can be modeled as a Steiner Tree Problem in graphs (STP). Although these problems all involve ﬁnding

spanning trees in graphs, they have different constraints and objective functions, requiring heuristic algorithms tailored

to their speciﬁc problem characteristics, without a uniform solution framework. Moreover, unlike the Minimum

Spanning Tree (MST) problem, which can be solved in polynomial time by exact algorithms such as Prim’s algorithm

[

], ﬁnding optimal solutions for these problems requires computation time exponential in problem size, making it

impractical for real-world applications.

In contrast to conventional algorithms that are designed based on expert knowledge, end-to-end models leveraging

deep neural networks have achieved notable success in tackling a wide range of combinatorial optimization problems

[

]. However, most methods treat the problem as permutating a sequence of vertices, e.g., allocating resources in

the knapsack problem [

], ﬁnding the shortest travel route in routing problems [

], approximating

minimum vertex cover and maximum cut on graphs [

], learning elimination order in tree decomposition [

locating facilities in the p-median problem [

], and determining Steiner points in STP [

]. For problems that require

decision-making on edge sets such as MST, due to the need to consider the connectivity and acyclicity of the solution,

there hardly exists methods that deal with the edge generation tasks directly. [

] proposed to ﬁrst transform edges

into vertices by constructing line graphs, then use an end-to-end approach to tackle various graph problems, including

MST. This method, however, is not suitable for dense graphs, as their line graphs are quite large. [

] presented the use

of Graph Neural Networks and Reinforcement Learning (RL) to solve STP, whose decoder, after outputting a vertex,

selects the edge with the smallest weight connected to that vertex in the partial solution. However, this method cannot

be applied to problems where the objective function is not the sum of the weights of all edges.

This paper proposes NeuroPrim, a novel framework that utilizes the successful Transformer model to solve three

challenging spanning tree problems in Euclidean space, namely the DCMST, MRCST and STP problems. The algorithm

is based on a Markov Decision Process (MDP) for general combinatorial optimization problems on graphs, which

signiﬁcantly reduces the action and state spaces by maintaining the connectivity of partially feasible solutions, similar

to Prim’s algorithm. The framework is parameterized and trained with problem-related masks. Our goal is not to

outperform existing solvers; instead, we demonstrate that our approach can produce approximate solutions to various

sizes and variants of spanning tree problems quickly, with minimal modiﬁcations required after training. Our approach’s

effectiveness may provide insights into developing algorithms for problems with complex constraints and few heuristics.

Our work makes several signiﬁcant contributions, including:

•

We present a novel Markov Decision Process (MDP) framework for general combinatorial optimization

problems on graphs. This framework allows all constructive algorithms for solving these problems to be

viewed as greedy rollouts of some policy solved by the model.

•

We propose NeuroPrim, a uniﬁed framework that leverages the MDP to solve problems that require decisions

on both vertex and edge sets. Our framework is designed to be easily adaptable, requiring only minimal

modiﬁcations to masks and rewards to solve different problems.

•

To the best of our knowledge, we are the ﬁrst to use a sequence-to-sequence model for solving complex

spanning tree problems. We demonstrate the effectiveness of our approach by applying it to three challenging

variants of the spanning tree problem on Euclidean space, achieving near-optimal solutions in a short time and

outperforming strong heuristics.

2 Preliminaries

In this section, we describe the problems investigated and the techniques employed. Firstly, we present three vari-

ants of the spanning tree problem under investigation. Subsequently, we introduce the networks utilized for policy

parameterization.

2.1 Spanning tree problems

Given an undirected weighted graph

(V(G), E(G), w)

with

v(G)

vertices,

e(G)

edges and

w:E→R

specifying

the weight of each edge, a spanning tree of

is a subgraph that is a tree which includes all vertices of

. Among all

combinatorial optimization problems with spanning trees, the Minimum Spanning Tree (MST) problem is one of the

most typical and well-known one. The problem seeks to ﬁnd a spanning tree with least total edge weights and can be

solved in time

O(e(G) log v(G))

by Bor˚uvka’s algorithm [

], Prim’s algorithm [

] and Kruskal’s algorithm [

However, with additional constraints or modiﬁcations to the objective function, it becomes more difﬁcult to solve these

problems to optimality. Three variants of the spanning tree problem are considered in this paper. For each variant, an

example in Euclidean space is shown in Figure 1.

Accepted for publication in SCIENCE CHINA Mathematics

v3v4

√2

√5/2

√17/2

2.5

√13/2

(a) Euclidean graph

v3v4

(b) DCMST (d= 2)

v3v4

(d) Steiner tree

Figure 1: Examples of optimal solutions to different spanning tree problems. All vertices to be spanned (i.e., all vertices

in DCMST and MRCST problem and terminals in STP) are colored in blue. The black lines in (a) indicates a feasible

edge in graph, and in (b), (c), (d) an edge in the optimal solution. (a) A complete Euclidean graph with ﬁve vertices.

(b) MST and also DCMST (

d= 2

) of the graph with a total weight of

3 + √5/2

. (c) MRCST with overall pairwise

distance of

8 + 4√2 + 2√5

. (d) A Steiner tree connecting terminals

{v2, v4, v5}

with total weight of

2 + √5/2

, where

{v1}is the Steiner point.

2.1.1 Degree-constrained Minimum Spanning Tree problem

Given an integer

d≥2

, the degree-constrained minimum spanning tree (DCMST) problem (also known as the minimum

bounded degree spanning tree problem) involves ﬁnding a MST of maximum degree at most

. The problem was

ﬁrst formulated by [

]. The NP-hardness can be shown by a reduction to the Hamilton Path Problem when

d= 2

[

]. Several exact [

], heuristic [

], metaheuristic [

] algorithms and approximation

schemes [

] have been proposed to tackle this problem. In Figure 1b, the path

v2v3v4v1v5

is the MST with

maximum degree 2.

2.1.2 Minimum Routing Cost Spanning Tree problem

The minimum routing cost spanning tree (MRCST) problem (also known as the optimum distance spanning tree

problem) aims to minimizing the sum of pairwise distances between vertices in the tree, i.e., ﬁnding a spanning tree of

that minimizes

P(x,y)∈V(G)×V(G),x<y dH(x, y)

, where

dH(x, y)

is the total weights of the unique path between

vertices

x, y ∈V(G)

in the tree

. This problem is ﬁrst stated by [

] and then its NP-hardness is raised by [

]. Exact

[

], heuristic [

], metaheuristic [

] algorithms and approximation schemes

[

] have been developed for this problem. In Figure 1c, the star graph with internal vertex

and leaves

and

is the MRCST, and its routing cost is 4 times its total weight. (Note that after deleting any edge, the graph

becomes two connected components with orders

and

, so that the contribution of each edge to the routing cost is

1·4=4.)

2.1.3 Steiner Tree Problem in graphs

Given a subset of vertices

V′⊆V(G)

called terminals, the Steiner tree problem in graphs (STP) asks for a tree in

that spans

V′

with minimum weights, i.e., ﬁnding a set

S⊆V(G)\V′

that minimize the cost of the MST of

G[V′∪S]

where

G[V′∪S]

is the subgraph induced by

V′∪S

and

is called Steiner points. The problem is NP-hard [

] and

has two polynomially solvable variants: shortest path problem (

|V′|= 2

) and MST problem (

|V′|=v(G)

). Exact

[

], heuristic [

], metaheuristic [

] algorithms and approximation

schemes [

] have been devised to address this problem. The most recent breakthrough lies in the 11th DIMACS

(the Center for Discrete Mathematics and Theoretical Computer Science) Implementation Challenge and the PACE

2018 Parameterized Algorithms and Computational Experiments Challenge [

]. In Figure 1d, the star graph with

internal vertex v1and leaves v2,v4and v5is the Steiner tree connecting {v2, v4, v5}.

2.2 Transformer network

The Transformer [

] is a neural network architecture that was introduced in 2017 by Vaswani et al. It was designed to

address some of the limitations of traditional recurrent neural networks (RNNs) and convolutional neural networks

(CNNs) in processing sequential data. The Transformer has become a popular choice for various natural language

processing (NLP) tasks, such as language translation, language modeling, and text classiﬁcation.

Accepted for publication in SCIENCE CHINA Mathematics

The Transformer model architecture, as illustrated in Figure 2a, consists of two main components: the encoder and the

decoder. The encoder takes the input sequence and produces a set of hidden states that represent the input sequence.

The decoder takes the output of the encoder and produces the output sequence. Both the encoder and decoder consist of

multiple layers of multi-head attention (MHA) and feed-forward networks (FFN). The MHA mechanism is the key

component that allows the Transformer to model long-range dependencies in the input sequence.

Self-attention is a mechanism that allows the model to attend to different parts of the input sequence to compute a

representation for each part. The self-attention mechanism employs query and key vectors to compute the attention

weights, which determine the contribution of each element in the input sequence to the ﬁnal output. The attention

weights are computed by taking the dot product of the query and key vectors, dividing the result by the square root

of the dimensionality of the key vectors, and normalizing the scores using the softmax function as depicted in Figure

2c. MHA extends self-attention by performing attention over multiple projections of the input sequence. In MHA,

the input sequence is projected into multiple subspaces, and attention is performed separately on each subspace. This

allows the model to attend to different aspects of the input sequence simultaneously, providing a richer representation

of the input sequence. Speciﬁcally, the MHA block with residual connection (RC) and batch normalization (BN)

can be deﬁned as a parameterized function class

MHA : Rn×d×Rm×d×Rm×d× {0,1}n×m→Rn×d

given by

MHA(Q, K, V, M ) = BN(Q+O), where

h=1 softmax M⊙(QW q

h+1nbq)(KW k

h+1mbk)T/√d(V W v

h+1mbv)Wo

h+1nbo,(1)

is the matrix concatenation operator,

is the embedding dimension of the model,

is the number of parallel attention

heads satisfying that

is divisible by

⊙

represents the element-wise product,

1n= [1 ··· 1]T∈Rn×1

, and

h, W k

h, W v

h∈Rd×(d/H)

bq, bk, bv∈R1×d

Wo∈Rd×d

bo∈R1×d

and

h∈ {1, . . . , H}

are trainable parameters.

The MHA block is shown in Figure 2b.

The FFN with RC and BN can be similarly deﬁned as FFN : Rn×d→Rn×d, giving

FFN(X) = BN (X+ (ReLU(XW1+1nb1)W2+1nb2)) ,(2)

where

is the dimension of hidden layer, and

W1∈Rd×df

b1∈R1×df

W2∈Rdf×d

and

b2∈R1×d

are trainable

parameters.

Input

Embedding

Output

Embedding

Add&Norm

Multi-Head

Attention

Add & Norm

Multi-Head

Attention

Add & Norm

Masked

Multi-Head

Attention

Add&Norm

Feed

Forward

Add & Norm

Feed

Forward

Linear

Softmax

Inputs Outputs

(shifted right)

Output

Probabilities

N×

Positional

Encoding

Positional

Encoding

(a) Transformer model

V K Q

Concat

Linear

Linear Linear Linear

Scaled Dot-Product

Attention

Linear Linear Linear

Scaled Dot-Product

Attention

Linear Linear Linear

Scaled Dot-Product

Attention

(b) Multi-head attention

QK V

MatMul

Scale

Mask

Softmax

MatMul

attention

Figure 2: The Transformer model architecture.

3 Method

Our approach begins by deﬁning an MDP for general combinatorial optimization problems on graphs. We then present

the NeuroPrim algorithm and its constituent components for solving three variants of the spanning tree problem (i.e.,

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

NEUROPRIM:ANATTENTION-BASEDMODELFORSOLVINGNP-HARDSPANNINGTREEPROBLEMSYuchenShiDepartmentofMathematicalSciencesUniversityofChineseAcademyofSciencesBeijing100049,Chinashiyuchen20@mails.ucas.ac.cnCongyingHanDepartmentofMathematicalSciencesUniversityofChineseAcademyofSciencesBeijing100049,Chinahancy@uca...

展开>> 收起<<

NEURO PRIM ANATTENTION -BASED MODEL FOR SOLVING NP- HARD SPANNING TREE PROBLEMS Yuchen Shi.pdf

共17页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

NEURO PRIM ANATTENTION -BASED MODEL FOR SOLVING NP- HARD SPANNING TREE PROBLEMS Yuchen Shi

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: