NEURO PRIM ANATTENTION -BASED MODEL FOR SOLVING NP- HARD SPANNING TREE PROBLEMS Yuchen Shi

2025-05-02 0 0 2.94MB 17 页 10玖币
侵权投诉
NEUROPRIM: ANATTENTION-BASED MODEL FOR SOLVING
NP-HARD SPANNING TREE PROBLEMS
Yuchen Shi
Department of Mathematical Sciences
University of Chinese Academy of Sciences
Beijing 100049, China
shiyuchen20@mails.ucas.ac.cn
Congying Han
Department of Mathematical Sciences
University of Chinese Academy of Sciences
Beijing 100049, China
hancy@ucas.ac.cn
Tiande Guo
Department of Mathematical Sciences
University of Chinese Academy of Sciences
Beijing 100049, China
tdguo@ucas.ac.cn
June 13, 2023
ABSTRACT
Spanning tree problems with specialized constraints can be difficult to solve in real-world scenarios,
often requiring intricate algorithmic design and exponential time. Recently, there has been growing
interest in end-to-end deep neural networks for solving routing problems. However, such methods
typically produce sequences of vertices, which makes it difficult to apply them to general combina-
torial optimization problems where the solution set consists of edges, as in various spanning tree
problems. In this paper, we propose NeuroPrim, a novel framework for solving various spanning
tree problems by defining a Markov Decision Process (MDP) for general combinatorial optimization
problems on graphs. Our approach reduces the action and state space using Prim’s algorithm and
trains the resulting model using REINFORCE. We apply our framework to three difficult problems on
Euclidean space: the Degree-constrained Minimum Spanning Tree (DCMST) problem, the Minimum
Routing Cost Spanning Tree (MRCST) problem, and the Steiner Tree Problem in graphs (STP).
Experimental results on literature instances demonstrate that our model outperforms strong heuristics
and achieves small optimality gaps of up to 250 vertices. Additionally, we find that our model
has strong generalization ability, with no significant degradation observed on problem instances as
large as 1000. Our results suggest that our framework can be effective for solving a wide range of
combinatorial optimization problems beyond spanning tree problems.
Keywords degree-constrained minimum spanning tree problem
·
minimum routing cost spanning tree problem
·
Steiner
tree problem in graphs ·Prim’s algorithm ·reinforcement learning
1 Introduction
Network design problems are ubiquitous and arise in various practical applications. For instance, while designing a road
network, one needs to ensure that the number of intersections does not exceed a certain limit, giving rise to a Degree-
constrained Minimum Spanning Tree (DCMST) problem. Similarly, designing a communication network requires
minimizing the average cost of communication between two users, which can be formulated as a Minimum Routing
Cost Spanning Tree (MRCST) problem. In addition, minimizing the cost of opening facilities and communication
network layouts while maintaining interconnectivity of facilities is critical for telecommunication network infrastructure.
Corresponding author
arXiv:2210.12453v2 [cs.LG] 10 Jun 2023
Accepted for publication in SCIENCE CHINA Mathematics
This problem can be modeled as a Steiner Tree Problem in graphs (STP). Although these problems all involve finding
spanning trees in graphs, they have different constraints and objective functions, requiring heuristic algorithms tailored
to their specific problem characteristics, without a uniform solution framework. Moreover, unlike the Minimum
Spanning Tree (MST) problem, which can be solved in polynomial time by exact algorithms such as Prim’s algorithm
[
56
], finding optimal solutions for these problems requires computation time exponential in problem size, making it
impractical for real-world applications.
In contrast to conventional algorithms that are designed based on expert knowledge, end-to-end models leveraging
deep neural networks have achieved notable success in tackling a wide range of combinatorial optimization problems
[
35
,
36
]. However, most methods treat the problem as permutating a sequence of vertices, e.g., allocating resources in
the knapsack problem [
24
], finding the shortest travel route in routing problems [
6
,
46
,
53
,
75
,
77
,
78
], approximating
minimum vertex cover and maximum cut on graphs [
21
], learning elimination order in tree decomposition [
43
],
locating facilities in the p-median problem [
76
], and determining Steiner points in STP [
2
]. For problems that require
decision-making on edge sets such as MST, due to the need to consider the connectivity and acyclicity of the solution,
there hardly exists methods that deal with the edge generation tasks directly. [
26
] proposed to first transform edges
into vertices by constructing line graphs, then use an end-to-end approach to tackle various graph problems, including
MST. This method, however, is not suitable for dense graphs, as their line graphs are quite large. [
27
] presented the use
of Graph Neural Networks and Reinforcement Learning (RL) to solve STP, whose decoder, after outputting a vertex,
selects the edge with the smallest weight connected to that vertex in the partial solution. However, this method cannot
be applied to problems where the objective function is not the sum of the weights of all edges.
This paper proposes NeuroPrim, a novel framework that utilizes the successful Transformer model to solve three
challenging spanning tree problems in Euclidean space, namely the DCMST, MRCST and STP problems. The algorithm
is based on a Markov Decision Process (MDP) for general combinatorial optimization problems on graphs, which
significantly reduces the action and state spaces by maintaining the connectivity of partially feasible solutions, similar
to Prim’s algorithm. The framework is parameterized and trained with problem-related masks. Our goal is not to
outperform existing solvers; instead, we demonstrate that our approach can produce approximate solutions to various
sizes and variants of spanning tree problems quickly, with minimal modifications required after training. Our approach’s
effectiveness may provide insights into developing algorithms for problems with complex constraints and few heuristics.
Our work makes several significant contributions, including:
We present a novel Markov Decision Process (MDP) framework for general combinatorial optimization
problems on graphs. This framework allows all constructive algorithms for solving these problems to be
viewed as greedy rollouts of some policy solved by the model.
We propose NeuroPrim, a unified framework that leverages the MDP to solve problems that require decisions
on both vertex and edge sets. Our framework is designed to be easily adaptable, requiring only minimal
modifications to masks and rewards to solve different problems.
To the best of our knowledge, we are the first to use a sequence-to-sequence model for solving complex
spanning tree problems. We demonstrate the effectiveness of our approach by applying it to three challenging
variants of the spanning tree problem on Euclidean space, achieving near-optimal solutions in a short time and
outperforming strong heuristics.
2 Preliminaries
In this section, we describe the problems investigated and the techniques employed. Firstly, we present three vari-
ants of the spanning tree problem under investigation. Subsequently, we introduce the networks utilized for policy
parameterization.
2.1 Spanning tree problems
Given an undirected weighted graph
(V(G), E(G), w)
with
v(G)
vertices,
e(G)
edges and
w:ER
specifying
the weight of each edge, a spanning tree of
G
is a subgraph that is a tree which includes all vertices of
G
. Among all
combinatorial optimization problems with spanning trees, the Minimum Spanning Tree (MST) problem is one of the
most typical and well-known one. The problem seeks to find a spanning tree with least total edge weights and can be
solved in time
O(e(G) log v(G))
by Bor˚uvka’s algorithm [
10
], Prim’s algorithm [
56
] and Kruskal’s algorithm [
47
].
However, with additional constraints or modifications to the objective function, it becomes more difficult to solve these
problems to optimality. Three variants of the spanning tree problem are considered in this paper. For each variant, an
example in Euclidean space is shown in Figure 1.
2
Accepted for publication in SCIENCE CHINA Mathematics
v3v4
v1
v2
v5
2
2
1
1
5/2
1
17/2
1
2.5
13/2
(a) Euclidean graph
v3v4
v1
v2
v5
(b) DCMST (d= 2)
v3v4
v1
v2
v5
(c) MRCST
v3v4
v1
v2
v5
(d) Steiner tree
Figure 1: Examples of optimal solutions to different spanning tree problems. All vertices to be spanned (i.e., all vertices
in DCMST and MRCST problem and terminals in STP) are colored in blue. The black lines in (a) indicates a feasible
edge in graph, and in (b), (c), (d) an edge in the optimal solution. (a) A complete Euclidean graph with five vertices.
(b) MST and also DCMST (
d= 2
) of the graph with a total weight of
3 + 5/2
. (c) MRCST with overall pairwise
distance of
8 + 42 + 25
. (d) A Steiner tree connecting terminals
{v2, v4, v5}
with total weight of
2 + 5/2
, where
{v1}is the Steiner point.
2.1.1 Degree-constrained Minimum Spanning Tree problem
Given an integer
d2
, the degree-constrained minimum spanning tree (DCMST) problem (also known as the minimum
bounded degree spanning tree problem) involves finding a MST of maximum degree at most
d
. The problem was
first formulated by [
52
]. The NP-hardness can be shown by a reduction to the Hamilton Path Problem when
d= 2
[
33
]. Several exact [
3
,
7
,
16
,
20
,
30
], heuristic [
8
], metaheuristic [
13
,
14
,
23
,
25
,
58
,
67
] algorithms and approximation
schemes [
32
,
34
,
66
] have been proposed to tackle this problem. In Figure 1b, the path
v2v3v4v1v5
is the MST with
maximum degree 2.
2.1.2 Minimum Routing Cost Spanning Tree problem
The minimum routing cost spanning tree (MRCST) problem (also known as the optimum distance spanning tree
problem) aims to minimizing the sum of pairwise distances between vertices in the tree, i.e., finding a spanning tree of
G
that minimizes
P(x,y)V(G)×V(G),x<y dH(x, y)
, where
dH(x, y)
is the total weights of the unique path between
vertices
x, y V(G)
in the tree
H
. This problem is first stated by [
63
] and then its NP-hardness is raised by [
39
]. Exact
[
1
,
19
,
31
,
72
,
81
], heuristic [
17
,
51
,
62
,
64
,
71
], metaheuristic [
61
,
65
,
70
] algorithms and approximation schemes
[
79
,
80
] have been developed for this problem. In Figure 1c, the star graph with internal vertex
v1
and leaves
v2
,
v3
,
v4
and
v5
is the MRCST, and its routing cost is 4 times its total weight. (Note that after deleting any edge, the graph
becomes two connected components with orders
1
and
4
, so that the contribution of each edge to the routing cost is
1·4=4.)
2.1.3 Steiner Tree Problem in graphs
Given a subset of vertices
VV(G)
called terminals, the Steiner tree problem in graphs (STP) asks for a tree in
G
that spans
V
with minimum weights, i.e., finding a set
SV(G)\V
that minimize the cost of the MST of
G[VS]
,
where
G[VS]
is the subgraph induced by
VS
and
S
is called Steiner points. The problem is NP-hard [
42
] and
has two polynomially solvable variants: shortest path problem (
|V|= 2
) and MST problem (
|V|=v(G)
). Exact
[
22
,
45
,
50
,
54
], heuristic [
28
,
60
,
69
,
73
], metaheuristic [
5
,
11
,
29
,
38
,
41
,
55
,
57
,
59
] algorithms and approximation
schemes [
4
,
15
,
18
] have been devised to address this problem. The most recent breakthrough lies in the 11th DIMACS
(the Center for Discrete Mathematics and Theoretical Computer Science) Implementation Challenge and the PACE
2018 Parameterized Algorithms and Computational Experiments Challenge [
9
]. In Figure 1d, the star graph with
internal vertex v1and leaves v2,v4and v5is the Steiner tree connecting {v2, v4, v5}.
2.2 Transformer network
The Transformer [
74
] is a neural network architecture that was introduced in 2017 by Vaswani et al. It was designed to
address some of the limitations of traditional recurrent neural networks (RNNs) and convolutional neural networks
(CNNs) in processing sequential data. The Transformer has become a popular choice for various natural language
processing (NLP) tasks, such as language translation, language modeling, and text classification.
3
Accepted for publication in SCIENCE CHINA Mathematics
The Transformer model architecture, as illustrated in Figure 2a, consists of two main components: the encoder and the
decoder. The encoder takes the input sequence and produces a set of hidden states that represent the input sequence.
The decoder takes the output of the encoder and produces the output sequence. Both the encoder and decoder consist of
multiple layers of multi-head attention (MHA) and feed-forward networks (FFN). The MHA mechanism is the key
component that allows the Transformer to model long-range dependencies in the input sequence.
Self-attention is a mechanism that allows the model to attend to different parts of the input sequence to compute a
representation for each part. The self-attention mechanism employs query and key vectors to compute the attention
weights, which determine the contribution of each element in the input sequence to the final output. The attention
weights are computed by taking the dot product of the query and key vectors, dividing the result by the square root
of the dimensionality of the key vectors, and normalizing the scores using the softmax function as depicted in Figure
2c. MHA extends self-attention by performing attention over multiple projections of the input sequence. In MHA,
the input sequence is projected into multiple subspaces, and attention is performed separately on each subspace. This
allows the model to attend to different aspects of the input sequence simultaneously, providing a richer representation
of the input sequence. Specifically, the MHA block with residual connection (RC) and batch normalization (BN)
can be defined as a parameterized function class
MHA : Rn×d×Rm×d×Rm×d× {0,1}n×mRn×d
given by
MHA(Q, K, V, M ) = BN(Q+O), where
O=
H
n
h=1 softmax M(QW q
h+1nbq)(KW k
h+1mbk)T/d(V W v
h+1mbv)Wo
h+1nbo,(1)
f
is the matrix concatenation operator,
d
is the embedding dimension of the model,
H
is the number of parallel attention
heads satisfying that
d
is divisible by
H
,
represents the element-wise product,
1n= [1 ··· 1]TRn×1
, and
Wq
h, W k
h, W v
hRd×(d/H)
,
bq, bk, bvR1×d
,
WoRd×d
,
boR1×d
and
h∈ {1, . . . , H}
are trainable parameters.
The MHA block is shown in Figure 2b.
The FFN with RC and BN can be similarly defined as FFN : Rn×dRn×d, giving
FFN(X) = BN (X+ (ReLU(XW1+1nb1)W2+1nb2)) ,(2)
where
df
is the dimension of hidden layer, and
W1Rd×df
,
b1R1×df
,
W2Rdf×d
and
b2R1×d
are trainable
parameters.
Input
Embedding
Output
Embedding
Add&Norm
Multi-Head
Attention
Add & Norm
Multi-Head
Attention
Add & Norm
Masked
Multi-Head
Attention
Add&Norm
Feed
Forward
Add & Norm
Feed
Forward
Linear
Softmax
Inputs Outputs
(shifted right)
Output
Probabilities
N×
N×
Positional
Encoding
Positional
Encoding
(a) Transformer model
V K Q
Concat
Linear
Linear Linear Linear
Scaled Dot-Product
Attention
Linear Linear Linear
Scaled Dot-Product
Attention
Linear Linear Linear
Scaled Dot-Product
Attention
(b) Multi-head attention
QK V
MatMul
Scale
Mask
Softmax
MatMul
(c) Scaled Dot-product
attention
Figure 2: The Transformer model architecture.
3 Method
Our approach begins by defining an MDP for general combinatorial optimization problems on graphs. We then present
the NeuroPrim algorithm and its constituent components for solving three variants of the spanning tree problem (i.e.,
4
摘要:

NEUROPRIM:ANATTENTION-BASEDMODELFORSOLVINGNP-HARDSPANNINGTREEPROBLEMSYuchenShiDepartmentofMathematicalSciencesUniversityofChineseAcademyofSciencesBeijing100049,Chinashiyuchen20@mails.ucas.ac.cnCongyingHanDepartmentofMathematicalSciencesUniversityofChineseAcademyofSciencesBeijing100049,Chinahancy@uca...

展开>> 收起<<
NEURO PRIM ANATTENTION -BASED MODEL FOR SOLVING NP- HARD SPANNING TREE PROBLEMS Yuchen Shi.pdf

共17页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:17 页 大小:2.94MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 17
客服
关注