ON REPRESENTING MIXED-INTEGER LINEAR PROGRAMS BY GRAPH NEURAL NETWORKS ZIANG CHEN JIALIN LIU XINSHANG WANG JIANFENG LU AND WOTAO YIN

2025-05-02 0 0 706.59KB 31 页 10玖币

侵权投诉

ON REPRESENTING MIXED-INTEGER LINEAR PROGRAMS BY

GRAPH NEURAL NETWORKS

ZIANG CHEN, JIALIN LIU, XINSHANG WANG, JIANFENG LU, AND WOTAO YIN

Abstract. While Mixed-integer linear programming (MILP) is NP-hard in general, prac-

tical MILP has received roughly 100–fold speedup in the past twenty years. Still, many

classes of MILPs quickly become unsolvable as their sizes increase, motivating researchers

to seek new acceleration techniques for MILPs. With deep learning, they have obtained

strong empirical results, and many results were obtained by applying graph neural networks

(GNNs) to making decisions in various stages of MILP solution processes. This work discov-

ers a fundamental limitation: there exist feasible and infeasible MILPs that all GNNs will,

however, treat equally, indicating GNN’s lacking power to express general MILPs. Then, we

show that, by restricting the MILPs to unfoldable ones or by adding random features, there

exist GNNs that can reliably predict MILP feasibility, optimal objective values, and optimal

solutions up to prescribed precision. We conducted small-scale numerical experiments to

validate our theoretical ﬁndings.

1. Introduction

Mixed-integer linear programming (MILP) is a type of optimization problems that minimize

a linear objective function subject to linear constraints, where some or all variables must take

integer values. MILP has a wide type of applications, such as transportation [39], control [34],

scheduling [16], etc. Branch and Bound (B&B) [27], an algorithm widely adopted in modern

solvers that exactly solves general MILPs to global optimality, unfortunately, has an exponential

time complexity in the worst-case sense. To make MILP more practical, researchers have to

analyze the features of each instance of interest based on their domain knowledge, and use such

features to adaptively warm-start B&B or design the heuristics in B&B.

To automate such laborious process, researchers turn attention to Machine learning (ML)

techniques in recent years [7]. The literature has reported some encouraging ﬁndings that

a proper chosen ML model is able to learn some useful knowledge of MILP from data and

generalize well to some similar but unseen instances. For example, one can learn fast approxi-

mations of Strong Branching, an eﬀective but time-consuming branching strategy usually used

in B&B [2, 24, 28, 46]. One may also learn cutting strategies [9, 23, 42], node selection/pruning

strategies [22, 45], or decomposition strategies [41] with ML models. The role of ML models

in those approaches can be summarized as: approximating useful mappings or parameterizing

Date: May 29, 2023.

A major part of the work of Z. Chen was completed during his internship at Alibaba US DAMO Academy.

Corresponding author: Jialin Liu, jialin.liu@alibaba-inc.com.

arXiv:2210.10759v2 [cs.LG] 25 May 2023

2 ZIANG CHEN, JIALIN LIU, XINSHANG WANG, JIANFENG LU, AND WOTAO YIN

key strategies in MILP solvers, and these mappings/strategies usually take an MILP instance

as input and output its key peroperties.

The graph neural network (GNN), due to its nice properties, say permutation invariance, is

considered a suitable model to represent such mappings/strategies for MILP. More speciﬁcally,

permutations on variables or constraints of an MILP do not essentially change the problem

itself, reliable ML models such as GNNs should satisfy such properties, otherwise the model may

overﬁt to the variable/constraint orders in the training data. [17] proposed that an MILP can be

encoded into a bipartite graph on which one can use a GNN to approximate Strong Branching.

[13] proposed to represent MILP with a tripartite graph. Since then, GNNs have been adopted

to represent mappings/strategies for MILP, for example, approximating Strong Branching [20,

21, 31, 40], approximating optimal solution [25, 31], parameterizing cutting strategies [32], and

parameterizing branching strategies [33, 38].

However, theoretical foundations in this direction still remain unclear. A key problem is the

ability of GNN to approximate important mappings related with MILP. We ask the following

questions:

Is GNN able to predict whether an MILP is feasible?(Q1)

Is GNN able to approximate the optimal objective value of an MILP?(Q2)

Is GNN able to approximate an optimal solution of an MILP?(Q3)

To answer questions (Q1 - Q3), one needs theories of separation power and representation

power of GNN. The separation power of GNN is measured by whether it can distinguish

two non-isomorphic graphs. Given two graphs G1, G2, we say a mapping F(e.g. a GNN) has

strong separation power if F(G1)̸=F(G2) as long as G1, G2that are not the isomorphic. In

our settings, since MILPs are represented by graphs, the separation power actually indicates

the ability of GNN to distinguish two diﬀerent MILP instances. The representation power of

GNN refers to how well it can approximate mappings with permutation invariant properties. In

our settings, we study whether GNN can map an MILP to its feasiblity, optimal objective value

and an optimal solution. The separation power and representation power of GNNs are closely

related to the Weisfeiler-Lehman (WL) test [43], a classical algorithm to identify whether two

graphs are isomorphic or not. It has been shown that GNN has the same separation power

with the WL test [44], and, based on this result, GNNs can universally approximate continuous

graph-input mappings with separation power no stronger than the WL test [5,18].

Our contributions. With the above tools in hand, one still cannot directly answer questions

(Q1 - Q3) since the relationship between characteristics of general MILPs and properties of

graphs are not clear yet. Although there are some works studying the representation power

of GNN on some graph-related optimization problems [30, 36] and linear programming [12],

representing general MILPs with GNNs are still not theoretically studied, to the best of our

knowledge. Our contributions are listed below:

•(Limitation of GNNs for MILP) We show with an example that GNNs do not have strong

enough separation power to distinguish any two diﬀerent MILP instances. There exist two

MILPs such that one of them is feasible while the other one is not, but, unfortunately, all

ON REPRESENTING MIXED-INTEGER LINEAR PROGRAMS BY GRAPH NEURAL NETWORKS 3

GNNs treat them equally without detecting the essential diﬀerence between them. In fact,

there are inﬁnitely many pairs of MILP instances that can puzzle GNN.

•(Foldable and unfoldable MILP) We provide a precise mathematical description on what

type of MILPs makes GNNs fail. These hard MILP instances are named as foldable MILPs.

We prove that, for unfoldable MILPs, GNN has strong enough separation power and rep-

resentation power to approximate the feasibility, optimal objective value and an optimal

solution.

•(MILP with random features) To handle those foldable MILPs, we propose to append random

features to the MILP-induced graphs. We prove that, with the random feature technique,

the answers to questions (Q1 - Q3) are aﬃrmative.

The answers to (Q1 - Q3) serve as foundations of a more practical question: whether GNNs

are able to predict branching strategies or primal heuristics for MILP? Although the answers

to (Q1) and (Q2) do not directly answer that practical question, they illustrate the possibility

that GNNs can capture some key information of a MILP instance and have the capacity to

suggest an adaptive branching strategy for each instance. To obtain a GNN-based strategy,

practitioners should consider more factors: feature spaces, action spaces, training methods,

generalization performance, etc., and some recent empirical studies show encouraging results

on learning such a strategy [17, 20, 21, 25, 31, 33, 38, 40]. The answer to (Q3) directly shows the

possibility of learning primal heuristics. With proper model design and training methods, one

could obtain competitive GNN-based primal heuristics [13, 31].

The rest of this paper is organized as follows. We state some preliminaries in Section 2.

The limitation of GNN for MILP is presented in Section 3 and we provide the descriptions

of foldable and unfoldable MILPs in Section 4. The random feature technique is introduced

in Section 5. Section 6 contains some numerical results and the whole paper is concluded in

Section 7.

2. Preliminaries

In this section, we introduce some preliminaries that will be used throughout this paper.

Our notations, deﬁnitions, and examples follow [12]. Consider a general MILP problem deﬁned

with:

(2.1) min

x∈Rnc⊤x, s.t. Ax ◦b, l ≤x≤u, xj∈Z,∀j∈I,

where A∈Rm×n,c∈Rn,b∈Rm,l∈(R∪ {−∞})n,u∈(R∪ {+∞})n, and ◦ ∈ {≤,=,≥}m.

The index set I⊂ {1,2, . . . , n}includes those indices jwhere xjare constrained to be an

integer. The feasible set is deﬁned with Xfeasible := {x∈Rn|Ax◦b, l ≤x≤u, xj∈Z,∀j∈I},

and we say an MILP is infeasible if Xfeasible =∅and feasible otherwise. For feasible MILPs,

inf{c⊤x:x∈Xfeasible}is named as the optimal objective value. If there exists x∗∈Xfeasible

with c⊤x∗≤c⊤x, ∀x∈Xfeasible, then we say that x∗is an optimal solution. It is possible

that the objective value is arbitrarily good, i.e., for any R > 0, c⊤ˆx < −Rholds for some

ˆx∈Xfeasible. In this case, we say the MILP is unbounded or its optimal objective value is −∞.

4 ZIANG CHEN, JIALIN LIU, XINSHANG WANG, JIANFENG LU, AND WOTAO YIN

2.1. MILP as a weighted bipartite graph with vertex features. Inspired by [17] and

following [12], we represent MILP by weighted bipartite graphs. The vertex set of such a graph

is V∪W, where V={v1, v2, . . . , vm}with virepresenting the i-th constraint, and W=

{w1, w2, . . . , wm}with wjrepresenting the j-th variable. To fully represent all information in

(2.1), we associate each vertex with features: The vertex vi∈Vis equipped with a feature

vector hV

i= (bi,◦i) that is chosen from HV=R× {≤,=,≥}. The vertex wj∈Wis equipped

with a feature vector hW

j= (cj, lj, uj, τj), where τj= 1 if j∈Iand τj= 0 otherwise. The

feature hW

jis in the space HW=R×(R∪ {−∞})×(R∪ {+∞})× {0,1}. The edge Ei,j ∈R

connects vi∈Vand wj∈W, and its value is deﬁned with Ei,j =Ai,j . Note that there is

no edge connecting vertices in the same vertex group (Vor W) and Ei,j = 0 if there is no

connection between viand wj. Thus, the whole graph is denoted as G= (V∪W, E), and we

denote Gm,n as the collection of all such weighted bipartite graphs whose two vertex groups

have size mand n, respectively. Finally, we deﬁne HV

m:= (HV)mand HW

n:= (HW)n, and

stack all the vertex features together as H= (hV

1, hV

2, . . . , hV

m, hW

1, hW

2, . . . , hW

n)∈ HV

m× HW

Then a weighted bipartite graph with vertex features (G, H)∈ Gm,n × HV

m× HW

ncontains all

information in the MILP problem (2.1), and we name such a graph as an MILP-induced graph

or MILP-graph.1If I=∅, the feature τjcan be dropped and the graphs reduce to LP-graphs

in [12]. We provide an example of MILP-graph in Figure 1.

min

x∈R2x1+x2,

s.t. x1+ 3x2≥1, x1+x2≥1,

x1≤3, x2≤5, x2∈Z.

1= (1,≥)

2= (1,≥)

1= (1,−∞,3,0)

2= (1,−∞,5,1)

Figure 1. An example of MILP-graph

2.2. Graph neural networks with message passing for MILP-graphs. To represent

properties of the whole graph, one needs to build a GNN that maps (G, H) to a real number:

Gm,n ×HV

m×HW

n→R; to represent properties of each vertex in W(or represent properties for

each variable), one needs to build a GNN that maps (G, H) to a vector: Gm,n ×HV

m×HW

n→Rn.

The GNNs used in this paper can be constructed with the following three steps:

(1) Initial mapping at level l= 0. Let p0and q0be learnable embedding mappings, then

the embedded features are s0

i=p0(hV

i) and t0

j=q0(hW

j) for i= 1,2, . . . , m and j=

1,2, . . . , n.

(2) Message-passing layers for l= 1,2, . . . , L. Given learnable mappings fl, gl, pl, ql, one can

update vertex features by the following formulas for all i= 1,2, . . . , m and j= 1,2, . . . , n:

i=plsl−1

j=1

Ei,j fl(tl−1

j), tl

j=qltl−1

i=1

Ei,j gl(sl−1

i).

1In [17] and other related empirical studies, the feature spaces HV,HWare more complicated than those

deﬁned in this paper. We only keep the most basic features here for simplicity of analysis.

ON REPRESENTING MIXED-INTEGER LINEAR PROGRAMS BY GRAPH NEURAL NETWORKS 5

(3a) Last layer for graph-level output. Deﬁne a learnable mapping rGand the output is a

scalar yG=rG(¯sL,¯

tL), where ¯sL=Pm

i=1 sL

iand ¯

tL=Pn

j=1 tL

(3b) Last layer for node-level output. Deﬁne a learnable mapping rWand the output is a

vector y∈Rnwith entries being yj=rW(¯sL,¯

tL, tL

j) for j= 1,2, . . . , n.

Throughout this paper, we require all learnable mappings to be continuous following the

same settings as in [5, 12]. In practice, one may parameterize those mappings with multi-layer

perceptions (MLPs). Deﬁne FGNN as the set of all GNN mappings connecting layers (1), (2),

and (3a). FW

GNN is deﬁned similarly by replacing (3a) with (3b).

2.3. Mappings to represent MILP characteristics. Now we introduce the mappings that

are what we aim to approximate by GNNs. With the deﬁnitions, we will revisit questions

(Q1-Q3) and describe them in a mathematically precise way.

Feasibility mapping. We ﬁrst deﬁne the following mapping that indicates the feasibility of

MILP:

Φfeas :Gm,n × HV

m× HW

n→ {0,1},

where Φfeas(G, H) = 1 if (G, H) corresponds to a feasible MILP and Φfeas(G, H) = 0 otherwise.

Optimal objective value mapping. We then deﬁne the following mapping that maps an

MILP to its optimal objective value:

Φobj :Gm,n × HV

m× HW

n→R∪ {∞,−∞},

where Φobj(G, H) = ∞implies infeasibility and Φobj(G, H) = −∞ implies unboundedness.

Note that the optimal objective value for MILP may be an inﬁmum that can never be achieved.

An example would be minx∈Z2x1+πx2,s.t. x1+πx2≥√2. The optimal objective value is

√2, since for any ϵ > 0, there exists a feasible xwith x1+πx2<√2 + ϵ. However, there is

no x∈Z2such that x1+πx2=√2. Thus, the preimage Φ−1

obj(R) cannot precisely describe

all MILP instances with an optimal solution, it describes MILP problems with a ﬁnite optimal

objective.

Optimal solution mapping. To give a well-deﬁned optimal solution mapping is much more

complicated 2since the optimal objective value, as we discussed before, may never be achieved

in some cases. To handle this issue, we only consider the case that any component in lor umust

be ﬁnite, which implies that an optimal solution exists as long as the MILP problem is feasible.

More speciﬁcally, the vertex feature space is limited to e

n= (R×R×R× {0,1})n⊂ HW

and we consider MILP problems taken from Dsolu =Gm,n ×HV

m×e

n∩Φ−1

feas(1).Note that

Φ−1

feas(1) describes the set of all feasible MILPs. Consequently, any MILP instance in Dsolu

admits at least one optimal solution. We can further deﬁne the following mapping which maps

an MILP to exactly one of its optimal solutions:

Φsolu :Dsolu\Dfoldable →Rn,

2If we remove the integer constraints I=∅and let MILP reduces to linear programming (LP), the solution

mapping will be easier to deﬁne. In this case, as long as the optimal objective value is ﬁnite, there must exist

an optimal solution, and the optimal solution with the smallest ℓ2-norm is unique [12]. Therefore, a mapping

Φsolu, which maps an LP to its optimal solution with the smallest ℓ2-norm, is well deﬁned on Φ−1

obj(R).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ONREPRESENTINGMIXED-INTEGERLINEARPROGRAMSBYGRAPHNEURALNETWORKSZIANGCHEN,JIALINLIU,XINSHANGWANG,JIANFENGLU,ANDWOTAOYINAbstract.WhileMixed-integerlinearprogramming(MILP)isNP-hardingeneral,prac-ticalMILPhasreceivedroughly100–foldspeedupinthepasttwentyyears.Still,manyclassesofMILPsquicklybecomeunsolvabl...

展开>> 收起<<

ON REPRESENTING MIXED-INTEGER LINEAR PROGRAMS BY GRAPH NEURAL NETWORKS ZIANG CHEN JIALIN LIU XINSHANG WANG JIANFENG LU AND WOTAO YIN.pdf

共31页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

ON REPRESENTING MIXED-INTEGER LINEAR PROGRAMS BY GRAPH NEURAL NETWORKS ZIANG CHEN JIALIN LIU XINSHANG WANG JIANFENG LU AND WOTAO YIN

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: