Neural Networks for Local Search and Crossover in Vehicle Routing A Possible Overkill Italo Santana1 Andrea Lodi2 Thibaut Vidal13

2025-05-02 0 0 1.18MB 15 页 10玖币
侵权投诉
Neural Networks for Local Search and Crossover
in Vehicle Routing: A Possible Overkill?
´
Italo Santana1, Andrea Lodi2, Thibaut Vidal1,3
1Department of Computer Science, Pontifical Catholic University of Rio de Janeiro
2Jacobs Technion-Cornell Institute, Cornell Tech and Technion - IIT
3CIRRELT & SCALE-AI Chair in Data-Driven Supply Chains, Department of Mathematical and Industrial Engineering,
Polytechnique Montr´
eal
isantana@inf.puc-rio.br, andrea.lodi@cornell.edu, thibaut.vidal@polymtl.ca
Abstract
Extensive research has been conducted, over recent years, on
various ways of enhancing heuristic search for combinatorial
optimization problems with machine learning algorithms. In
this study, we investigate the use of predictions from graph
neural networks (GNNs) in the form of heatmaps to improve
the Hybrid Genetic Search (HGS), a state-of-the-art algorithm
for the Capacitated Vehicle Routing Problem (CVRP). The
crossover and local-search components of HGS are instru-
mental in finding improved solutions, yet these components
essentially rely on simple greedy or random choices. It seems
intuitive to attempt to incorporate additional knowledge at
these levels. Throughout a vast experimental campaign on
more than 10,000 problem instances, we show that exploiting
more sophisticated strategies using measures of node relat-
edness (heatmaps, or simply distance) within these algorith-
mic components can significantly enhance performance. How-
ever, contrary to initial expectations, we also observed that
heatmaps did not present significant advantages over simpler
distance measures for these purposes. Therefore, we faced a
common —though rarely documented— situation of overkill:
GNNs can indeed improve performance on an important op-
timization task, but an ablation analysis demonstrated that
simpler alternatives perform equally well.
1 Introduction
Vehicle routing problems (VRP) represent one of the most
studied classes of NP-hard problems due to their practical dif-
ficulty and ubiquity in real-life applications such as food dis-
tribution, parcel delivery, or waste collection, among others
(Toth and Vigo 2014; Vidal, Laporte, and Matl 2020). Prob-
lems in this class generally seek to plan efficient itineraries
for a fleet of vehicles to service a geographically-dispersed
set of customers. The capacitated VRP (CVRP) is the most
canonical variant among all existing routing problems. Its
objective is to minimize the total distance traveled by the ve-
hicles to service the customers, subject to a single constraint
representing the vehicle capacities: i.e., the sum of customers’
demands over a route should not exceed the vehicle capacity.
Over the years, there have been dramatic improvements
in the heuristic and exact (i.e., provably optimal) solution
of VRPs. To date, the best performing exact algorithms
rely on branch-cut-and-price strategies, with tailored cutting-
plane algorithms and sophisticated column-generation rou-
tines (Costa, Contardo, and Desaulniers 2019; Pessoa et al.
2020). With these methods, it is now possible to solve most
existing instances with 200 or 300 customers. However, the
time for an exact solution remains highly volatile at this scale,
and most larger instances remain unsolved. Consequently, ex-
tensive research has been conducted on metaheuristics for
this problem to find high-quality solutions in a shorter and
more controlled time (Vidal et al. 2013a).
As it stands now, metaheuristics can consistently locate
high-quality solutions for CVRPs with up to 1,000 customers
in a matter of minutes (Christiaens and Vanden Berghe 2020;
Vidal 2022). Of all existing methods, the Hybrid Genetic
Search (HGS) algorithm developed in Vidal et al. (2012,
2014) and (Vidal 2022) is known to achieve the best-known
solution quality consistently on most problems and instances
of interest. Notably, during the 12th DIMACS implemen-
tation challenge on the CVRP organized in 2022 (Archetti
et al. 2022), it was used as the base algorithm for four out of
the five best methods. Very-large problem instances counting
dozens of thousands of customers can also be solved using
tailored data structures and decomposition strategies (Ac-
corsi and Vigo 2021; Santini et al. 2021). Considering the
latest generation of metaheuristics such as HGS, it is clear
that two main operators —local search and crossover— are
instrumental in finding improving solutions.
Local Search (LS)
consists in systematically exploring
a neighborhood obtained by small changes over a cur-
rent solution to identify improvements. This process is
iterated until attaining a local minimum. Classical neigh-
borhoods for the CVRP involve exchanges or relocations
of client visits and edge reconnections. They typically
include
O(n2)
possible neighbors, where
n
represents
the number of customers. Due to its iterative nature, LS
typically takes the largest share of the computational time.
Several techniques have been developed to reduce compu-
tational complexity. In particular, Toth and Vigo (2003)
observed that the search could be limited to relocations
and exchanges of customers that are geographically re-
lated. The resulting strategy, called granular search, limits
classical neighborhoods to
On)
moves, where
Γ
is a
user-defined parameter. However, although very simple in
design, a straightforward distance-based relatedness crite-
rion may hinder the search process, especially if optimal
solutions require a few long edges.
In contrast,
Crossover
operators focus on diversifying
arXiv:2210.12075v1 [cs.NE] 9 Sep 2022
the search. They consist of recombining two existing (par-
ent) solutions into a new (offspring) solution that inher-
its promising characteristics from both. For the CVRP,
crossover operators are not primarily designed for solu-
tion improvement, but instead used to create promising
starting points for subsequent LS. Various crossovers have
been used in previous works (Nagata 2007; Vidal 2022).
As shown in Section 2, the Ordered Crossover (OX) is
widely used, and consists in juxtaposing a fragment of the
first solution with the remaining client visits ordered as
in the second solution. By doing so, it implicitly creates a
re-connection point, which is typically random.
Note that in both LS and Crossover, there is interest in using
relatedness information between client vertices to (i) speed up
the LS or (ii) identify a subset of more promising crossover
operations. It is also noteworthy that the relatedness infor-
mation used until now for the LS (and possibly used for
the crossover) is a broader concept that goes beyond simple
distance criteria, and which could be possibly learned.
In recent years, graph neural networks (GNNs) have
emerged as a tool to apply machine learning techniques to
combinatorial optimization problems posed over graphs. To
the best of our knowledge, the first attempt in this context
was proposed by Hopfield and Tank (1985) for the TSP. Un-
derpinned by enhancements in hardware and artificial intelli-
gence research over the last years, the development of deep
NNs made them relevant to a wide range of difficult combina-
torial optimization problems, such as SAT, Minimum Vertex
Cover, and Maximum Cut (Dai et al. 2017; Yolcu and P
´
oczos
2019). When applied to solve CVRPs, these networks are
usually combined with reinforcement learning (RL – Nazari
et al. 2018; Chen and Tian 2019) or typically used for node
classification or edge prediction (Kool et al. 2022; Xin et al.
2021). Despite extensive research, GNNs for directly solv-
ing CVRPs remain limited to small problem instances with
up to 100 customers and generally do not compare favor-
ably with classic optimization methods (exact or heuristic)
in terms of solution quality. This is possibly due to the fact
that good solutions for combinatorial optimization problems
result from tacit structural knowledge about the problem
(learnable solution structure) along with a significant amount
of trial-and-error to build the best possible solution fitting
almost perfectly the constraints at hand. After all, an optimal
solution is a very specific outlier. Whereas better knowledge
of solution structures can be learned to guide the search,
avoiding some (explicit or implicit) enumeration of solutions
without compromising solution quality is generally challeng-
ing. Consequently, using pure learning algorithms without
any other form of solution enumeration is likely to be unsuc-
cessful.
Given these observations, a promising path toward bet-
ter solution methods for VRPs concern the hybridization of
learning-based and traditional solution methods. In this work,
in particular, we aim to learn and use relatedness information
in the LS and crossover operators of HGS to improve this
state-of-the-art method substantially. We capitalize upon the
work of Kool et al. (2022), which trained a GNN to predict
occurrence probabilities of edges in high-quality solutions
(i.e., heatmap), and used this information to sparsify the un-
derlying graph and accelerate related solution procedures.
Instead, we leverage the heatmaps as a source of relatedness
information to define neighborhood restrictions in the LS and
possible re-connection points in the crossover. Throughout
an extensive experimental campaign, we evaluate how HGS’
performance varies with these surgical changes, for better
or worse, on more than 10000 different instances contain-
ing from 100 to 1000 customers. Therefore, we make the
following specific contributions.
1.
We introduce a framework for defining and exploiting
relatedness information between pairs of customers in the
context of the CVRP.
2.
We show that relatedness measures can be exploited to
steer the LS towards the most promising moves. Addition-
ally, we use relatedness information to extend the classical
OX crossover, trading some of its inherent randomness for
better choices of re-connection points between the parents.
Our approaches are generic and applicable to any type of
relatedness measure. We will specifically consider related-
ness from two sources: geographical relatedness as given
by the distance between two customers, and learnable
relatedness (i.e., heatmap) obtained from a GNN.
3.
We suggest a practical technique to exploit the output of a
single GNN (heatmaps for fixed-size graphs) for problem
instances of varying sizes. To achieve this, we decom-
pose the original instance into a sequence of fixed-size
subproblems and aggregate the resulting heatmap infor-
mation. This approach has the benefit of only requiring a
single trained model of moderate size.
4.
Finally, we conduct an extensive computational campaign
to measure the enhancements achieved with the proposed
techniques and analyze the impact of each change. We
observe that incorporating relatedness information within
the crossover and LS operators largely benefit the search,
such that learning-based approaches seem to be successful
at first sight. However, after a closer analysis, we also
observe that these improvements are mostly insensitive to
the source of the relatedness information (geographical
or learned). Therefore, the problem-specific knowledge
and strategies that we integrated contributed more to the
algorithm’s performance than GNN-based algorithms for
defining relatedness.
2 Methodology
The CVRP is defined over a complete graph
G= (V, E)
,
where the set of vertices
V={0,1...,n}
contains a vertex
0
representing the depot, and the remaining vertices represent
customers. Each customer
i∈ {1, . . . , n}
is characterized
by a demand
di
. Edges
(i, j)
model direct travel between
vertices
i
and
j
for a distance
dij
. A solution to this problem
is a set of routes originating and ending at the depot and
visiting customers, such that (i) the total demand over each
route does not exceed a vehicle-capacity limit
Q
, (ii) each
customer is visited exactly once, and (iii) the total travel
distance is minimized.
We additionally assume that we can calculate a relatedness
metric
φ(i, j)
for each edge
(i, j)
. This definition is general:
in the simplest setting, relatedness could be the inverse of
distance, i.e.,
φ(i, j) = 1/dij
. In a more informed setting,
we can instead consider defining
φ(i, j)
as the output of a
graph neural network (GNN) as seen in Kool et al. (2022),
predicting the probability of occurrence of an edge in a high-
quality solution. Probabilities of this kind are typically called
heatmaps. In the remainder of the paper, we will refer to
φD(i, j)
for distance-relatedness, and
φN(i, j)
for GNN-based
relatedness. This information will now be used to refine the
two most important HGS operators.
2.1 Hybrid Genetic Search
The Hybrid Genetic Search (Vidal 2022) relies on simple so-
lution generation and improvement steps. A complete pseudo-
code is provided in the appendix. The method starts by ini-
tializing a population of size
µ
with random solutions that
are improved by local search. After this initialization phase,
HGS iteratively generates new solutions by selecting two
random solutions in the population, recombining them using
an ordered crossover (OX), and applying local search for
improvement. To promote exploration, solutions that exceed
capacity limits are not directly rejected but instead penal-
ized according to their amount of infeasibility. The penalty
weights are adapted during the search to achieve a target
percentage of feasible solutions, and infeasible solutions are
maintained in a separate subpopulation. Whenever a solution
is infeasible after local search, an extra REPAIR step is ap-
plied, which simply consists of a classic local search with a
temporarily (10×) higher penalty coefficient.
During the overall search process, the number of solutions
in the feasible and infeasible populations is monitored. When-
ever any population exceeds
µ+λ
solutions, a survivors’ se-
lection phase is triggered to retain only the best
µ
individuals,
according to a ranking metric based on solution value and
contribution to the population diversity. Finally, the algorithm
restarts each time
nIT
consecutive solution generations have
been done without improvement of the best solution, and
it terminates upon a time limit
TMAX
by returning the best
solution found over all the restarts.
2.2 Local Search using Relatedness Measures
Local Search (LS) is a conceptually simple and efficient
method to solve combinatorial optimization problems of the
form
minxXc(x)
, where
X
is the space of all solutions
and
c
is the objective function. A neighborhood is defined
as a mapping
N:X2X
associating with any solu-
tion
x
a set of neighbors
N(x)X
. For the CVRP,
N(x)
is usually defined relative to a set of operations (i.e., moves)
that can modify the current solution
x
. A move
τ
is a small
modification that can be applied on
x
to obtain a neighbor
τ(x) N (x)
. HGS uses four main types of moves and some
of their immediate extensions (Vidal 2022):
RELOCATE: Moves a visit to customer
i
immediately after
a visit to a different customer jor the depot;
• SWAP: Exchanges the visits of customers iand j;
• 2-OPT: Reverts a customer-visit sequence (i, . . . , j);
2-OPT*: Exchanges customers
i
and
j
and their succeed-
ing visits.
The moves are evaluated in a random order of the indices
i
and
j
, and any improvement is directly applied. This pro-
cess is repeated until a local minimum is reached, i.e., a
Twenty customers most related to customer
63
ac-
cording to φDand φN:
Depot
63
63
63
𝜙!
𝜙"
Customer 63 in an optimal solution:
Depot
63
Figure 1: Sets of related customers according to
φD
and
φN
,
on instance X-n247-k50
situation where no improving move exists for all the con-
sidered neighborhoods. Without further pruning, all these
neighborhoods contain
O(n2)
solutions. Using incremental
calculations (keeping track of partial load and distance over
the routes), it is possible to conduct a complete evaluation
of all neighborhoods in
O(n2)
time. Moreover, the number
of complete neighborhood searches (i.e., loops) needed to
converge is rarely greater than 10 in practice.
A quadratic complexity for the LS operator is adequate for
small problems, but this can become a significant bottleneck
otherwise due to its frequent use. Considering this, Toth and
Vigo (2003) introduced a “granular search” mechanism that
consists in limiting the moves to customer pairs
(i, j)
that are
geographically close, i.e., such that
j
belongs to a set
Φ(i)
formed of the
Γ
closest customers of
i
. Consequently, the
total number of moves and the complexity of each LS loop
reduce down to
O(nΓ)
time. Indeed, it rarely makes sense to
relocate or exchange customer visits that are far away from
each other. Moreover, this strategy ensures that each move
creates at least one short edge (Schneider, Schwahn, and Vigo
2017; Vidal et al. 2013b; Accorsi and Vigo 2021).
Since its inception, granular search has been adapted to
many VRP variants. Especially, to handle customer con-
straints on service-time windows, (Vidal et al. 2013b) ex-
tended the concept to filter node pairs (i, j)based on a com-
pound metric that includes distance, unavoidable waiting
times, and unavoidable time-window violations arising from
摘要:

NeuralNetworksforLocalSearchandCrossoverinVehicleRouting:APossibleOverkill?´ItaloSantana1,AndreaLodi2,ThibautVidal1,31DepartmentofComputerScience,PonticalCatholicUniversityofRiodeJaneiro2JacobsTechnion-CornellInstitute,CornellTechandTechnion-IIT3CIRRELT&SCALE-AIChairinData-DrivenSupplyChains,Depart...

展开>> 收起<<
Neural Networks for Local Search and Crossover in Vehicle Routing A Possible Overkill Italo Santana1 Andrea Lodi2 Thibaut Vidal13.pdf

共15页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:15 页 大小:1.18MB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 15
客服
关注