the search. They consist of recombining two existing (par-
ent) solutions into a new (offspring) solution that inher-
its promising characteristics from both. For the CVRP,
crossover operators are not primarily designed for solu-
tion improvement, but instead used to create promising
starting points for subsequent LS. Various crossovers have
been used in previous works (Nagata 2007; Vidal 2022).
As shown in Section 2, the Ordered Crossover (OX) is
widely used, and consists in juxtaposing a fragment of the
first solution with the remaining client visits ordered as
in the second solution. By doing so, it implicitly creates a
re-connection point, which is typically random.
Note that in both LS and Crossover, there is interest in using
relatedness information between client vertices to (i) speed up
the LS or (ii) identify a subset of more promising crossover
operations. It is also noteworthy that the relatedness infor-
mation used until now for the LS (and possibly used for
the crossover) is a broader concept that goes beyond simple
distance criteria, and which could be possibly learned.
In recent years, graph neural networks (GNNs) have
emerged as a tool to apply machine learning techniques to
combinatorial optimization problems posed over graphs. To
the best of our knowledge, the first attempt in this context
was proposed by Hopfield and Tank (1985) for the TSP. Un-
derpinned by enhancements in hardware and artificial intelli-
gence research over the last years, the development of deep
NNs made them relevant to a wide range of difficult combina-
torial optimization problems, such as SAT, Minimum Vertex
Cover, and Maximum Cut (Dai et al. 2017; Yolcu and P
´
oczos
2019). When applied to solve CVRPs, these networks are
usually combined with reinforcement learning (RL – Nazari
et al. 2018; Chen and Tian 2019) or typically used for node
classification or edge prediction (Kool et al. 2022; Xin et al.
2021). Despite extensive research, GNNs for directly solv-
ing CVRPs remain limited to small problem instances with
up to 100 customers and generally do not compare favor-
ably with classic optimization methods (exact or heuristic)
in terms of solution quality. This is possibly due to the fact
that good solutions for combinatorial optimization problems
result from tacit structural knowledge about the problem
(learnable solution structure) along with a significant amount
of trial-and-error to build the best possible solution fitting
almost perfectly the constraints at hand. After all, an optimal
solution is a very specific outlier. Whereas better knowledge
of solution structures can be learned to guide the search,
avoiding some (explicit or implicit) enumeration of solutions
without compromising solution quality is generally challeng-
ing. Consequently, using pure learning algorithms without
any other form of solution enumeration is likely to be unsuc-
cessful.
Given these observations, a promising path toward bet-
ter solution methods for VRPs concern the hybridization of
learning-based and traditional solution methods. In this work,
in particular, we aim to learn and use relatedness information
in the LS and crossover operators of HGS to improve this
state-of-the-art method substantially. We capitalize upon the
work of Kool et al. (2022), which trained a GNN to predict
occurrence probabilities of edges in high-quality solutions
(i.e., heatmap), and used this information to sparsify the un-
derlying graph and accelerate related solution procedures.
Instead, we leverage the heatmaps as a source of relatedness
information to define neighborhood restrictions in the LS and
possible re-connection points in the crossover. Throughout
an extensive experimental campaign, we evaluate how HGS’
performance varies with these surgical changes, for better
or worse, on more than 10000 different instances contain-
ing from 100 to 1000 customers. Therefore, we make the
following specific contributions.
1.
We introduce a framework for defining and exploiting
relatedness information between pairs of customers in the
context of the CVRP.
2.
We show that relatedness measures can be exploited to
steer the LS towards the most promising moves. Addition-
ally, we use relatedness information to extend the classical
OX crossover, trading some of its inherent randomness for
better choices of re-connection points between the parents.
Our approaches are generic and applicable to any type of
relatedness measure. We will specifically consider related-
ness from two sources: geographical relatedness as given
by the distance between two customers, and learnable
relatedness (i.e., heatmap) obtained from a GNN.
3.
We suggest a practical technique to exploit the output of a
single GNN (heatmaps for fixed-size graphs) for problem
instances of varying sizes. To achieve this, we decom-
pose the original instance into a sequence of fixed-size
subproblems and aggregate the resulting heatmap infor-
mation. This approach has the benefit of only requiring a
single trained model of moderate size.
4.
Finally, we conduct an extensive computational campaign
to measure the enhancements achieved with the proposed
techniques and analyze the impact of each change. We
observe that incorporating relatedness information within
the crossover and LS operators largely benefit the search,
such that learning-based approaches seem to be successful
at first sight. However, after a closer analysis, we also
observe that these improvements are mostly insensitive to
the source of the relatedness information (geographical
or learned). Therefore, the problem-specific knowledge
and strategies that we integrated contributed more to the
algorithm’s performance than GNN-based algorithms for
defining relatedness.
2 Methodology
The CVRP is defined over a complete graph
G= (V, E)
,
where the set of vertices
V={0,1...,n}
contains a vertex
0
representing the depot, and the remaining vertices represent
customers. Each customer
i∈ {1, . . . , n}
is characterized
by a demand
di
. Edges
(i, j)
model direct travel between
vertices
i
and
j
for a distance
dij
. A solution to this problem
is a set of routes originating and ending at the depot and
visiting customers, such that (i) the total demand over each
route does not exceed a vehicle-capacity limit
Q
, (ii) each
customer is visited exactly once, and (iii) the total travel
distance is minimized.
We additionally assume that we can calculate a relatedness
metric
φ(i, j)
for each edge
(i, j)
. This definition is general:
in the simplest setting, relatedness could be the inverse of
distance, i.e.,
φ(i, j) = 1/dij
. In a more informed setting,