
Machine Learning for K-adaptability
In this paper, we consider a third approach to
(1)
known as
K
-adaptability. There, at most
K
possible wait-and-see
decisions
y1,...,yK
are allowed to be constructed, and the decision maker must select one of those. The values of the
possible yk’s become the first-stage variables, and the problem boils down to
min
x∈X ,y∈YKmax
z∈Z min
k∈K c(z)⊺x+d(z)⊺yk:T(z)x+W(z)yk≤h(z),(2)
where
K={1, . . . , K}
and
YK=×K
k=1 Y
. Although the solution space of
(2)
is finite-dimensional, it remains an
NP-hard problem. For certain cases,
(2)
can be equivalently rewritten as a mixed integer linear programming (MILP)
model [Hanasusanto et al., 2015].
The above formulation requires that for given
x∈ X
and
z∈ Z
, there is at least one decision
yk
,
k∈ K
satisfying
T(z)x+W(z)yk≤h(z)
, and among those one (or more) minimizing the objective. Looking at
(2)
from the point
of view
yk
, we can say that for each
yk
, we can identify a subset
Zk
of
Z
for which a given
yk
is optimal among
K
selected recourses. The union of sets
Zk
,
k∈ K
is equal to
Z
although they need not be mutually disjoint (but a
mutually disjoint partition of
Z
can be constructed). Consequently, solving
(2)
involves implicitly (i) clustering
Z
,
and (ii) optimizing the per-cluster decision so that the objective function corresponding to the most difficult cluster is
minimized. Such simultaneous clustering and per-cluster optimization also occur, for example in retail. A line of
K
products is to be designed to attract the largest possible group of customers. The customers are clustered into
K
groups,
and the nature of the products is guided by the cluster characteristics.
In this manuscript, we focus on the general MILP
K
-adaptability case for which the only existing solution approach
is the
K
-adaptability branch-and-bound (
K
-B&B) algorithm of Subramanyam et al. [2020]. Other methods to solve
the
K
-adaptability problem have been proposed by Hanasusanto et al. [2015] that deals with binary decisions, and
Ghahtarani et al. [2023] that assumes integer first-stage decisions. The
K
-B&B algorithm, as opposed to the top-down
partitioning of
Z
of Bertsimas and Dunning [2016] or Postek and den Hertog [2016], proceeds by gradually building
up discrete subsets
¯
Zk
of scenarios. In most practical cases, a solution to
(2)
, where
y1,...,yK
are feasible for large
¯
Z1,..., ¯
ZK
, is also feasible to the original problem. The problem, however, lies in knowing which scenarios should be
grouped together. In other words, a decision needs to be made on which scenarios of
Z
should be responded to with the
same decision. How well this question is answered, determines the (sub)optimality of
y1,...,yK
. In Subramanyam
et al. [2020], a search tree is used to determine the best collection (see Section 2 for details). However, this approach
suffers from exponential growth.
We introduce a method for learning the best strategy to explore this tree. In particular, we learn which nodes to evaluate
next in depth-first search dives to obtain good solutions faster. These predictions are made using a supervised machine
learning (ML) model. As the input does not range for different instance sizes, or values of
K
, as will be explained
in future sections, the ML model does not require difficult architectures. Standard ML models, such as feed-forward
neural networks, random forests, or support vector machines can be used for this work.
Due to the supervised nature, some oracle is required to be imitated. In the design of this oracle, we are partly inspired
by Monte Carlo tree search (MCTS) [Browne et al., 2012], which is often used for exploring large trees. Namely, the
training data is obtained by exploring
K
-B&B trees via an adaptation of MCTS (see Section 3.4). The scores given to
the nodes in the MCTS-like exploration are stored and used as labels in our training data.
In the field of solving MILPs, learning node selection to speed up exploring the B&B tree has been done, e.g., by He
et al. [2014]. Here, a node selection policy is designed by imitating an oracle. This oracle is constructed using the
optimal solutions of various MILP data sets. More recently, Khalil et al. [2022a] used a graph neural network to learn
node selection. Our method distinguishes itself from these approaches as we specifically use the nature of our problem.
Namely, in the design of the node selection strategy, we use the actual meaning of selecting a node; adding a scenario to
a subset. Therefore, we try to learn what characteristics (or features) scenarios that should belong to the same subset
have.
For an overview on ML for learning branching policies in B&B, see Bengio et al. [2020]. There has also been done a
vast amount of research on applying MCTS directly to solving combinatorial problems. In Sabharwal et al. [2012]
a special case of MCTS called Upper Confidence bounds for Trees (UCT), is used for designing a node selection
strategy to explore B&B trees (for MIPs). In Khalil et al. [2022b] MCTS is used to find the best backdoor (i.e., a
subset of variables used for branching) for solving MIPs. Loth et al. [2013] have used MCTS for enhancing constraint
programming solvers, which naturally use a search tree for solving combinatorial problems. For an elaborate overview
on modifications and applications of MCTS, we refer to ´
Swiechowski et al. [2022].
The remainder of the paper is structured as follows. In Section 2 we describe the inner workings of the
K
-adaptability
branch-and-bound to set the stage for our contribution. In Section 3 we outline our ML methodology along with the
data generation procedure. Section 4 discusses the results of a numerical study, and Section 5 concludes with some
remarks on future works.
2