encodes the constraints in model’s weights and learns them from data. Nandwani et al. [2022, 2021],
Palm et al. [2018] train a Recurrent Relational Network for learning to solve puzzles like sudoku,
graph coloring, Futoshiki, etc; Dong et al. [2019] propose Neural Logic Machines (NLMs) that learn
lifted first order rules and experiment with blocks world, reasoning on family trees etc; Ranjan et al.
[2022] use a siamese GNN architecture to learn the combinatorial problems of graph and subgraph
distance computation; Bajpai et al. [2018], Garg et al. [2019, 2020], Sharma et al. [2022] use GNNs
to train probabilistic planners for combinatorial domains with large, discrete state and action spaces;
Selsam et al. [2019], Amizadeh et al. [2019a,b] train a GNN for solving any CSP when constraints
are explicitly given in a standard form such as CNF, DNF or Boolean Circuits. This is clearly different
from us since we are interested in explicitly learning the constraints of the optimization problem.
Inverse Optimization [Chan et al., 2021] aims to learn the constraints (cost) of a linear program (LP)
from the observed optimal decisions. Recently [Tan et al., 2019, 2020] use the notion of Parameterized
Linear Programs (PLP) in which both the cost and the constraints of an LP are parameterized by
unknown weights. [Gould et al., 2019] show how to differentiate through continuous constrained
optimization problem using the notion of a ‘declarative node’ that uses implicit function theorem
[Mingari Scarpello and Ritelli, 2002]. Similar to this are other works [Amos and Kolter, 2017,
Agrawal et al., 2019] that define methods for differentiating through other continuous problems like
convex quadratic programs (QPs) or cone programs. While all these techniques are concerned with
learning the constraints (cost) for optimization problems defined over continuous variables, our focus
is on learning constraints (cost) for an ILP which involves optimization over discrete variables and
can be a significantly harder problem.
In the space of learning cost for an ILP, several approaches have been proposed recently. [Pogancic
et al., 2020] give the gradient w.r.t. linear cost function that is optimized over a given discrete space.
[Rolínek et al., 2020a,b] exploit it for the task of Deep Graph Matching, and for differentiating
through rank based metrics such as Average Precision/Recall respectively. [Berthet et al., 2020]
replace the black-box discrete solver with its smooth perturbed version during training and exploit
‘perturb-and-MAP’ [Papandreou and Yuille, 2011] to compute the gradients. On similar lines, [Niepert
et al., 2021] also exploit perturb-and-MAP but propose a different noise model for perturbation and
also extends to the case when the output of the solver may feed into a downstream neural model.
These methods assume the constraints to be given, and only learn the cost for an ILP.
Other methods use relaxations of specific algorithms for the task of learning the constraints and cost
for an ILP. Ferber et al. [2020] backprop through the KKT conditions of the LP relaxation created
iteratively while using cutting-plane method [Gomory, 2010] for solving an MILP; Wilder et al.
[2019] add a quadratic penalty term to the continuous relaxation and use implicit function theorem to
backprop through the KKT conditions, as done in Amos and Kolter [2017] for smooth programs;
Mandi and Guns [2020] instead add a log-barrier term to get the LP relaxation and differentiate
through its homogeneous self-dual formulation linking it to the iterative steps in the interior point
method. Wang et al. [2019] uses a low rank SDP relaxation of MAXSAT and defines how the
gradients flow during backpropagation. Presumably, these approaches are limited in their application
since they rely on specific algorithmic relaxations.
Instead of working with a specific algorithm, some recent works differentiate through a black-box
combinatorial solver for the task of learning constraints [Paulus et al., 2021]. The solver is called at
every learning step to compute the gradient. This becomes a bottleneck during training, since the
constraints are being learned on the go, and the problem could become ill-behaved resulting in a
larger solver time, severely limiting scalability of such approaches. In contrast, we would like to
perform this task in a solver-free manner. Pan et al. [2020] propose learning of constraints for the task
of structured prediction. They represent the linear constraints compactly using a specific two layer
Rectifier Network [Pan and Srikumar, 2016]. A significant limitation is that their method can only be
used for learning constraints, and not costs. Further, their approach does not do well experimentally
on our benchmark domains. Meng and Chang [2021] propose a non-learning based approach for
mining constraints in which for each training data
(c,y)
, a new constraint is added which essentially
imply that no other target can have better cost than the given target
y
for the corresponding cost
c
. We
are specifically interested in learning not only the constraints, but also the cost of an ILP from data.
Convex Polytope Machines (CPM) [Kantchelian et al., 2014] learns a non linear binary classifier
in the form of a polytope from a dataset of positive and negative examples. In contrast, our goal is
to learn the cost as well as constraints for an ILP where both the constraints and the cost could be
parameterized by another neural network. Also, we do not have access to any negative samples.
3