
Soumalias et al.: Machine Learning-Powered Course Allocation
4
queries as she wants (including none). However, the more queries she answers, the better the ML
model will typically approximate her true preferences, which will benefit her in the last phase,
where MLCM computes the final allocation based on all ML models (and in case a student has
answered no queries, then only her GUI reports will be used for the final allocation calculation).3
In Section 4, we describe all components of MLCM. First, we give a detailed description of MLCM
(Section 4.1). Then we formally introduce our query generation algorithm, which we call online
binary insertion sort (OBIS) (Section 4.2). We prove that the amount of information that can be
inferred for a student grows superlinearly in the number of OBIS-generated CQs she answers. We
provide a simple example, showcasing how MLCM using OBIS queries helps alleviate a student’s
reporting mistakes (Section 4.3). Finally, we extend the theoretical guarantees of CM to MLCM
(Section 4.4). Importantly, we explain why MLCM is also strategyproof in the large.4
In Section 5, we instantiate the ML model used by MLCM with monotone-value neural networks
(MVNNs) (Weissteiner et al. 2022a), an architecture specifically designed for combinatorial assign-
ment problems. We introduce a training method that combines the cardinal input from the CM
language and the ordinal feedback from the CQs in accordance with Bayesian theory.
In Section 6, we introduce a new course allocation simulation framework to evaluate MLCM. Its
first component is a realistic student preference generator, designed so that each student’s complete
preferences can be encoded in a succinct Mixed Integer Program (MIP). This allows computing
a benchmark allocation given the students’ true preferences. Its second component models the
students’ reporting mistakes when interacting with the CM language. We calibrate the framework’s
parameters based on real-world data from the lab experiment of Budish and Kessler (2022).5
In Section 7, we empirically evaluate the performance of MLCM. We find that MLCM signif-
icantly outperforms CM in terms of average as well as minimum student utility, even with only
one CQ (Section 7.2). We show that our results are robust to changes in the students’ reporting
mistakes (Section 7.3), that the expected benefit of a student unilaterally opting into MLCM is
large (Section 7.4), and that MLCM also outperforms CM when students have simple additive
preferences (Section 7.5). To explain the performance improvements of MLCM, we show that OBIS
generates CQs that are surprising and informative for the ML models (Section 7.6). Finally, we
3An alternative approach is to only use the trained ML models to update the student’s GUI reports, and to then
run the original CM mechanism on the updated reports. In Section 7.8, we show that this approach also works and
leads to essentially the same efficiency, but in terms of run-time it is approximately 10 times slower.
4A mechanism is strategyproof in the large if, for sufficiently many agents and any full-support i.i.d. distribution of
the other agents’ reports, reporting truthfully is approximately interim optimal (Azevedo and Budish 2019).
5The source code for our course allocation simulator is already publicly available (we omit the hyperlink to preserve
anonymity). We have submitted the source code for the simulator, for MLCM, and for all of our experiments as part
of the supplementary material, and we will make the source code publicly available upon acceptance of this paper.