Machine Learning-Powered Course Allocation Ermis Soumalias Department of Informatics University of Zurich and ETH AI Center seukenifi.uzh.ch

2025-05-02 2 0 1.32MB 75 页 10玖币

侵权投诉

Machine Learning-Powered Course Allocation

Ermis Soumalias

Department of Informatics, University of Zurich and ETH AI Center, seuken@iﬁ.uzh.ch

Behnoosh Zamanlooy

McMaster University, zamanlob@mcmaster.ca

Jakob Weissteiner

Department of Informatics, University of Zurich and ETH AI Center, weissteiner@iﬁ.uzh.ch

Sven Seuken

Department of Informatics, University of Zurich and ETH AI Center, seuken@iﬁ.uzh.ch

We study the course allocation problem, where universities assign course schedules to students. The current

state-of-the-art mechanism, Course Match, has one major shortcoming: students make signiﬁcant mistakes

when reporting their preferences, which negatively aﬀects welfare and fairness. To address this issue, we

introduce a new mechanism, Machine Learning-powered Course Match (MLCM). At the core of MLCM is

a machine learning-powered preference elicitation module that iteratively asks personalized pairwise com-

parison queries to alleviate students’ reporting mistakes. Extensive computational experiments, grounded

in real-world data, demonstrate that MLCM, with only ten comparison queries, signiﬁcantly increases both

average and minimum student utility by 7%–11% and 17%–29%, respectively. Finally, we highlight MLCM’s

robustness to changes in the environment and show how our design minimizes the risk of upgrading to

MLCM while making the upgrade process simple for universities and seamless for their students.

1. Introduction

The course allocation problem arises when educational institutions assign schedules of courses to

students (Budish and Cantillon 2012). Each course has a limited number of indivisible seats and

monetary transfers are prohibited for fairness reasons. The key challenge is to accurately elicit the

students’ preferences. The combinatorial nature of the preferences makes this particularly diﬃcult,

as students may view certain courses as complements or substitutes (Budish and Kessler 2022).

1.1. Course Match

The state-of-the-art practical solution to the course allocation problem is the Course Match (CM)

mechanism by Budish et al. (2017), which provides a good trade-oﬀ between eﬃciency, fairness,

and incentives. In recent years, CM has been adopted at many institutions such as the Wharton

School at the University of Pennsylvania1and Columbia Business School2.

1https://mba-inside.wharton.upenn.edu/course-match/

2https://students.business.columbia.edu/records-registration/course-match-registration

arXiv:2210.00954v3 [cs.GT] 9 Mar 2024

Soumalias et al.: Machine Learning-Powered Course Allocation

CM uses a simple reporting language to elicit students’ preferences over schedules (i.e., course

bundles). Concretely, CM oﬀers students a graphical user interface (GUI) to enter a base value

between 0 and 100 for each course, and an adjustment value between −200 and 200 for each pair

of courses. Adjustments allow students to report complementarities and substitutabilities between

courses, up to pairwise interactions. The total value of a schedule is the sum of the base values

reported for each course in that schedule plus any adjustments (if both courses are in the schedule).

Prior to the adoption of CM in practice, Budish and Kessler (2022) performed a lab experiment

to evaluate CM. Regarding eﬃciency, they found that, on average, students were happier with

CM compared to the Bidding Points Auction, the previously used mechanism. Regarding fairness,

students also found CM fairer. Regarding the reporting language, they found that students were

able to report their preferences “accurately enough to realize the eﬃciency and fairness beneﬁts.”

Given these positive ﬁndings, Wharton was then the ﬁrst university to switch to CM.

1.2. Preference Elicitation Shortcomings of Course Match

However, Budish et al. (2017) were already concerned that the CM language may not be able to

fully capture all students’ preferences. Furthermore, they mentioned that some students might ﬁnd

it non-trivial to use the CM language and might therefore make mistakes when reporting their

preferences. Indeed, the lab experiment by Budish and Kessler (2022) revealed several shortcomings

of CM in this regard.

First, students made very limited use of the CM language: on average, students only reported

a base value for half of the 25 courses in the experiment. Additionally, the average number of

pairwise adjustments was only 1.08 (out of 300 possible), and the median was 0. This suggests

that cognitive limitations negatively aﬀect how well students can report their preferences using

the CM language, unless students have essentially additive preferences. Second, in addition to not

reporting part of their preferences, Budish and Kessler (2022) provided evidence that students are

also inaccurate when they do report their preferences.

Budish and Kessler (2022) found that both of these reporting mistakes negatively aﬀected the

welfare of CM. In their experiment, about 16% of students would have preferred another course

schedule, with a median utility diﬀerence for these schedules of 13%. Thus, preference elicitation

in course allocation remains an important challenge.

1.3. Machine Learning-powered Preference Elicitation

To address this challenge, we turn to machine learning (ML). The high-level idea is to train a

separate ML model for each student based on that student’s reports after using the CM language

(i.e., the GUI). These ML models can then be used in an ML-powered preference elicitation algo-

rithm that asks each student a sequence of carefully selected queries, thus enabling them to correct

Soumalias et al.: Machine Learning-Powered Course Allocation 3

the reporting mistakes they made in the GUI. Based on those queries, the student’s ML model is

updated in real-time (in less than one second) and the next query is generated. At the end, the

mechanism allocates schedules to students based on their trained ML models.

With our approach, we build on the ideas developed in a recent stream of papers on ML-powered

preference elicitation. Lahaie and Parkes (2004) and Blum et al. (2004) were the ﬁrst to combine ML

and mechanism design, studying the relation between learning theory and preference elicitation.

Brero et al. (2017,2018) were the ﬁrst to integrate an ML-powered preference elicitation component

into a practical combinatorial auction mechanism. They used support vector regression to learn

bidders’ value functions and to generate new queries in each auction round. In (Brero et al. 2021),

the authors proposed the MLCA mechanism and showed that it achieves higher eﬃciency than the

widely-used combinatorial clock auction (Ausubel et al. 2017). Recently, there has been a stream

of papers improving the ML capability of the MLCA mechanism, which we discuss in Section 2.

While these works are important pre-cursors to the present paper, there are several notewor-

thy diﬀerences. First, and most importantly, these papers used value queries as the interaction

paradigm (i.e., asking agents a query of the form “What is your value for bundle {XYZ}”); how-

ever, in a setting without money, it is unnatural to report preferences in cardinal terms. Instead, we

use pairwise comparison queries (CQs) (i.e., “Do you prefer course schedule A or B?”) because a

pairwise CQ is a simpler type of query known to have low cognitive load (Conitzer 2009,Chajewska

et al. 2000). Second, our goal is not to replace but to build on top of the CM language; thus, we

must be able to handle the cardinal input that students provide via the CM reporting language

as well as the ordinal feedback from answering CQs (e.g., A≻B). Third, while an auctioneer can

require bidders in an auction to participate in a synchronous way (i.e., submitting a bid in every

round), we must allow students to interact with the mechanism in an asynchronous manner (i.e.,

allowing students to answer a sequence of CQs without having to wait on other students).

1.4. Overview of Contributions

In this paper, we introduce the machine learning-based Course Match (MLCM) mechanism, which

builds on top of CM by adding an ML-powered preference elicitation algorithm.

First, students use the CM reporting language (i.e., the same GUI). As in CM, this input is

required from all students. Second, MLCM uses these initial reports to train a separate ML model

for each student so that it can predict each student’s utility for any possible course schedule. Third,

MLCM uses a carefully designed ML-powered preference elicitation algorithm to generate pairwise

comparison queries that are tailored to each student, and students simply answer which schedule

they prefer. Based on this feedback, each student’s ML model is retrained and the next query is

generated. Importantly, this iterative phase is optional – each student can answer as many such

Soumalias et al.: Machine Learning-Powered Course Allocation

queries as she wants (including none). However, the more queries she answers, the better the ML

model will typically approximate her true preferences, which will beneﬁt her in the last phase,

where MLCM computes the ﬁnal allocation based on all ML models (and in case a student has

answered no queries, then only her GUI reports will be used for the ﬁnal allocation calculation).3

In Section 4, we describe all components of MLCM. First, we give a detailed description of MLCM

(Section 4.1). Then we formally introduce our query generation algorithm, which we call online

binary insertion sort (OBIS) (Section 4.2). We prove that the amount of information that can be

inferred for a student grows superlinearly in the number of OBIS-generated CQs she answers. We

provide a simple example, showcasing how MLCM using OBIS queries helps alleviate a student’s

reporting mistakes (Section 4.3). Finally, we extend the theoretical guarantees of CM to MLCM

(Section 4.4). Importantly, we explain why MLCM is also strategyproof in the large.4

In Section 5, we instantiate the ML model used by MLCM with monotone-value neural networks

(MVNNs) (Weissteiner et al. 2022a), an architecture speciﬁcally designed for combinatorial assign-

ment problems. We introduce a training method that combines the cardinal input from the CM

language and the ordinal feedback from the CQs in accordance with Bayesian theory.

In Section 6, we introduce a new course allocation simulation framework to evaluate MLCM. Its

ﬁrst component is a realistic student preference generator, designed so that each student’s complete

preferences can be encoded in a succinct Mixed Integer Program (MIP). This allows computing

a benchmark allocation given the students’ true preferences. Its second component models the

students’ reporting mistakes when interacting with the CM language. We calibrate the framework’s

parameters based on real-world data from the lab experiment of Budish and Kessler (2022).5

In Section 7, we empirically evaluate the performance of MLCM. We ﬁnd that MLCM signif-

icantly outperforms CM in terms of average as well as minimum student utility, even with only

one CQ (Section 7.2). We show that our results are robust to changes in the students’ reporting

mistakes (Section 7.3), that the expected beneﬁt of a student unilaterally opting into MLCM is

large (Section 7.4), and that MLCM also outperforms CM when students have simple additive

preferences (Section 7.5). To explain the performance improvements of MLCM, we show that OBIS

generates CQs that are surprising and informative for the ML models (Section 7.6). Finally, we

3An alternative approach is to only use the trained ML models to update the student’s GUI reports, and to then

run the original CM mechanism on the updated reports. In Section 7.8, we show that this approach also works and

leads to essentially the same eﬃciency, but in terms of run-time it is approximately 10 times slower.

4A mechanism is strategyproof in the large if, for suﬃciently many agents and any full-support i.i.d. distribution of

the other agents’ reports, reporting truthfully is approximately interim optimal (Azevedo and Budish 2019).

5The source code for our course allocation simulator is already publicly available (we omit the hyperlink to preserve

anonymity). We have submitted the source code for the simulator, for MLCM, and for all of our experiments as part

of the supplementary material, and we will make the source code publicly available upon acceptance of this paper.

Soumalias et al.: Machine Learning-Powered Course Allocation 5

highlight how gracefully MLCM’s runtime scales with the number of courses (Section 7.7) and the

practicability of piloting MLCM for a university currently using CM (Section 7.8).

In Section 8, we contextualize our results, explore the practicalities of implementing MLCM, and

examine alternatives. In Section 9, we conclude, outline avenues for future work, and explain how

our proposed mechanism can be applied to other combinatorial assignment problems as well.

2. Related Work

Our work is related to the research on course allocation and ML-based preference elicitation.

2.1. Course Allocation

The course allocation problem is an instance of the combinatorial assignment problem, for which

several impossibility results establish a tension between welfare, incentives, and fairness. For exam-

ple, it is known that the only mechanisms for this problem that are ex-post Pareto eﬃcient and

strategyproof are dictatorships (P´apai 2001,Hatﬁeld 2009).

Multiple empirical studies have pointed out design ﬂaws of course allocation mechanisms used

in practice. Budish and Cantillon (2012) showed that the Harvard Business School (HBS) draft

mechanism (Brams and Straﬃn Jr 1979) creates signiﬁcant incentives for students to misreport,

leading to large welfare losses. The commonly used Bidding Points Auction implicitly assumes that

students have positive value for left-over virtual currency, which harms incentives and ultimately

leads to allocations that are neither eﬃcient nor fair (S¨onmez and ¨

Unver 2010).

Motivated by these design ﬂaws, Budish (2011) proposed a new mechanism for the combinatorial

assignment problem called approximate competitive equilibrium from equal incomes (A-CEEI).

A-CEEI circumvents the impossibility results previously mentioned by making slight compromises

in all three of those dimensions of interest. Speciﬁcally, A-CEEI is approximately eﬃcient, satisﬁes

desirable fairness criteria (envy bounded by a single good and (n+ 1)-maximin share guarantee),

and is strategyproof in the large. Later, Budish et al. (2017) introduced CM as the practical

implementation of A-CEEI. We provide a short review of A-CEEI and CM in Section 3.

While our research focuses on the combinatorial assignment approach to course allocation, there

are also other modeling approaches. Diebold et al. (2014) modeled course allocation as a two-sided

matching problem, where, in contrast to our setting, instructors also have preferences over students.

Bichler et al. (2021) studied the same setting where courses additionally have minimum quotas.

Kornbluth and Kushnir (2023) extended the A-CEEI mechanism to a similar two-sided matching

setting where students have diﬀerent priorities for courses. Bichler and Merting (2021) studied the

assignment of tutorials for mandatory courses, which is more similar to a scheduling problem.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MachineLearning-PoweredCourseAllocationErmisSoumaliasDepartmentofInformatics,UniversityofZurichandETHAICenter,seuken@ifi.uzh.chBehnooshZamanlooyMcMasterUniversity,zamanlob@mcmaster.caJakobWeissteinerDepartmentofInformatics,UniversityofZurichandETHAICenter,weissteiner@ifi.uzh.chSvenSeukenDepartmentof...

展开>> 收起<<

Machine Learning-Powered Course Allocation Ermis Soumalias Department of Informatics University of Zurich and ETH AI Center seukenifi.uzh.ch.pdf

共75页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Machine Learning-Powered Course Allocation Ermis Soumalias Department of Informatics University of Zurich and ETH AI Center seukenifi.uzh.ch

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: