and f(θi, θj) = 0 as a failure. The algorithm observes the success of each team, and may use
this to select the matchings in subsequent rounds; however, the algorithm cannot directly
observe agents’ types. For any matching M, we define its score as S(M) := P(i,j)∈Mf(θi, θj)
— in the special case of Boolean functions, this is the number of successful teams.
A convenient way to view the Boolean setting is as constructing an edge-labeled exploration
graph G(V, E1, E2, . . .), where nodes in Vare agents, and the edge set Et:= St0≤tMt0
represents all pairings played up to round t. Upon being played for the first time, an edge is
assigned a label {0,1}corresponding to the success value of its team. Known 0-agents and
known 1-agents are those whose types can be inferred from the edge labels. The remaining
agents are unknown. The unresolved subgraph is the induced subgraph on the unknown
agents.
2.2 Adversarial Types and Regret
The principal makes decisions facing an adaptive adversary, who knows k(unlike the principal,
who only knows n) and, in each round, is free to assign agent types after seeing the matching
chosen by the principal, as long as (1) the assignment is consistent with prior observations
(i.e., with the exploration graph), and (2) the number of 1-agents is k. Note that this is
the strongest notion of an adversary we can consider in this setting; in particular, since the
adversary is fully adaptive and knows the matching before making decisions, randomizing does
not help, and so it is without loss of generality to consider only deterministic algorithms.
We evaluate the performance of algorithms in terms of additive regret against such an ad-
versary. Formally, let M∗be any matching maximizing S(M∗) — note that for any Boolean
team success function, S(M∗) is a fixed function of nand k. In round t, an algorithm incurs
regret rt:= S(M∗)−S(Mt), and its total regret is the sum of its per-round regret over an
a priori infinite time horizon. Note, however, that after a finite number of rounds, a na¨ıve
algorithm that enumerates all matchings can determine, and henceforth play, M∗; thus, the
optimal regret is always finite. Moreover, the “effective” horizon (i.e., the time until the
algorithm learns M∗) of our algorithms is small.
2.3 Symmetry Synergy Functions and Atomic Primitives
In subsequent sections, we consider the problem of minimizing regret under four Boolean
synergy functions f:{0,1}2→ {0,1}:EQ,XOR,OR, and AND. Interestingly, the algorithms
for these four settings suffice to handle any symmetric synergy function f:{0,1}2→R. We
argue this below for synergy functions that take at most two values; We handle the case of
synergy functions ftaking three different values at the end of Section 3.1.
Lemma 1. Fix some `≤u, let f:{0,1}2→ {`, u}be any symmetric synergy function, and
let rf(n, k)denote the optimal regret with nagents, of which khave type 1.
Then, rf(n, k) = (u−`)·rg(n, k)for one of g∈ {EQ,XOR,AND,OR}.
Proof First, note that without loss of generality, we may assume that f(0,0) ≤f(1,1).
Otherwise, we can swap the labels of the agent types without altering the problem. Note
that this immediately allows us to reduce team formation under the Boolean NAND and
NOR function to the same problem under AND and OR, respectively. Next, note that if
4