
4
that satisfy it are either constant or decreasing (meaning
that they cannot increase under any alternative). This hap-
pens, for example, when testing exchangeability or testing
log-concavity; see Section 5.5 for references and details.
Luckily, generalizing the above game protocol resusci-
tates the approach. There appear to be two different types
of generalized games: (a) one can restrict the amount of
information available to the Skeptic by introducing a third
player (an “Intermediary”) who throws away some infor-
mation revealed by Reality (mathematically, Skeptic op-
erates in a shrunk filtration), (b) one can instead make
the Skeptic play many games in parallel, each against a
different subset of P, with the Skeptic’s net wealth be-
ing their worst wealth across all the parallel games. In
the first case, Skeptic’s wealth may remain a nonnega-
tive (super)martingale, but in the second case, their wealth
is an e-process (under the null, their wealth is upper-
bounded by a different supermartingale in each game,
and thus is bounded by one at any stopping time). While
these solutions may seem almost magical at first glance,
they both yield fruit for the same problem mentioned
above of testing exchangeability: approach (a) is used
in Vovk, Gammerman and Shafer (2022, Part III) and ap-
proach (b) in Ramdas et al. (2022). The latter work, along
with Ruf et al. (2022), together show the centrality of e-
processes in game-theoretic statistics: e-processes exist
for many Pfor which nonnegative (super)martingales do
not.
When Pand Qhave a common reference measure,
meaning that likelihood-ratio based methods are still in
play, two key ideas stand out: universal inference (Wasserman, Ramdas and Balakrishnan,
2020), and the reverse information projection (Grünwald, De Heide and Koolen,
2023). The former always yields an e-process, but lat-
ter always results in an e-value which can be multiplied
across batches of data to yield a supermartingale. But
sometimes the latter also directly yields an e-process (and
when it does, it dominates universal inference).
When Pand Qdo not have any common reference mea-
sures — and thus likelihood-ratio based methods may not
make any sense at the outset — the design of nonneg-
ative (super)martingales or e-processes occupies center
stage. Sometimes, the nonparametric definition of Pdi-
rectly yields a natural game, like when testing if a “sub-
Gaussian” mean is positive (Darling and Robbins,1967).
Other times, one must design new games in possibly
shrunk filtrations, which may not be obvious at the outset,
like in two-sample testing (Shekhar and Ramdas,2021).
The entire discussion above was centered on testing by
betting, because this typically forms the technical heart
of other problems that are not cast explicitly as testing.
For example, appropriate duality concepts and inversions
allow us to translate many of these results into those for
estimation of appropriate functionals using confidence se-
quences (Section 5). Both e-processes and confidence se-
quences can in turn be extended to other problems like
change detection (Section 5.7), model selection, etc.
In fact, our investigations reveal a curious phenomenon:
at the heart of many (and plausibly, all) nonparametric
testing and estimation problems is a “hidden” game (of-
ten not unique: the same Pand Qmay be associated
with different filtrations and betting strategies that are e-
processses under Pand make money under Q). Further,
explicitly bringing out such games (and betting well in
them) can yield powerful new methodology as well as
new theoretical insights (Howard et al.,2020)
A full understanding of when and why this happens
is open, but we provide one hint here. Likelihood ratios
have been at the center of statistics for nearly a century.
Nonnegative (super)martingales and e-processes are sim-
ply nonparametric, composite generalizations of likeli-
hood ratios, and these have been found to exist in dozens
of problems where one cannot even begin to talk about
likelihood-ratio based methods. Thus, these tools give
us a way to work implicitly with likelihood ratios, even
when there appears to be no explicit way to do so. Given
the power (and sometimes optimality) of likelihood-ratio
based tests in parametric settings, we perhaps get a hint of
the power of our game-theoretic approaches in composite
(often nonparametric) settings.
The rest of this paper will formally define the key con-
cepts, and provide technical details of the aforementioned
methods and phenomena in different problem settings.
2. CENTRAL CONCEPTS
In the sequel, we leave measurability assumptions and
other measure-theoretic details implicit so far as possible.
2.1 E-values
An e-variable for Pis a nonnegative random variable E
such that EP[E]≤1 for all P∈P. Its realized value, after
observing the data, is an e-value.2Often we call Eitself
an e-value, blurring the distinction between the random
variable and its realized value. (The term “p-value” is also
often used for both random variables and their values.)
When EP[E]=1, we call the e-value Eaunit bet
against P. This name evokes a story in which expected
values are prices of payoff: the Forecaster predicts that
X∼P, and in order to bet against them, a Skeptic could
buy one unit of E, for the price of 1, delivering the Skep-
tic a payoffof E(X). Mathematically, a unit bet against P
is simply3a likelihood ratio dQ/dP for some alternative
Q. This is elementary when we use probability densities:
2Observe that we use boldface Efor expectation and normal Efor
e-values. The “e” in e-value stands both for “evidence” (because it
quantifies statistical evidence against the null) and for “expectation”
(because its central property is its expectation).
3In some sense, statisticians have always been using e-values (and
test martingales), because likelihood ratios are the most important ex-
ample of e-values (and test martingales). But this direct analog only
holds when testing a single distribution P. The power and utility of e-
values, test (super)martingales and e-processes are truly realized only
dealing with a composite (and sometimes nonparametric) P.