Making Decisions under Outcome Performativity Michael P . Kim University of California Berkeley

2025-05-02 0 0 389.76KB 37 页 10玖币
侵权投诉
Making Decisions under Outcome Performativity
Michael P. Kim
University of California, Berkeley
mpkim@berkeley.edu
Juan C. Perdomo
University of California, Berkeley
jcperdomo@berkeley.edu
January 10, 2023
Abstract
Decision-makers often act in response to data-driven predictions, with the goal of achiev-
ing favorable outcomes. In such settings, predictions dont passively forecast the future;
instead, predictions actively shape the distribution of outcomes they are meant to predict.
This performative prediction setting [PZMH20] raises new challenges for learning “optimal”
decision rules. In particular, existing solution concepts do not address the apparent tension
between the goals of forecasting outcomes accurately and steering individuals to achieve
desirable outcomes.
To contend with this concern, we introduce a new optimality conceptperformative
omniprediction—adapted from the supervised (non-performative) learning setting [GKR
+
22].
A performative omnipredictor is a single predictor that simultaneously encodes the optimal
decision rule with respect to many possibly-competing objectives. Our main result demon-
strates that ecient performative omnipredictors exist, under a natural restriction of perfor-
mative prediction, which we call outcome performativity. On a technical level, our results fol-
low by carefully generalizing the notion of outcome indistinguishability [DKR
+
21,GHK
+
23]
to the outcome performative setting. From an appropriate notion of Performative OI, we
recover many consequences known to hold in the supervised setting, such as omniprediction
and universal adaptability [KKG+22].
Authors listed alphabetically.
arXiv:2210.01745v2 [cs.LG] 7 Jan 2023
1 Introduction
Data-driven predictions inform policy decisions that directly impact individuals. Proponents
argue that by understanding patterns from the past, decisions can be optimized to improve
future outcomes, to the benefit of individuals and institutions [KLMO15]. In the US educational
system, for instance, early warning systems (EWS) have become a key tool used by states to
combat low graduation rates [BB19,US 16]. The rationale for using such systems is clear. Given
a predictor that, for each student, estimates the likelihood of graduation, school districts can
identify high-risk students at a young age, directing resources to improve individuals’ outcomes,
and in turn, the districts’ graduation rates. Despite compelling arguments, reliably predicting
life outcomes remains a largely-unsolved problem in machine learning.
A key challenge in utilizing predictions to inform decisions is that, often, predictions
influence the outcomes they’re meant to forecast. In the education example above, districts
consider predictions of graduation with the intention of eecting graduation outcomes. In this
situation—where predictions determine interventions, which influence outcomes—accuracy can
be a paradoxical notion. If a predictor correctly identifies high risk individuals as likely to suer
negative outcomes, after successful interventions, the individuals’ outcomes will be positive and
the initial predictions will appear inaccurate. To apply data-driven tools eectively, decision-
makers must resolve an apparent tension between the objectives of forecasting individuals’
outcomes reliably and steering individuals to achieve better outcomes.
Recent work of [PZMH20] introduced performative prediction to contend with the fact that
predictions not only forecast, but also shape the world. Informally, a prediction problem is
performative if the act of prediction influences the distribution on individual-outcome pairs.
From early warning systems, to online content recommendations, to public health advisories:
across many contexts, individuals respond to predictions in a manner that changes the likelihood
of possible outcomes (successful graduation, increased click rate, or decreased disease caseload).
In their original work on the subject, [PZMH20] frame the goal of performative prediction
through loss minimization. In this framing, the ultimate goal is to learn a performatively optimal
decision rule. A decision rule
hpo
is performatively optimal if it achieves the minimal expected
loss (within some class of decision rules H) over the distribution that it induces,
hpo argmin
h∈H
E
(x,y)∼D(h)[`(x,h(x),y)].(1)
Here, D(h) is the distribution over (x,y) pairs observed as response to deploying h.
For generality’s sake, performative prediction makes minimal restrictions on how the distri-
bution may respond to a chosen decision rule. In particular, the choice to deploy a hypothesis
h
, may change the joint distribution (
x, y
)
∼ D
(
h
) over individual-outcome pairs, essentially
arbitrarily.
1
This generality enables us to write a broad range of prediction problems—including
supervised learning [SB14], strategic classification [HMPW16], and causal inference [MMH20]—
as special cases of performative prediction. In all, [PZMH20] establishes a powerful framework
for reasoning about settings where the distribution of examples responds to the predictions.
While powerful, the framework has two noticeable limitations. First, achieving performative
optimality is hard. Without any assumptions on the distributional response
D
(
·
), achieving
performative optimality requires exhaustive search over the hypothesis class
H
. Furthermore,
even under strong structural assumptions on the distributional response and choice of loss
`
, it
1
[PZMH20] assume only a Lipschitzness condition, where similar hypotheses
h
and
h0
give rise to similar
distributions D(h) and D(h0), measured in Wasserstein (earth mover’s) distance.
1
is known that convex optimization does not suce to achieve optimality [PZMH20,MPZ21].
Stated another way: the generality of performative prediction does not come for free. To date,
all existing methods for performative optimality require strong specification assumptions on
the outcome distribution and distributional response.
The second limitation arises due to formulating performative prediction as a loss minimiza-
tion problem: the loss
`
is fixed, once and for all. In performative prediction, dierent losses
can encode drastically dierent objectives: losses are used not only to promote accuracy of
predictions, but also to encourage favorable outcome distributions. Consider a loss designed
for accurate forecasting, e.g., the squared error (
b
yy
)
2
. In this case, the optimal decision rule
will prioritize accuracy without regard for the “quality” of the outcome distribution. On the
other hand, consider a loss designed to steer towards positive outcomes, 1
y
. Here, there is no
notion of accuracy (the loss ignores the prediction b
y), but instead, the objective is to nudge the
distribution of outcomes towards y= 1.
Encoding the decision-making objective through a single loss function forces the learner
to choose the “correct” objective at train time. Downstream decision-makers, however, may
reasonably want to explore dierent objectives according to their own sense of “optimality”.
In the existing formulations for performative prediction, exploring dierent losses requires
re-training from scratch. In this work, we investigate an alternative formulation that enables
decision-makers to eciently explore optimal decision rules under many dierent objectives.
1.1 Decision-Making under Outcome Performativity
To begin, we introduce a special case of the performative prediction setting, which we call
outcome performativity. Outcome performativity focuses on the eects of local decisions on
individuals’ outcomes, rather than the eect of broader policy on the distribution of individuals.
For instance, our example of graduation prediction is modeled well by outcome performativity.
For a given a student, the EWS prediction they receive aects their future graduation outcome,
but does not influence their demographic features or historical test scores. In other words, we
narrow our attention to the performative eects of decisions
h
(
x
) on the conditional distribution
over outcomes
y
, rather than the eects of the decision rule
h
on the distribution as a whole
D
(
h
).
This reframing of performativity still captures many important decision-making problems, but
gives us additional structure to address some of the limitations in the original formulation.
On a technical level, outcome performativity imagines a data generating process over triples
(
x,b
y,y
) where
x∼ D
is sampled from a static distribution over inputs, then a prediction or
decision
b
yb
Y
is selected (possibly as a function of
x
), and finally the true outcome
y∈ Y
is
sampled conditioned on
x
and
b
y
. We focus on binary outcomes
Y
=
{
0
,
1
}
.
2
In this setting, the
outcome performativity assumption posits the existence of an underlying probability function,
p:X × b
Y [0,1],
where for a given individual
x∈ X
and decision
b
yb
Y
, the true outcome
y
is sampled as a
Bernoulli with parameter p(x,b
y). We refer to the true outcome distribution pas Nature.
By asserting a fixed “ground truth” probability function, the outcome performativity frame-
work does not allow for arbitrary distributional responses and limits the generality of the
approach. For instance, outcome performativity does not capture strategic classification. But
2
In general, outcome performativity could be defined for larger outcome domains. Handling such domains is
possible, but technical. We restrict our attention to binary outcomes to focus on the novel conceptual issues.
2
x
b
y
y
Figure 1: Causal graphical representation of the outcome performativity data generating process.
importantly, by refining the model of performativity, there is hope that we may sidestep the
hardness results for learning optimal performative predictors.
Performative Omniprediction.
We begin by observing that under outcome performativity,
the true probability function
p
suggests an optimal decision rule
f
`
:
X b
Y
for any loss
`
. In
our setting,
p
governs the outcome distribution, so given an input
x∈ X
, the optimal decision
f
`(x) is determined by a simple, univariate optimization procedure over a discrete set b
Y:
f
`(x)argmin
b
yb
Y
E
yp(x,b
y)[`(x,b
y,y)].(2)
Note that the decision rule
f
`
(
x
) minimizes the loss pointwise for
x∈ X
. Consequently, averaging
over any static, feature distribution
D
, the decision rule
f
`
is performative optimal for any
hypothesis class H, loss `, and marginal distribution D:
E
x∼D
yp(x,f
`(x))
[`(x, f
`(x),y)] 6min
h∈H
E
x∼D
yp(x,h(x))
[`(x, h(x), y)].
While the existence of
p
implies the existence of optimal decision rules under outcome per-
formativity, we make no assumptions about the learnability of
p
. In general, the function
p
may be arbitrarily complex, so learning (or even representing!)
p
may be infeasible, both
computationally and statistically. Still, the above analysis reveals the power of modeling the
probability function
p
:
X × b
Y →
[0
,
1]. The optimal probability function
p
encodes the optimal
decision rule
f
`
for every loss function
`
. This perspective raises a concrete technical question:
short of learning
p
, can we learn a probability function
˜
p
:
X × b
Y →
[0
,
1] that suggests an
optimal decision rule, via simple post-processing, for many dierent objectives?
Recent work of [GKR
+
22] studied the analogous question in the context of supervised
learning (without performativity), formalizing a solution concept which they call omniprediction.
Intuitively, an omnipredictor is a single probability function
˜
p
that suggests an optimal decision
rule for many dierent loss functions
L
. The work of [GKR
+
22] and follow-up work of [GHK
+
23]
demonstrate—rather surprisingly—that omniprediction in supervised learning is broadly a
feasible concept. For a variety of choices of loss classes
L
(e.g., Lipschitz losses or convex losses),
it is possible to learn an ecient predictor ˜
pthat gives optimal decisions for any loss `∈ L.
In this work, we generalize omniprediction to the outcome performative setting. As a
solution concept, performative omniprediction directly addresses the limiting assumption in
performative prediction that the loss
`
is known and fixed. Given a performative omnipredictor,
a decision-maker can explore the consequences of optimizing for dierent losses, balancing the
3
desire for forecasting and steering, as they see fit. Technically, given a predictor
˜
p
, we define
˜
f`:X b
Yto be the optimal decision rule, that acts as if outcomes are governed by ˜
p.
˜
f`(x)argmin
b
yb
Y
E
˜
y˜
p(x,b
y)[`(x,b
y, ˜
y)]
We emphasize that, for any loss
`
, the decision rule
˜
f`
(
x
) is an ecient post-processing of the
predictions given by
˜
p
(
x,b
y
) for
b
yb
Y
. A performative omnipredictor is a model of nature
˜
p
:
X × b
Y →
[0
,
1] that induces a corresponding decision rule
˜
f`
that is performatively optimal
over a collection of losses `∈ L.
Definition
(Performative Omnipredictor)
.
For a collection of loss functions
L
, hypothesis class
H
, and
ε>
0, a predictor
˜
p
:
X × b
Y →
[0
,
1] is an (
L,H,ε
)-performative omnipredictor for an input
distribution Dif for every `∈ L, the decision rule ˜
f`is ε-performative optimal over H.
E
x∼D
yp(x, ˜
f`(x)))
[`(x, ˜
f`(x)),y)] 6argmin
h∈H
E
x∼D
yp(x,h(x))
[`(x, h(x), y)] + ε(3)
While an intriguing prospect, omniprediction is particularly ambitious in the performative
world. Whereas most supervised learning losses have the same moral goal (to accurately forecast
the outcome), losses in the performative world can encode entirely contradictory objectives. For
instance, we can define a pair of losses
`0
and
`1
that reward decisions that steer outcomes to be
0 and 1, respectively. A performative omnipredictor must contend with these contradictions,
providing optimal decision rules under performative eects.
Concretely, under outcome performativity, there is a certain circularity in naively determin-
ing the optimal decision
˜
f
(
x
) from a prediction
˜
p
(
x,b
y
). Choosing an “optimal” decision
˜
f
(
x
)
causes a shift in the distribution on the outcome
yp
(
x, ˜
f
(
x
)), which may imply a dierent
“optimal” decision, which seems to lead to a continuing cycle of dependency. In this way, any
performative omnipredictor
˜
p
must encode the optimal decision rule
˜
f`
for each
`∈ L
,antic-
ipating the shift induced by the choice of
˜
f`
. In this work, we ask whether—despite this key
challenge—ecient performative omnipredictors exist, and if so, can we learn them?
1.2 Our Contributions
Our first contributions are conceptual, introducing the outcome performativity setting and the
notion of performative omnipredictors. As an abstraction, outcome performativity strikes a
balance with enough generality to model many real-world phenomena and enough structure to
give eective solutions. For settings where the distributional response occurs predominantly as
outcome performativity, the framework is well-scoped to contend with the challenges of per-
formative prediction. In particular, performative omnipredictors provide an eective solution
concept to address the tension between dierent objectives under performativity.
With these conceptual contributions in place, we turn to the feasibility of omniprediction
under outcome performativity. While outcome performativity introduces a number of new chal-
lenges, we show how to apply many techniques established for omniprediction in the supervised
learning setting to recover analogous guarantees under performativity. On a technical level, we
follow the Loss Outcome Indistinguishability approach of [GHK
+
23], demonstrating how—with
the right conceptual framing—arguments for the existence of supervised omnipredictors can be
translated into guarantees for the performative setting.
4
摘要:

MakingDecisionsunderOutcomePerformativityMichaelP.KimUniversityofCalifornia,Berkeleympkim@berkeley.eduJuanC.PerdomoUniversityofCalifornia,Berkeleyjcperdomo@berkeley.eduJanuary10,2023AbstractDecision-makersoftenactinresponsetodata-drivenpredictions,withthegoalofachiev-ingfavorableoutcomes.Insuchsetti...

展开>> 收起<<
Making Decisions under Outcome Performativity Michael P . Kim University of California Berkeley.pdf

共37页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:37 页 大小:389.76KB 格式:PDF 时间:2025-05-02

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 37
客服
关注