Making Decisions under Outcome Performativity Michael P . Kim University of California Berkeley

2025-05-02 0 0 389.76KB 37 页 10玖币

侵权投诉

Making Decisions under Outcome Performativity

Michael P. Kim

University of California, Berkeley

mpkim@berkeley.edu

Juan C. Perdomo

University of California, Berkeley

jcperdomo@berkeley.edu

January 10, 2023

Abstract

Decision-makers often act in response to data-driven predictions, with the goal of achiev-

ing favorable outcomes. In such settings, predictions don’t passively forecast the future;

instead, predictions actively shape the distribution of outcomes they are meant to predict.

This performative prediction setting [PZMH20] raises new challenges for learning “optimal”

decision rules. In particular, existing solution concepts do not address the apparent tension

between the goals of forecasting outcomes accurately and steering individuals to achieve

desirable outcomes.

To contend with this concern, we introduce a new optimality concept—performative

omniprediction—adapted from the supervised (non-performative) learning setting [GKR

22].

A performative omnipredictor is a single predictor that simultaneously encodes the optimal

decision rule with respect to many possibly-competing objectives. Our main result demon-

strates that eﬃcient performative omnipredictors exist, under a natural restriction of perfor-

mative prediction, which we call outcome performativity. On a technical level, our results fol-

low by carefully generalizing the notion of outcome indistinguishability [DKR

21,GHK

23]

to the outcome performative setting. From an appropriate notion of Performative OI, we

recover many consequences known to hold in the supervised setting, such as omniprediction

and universal adaptability [KKG+22].

Authors listed alphabetically.

arXiv:2210.01745v2 [cs.LG] 7 Jan 2023

1 Introduction

Data-driven predictions inform policy decisions that directly impact individuals. Proponents

argue that by understanding patterns from the past, decisions can be optimized to improve

future outcomes, to the beneﬁt of individuals and institutions [KLMO15]. In the US educational

system, for instance, early warning systems (EWS) have become a key tool used by states to

combat low graduation rates [BB19,US 16]. The rationale for using such systems is clear. Given

a predictor that, for each student, estimates the likelihood of graduation, school districts can

identify high-risk students at a young age, directing resources to improve individuals’ outcomes,

and in turn, the districts’ graduation rates. Despite compelling arguments, reliably predicting

life outcomes remains a largely-unsolved problem in machine learning.

A key challenge in utilizing predictions to inform decisions is that, often, predictions

inﬂuence the outcomes they’re meant to forecast. In the education example above, districts

consider predictions of graduation with the intention of eﬀecting graduation outcomes. In this

situation—where predictions determine interventions, which inﬂuence outcomes—accuracy can

be a paradoxical notion. If a predictor correctly identiﬁes high risk individuals as likely to suﬀer

negative outcomes, after successful interventions, the individuals’ outcomes will be positive and

the initial predictions will appear inaccurate. To apply data-driven tools eﬀectively, decision-

makers must resolve an apparent tension between the objectives of forecasting individuals’

outcomes reliably and steering individuals to achieve better outcomes.

Recent work of [PZMH20] introduced performative prediction to contend with the fact that

predictions not only forecast, but also shape the world. Informally, a prediction problem is

performative if the act of prediction inﬂuences the distribution on individual-outcome pairs.

From early warning systems, to online content recommendations, to public health advisories:

across many contexts, individuals respond to predictions in a manner that changes the likelihood

of possible outcomes (successful graduation, increased click rate, or decreased disease caseload).

In their original work on the subject, [PZMH20] frame the goal of performative prediction

through loss minimization. In this framing, the ultimate goal is to learn a performatively optimal

decision rule. A decision rule

hpo

is performatively optimal if it achieves the minimal expected

loss (within some class of decision rules H) over the distribution that it induces,

hpo ∈argmin

h∈H

(x,y)∼D(h)[`(x,h(x),y)].(1)

Here, D(h) is the distribution over (x,y) pairs observed as response to deploying h.

For generality’s sake, performative prediction makes minimal restrictions on how the distri-

bution may respond to a chosen decision rule. In particular, the choice to deploy a hypothesis

, may change the joint distribution (

x, y

)

∼ D

(

) over individual-outcome pairs, essentially

arbitrarily.

This generality enables us to write a broad range of prediction problems—including

supervised learning [SB14], strategic classiﬁcation [HMPW16], and causal inference [MMH20]—

as special cases of performative prediction. In all, [PZMH20] establishes a powerful framework

for reasoning about settings where the distribution of examples responds to the predictions.

While powerful, the framework has two noticeable limitations. First, achieving performative

optimality is hard. Without any assumptions on the distributional response

(

), achieving

performative optimality requires exhaustive search over the hypothesis class

. Furthermore,

even under strong structural assumptions on the distributional response and choice of loss

, it

[PZMH20] assume only a Lipschitzness condition, where similar hypotheses

and

give rise to similar

distributions D(h) and D(h0), measured in Wasserstein (earth mover’s) distance.

is known that convex optimization does not suﬃce to achieve optimality [PZMH20,MPZ21].

Stated another way: the generality of performative prediction does not come for free. To date,

all existing methods for performative optimality require strong speciﬁcation assumptions on

the outcome distribution and distributional response.

The second limitation arises due to formulating performative prediction as a loss minimiza-

tion problem: the loss

is ﬁxed, once and for all. In performative prediction, diﬀerent losses

can encode drastically diﬀerent objectives: losses are used not only to promote accuracy of

predictions, but also to encourage favorable outcome distributions. Consider a loss designed

for accurate forecasting, e.g., the squared error (

y−y

)

. In this case, the optimal decision rule

will prioritize accuracy without regard for the “quality” of the outcome distribution. On the

other hand, consider a loss designed to steer towards positive outcomes, 1

−y

. Here, there is no

notion of accuracy (the loss ignores the prediction b

y), but instead, the objective is to nudge the

distribution of outcomes towards y= 1.

Encoding the decision-making objective through a single loss function forces the learner

to choose the “correct” objective at train time. Downstream decision-makers, however, may

reasonably want to explore diﬀerent objectives according to their own sense of “optimality”.

In the existing formulations for performative prediction, exploring diﬀerent losses requires

re-training from scratch. In this work, we investigate an alternative formulation that enables

decision-makers to eﬃciently explore optimal decision rules under many diﬀerent objectives.

1.1 Decision-Making under Outcome Performativity

To begin, we introduce a special case of the performative prediction setting, which we call

outcome performativity. Outcome performativity focuses on the eﬀects of local decisions on

individuals’ outcomes, rather than the eﬀect of broader policy on the distribution of individuals.

For instance, our example of graduation prediction is modeled well by outcome performativity.

For a given a student, the EWS prediction they receive aﬀects their future graduation outcome,

but does not inﬂuence their demographic features or historical test scores. In other words, we

narrow our attention to the performative eﬀects of decisions

(

) on the conditional distribution

over outcomes

, rather than the eﬀects of the decision rule

on the distribution as a whole

(

This reframing of performativity still captures many important decision-making problems, but

gives us additional structure to address some of the limitations in the original formulation.

On a technical level, outcome performativity imagines a data generating process over triples

(

x,b

y,y∗

) where

x∼ D

is sampled from a static distribution over inputs, then a prediction or

decision

y∈b

is selected (possibly as a function of

), and ﬁnally the true outcome

y∗∈ Y

sampled conditioned on

and

. We focus on binary outcomes

{

}

In this setting, the

outcome performativity assumption posits the existence of an underlying probability function,

p∗:X × b

Y → [0,1],

where for a given individual

x∈ X

and decision

y∈b

, the true outcome

y∗

is sampled as a

Bernoulli with parameter p∗(x,b

y). We refer to the true outcome distribution p∗as Nature.

By asserting a ﬁxed “ground truth” probability function, the outcome performativity frame-

work does not allow for arbitrary distributional responses and limits the generality of the

approach. For instance, outcome performativity does not capture strategic classiﬁcation. But

In general, outcome performativity could be deﬁned for larger outcome domains. Handling such domains is

possible, but technical. We restrict our attention to binary outcomes to focus on the novel conceptual issues.

y∗

Figure 1: Causal graphical representation of the outcome performativity data generating process.

importantly, by reﬁning the model of performativity, there is hope that we may sidestep the

hardness results for learning optimal performative predictors.

Performative Omniprediction.

We begin by observing that under outcome performativity,

the true probability function

p∗

suggests an optimal decision rule

f∗

X → b

for any loss

. In

our setting,

p∗

governs the outcome distribution, so given an input

x∈ X

, the optimal decision

f∗

`(x) is determined by a simple, univariate optimization procedure over a discrete set b

f∗

`(x)∈argmin

y∈b

y∗∼p∗(x,b

y)[`(x,b

y,y∗)].(2)

Note that the decision rule

f∗

(

) minimizes the loss pointwise for

x∈ X

. Consequently, averaging

over any static, feature distribution

, the decision rule

f∗

is performative optimal for any

hypothesis class H, loss `, and marginal distribution D:

x∼D

y∗∼p∗(x,f ∗

`(x))

[`(x, f ∗

`(x),y∗)] 6min

h∈H

x∼D

y∗∼p∗(x,h(x))

[`(x, h(x), y∗)].

While the existence of

p∗

implies the existence of optimal decision rules under outcome per-

formativity, we make no assumptions about the learnability of

p∗

. In general, the function

p∗

may be arbitrarily complex, so learning (or even representing!)

p∗

may be infeasible, both

computationally and statistically. Still, the above analysis reveals the power of modeling the

probability function

p∗

X × b

Y →

1]. The optimal probability function

p∗

encodes the optimal

decision rule

f∗

for every loss function

. This perspective raises a concrete technical question:

short of learning

p∗

, can we learn a probability function

X × b

Y →

1] that suggests an

optimal decision rule, via simple post-processing, for many diﬀerent objectives?

Recent work of [GKR

22] studied the analogous question in the context of supervised

learning (without performativity), formalizing a solution concept which they call omniprediction.

Intuitively, an omnipredictor is a single probability function

that suggests an optimal decision

rule for many diﬀerent loss functions

. The work of [GKR

22] and follow-up work of [GHK

23]

demonstrate—rather surprisingly—that omniprediction in supervised learning is broadly a

feasible concept. For a variety of choices of loss classes

(e.g., Lipschitz losses or convex losses),

it is possible to learn an eﬃcient predictor ˜

pthat gives optimal decisions for any loss `∈ L.

In this work, we generalize omniprediction to the outcome performative setting. As a

solution concept, performative omniprediction directly addresses the limiting assumption in

performative prediction that the loss

is known and ﬁxed. Given a performative omnipredictor,

a decision-maker can explore the consequences of optimizing for diﬀerent losses, balancing the

desire for forecasting and steering, as they see ﬁt. Technically, given a predictor

, we deﬁne

f`:X → b

Yto be the optimal decision rule, that acts as if outcomes are governed by ˜

f`(x)∈argmin

y∈b

y∼˜

p(x,b

y)[`(x,b

y, ˜

y)]

We emphasize that, for any loss

, the decision rule

(

) is an eﬃcient post-processing of the

predictions given by

(

x,b

) for

y∈b

. A performative omnipredictor is a model of nature

X × b

Y →

1] that induces a corresponding decision rule

that is performatively optimal

over a collection of losses `∈ L.

Deﬁnition

(Performative Omnipredictor)

For a collection of loss functions

, hypothesis class

, and

ε>

0, a predictor

X × b

Y →

1] is an (

L,H,ε

)-performative omnipredictor for an input

distribution Dif for every `∈ L, the decision rule ˜

f`is ε-performative optimal over H.

x∼D

y∗∼p∗(x, ˜

f`(x)))

[`(x, ˜

f`(x)),y∗)] 6argmin

h∈H

x∼D

y∗∼p∗(x,h(x))

[`(x, h(x), y∗)] + ε(3)

While an intriguing prospect, omniprediction is particularly ambitious in the performative

world. Whereas most supervised learning losses have the same moral goal (to accurately forecast

the outcome), losses in the performative world can encode entirely contradictory objectives. For

instance, we can deﬁne a pair of losses

and

that reward decisions that steer outcomes to be

0 and 1, respectively. A performative omnipredictor must contend with these contradictions,

providing optimal decision rules under performative eﬀects.

Concretely, under outcome performativity, there is a certain circularity in naively determin-

ing the optimal decision

(

) from a prediction

(

x,b

). Choosing an “optimal” decision

(

)

causes a shift in the distribution on the outcome

y∗∼p∗

(

x, ˜

(

)), which may imply a diﬀerent

“optimal” decision, which seems to lead to a continuing cycle of dependency. In this way, any

performative omnipredictor

must encode the optimal decision rule

for each

`∈ L

,antic-

ipating the shift induced by the choice of

. In this work, we ask whether—despite this key

challenge—eﬃcient performative omnipredictors exist, and if so, can we learn them?

1.2 Our Contributions

Our ﬁrst contributions are conceptual, introducing the outcome performativity setting and the

notion of performative omnipredictors. As an abstraction, outcome performativity strikes a

balance with enough generality to model many real-world phenomena and enough structure to

give eﬀective solutions. For settings where the distributional response occurs predominantly as

outcome performativity, the framework is well-scoped to contend with the challenges of per-

formative prediction. In particular, performative omnipredictors provide an eﬀective solution

concept to address the tension between diﬀerent objectives under performativity.

With these conceptual contributions in place, we turn to the feasibility of omniprediction

under outcome performativity. While outcome performativity introduces a number of new chal-

lenges, we show how to apply many techniques established for omniprediction in the supervised

learning setting to recover analogous guarantees under performativity. On a technical level, we

follow the Loss Outcome Indistinguishability approach of [GHK

23], demonstrating how—with

the right conceptual framing—arguments for the existence of supervised omnipredictors can be

translated into guarantees for the performative setting.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

MakingDecisionsunderOutcomePerformativityMichaelP.KimUniversityofCalifornia,Berkeleympkim@berkeley.eduJuanC.PerdomoUniversityofCalifornia,Berkeleyjcperdomo@berkeley.eduJanuary10,2023AbstractDecision-makersoftenactinresponsetodata-drivenpredictions,withthegoalofachiev-ingfavorableoutcomes.Insuchsetti...

展开>> 收起<<

Making Decisions under Outcome Performativity Michael P . Kim University of California Berkeley.pdf

共37页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Making Decisions under Outcome Performativity Michael P . Kim University of California Berkeley

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: