Conformalized Fairness via Quantile Regression Meichen Liu1 Lei Ding1 Dengdeng Yu2 Wulong Liu3 Linglong Kong1 Bei Jiang1

2025-05-01 0 0 2.96MB 18 页 10玖币

侵权投诉

Conformalized Fairness via Quantile Regression

Meichen Liu1, Lei Ding1, Dengdeng Yu2, Wulong Liu3,

Linglong Kong1∗

, Bei Jiang1∗

1Department of Mathematical and Statistical Sciences, University of Alberta

2Department of Mathematics, University of Texas at Arlington

3Huawei Noah’s Ark Lab Canada

{meichen1,lding1,lkong,bei1}@ualberta.ca

{dengdeng.yu}@uta.edu

{liuwulong}@huawei.com

Abstract

Algorithmic fairness has received increased attention in socially sensitive domains.

While rich literature on mean fairness has been established, research on quantile

fairness remains sparse but vital. To fulﬁll great needs and advocate the signiﬁcance

of quantile fairness, we propose a novel framework to learn a real-valued quantile

function under the fairness requirement of Demographic Parity with respect to sen-

sitive attributes, such as race or gender, and thereby derive a reliable fair prediction

interval. Using optimal transport and functional synchronization techniques, we

establish theoretical guarantees of distribution-free coverage and exact fairness for

the induced prediction interval constructed by fair quantiles. A hands-on pipeline

is provided to incorporate ﬂexible quantile regressions with an efﬁcient fairness

adjustment post-processing algorithm. We demonstrate the superior empirical

performance of this approach on several benchmark datasets. Our results show

the model’s ability to uncover the mechanism underlying the fairness-accuracy

trade-off in a wide range of societal and medical applications.

1 Introduction

We are increasingly leaning on machine learning systems to tackle human problems. A primary

objective is to develop intelligent algorithms that can automatically produce accurate decisions which

also enjoy equitable properties, as unintended social bias has been identiﬁed as a rising concern in

various ﬁelds [15, 20, 16].

As a means of providing quantitative measures of fairness, a number of metrics have been proposed.

These metrics can be categorized into three broad categories: group fairness [

], individual fairness

[

], and causality-based fairness [

]. In contrast to causality-based fairness that requires domain

knowledge to develop a fair causal structure and individual fairness that seeks equality only between

similar individuals, group fairness does not require any prior knowledge and seeks equality for groups

as a whole [

]. Among the metrics deﬁned for group fairness such as equalized odds [

] and

predictive rate parity [

], demographic parity (DP) is generic since it does not allow prediction

results in aggregate to depend on sensitive attributes [

]. In particular, an algorithm is said

to satisfy DP if its prediction is independent of any given sensitive attribute [2].

There have been a number of studies on algorithmic fairness concerning DP [

In the context regression analysis, much attention have been paid on conditional mean inferences

[

], few are concerned with conditional quantiles [

]. As real-world data often

exhibit heterogeneity, contain extreme outliers, or do not meet satisfactory distributional assumptions,

∗Co-Corresponding Authors

Preprint. Under review.

arXiv:2210.02015v2 [stat.ML] 14 Oct 2022

like Gaussianity, a fairness discussion on conditional quantiles may be more rational and essential

since they are able to provide a more complete understanding of the dependence structure between

response and explanatory variables [

], as well as better accommodate asymmetry and extreme

tail behavior [

]. It should also be noted that bias or unfairness that arises in mean regression may

also be propagated through quantile regression, therefore it must be properly dealt with separately:

a graphic demonstration can be found in Figure 1. More intuitively, we may take an example

from a Spanish labor market study [

]. The study found that in Spain, also in line with other

countries, the mean wage gap between men and women is quite substantial: on average, women

earn around 70 percent of what men earn. While wage gaps are not uniform across all pay scales,

they are greater at higher quantiles than at lower quantiles. As biases and disparities at different

quantiles tend to be overshadowed by the mean behavior of the entire population, we propose a

novel framework to seek fair predictions at different quantiles. It uses optimal transport techniques

[

] by transforming bias-affected distributions into an only-fair Wasserstein-2 barycenter through

a kernel-based functional synchronization method [

], in order to provide fair quantile estimators.

Figure 1: An illustration of quantile fairness: for a skewed and heteroscedastic quantile estimation

{qα,i}N

i=1

affected by the sensitive attribute

S∈ {0,1}

, for example, the higher quantile of the

salary distribution, the optimal fair quantile prediction

Qgα(t), t ∈(0,1)

is derived through a convex

combination of the conditional quantile functions of Qqα|S=0 and Qqα|S=1.

Since quantile fairness poses a number of theoretical challenges, no previous literature has been

able to provide any inference results such as prediction intervals concerning quantile fairness. It

is imperative to keep in mind that fairness is only one of two legs of the primary goal of modern

machine learning algorithms, the other being accuracy. Building a reliable prediction with valid

conﬁdence is a signiﬁcant challenge that is encountered by many machine learning algorithms [

Towards this end, we propose the conformalized fair quantile prediction (CFQP) inspired by the

works of Romano et al.

[37

38]

. Our analysis demonstrates, both mathematically and experimentally,

that CFQR provides ﬁnite sample, distribution-free validity, DP fairness for different quantiles, and

precise control of the miscoverage rate, regardless of the underlying quantile algorithm.

Contributions and Outlines.

In this paper, we propose a new quantile based method with valid

inference that enhances both accuracy and fairness while maintaining a balance between the two. It is

a novel framework that allows an exact control of prediction miscoverages while ensuring quantile

fairness simultaneously. The main contributions are summarized as follows:

We successfully transform the problem of searching quantiles under DP fairness to the construc-

tion of multi-marginal Wasserstein-2 barycenters via the optimal transport theory [

We incorporate a novel kernel smoothing step into the preceding method, which is particularly

advantageous for subgroups whose sample sizes are too small to obtain reliable quantile function

estimations.

ii.

In Section 4, we propose a conformalized fair quantile regression prediction interval (CFQP)

inspired by the works of Romano et al.

[37

38]

. It is mathematically proved to achieve a

distribution-free validity, demographic parity on different quantiles, and an exact control of

miscoverage rates, regardless of the quantile algorithm used. The theoretical validity of prediction

interval constructed by CFQP and exact DP of the fair quantile estimators are given in Section 5

and the supplement.

iii.

The experimental results presented in Section 6 include a numerical comparison of the proposed

CFQP and fair quantile estimation with both state-of-the-art conformal and fairness-oriented

methods. By reducing the discriminatory bias dramatically, our method outperforms the state-of-

the-art methods while maintaining reasonable short interval lengths.

Related works.

Existing approaches for building a fair mean regression broadly fall into three

classes: pre-processing, in-processing and post-processing. In particular, preprocessing methods

focus on transforming the data to remove any unwanted bias [

]; in-processing methods aim

to build in fairness constraints into the training step [

]; post-processing methods target

to modify the trained predictor [

]. As few previous works have focused on the quantile

fairness of and fair prediction interval, the most related are Yang et al.

[46]

, where a different fairness

measure was used. While Agarwal et al.

[2]

mentioned that their reduction-based approach can be

adapted into quantile regression, Williamson and Menon

[44]

brought forward a novel conditional

variance at risk fairness measure aiming to control the largest subgroup risk. For interval fairness

measure, the approach by Romano et al.

[37]

achieved equalized coverage among groups without

fairness on interval endpoints. Methodologically, integrating algorithmic fairness with Wasserstein

distance based barycenter problem has been studied in [

]. Both in-processing [

]

and post-processing [

] methods were proposed to solve classiﬁcation and mean regression

problems. As a post-processing method, our work is distinct from above-mentioned methods by

constructing the DP-fairness for each population quantile, and generating a fair prediction interval

accordingly.

Notations.

We denote by

[K]

the set

{1, . . . , K}

for arbitrary integer

|S|

represents the cardinality

for a ﬁnite set

and

represent the expectation and probability and

{·}

is the indicator function.

Let

{Zn}∞

n=1

be a sequence of random variables, and

{kn}∞

n=1

be a sequence of positive numbers,

we say that

Zn=Op(kn)

, if

limT→∞ lim supn→∞ P(|Zn|> T kn)=0

, then

Zn/kn=Op(1)

. To

denote the equality in distribution of two random variables Aand B, we write Ad

=B.

2 Problem statement

Consider the regression problem where a “sensitive characteristic”

is available, which by the

U.S. law [

] can be enumerated as sex, race, age, disability, etc. We observe the triplets

(X1, S1, Y1),...,(Xn, Sn, Yn)

, denote

(Xi, Si, Yi)

i= 1, . . . , n

and

is a random variable

Rp×[K]×R

. The aim is to predict the unknown value of

Yn+1

at a test point

Xn+1, Sn+1

. Let

be the joint distribution of

, we assume that all the samples

{Zi}n+1

i=1

are drawn exchangeable,

where i.i.d. is a special case.

Our goal is to construct a marginal distribution-free prediction band

C(Xn+1, Sn+1)⊆R

that is

likely to cover the unknown response

Yn+1

with ﬁnite-sample (nonasymptotic) validity. Formally,

given a desired miscoverage rate α, the predicted interval satisﬁes

P{Yn+1 ∈C(Xn+1, Sn+1)} ≥ 1−α(1)

for any joint distribution

and any sample size

, while the left and right endpoint of

C(Xn+1, Sn+1)

satisﬁes the fairness constraint of Demographic Parity concerning the sensitive variable S.

Demographic Parity.

We introduce the quantitative deﬁnition of DP in fair regression and connect

the DP-fairness with a quantile regressor

qα

. The result that

qα

can be projected to the fair counterparts

using optimal transport will be invoked later.

Given a ﬁxed quantile level

(it may refer to

αlo

αhi

indicating the upper and lower quantile

estimates for the prediction band

C(Xn+1, Sn+1)

). Let

qα:Rp×[K]→R

represent an arbitrary

conditional quantile predictor. Denote by

νqα|s

the distribution of

(qα(X, S)|S=s)

, the Cumulative

Distribution Function (CDF) of νqα|sis given by

Fνqα|s(t) = P(qα(X, S)≤t|S=s).(2)

The quantile function

Qνqα|s=F−1

νqα|s: [0,1] →R

,namely, the generalized inverse of

Fνqα|s

, can

thus be deﬁned as for all levels t∈(0,1],

Qνqα|s(t) = inf{y∈R:Fνqα|s(y)≥t}with Qνqα|s(0) = Qνqα|s(0+).(3)

To simplify the notations, we will write

Fqα|s

and

Qqα|s

instead of

Fνqα|s

and

Qνqα|s

respectively,

for any prediction rule qα.

In the following, we introduce the deﬁnition of Demographic Parity (DP), which is most commonly

used in the context of fairness research [2, 12, 13, 24, 32].

Deﬁnition 1

(Demographic Parity)

An arbitrary prediction

g:Rd×[K]→R

satisﬁes demographic

parity under a distribution

over

(X, S, Y )

, if

g(X, S)

is statistically independent of the sensitive

attribute S. Formally, for every s, s0∈[K],

sup

t∈R

|P(g(X, S)≤t|S=s)−P(g(X, S)≤t|S=s0)|= 0.

Demographic Parity (DP) requires the predictions to be independent of the sensitive attribute, and it

demands the Kolmogorov-Smirnov distance [

] (the difference between CDFs measured in the

l∞

norm) between νg|sand νg|s0to vanish for all categories s, s0.

3 Quantile Regression and Conformal Prediction

In this section, we recall the CQR approach for ﬁnite sample, distribution-free prediction interval

inference. Quantile regression was proposed by Koenker and Bassett

[27]

to estimate the

-th quantile

of the conditional distribution of

given

X:= (X, S)

for some quantile level

α∈(0,1)

, since

then it has become more pervasive with various applications, such as providing prediction intervals,

detecting outliers, or perceiving the entire distribution [

]. Denote the conditional cumulative

distribution of

given

F(y|˜

X= ˜x) := P{Y≤y|˜

X= ˜x}

. The

-th conditional quantile

prediction is deﬁned as

qα(˜x) := inf{y∈R:F(y|˜

X= ˜x)≥α}.

Quantile regression can be cast

as an optimization problem [

], by minimizing the expected check loss function

E(ρα) = E[ρα(y, q)|˜

X= ˜x], where

ρα(y, qα(˜x)) = α|y−qα(˜x)|if y≥qα(˜x),

(1 −α)|y−qα(˜x)|if y < qα(˜x).(4)

Quantile regression offers a principled way of judging the reliability of predictions by building

a prediction interval for the new observation

(˜

Xn+1, Yn+1)

. In contrast to asymptopia, Romano

et al.

[37

38]

brought forward the conformalized quantile regression (CQR) by combining the

merits of robust quantile regression with conformal prediction, thus ﬁnite sample validity in Eq.

(1)

is guaranteed. Inspired by the split conformal method, a split CQR likewise starts with split-

ting the data into a proper training set and a calibration set, indexed by

I1,I2

respectively. Given

any quantile regression algorithm

, we then ﬁt two conditional quantile functions

ˆqαlo

and

ˆqαhi

on the proper training set:

{ˆqαlo ,ˆqαhi } ← Q n˜

Xi, Yi:i∈ I1o.

The conformity scores are

calculated to quantify the error made by the plug-in prediction interval

C(˜x) = [ˆqαlo (˜x),ˆqαhi (˜x)]

We evaluate the scores on the calibration set as

Ek:= max nˆqαlo (˜

Xk)−Yk, Yk−ˆqαhi (˜

Xk)o

for each

k∈ I2

, where both undercoverage and overcoverage of the interval are taken into

consideration [

]. Given a new input data

Xn+1

, we construct the prediction interval for

Yn+1

C˜

Xn+1=hˆqαlo ˜

Xn+1−Q1−α(E, I2),ˆqαhi ˜

Xn+1+Q1−α(E, I2)i,

where

Q1−α(E, I2) := (1 −α)(1 + 1/|I2|)

-th empirical quantile of

{Ek:k∈ I2}

conformalizes the

plug-in prediction interval. Note that the constructed interval

C(˜

Xn+1)

could be highly inﬂuenced

by the sensitive variable S.

4 Conformal fair quantile prediction (CFQP)

We formally describe our proposed conformal fair prediction (CFQP) framework for constructing

DP fairness constrained prediction intervals in this section. A kernel smoothing quantile function

is introduced during the functional synchronization, which can improve the estimation when some

subgroups are too small to give reliable sample quantile function estimations.

Deﬁnition 2

(Wasserstein-2 distance)

Let

and

be two univariate probability measures with ﬁnite

second moments. The squared Wasserstein-2 distance between µand νis deﬁned as

2(µ, ν) = inf ZR×R

|x−y|2dγ(x, y), γ ∈Γµ,ν 

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ConformalizedFairnessviaQuantileRegressionMeichenLiu1,LeiDing1,DengdengYu2,WulongLiu3,LinglongKong1,BeiJiang11DepartmentofMathematicalandStatisticalSciences,UniversityofAlberta2DepartmentofMathematics,UniversityofTexasatArlington3HuaweiNoah'sArkLabCanada{meichen1,lding1,lkong,bei1}@ualberta.ca{den...

展开>> 收起<<

Conformalized Fairness via Quantile Regression Meichen Liu1 Lei Ding1 Dengdeng Yu2 Wulong Liu3 Linglong Kong1 Bei Jiang1.pdf

共18页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Conformalized Fairness via Quantile Regression Meichen Liu1 Lei Ding1 Dengdeng Yu2 Wulong Liu3 Linglong Kong1 Bei Jiang1

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: