Conformalized Fairness via Quantile Regression Meichen Liu1 Lei Ding1 Dengdeng Yu2 Wulong Liu3 Linglong Kong1 Bei Jiang1

2025-05-01 0 0 2.96MB 18 页 10玖币
侵权投诉
Conformalized Fairness via Quantile Regression
Meichen Liu1, Lei Ding1, Dengdeng Yu2, Wulong Liu3,
Linglong Kong1
, Bei Jiang1
1Department of Mathematical and Statistical Sciences, University of Alberta
2Department of Mathematics, University of Texas at Arlington
3Huawei Noah’s Ark Lab Canada
{meichen1,lding1,lkong,bei1}@ualberta.ca
{dengdeng.yu}@uta.edu
{liuwulong}@huawei.com
Abstract
Algorithmic fairness has received increased attention in socially sensitive domains.
While rich literature on mean fairness has been established, research on quantile
fairness remains sparse but vital. To fulfill great needs and advocate the significance
of quantile fairness, we propose a novel framework to learn a real-valued quantile
function under the fairness requirement of Demographic Parity with respect to sen-
sitive attributes, such as race or gender, and thereby derive a reliable fair prediction
interval. Using optimal transport and functional synchronization techniques, we
establish theoretical guarantees of distribution-free coverage and exact fairness for
the induced prediction interval constructed by fair quantiles. A hands-on pipeline
is provided to incorporate flexible quantile regressions with an efficient fairness
adjustment post-processing algorithm. We demonstrate the superior empirical
performance of this approach on several benchmark datasets. Our results show
the model’s ability to uncover the mechanism underlying the fairness-accuracy
trade-off in a wide range of societal and medical applications.
1 Introduction
We are increasingly leaning on machine learning systems to tackle human problems. A primary
objective is to develop intelligent algorithms that can automatically produce accurate decisions which
also enjoy equitable properties, as unintended social bias has been identified as a rising concern in
various fields [15, 20, 16].
As a means of providing quantitative measures of fairness, a number of metrics have been proposed.
These metrics can be categorized into three broad categories: group fairness [
4
], individual fairness
[
28
], and causality-based fairness [
34
]. In contrast to causality-based fairness that requires domain
knowledge to develop a fair causal structure and individual fairness that seeks equality only between
similar individuals, group fairness does not require any prior knowledge and seeks equality for groups
as a whole [
7
]. Among the metrics defined for group fairness such as equalized odds [
10
,
32
] and
predictive rate parity [
11
], demographic parity (DP) is generic since it does not allow prediction
results in aggregate to depend on sensitive attributes [
2
,
24
,
13
,
46
]. In particular, an algorithm is said
to satisfy DP if its prediction is independent of any given sensitive attribute [2].
There have been a number of studies on algorithmic fairness concerning DP [
2
,
12
,
13
,
24
,
35
,
46
].
In the context regression analysis, much attention have been paid on conditional mean inferences
[
2
,
12
,
13
,
35
], few are concerned with conditional quantiles [
44
,
46
]. As real-world data often
exhibit heterogeneity, contain extreme outliers, or do not meet satisfactory distributional assumptions,
Co-Corresponding Authors
Preprint. Under review.
arXiv:2210.02015v2 [stat.ML] 14 Oct 2022
like Gaussianity, a fairness discussion on conditional quantiles may be more rational and essential
since they are able to provide a more complete understanding of the dependence structure between
response and explanatory variables [
46
], as well as better accommodate asymmetry and extreme
tail behavior [
45
]. It should also be noted that bias or unfairness that arises in mean regression may
also be propagated through quantile regression, therefore it must be properly dealt with separately:
a graphic demonstration can be found in Figure 1. More intuitively, we may take an example
from a Spanish labor market study [
19
,
20
]. The study found that in Spain, also in line with other
countries, the mean wage gap between men and women is quite substantial: on average, women
earn around 70 percent of what men earn. While wage gaps are not uniform across all pay scales,
they are greater at higher quantiles than at lower quantiles. As biases and disparities at different
quantiles tend to be overshadowed by the mean behavior of the entire population, we propose a
novel framework to seek fair predictions at different quantiles. It uses optimal transport techniques
[
3
,
13
] by transforming bias-affected distributions into an only-fair Wasserstein-2 barycenter through
a kernel-based functional synchronization method [
9
,
52
], in order to provide fair quantile estimators.
Figure 1: An illustration of quantile fairness: for a skewed and heteroscedastic quantile estimation
{qα,i}N
i=1
affected by the sensitive attribute
S∈ {0,1}
, for example, the higher quantile of the
salary distribution, the optimal fair quantile prediction
Qgα(t), t (0,1)
is derived through a convex
combination of the conditional quantile functions of Qqα|S=0 and Qqα|S=1.
Since quantile fairness poses a number of theoretical challenges, no previous literature has been
able to provide any inference results such as prediction intervals concerning quantile fairness. It
is imperative to keep in mind that fairness is only one of two legs of the primary goal of modern
machine learning algorithms, the other being accuracy. Building a reliable prediction with valid
confidence is a significant challenge that is encountered by many machine learning algorithms [
51
].
Towards this end, we propose the conformalized fair quantile prediction (CFQP) inspired by the
works of Romano et al.
[37
,
38]
. Our analysis demonstrates, both mathematically and experimentally,
that CFQR provides finite sample, distribution-free validity, DP fairness for different quantiles, and
precise control of the miscoverage rate, regardless of the underlying quantile algorithm.
Contributions and Outlines.
In this paper, we propose a new quantile based method with valid
inference that enhances both accuracy and fairness while maintaining a balance between the two. It is
a novel framework that allows an exact control of prediction miscoverages while ensuring quantile
fairness simultaneously. The main contributions are summarized as follows:
i.
We successfully transform the problem of searching quantiles under DP fairness to the construc-
tion of multi-marginal Wasserstein-2 barycenters via the optimal transport theory [
3
,
13
,
21
].
We incorporate a novel kernel smoothing step into the preceding method, which is particularly
advantageous for subgroups whose sample sizes are too small to obtain reliable quantile function
estimations.
ii.
In Section 4, we propose a conformalized fair quantile regression prediction interval (CFQP)
inspired by the works of Romano et al.
[37
,
38]
. It is mathematically proved to achieve a
distribution-free validity, demographic parity on different quantiles, and an exact control of
miscoverage rates, regardless of the quantile algorithm used. The theoretical validity of prediction
interval constructed by CFQP and exact DP of the fair quantile estimators are given in Section 5
and the supplement.
iii.
The experimental results presented in Section 6 include a numerical comparison of the proposed
CFQP and fair quantile estimation with both state-of-the-art conformal and fairness-oriented
2
methods. By reducing the discriminatory bias dramatically, our method outperforms the state-of-
the-art methods while maintaining reasonable short interval lengths.
Related works.
Existing approaches for building a fair mean regression broadly fall into three
classes: pre-processing, in-processing and post-processing. In particular, preprocessing methods
focus on transforming the data to remove any unwanted bias [
6
,
34
,
50
]; in-processing methods aim
to build in fairness constraints into the training step [
2
,
5
,
28
,
32
]; post-processing methods target
to modify the trained predictor [
12
,
13
,
31
]. As few previous works have focused on the quantile
fairness of and fair prediction interval, the most related are Yang et al.
[46]
, where a different fairness
measure was used. While Agarwal et al.
[2]
mentioned that their reduction-based approach can be
adapted into quantile regression, Williamson and Menon
[44]
brought forward a novel conditional
variance at risk fairness measure aiming to control the largest subgroup risk. For interval fairness
measure, the approach by Romano et al.
[37]
achieved equalized coverage among groups without
fairness on interval endpoints. Methodologically, integrating algorithmic fairness with Wasserstein
distance based barycenter problem has been studied in [
3
,
12
,
13
,
21
,
26
]. Both in-processing [
2
,
26
]
and post-processing [
12
,
13
] methods were proposed to solve classification and mean regression
problems. As a post-processing method, our work is distinct from above-mentioned methods by
constructing the DP-fairness for each population quantile, and generating a fair prediction interval
accordingly.
Notations.
We denote by
[K]
the set
{1, . . . , K}
for arbitrary integer
K
.
|S|
represents the cardinality
for a finite set
S
.
E
and
P
represent the expectation and probability and
{·}
is the indicator function.
Let
{Zn}
n=1
be a sequence of random variables, and
{kn}
n=1
be a sequence of positive numbers,
we say that
Zn=Op(kn)
, if
limT→∞ lim supn→∞ P(|Zn|> T kn)=0
, then
Zn/kn=Op(1)
. To
denote the equality in distribution of two random variables Aand B, we write Ad
=B.
2 Problem statement
Consider the regression problem where a “sensitive characteristic”
S
is available, which by the
U.S. law [
21
,
37
] can be enumerated as sex, race, age, disability, etc. We observe the triplets
(X1, S1, Y1),...,(Xn, Sn, Yn)
, denote
(Xi, Si, Yi)
by
Zi
,
i= 1, . . . , n
and
Zi
is a random variable
in
Rp×[K]×R
. The aim is to predict the unknown value of
Yn+1
at a test point
Xn+1, Sn+1
. Let
P
be the joint distribution of
Z
, we assume that all the samples
{Zi}n+1
i=1
are drawn exchangeable,
where i.i.d. is a special case.
Our goal is to construct a marginal distribution-free prediction band
C(Xn+1, Sn+1)R
that is
likely to cover the unknown response
Yn+1
with finite-sample (nonasymptotic) validity. Formally,
given a desired miscoverage rate α, the predicted interval satisfies
P{Yn+1 C(Xn+1, Sn+1)} ≥ 1α(1)
for any joint distribution
P
and any sample size
n
, while the left and right endpoint of
C(Xn+1, Sn+1)
satisfies the fairness constraint of Demographic Parity concerning the sensitive variable S.
Demographic Parity.
We introduce the quantitative definition of DP in fair regression and connect
the DP-fairness with a quantile regressor
qα
. The result that
qα
can be projected to the fair counterparts
using optimal transport will be invoked later.
Given a fixed quantile level
α
(it may refer to
αlo
or
αhi
indicating the upper and lower quantile
estimates for the prediction band
C(Xn+1, Sn+1)
). Let
qα:Rp×[K]R
represent an arbitrary
conditional quantile predictor. Denote by
νqα|s
the distribution of
(qα(X, S)|S=s)
, the Cumulative
Distribution Function (CDF) of νqα|sis given by
Fνqα|s(t) = P(qα(X, S)t|S=s).(2)
The quantile function
Qνqα|s=F1
νqα|s: [0,1] R
,namely, the generalized inverse of
Fνqα|s
, can
thus be defined as for all levels t(0,1],
Qνqα|s(t) = inf{yR:Fνqα|s(y)t}with Qνqα|s(0) = Qνqα|s(0+).(3)
To simplify the notations, we will write
Fqα|s
and
Qqα|s
instead of
Fνqα|s
and
Qνqα|s
respectively,
for any prediction rule qα.
3
In the following, we introduce the definition of Demographic Parity (DP), which is most commonly
used in the context of fairness research [2, 12, 13, 24, 32].
Definition 1
(Demographic Parity)
.
An arbitrary prediction
g:Rd×[K]R
satisfies demographic
parity under a distribution
P
over
(X, S, Y )
, if
g(X, S)
is statistically independent of the sensitive
attribute S. Formally, for every s, s0[K],
sup
tR
|P(g(X, S)t|S=s)P(g(X, S)t|S=s0)|= 0.
Demographic Parity (DP) requires the predictions to be independent of the sensitive attribute, and it
demands the Kolmogorov-Smirnov distance [
29
] (the difference between CDFs measured in the
l
norm) between νg|sand νg|s0to vanish for all categories s, s0.
3 Quantile Regression and Conformal Prediction
In this section, we recall the CQR approach for finite sample, distribution-free prediction interval
inference. Quantile regression was proposed by Koenker and Bassett
[27]
to estimate the
α
-th quantile
of the conditional distribution of
Y
given
˜
X:= (X, S)
for some quantile level
α(0,1)
, since
then it has become more pervasive with various applications, such as providing prediction intervals,
detecting outliers, or perceiving the entire distribution [
30
,
22
]. Denote the conditional cumulative
distribution of
Y
given
˜
X
by
F(y|˜
X= ˜x) := P{Yy|˜
X= ˜x}
. The
α
-th conditional quantile
prediction is defined as
qα(˜x) := inf{yR:F(y|˜
X= ˜x)α}.
Quantile regression can be cast
as an optimization problem [
30
,
49
,
33
,
42
,
48
], by minimizing the expected check loss function
E(ρα) = E[ρα(y, q)|˜
X= ˜x], where
ρα(y, qα(˜x)) = α|yqα(˜x)|if yqα(˜x),
(1 α)|yqα(˜x)|if y < qα(˜x).(4)
Quantile regression offers a principled way of judging the reliability of predictions by building
a prediction interval for the new observation
(˜
Xn+1, Yn+1)
. In contrast to asymptopia, Romano
et al.
[37
,
38]
brought forward the conformalized quantile regression (CQR) by combining the
merits of robust quantile regression with conformal prediction, thus finite sample validity in Eq.
(1)
is guaranteed. Inspired by the split conformal method, a split CQR likewise starts with split-
ting the data into a proper training set and a calibration set, indexed by
I1,I2
respectively. Given
any quantile regression algorithm
Q
, we then fit two conditional quantile functions
ˆqαlo
and
ˆqαhi
on the proper training set:
{ˆqαlo ,ˆqαhi } Q n˜
Xi, Yi:i∈ I1o.
The conformity scores are
calculated to quantify the error made by the plug-in prediction interval
ˆ
C(˜x) = [ˆqαlo (˜x),ˆqαhi (˜x)]
.
We evaluate the scores on the calibration set as
Ek:= max nˆqαlo (˜
Xk)Yk, Ykˆqαhi (˜
Xk)o
for each
k∈ I2
, where both undercoverage and overcoverage of the interval are taken into
consideration [
38
]. Given a new input data
˜
Xn+1
, we construct the prediction interval for
Yn+1
as
C˜
Xn+1=hˆqαlo ˜
Xn+1Q1α(E, I2),ˆqαhi ˜
Xn+1+Q1α(E, I2)i,
where
Q1α(E, I2) := (1 α)(1 + 1/|I2|)
-th empirical quantile of
{Ek:k∈ I2}
conformalizes the
plug-in prediction interval. Note that the constructed interval
C(˜
Xn+1)
could be highly influenced
by the sensitive variable S.
4 Conformal fair quantile prediction (CFQP)
We formally describe our proposed conformal fair prediction (CFQP) framework for constructing
DP fairness constrained prediction intervals in this section. A kernel smoothing quantile function
is introduced during the functional synchronization, which can improve the estimation when some
subgroups are too small to give reliable sample quantile function estimations.
Definition 2
(Wasserstein-2 distance)
.
Let
µ
and
ν
be two univariate probability measures with finite
second moments. The squared Wasserstein-2 distance between µand νis defined as
W2
2(µ, ν) = inf ZR×R
|xy|2(x, y), γ Γµ,ν
4
摘要:

ConformalizedFairnessviaQuantileRegressionMeichenLiu1,LeiDing1,DengdengYu2,WulongLiu3,LinglongKong1,BeiJiang11DepartmentofMathematicalandStatisticalSciences,UniversityofAlberta2DepartmentofMathematics,UniversityofTexasatArlington3HuaweiNoah'sArkLabCanada{meichen1,lding1,lkong,bei1}@ualberta.ca{den...

展开>> 收起<<
Conformalized Fairness via Quantile Regression Meichen Liu1 Lei Ding1 Dengdeng Yu2 Wulong Liu3 Linglong Kong1 Bei Jiang1.pdf

共18页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:18 页 大小:2.96MB 格式:PDF 时间:2025-05-01

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 18
客服
关注