FASTER-CE Fast Sparse Transparent and Robust Counterfactual Explanations Shubham Sharma

2025-04-24 0 0 607.27KB 14 页 10玖币
侵权投诉
FASTER-CE: Fast, Sparse, Transparent, and Robust
Counterfactual Explanations
Shubham Sharma
University of Texas at Austin
shubham_sharma@utexas.edu
Alan H. Gee
Amira Learning
alan.gee@amiralearning.com
Jette Henderson
CognitiveScale
jhenderson@cognitivescale.com
Joydeep Ghosh
University of Texas at Austin
jghosh@utexas.edu
Abstract
Counterfactual explanations have substantially increased in popularity in the past
few years as a useful human-centric way of understanding individual black-box
model predictions. While several properties desired of high-quality counterfactuals
have been identified in the literature, three crucial concerns: the speed of explana-
tion generation, robustness/sensitivity and succinctness of explanations (sparsity)
have been relatively unexplored. In this paper, we present FASTER-CE: a novel
set of algorithms to generate fast, sparse, and robust counterfactual explanations.
The key idea is to efficiently find promising search directions for counterfactuals in
a latent space that is specified via an autoencoder. These directions are determined
based on gradients with respect to each of the original input features as well as
of the target, as estimated in the latent space. The ability to quickly examine
combinations of the most promising gradient directions as well as to incorporate
additional user-defined constraints allows us to generate multiple counterfactual
explanations that are sparse, realistic, and robust to input manipulations. Through
experiments on three datasets of varied complexities, we show that FASTER-CE
is not only much faster than other state of the art methods for generating multiple
explanations but also is significantly superior when considering a larger set of
desirable (and often conflicting) properties. Specifically we present results across
multiple performance metrics: sparsity, proximity, validity, speed of generation,
and the robustness of explanations, to highlight the capabilities of the FASTER-CE
family.
1 Introduction
As machine learning models get deployed in high-stake applications such as finance and healthcare,
providing human-centric explanations for the decisions made by such models is becoming increasingly
important. Not surprisingly, a wide variety of explanation types with associated algorithms have
been proposed and deployed recently [
6
], including feature attribution methods, prototypes and
counterfactuals. Among these, counterfactual explanations have found increased traction as a useful,
human-centric way of providing individual-level explanations for the decision of a black-box model.
Originally introduced in [
24
], counterfactual explanations attempt to answer the question: "What
would have to change in the input to change the outcome of the model?". Such an explanation can be
used as a means of actionable recourse for an end-user subjected to a model decision. The original
definition [
24
] focussed on finding a valid counterfactual as close to the input as possible, where
"valid" means that the class label given by the model is different for the counterfactual. However it
Preprint. Under review.
arXiv:2210.06578v1 [cs.LG] 12 Oct 2022
Figure 1: The FASTER-CE method: An autoencoder is trained on the input dataset. Then, the latent
space is extensively sampled to generate random latent space vectors that correspond to random
decoder outputs, and for every random perturbation, the value of every feature from the decoder
output, and the black-box prediction for the decoder output are noted. For every feature in the
dataset and for the black-box prediction, a linear model is trained with the latent space vectors as
inputs to the linear model and the feature or the black-box prediction as the output. After finding
these linear hyperplanes associated with each feature and the black-box prediction, user-defined
constraints are incorporated and a combination of QR factorization with respect to the normal vectors
of the hyperplanes and finding interesections between hyperplanes gives us a direction to move to
find candidate explanations. Multiple candidates are found for the test point for which we need an
explanation and checked for validity and post-process, and a set of final explanations is returned.
was soon apparent that useful counterfactuals exhibit several other desirable characteristics as well
[
23
,
15
,
19
]. The sparsity of an explanation, i.e., the number of attributes/features that change to
obtain the counterfactual, is one such property. For an end-user, changing just one attribute may
be easier and more understandable compared to changing multiple features in order to receive a
different outcome. However, including a sparsity constraint in existing methods that provide black-
box counterfactual explanations as an objective is fairly non-trivial, since many of these methods
rely on gradient descent [
24
,
15
,
3
] and an L0 penalty is non-differentiable, so surrogates may be
needed. Another key concern is the robustness of counterfactual explanations. Recently, it has been
shown that counterfactual explanations are vulnerable to input perturbations as well as adversarial
attacks [
20
]. The realism and actionability of an explanation is also desirable. Counterfactuals that
are "in-distribution" are preferred, so are those where the indicated change is feasible. For example, a
counterfactual that states that one needs to decrease her age to get a more preferred outcome, indicates
an infeasible change and hence is not an actionable means of recourse for that end-user.
However, all these desirable properties can be at odds with each other. For example, the nearest
point with a different outcome may be in an infeasible change direction. Moreover, given multiple
counterfactuals for a given input, the most actionable choice for recourse can be subjective and thus
should be an end-user’s choice. We have also come across applications such as automated approval
for medical procedures, where many decisions are being made each hour and explanations are needed
for each one. In such cases, the speed of explanation generation becomes crucial because of the (near)
real-time business requirements, even though it may cause one to miss some counterfactuals that
could have been more actionable.
In this paper, we propose FASTER-CE: FAst, Sparse, TransparEnt, and Robust Counterfactual
Explanations. The FASTER-CE method is shown in Figure 1. First, an autoencoder is trained on the
input dataset. Then, the latent space of this autoencoder is sampled from to generate a set of latent
vectors
{˜
z}
and corresponding decoder outputs
{x’}
. Then, we train a set of linear models mapping
the latent vectors to each individual input feature now considered as the target, and one more linear
2
models that uses the black-box predictions
y0
of the decoder output as target. Furthermore, using QR
factorization to incorporate user-defined constraints and by finding the intersection of hyperplanes
defined by the linear models, we identify promising direction(s) of counterfactual search and use line
search to generate a set of candidate counterfactual explanations for a given test data point
xtest
. The
validity of these explanations is checked, post-processing is carried out if need be to induce sparsity,
and the set of valid counterfactual explanations is returned.
There are multiple advantages of using this process to generate counterfactual explanations. A
reasonably accurate autoencoder provides a lower dimensional representation of the data in which the
search for counterfactuals is quicker, more realistic and robust. Moreover, training the autoencoder
and the linear models is a one-time process. Finding counterfactual explanations for any future input
then is simply based on a projection of its latent representation onto the intersection of hyperplanes
and moving along specific directions. This is significantly faster than performing gradient descent
[
15
], using an evolutionary algorithm [
19
], or the many existing methods that require an optimization
procedure for every test point. Through QR factorization, we are able to generate the sparsest
explanations, i.e., where only one feature changes (when feasible). We can also generate multiple
explanations efficiently with multiple features changing, if required. Finally, the use of linear models
makes it easier to specify a margin around the boundary in which an explanation should not exist,
thereby allowing us to produce explanations that have more robust recourse, as defined as in [20].
We propose three different variants of the FASTER-CE algorithm to generate explanations reflecting
different tradeoffs involving proximity, sparsity, and realism. We theoretically motivate how our
method produces more robust explanations. Experiments on three different datasets demonstrate
the effectiveness of our approach. We are able to generate multiple explanations with different
objectives much faster than existing methods. We also discuss the trade-offs between these objectives
experimentally. To the best of our knowledge, this is the first work providing a set of algorithms to
generate multiple explanations along key the desirable axes: robustness, sparsity, realism, validity,
proximity, and speed.
2 Related Work
Counterfactual explanations were originally defined and motivated as a useful means of explaining
artificial intelligence models in [
24
]. Gradient descent in the input space is used in [
24
] to generate
explanations. Subsequently, numerous other methods [
15
,
19
,
2
,
5
,
17
,
10
,
14
,
11
,
12
,
16
,
18
] have
been proposed to generate these explanations. A recent survey [
22
] provides details on many such
algorithms. However, several challenges exist towards the generation of these explanations [
21
,
4
,
1
].
For brevity, we only mention the methods most relevant to our work in this section.
CERTIFAI [
19
] provides multiple desirable objectives including proximity, validity, sparsity, and
realism of explanations in the generation of counterfactuals. The method uses a genetic algorithm to
generate candidate counterfactuals. The use of an evolutionary algorithm allows one to more easily
incorporate a variety of objectives or domain constraints compared to gradient based optimization
methods, but it can be quite slow, specially for large, high-dimensional data. DiCE is notable for
incorporating a diversity constraint to provide multiple counterfactual explanations, but is also slow,
and using gradient descent results in explanations that might not be robust [
20
]. Latent-CF [
3
] and
Sharpshooter [
5
] are two recent approaches that utilize a latent space (like FASTER-CE) to find
counterfactual explanations. Both methods demonstrate the usefulness of representation learning.
However, the gradient descent based Latent-CF is vulnerable to providing explanations that are not
robust. Sharpshooter is a fast algorithm, but does not take into account the sparsity or robustness of
explanations. Finally, Balakrishnan et al
. [2]
use latent space perturbations to analyze the bias in face
recognition algorithms. The method uses such perturbations to generate images for bias analysis, and
not towards counterfactual explanations.
3 Theory
Given an input
~x
, a classifier
~
f
, and a distance metric
d
, a counterfactual explanation
~c
can be found
by solving the optimization problem:
min
~c d(~x,~c)s.t. ~
f(~x)6=~
f(~c)(1)
3
摘要:

FASTER-CE:Fast,Sparse,Transparent,andRobustCounterfactualExplanationsShubhamSharmaUniversityofTexasatAustinshubham_sharma@utexas.eduAlanH.GeeAmiraLearningalan.gee@amiralearning.comJetteHendersonCognitiveScalejhenderson@cognitivescale.comJoydeepGhoshUniversityofTexasatAustinjghosh@utexas.eduAbstractC...

展开>> 收起<<
FASTER-CE Fast Sparse Transparent and Robust Counterfactual Explanations Shubham Sharma.pdf

共14页,预览3页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:14 页 大小:607.27KB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 14
客服
关注