FASTER-CE Fast Sparse Transparent and Robust Counterfactual Explanations Shubham Sharma

2025-04-24 0 0 607.27KB 14 页 10玖币

侵权投诉

FASTER-CE: Fast, Sparse, Transparent, and Robust

Counterfactual Explanations

Shubham Sharma

University of Texas at Austin

shubham_sharma@utexas.edu

Alan H. Gee

Amira Learning

alan.gee@amiralearning.com

Jette Henderson

CognitiveScale

jhenderson@cognitivescale.com

Joydeep Ghosh

University of Texas at Austin

jghosh@utexas.edu

Abstract

Counterfactual explanations have substantially increased in popularity in the past

few years as a useful human-centric way of understanding individual black-box

model predictions. While several properties desired of high-quality counterfactuals

have been identiﬁed in the literature, three crucial concerns: the speed of explana-

tion generation, robustness/sensitivity and succinctness of explanations (sparsity)

have been relatively unexplored. In this paper, we present FASTER-CE: a novel

set of algorithms to generate fast, sparse, and robust counterfactual explanations.

The key idea is to efﬁciently ﬁnd promising search directions for counterfactuals in

a latent space that is speciﬁed via an autoencoder. These directions are determined

based on gradients with respect to each of the original input features as well as

of the target, as estimated in the latent space. The ability to quickly examine

combinations of the most promising gradient directions as well as to incorporate

additional user-deﬁned constraints allows us to generate multiple counterfactual

explanations that are sparse, realistic, and robust to input manipulations. Through

experiments on three datasets of varied complexities, we show that FASTER-CE

is not only much faster than other state of the art methods for generating multiple

explanations but also is signiﬁcantly superior when considering a larger set of

desirable (and often conﬂicting) properties. Speciﬁcally we present results across

multiple performance metrics: sparsity, proximity, validity, speed of generation,

and the robustness of explanations, to highlight the capabilities of the FASTER-CE

family.

1 Introduction

As machine learning models get deployed in high-stake applications such as ﬁnance and healthcare,

providing human-centric explanations for the decisions made by such models is becoming increasingly

important. Not surprisingly, a wide variety of explanation types with associated algorithms have

been proposed and deployed recently [

], including feature attribution methods, prototypes and

counterfactuals. Among these, counterfactual explanations have found increased traction as a useful,

human-centric way of providing individual-level explanations for the decision of a black-box model.

Originally introduced in [

], counterfactual explanations attempt to answer the question: "What

would have to change in the input to change the outcome of the model?". Such an explanation can be

used as a means of actionable recourse for an end-user subjected to a model decision. The original

deﬁnition [

] focussed on ﬁnding a valid counterfactual as close to the input as possible, where

"valid" means that the class label given by the model is different for the counterfactual. However it

Preprint. Under review.

arXiv:2210.06578v1 [cs.LG] 12 Oct 2022

Figure 1: The FASTER-CE method: An autoencoder is trained on the input dataset. Then, the latent

space is extensively sampled to generate random latent space vectors that correspond to random

decoder outputs, and for every random perturbation, the value of every feature from the decoder

output, and the black-box prediction for the decoder output are noted. For every feature in the

dataset and for the black-box prediction, a linear model is trained with the latent space vectors as

inputs to the linear model and the feature or the black-box prediction as the output. After ﬁnding

these linear hyperplanes associated with each feature and the black-box prediction, user-deﬁned

constraints are incorporated and a combination of QR factorization with respect to the normal vectors

of the hyperplanes and ﬁnding interesections between hyperplanes gives us a direction to move to

ﬁnd candidate explanations. Multiple candidates are found for the test point for which we need an

explanation and checked for validity and post-process, and a set of ﬁnal explanations is returned.

was soon apparent that useful counterfactuals exhibit several other desirable characteristics as well

[

]. The sparsity of an explanation, i.e., the number of attributes/features that change to

obtain the counterfactual, is one such property. For an end-user, changing just one attribute may

be easier and more understandable compared to changing multiple features in order to receive a

different outcome. However, including a sparsity constraint in existing methods that provide black-

box counterfactual explanations as an objective is fairly non-trivial, since many of these methods

rely on gradient descent [

] and an L0 penalty is non-differentiable, so surrogates may be

needed. Another key concern is the robustness of counterfactual explanations. Recently, it has been

shown that counterfactual explanations are vulnerable to input perturbations as well as adversarial

attacks [

]. The realism and actionability of an explanation is also desirable. Counterfactuals that

are "in-distribution" are preferred, so are those where the indicated change is feasible. For example, a

counterfactual that states that one needs to decrease her age to get a more preferred outcome, indicates

an infeasible change and hence is not an actionable means of recourse for that end-user.

However, all these desirable properties can be at odds with each other. For example, the nearest

point with a different outcome may be in an infeasible change direction. Moreover, given multiple

counterfactuals for a given input, the most actionable choice for recourse can be subjective and thus

should be an end-user’s choice. We have also come across applications such as automated approval

for medical procedures, where many decisions are being made each hour and explanations are needed

for each one. In such cases, the speed of explanation generation becomes crucial because of the (near)

real-time business requirements, even though it may cause one to miss some counterfactuals that

could have been more actionable.

In this paper, we propose FASTER-CE: FAst, Sparse, TransparEnt, and Robust Counterfactual

Explanations. The FASTER-CE method is shown in Figure 1. First, an autoencoder is trained on the

input dataset. Then, the latent space of this autoencoder is sampled from to generate a set of latent

vectors

{˜

and corresponding decoder outputs

{x’}

. Then, we train a set of linear models mapping

the latent vectors to each individual input feature now considered as the target, and one more linear

models that uses the black-box predictions

of the decoder output as target. Furthermore, using QR

factorization to incorporate user-deﬁned constraints and by ﬁnding the intersection of hyperplanes

deﬁned by the linear models, we identify promising direction(s) of counterfactual search and use line

search to generate a set of candidate counterfactual explanations for a given test data point

xtest

. The

validity of these explanations is checked, post-processing is carried out if need be to induce sparsity,

and the set of valid counterfactual explanations is returned.

There are multiple advantages of using this process to generate counterfactual explanations. A

reasonably accurate autoencoder provides a lower dimensional representation of the data in which the

search for counterfactuals is quicker, more realistic and robust. Moreover, training the autoencoder

and the linear models is a one-time process. Finding counterfactual explanations for any future input

then is simply based on a projection of its latent representation onto the intersection of hyperplanes

and moving along speciﬁc directions. This is signiﬁcantly faster than performing gradient descent

[

], using an evolutionary algorithm [

], or the many existing methods that require an optimization

procedure for every test point. Through QR factorization, we are able to generate the sparsest

explanations, i.e., where only one feature changes (when feasible). We can also generate multiple

explanations efﬁciently with multiple features changing, if required. Finally, the use of linear models

makes it easier to specify a margin around the boundary in which an explanation should not exist,

thereby allowing us to produce explanations that have more robust recourse, as deﬁned as in [20].

We propose three different variants of the FASTER-CE algorithm to generate explanations reﬂecting

different tradeoffs involving proximity, sparsity, and realism. We theoretically motivate how our

method produces more robust explanations. Experiments on three different datasets demonstrate

the effectiveness of our approach. We are able to generate multiple explanations with different

objectives much faster than existing methods. We also discuss the trade-offs between these objectives

experimentally. To the best of our knowledge, this is the ﬁrst work providing a set of algorithms to

generate multiple explanations along key the desirable axes: robustness, sparsity, realism, validity,

proximity, and speed.

2 Related Work

Counterfactual explanations were originally deﬁned and motivated as a useful means of explaining

artiﬁcial intelligence models in [

]. Gradient descent in the input space is used in [

] to generate

explanations. Subsequently, numerous other methods [

] have

been proposed to generate these explanations. A recent survey [

] provides details on many such

algorithms. However, several challenges exist towards the generation of these explanations [

For brevity, we only mention the methods most relevant to our work in this section.

CERTIFAI [

] provides multiple desirable objectives including proximity, validity, sparsity, and

realism of explanations in the generation of counterfactuals. The method uses a genetic algorithm to

generate candidate counterfactuals. The use of an evolutionary algorithm allows one to more easily

incorporate a variety of objectives or domain constraints compared to gradient based optimization

methods, but it can be quite slow, specially for large, high-dimensional data. DiCE is notable for

incorporating a diversity constraint to provide multiple counterfactual explanations, but is also slow,

and using gradient descent results in explanations that might not be robust [

]. Latent-CF [

] and

Sharpshooter [

] are two recent approaches that utilize a latent space (like FASTER-CE) to ﬁnd

counterfactual explanations. Both methods demonstrate the usefulness of representation learning.

However, the gradient descent based Latent-CF is vulnerable to providing explanations that are not

robust. Sharpshooter is a fast algorithm, but does not take into account the sparsity or robustness of

explanations. Finally, Balakrishnan et al

. [2]

use latent space perturbations to analyze the bias in face

recognition algorithms. The method uses such perturbations to generate images for bias analysis, and

not towards counterfactual explanations.

3 Theory

Given an input

, a classiﬁer

, and a distance metric

, a counterfactual explanation

can be found

by solving the optimization problem:

min

~c d(~x,~c)s.t. ~

f(~x)6=~

f(~c)(1)

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

FASTER-CE:Fast,Sparse,Transparent,andRobustCounterfactualExplanationsShubhamSharmaUniversityofTexasatAustinshubham_sharma@utexas.eduAlanH.GeeAmiraLearningalan.gee@amiralearning.comJetteHendersonCognitiveScalejhenderson@cognitivescale.comJoydeepGhoshUniversityofTexasatAustinjghosh@utexas.eduAbstractC...

展开>> 收起<<

FASTER-CE Fast Sparse Transparent and Robust Counterfactual Explanations Shubham Sharma.pdf

共14页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

FASTER-CE Fast Sparse Transparent and Robust Counterfactual Explanations Shubham Sharma

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: