models that uses the black-box predictions
y0
of the decoder output as target. Furthermore, using QR
factorization to incorporate user-defined constraints and by finding the intersection of hyperplanes
defined by the linear models, we identify promising direction(s) of counterfactual search and use line
search to generate a set of candidate counterfactual explanations for a given test data point
xtest
. The
validity of these explanations is checked, post-processing is carried out if need be to induce sparsity,
and the set of valid counterfactual explanations is returned.
There are multiple advantages of using this process to generate counterfactual explanations. A
reasonably accurate autoencoder provides a lower dimensional representation of the data in which the
search for counterfactuals is quicker, more realistic and robust. Moreover, training the autoencoder
and the linear models is a one-time process. Finding counterfactual explanations for any future input
then is simply based on a projection of its latent representation onto the intersection of hyperplanes
and moving along specific directions. This is significantly faster than performing gradient descent
[
15
], using an evolutionary algorithm [
19
], or the many existing methods that require an optimization
procedure for every test point. Through QR factorization, we are able to generate the sparsest
explanations, i.e., where only one feature changes (when feasible). We can also generate multiple
explanations efficiently with multiple features changing, if required. Finally, the use of linear models
makes it easier to specify a margin around the boundary in which an explanation should not exist,
thereby allowing us to produce explanations that have more robust recourse, as defined as in [20].
We propose three different variants of the FASTER-CE algorithm to generate explanations reflecting
different tradeoffs involving proximity, sparsity, and realism. We theoretically motivate how our
method produces more robust explanations. Experiments on three different datasets demonstrate
the effectiveness of our approach. We are able to generate multiple explanations with different
objectives much faster than existing methods. We also discuss the trade-offs between these objectives
experimentally. To the best of our knowledge, this is the first work providing a set of algorithms to
generate multiple explanations along key the desirable axes: robustness, sparsity, realism, validity,
proximity, and speed.
2 Related Work
Counterfactual explanations were originally defined and motivated as a useful means of explaining
artificial intelligence models in [
24
]. Gradient descent in the input space is used in [
24
] to generate
explanations. Subsequently, numerous other methods [
15
,
19
,
2
,
5
,
17
,
10
,
14
,
11
,
12
,
16
,
18
] have
been proposed to generate these explanations. A recent survey [
22
] provides details on many such
algorithms. However, several challenges exist towards the generation of these explanations [
21
,
4
,
1
].
For brevity, we only mention the methods most relevant to our work in this section.
CERTIFAI [
19
] provides multiple desirable objectives including proximity, validity, sparsity, and
realism of explanations in the generation of counterfactuals. The method uses a genetic algorithm to
generate candidate counterfactuals. The use of an evolutionary algorithm allows one to more easily
incorporate a variety of objectives or domain constraints compared to gradient based optimization
methods, but it can be quite slow, specially for large, high-dimensional data. DiCE is notable for
incorporating a diversity constraint to provide multiple counterfactual explanations, but is also slow,
and using gradient descent results in explanations that might not be robust [
20
]. Latent-CF [
3
] and
Sharpshooter [
5
] are two recent approaches that utilize a latent space (like FASTER-CE) to find
counterfactual explanations. Both methods demonstrate the usefulness of representation learning.
However, the gradient descent based Latent-CF is vulnerable to providing explanations that are not
robust. Sharpshooter is a fast algorithm, but does not take into account the sparsity or robustness of
explanations. Finally, Balakrishnan et al
. [2]
use latent space perturbations to analyze the bias in face
recognition algorithms. The method uses such perturbations to generate images for bias analysis, and
not towards counterfactual explanations.
3 Theory
Given an input
~x
, a classifier
~
f
, and a distance metric
d
, a counterfactual explanation
~c
can be found
by solving the optimization problem:
min
~c d(~x,~c)s.t. ~
f(~x)6=~
f(~c)(1)
3