Why Should I Choose You

2025-05-06 0 0 570.98KB 16 页 10玖币

侵权投诉

WHY SHOULD I CHOOSE YOU?

AUTOXAI: A FRAMEWORK FOR SELECTING AND TUNING EXPLAINABLE AI SOLUTIONS

A PREPRINT

Robin Cugny

IRIT

Université Toulouse 2

Toulouse, France

robin.cugny@irit.fr

Julien Aligon

IRIT

Université Toulouse 1

Toulouse, France

julien.aligon@irit.fr

Max Chevalier

IRIT

Université Toulouse 3

Toulouse, France

max.chevalier@irit.fr

Geoffrey Roman Jimenez

SolutionData Group

Toulouse, France

groman-jimenez@solutiondatagroup.fr

Olivier Teste

IRIT

Université Toulouse 2

Toulouse, France

olivier.teste@irit.fr

October 11, 2022

ABSTRACT

In recent years, a large number of XAI (eXplainable Artiﬁcial Intelligence) solutions have been

proposed to explain existing ML (Machine Learning) models or to create interpretable ML models.

Evaluation measures have recently been proposed and it is now possible to compare these XAI

solutions. However, selecting the most relevant XAI solution among all this diversity is still a

tedious task, especially when meeting speciﬁc needs and constraints. In this paper, we propose

AutoXAI, a framework that recommends the best XAI solution and its hyperparameters according to

speciﬁc XAI evaluation metrics while considering the user’s context (dataset, ML model, XAI needs

and constraints). It adapts approaches from context-aware recommender systems and strategies of

optimization and evaluation from AutoML (Automated Machine Learning). We apply AutoXAI to

two use cases, and show that it recommends XAI solutions adapted to the user’s needs with the best

hyperparameters matching the user’s constraints.

Keywords

Explainable machine learning, Evaluation of explainability, Quality of explanation, Evaluation metrics,

AutoML, Recommender system, Information system

Acknowledgements

This work was supported by ANRT (CIFRE [2020/0870]) in collaboration with SolutionData Group and IRIT.

1 Introduction

Machine Learning (ML) models are now widely used in the industry. However, their lack of understandability delays

their adoption in high stakes domains such as the medical ﬁeld Markus et al. [2021], digital security Brown et al. [2018],

judicial ﬁeld Tan et al. [2018], or autonomous driving Omeiza et al. [2021]. In such contexts, decision-makers should

understand ML models and their results to detect biases Tan et al. [2018] or meaningless relationships Ribeiro et al.

[2016]. During the last decade, the eXplainable Artiﬁcial Intelligence (XAI) ﬁeld proposed a wide variety of solutions

to facilitate the understanding of ML models Adadi and Berrada [2018], Carvalho et al. [2019], Molnar [2020], Barredo

Arrieta et al. [2020], Doshi-Velez and Kim [2017], Lipton [2018], Gilpin et al. [2018]. In view of the growing number

of XAI proposals Barredo Arrieta et al. [2020], evaluating the quality of explanations has become necessary to choose

arXiv:2210.02795v2 [cs.LG] 10 Oct 2022

Why Should I Choose You? A PREPRINT

an appropriate XAI solution as well as its hyperparameters. It is worth noting that the evaluation of explanations can

either be done subjectively by humans or objectively with metrics Nguyen and Martínez [2020], Zhou et al. [2021],

Nauta et al. [2022]. However, data scientists who want to include an XAI solution have the following issues:

• They must check which XAI solutions are compatible with the data type and the ML model.

•

The XAI solutions should explain speciﬁcally what the data scientists want to understand and it should be

explained in an appropriate format.

• They should evaluate the effectiveness of the explanations produced by the selected XAI solutions.

•

The context requires that the explanations match speciﬁc quality criteria (called explanations’ properties)

which imposes the use of appropriate evaluation metrics.

• They have to ﬁnd the best hyperparameters for each of the selected XAI solutions to keep the best of them.

These are tedious and time-consuming tasks. Theoretical guides have been proposed by Vermeire et al. [2021], Palacio

et al. [2021] but, to the best of our knowledge, automating the complete XAI recommendation approach has never been

formalized and implemented before.

In this paper, we propose to automate these tasks in a framework to assist data scientists in choosing the best XAI

solutions according to their context (dataset, ML model, XAI needs and constraints).

Suggesting an adapted XAI solution requires deﬁning the elements of the data scientists’ contexts and using them to

ﬁlter the compatible XAI solutions. This task is challenging because there are as many formalizations as there are

authors in the XAI ﬁeld and very few works have attempted to unify XAI elements with a formalization Lundberg and

Lee [2017], Palacio et al. [2021], Nauta et al. [2022]. As we want to evaluate the XAI solutions, we must automatically

ﬁnd the XAI evaluation metrics that are compatible with the XAI solutions and are meaningful to the context. Moreover,

it is necessary to ﬁnd a way to validate multiple complementary properties by optimizing corresponding XAI evaluation

metrics while considering the data scientists’ preferences. Indeed, properties’ importance is subjective and depend on

the context. In addition, ranking XAI solutions requires ﬁnding the best hyperparameters using XAI evaluation metrics.

As it is computationally expensive, we draw inspiration from time-saving strategies for model evaluation in AutoML

He et al. [2021].

The contributions of this paper are as follows:

•

AutoXAI, a framework that recommends XAI solutions to match the data scientists’ context and optimize

their hyperparameters with respect to XAI evaluation metrics Zhou et al. [2021], Nauta et al. [2022].

• A more generic formalization of the data scientists’ context for XAI.

• A new evaluation metric to assess the completeness of example-based explanations.

• New time-saving strategies adapted to XAI evaluation.

We illustrate AutoXAI’s recommendations through two use cases with different users’ constraints and needs as well as

different datasets and models. These studies let us uncover interactions between hyperparameters and properties of

explanations, as well as interactions between the properties themselves.

The rest of the paper is organized as follows. Related work are described in Section 2. Formal deﬁnitions are illustrated

by an example of context in Section 3. The core of our framework is detailed in Sections 4 and 5. Experiments of

Section 6 show that our framework is well adapted to propose an explanation matching the user’s context and that

time-saving strategies considerably reduce the computation time. Finally, we conclude the paper and give possible

perspectives in Section 7.

2 Related work

Four research topics interact in this paper: the XAI solutions, which are being assessed by XAI evaluation metrics while

context-aware recommender systems and AutoML are means used to propose the most adapted solution.

2.1 XAI solutions

In this paper, we deﬁne an XAI solution as any algorithm that produces an explanation related to an ML problem. This

includes methods that explain black-box models but also naturally interpretable models. As mentioned in Section 1,

many XAI solutions now exist and different taxonomies have been proposed such as Barredo Arrieta et al. [2020],

Carvalho et al. [2019], Adadi and Berrada [2018], Gilpin et al. [2018], Nauta et al. [2022]. Carvalho et al. [2019]

Why Should I Choose You? A PREPRINT

also suggests grouping XAI solutions according to the type of explanation produced. They list: feature summary,

model internals, data point, surrogate intrinsically interpretable model, rule sets, explanations in natural language, and

question-answering. Later, Liao et al. [2020] suggests that XAI explanations answer speciﬁc questions about data, its

processing and results in ML. They map existing XAI solutions to questions and create an XAI question bank that

supports the design of user-centered XAI applications. Overton [2011] deﬁnes an explanation as an explanan: the

answer to the question and an explanandum: what is to be explained. These two elements provide a user-friendly

characterization of explanations and thus allow the user to specify which explanation is more adapted.

The diversity of existing XAI solutions makes it hard to ﬁnd an XAI solution adapted to one’s needs. Moreover, as

the XAI ﬁeld is growing, more and more XAI solutions proposed in the literature are producing similar kinds of

explanations. Hence, it has become necessary to objectively compare XAI solutions by assessing the effectiveness of

their explanations. In this direction, the recent literature has focused on quantitative XAI evaluations Nauta et al. [2022].

2.2 Evaluation of XAI solutions

Doshi-Velez and Kim [2017] distinguishes three strategies of evaluation: application-grounded evaluation, human-

grounded evaluation, and functionality-grounded evaluation that does not imply human intervention. Application-

grounded evaluation tests the effectiveness of explanations in a real-world application with domain experts and

human-grounded evaluation are carried out with lay humans. While explanations are intended for humans, functionality-

grounded evaluations are interesting because of their objectivity. Thus, this type of evaluation is inexpensive, fast and

can lead to a formal comparison of explanation methods Zhou et al. [2021].

Since the notion of "good explanations" is not trivial, some quality properties have been proposed by Robnik-Šikonja

and Bohanec [2018]. These are man-made criteria that attest to the quality of the explanations. Functionality-grounded

evaluation metrics are constructed to calculate scores to measure how well a property is met.

Nauta et al. [2022] focuses on the functionality-grounded evaluation and proposed the Co-12 Explanation Quality

Properties to unify the diverse properties proposed in the literature. They reviewed most XAI evaluation metrics and

associate each of them with properties. Examples of their properties that will be studied in this paper are as follow:

Continuity describes how continuous and generalizable the explanation function is, Correctness describes how faithful

the explanation is w.r.t. the black box, Compactness describes the size of the explanation, and Completeness describes

how much of the black box behavior is described in the explanation.

In practice, XAI evaluation metrics produce scores for properties of interest, making it possible to compare and choose

an XAI solution. However, the data scientists still have to ﬁnd the desired XAI solutions and their corresponding XAI

evaluation metrics. This issue could be addressed with strategies that have been studied in context-aware recommender

systems.

2.3 Context-aware recommender systems

Recommender systems ﬁlter information to present the most relevant elements to a user. To the best of our knowledge,

there is no recommender system for XAI solutions. To recommend adapted XAI solutions, one should consider the

whole context of the data scientist. According to Adomavicius et al. [2011], context-aware recommender systems offer

more relevant recommendations by adapting them to the user’s situation. They also state that context may be integrated

during three phases: contextual preﬁltering which selects a subset of possible candidates before the recommendation,

contextual modeling which uses context in the recommendation process, and contextual postﬁltering which adjusts

the recommendation afterward. These three phases require formally deﬁning the elements of the context, which is

one of our objectives for the framework we propose in this paper. While recommending an adapted XAI solution is

a ﬁrst interesting step, the data scientist eventually wants a reliable explanation, i.e. an explanation that veriﬁes the

properties of interest. To achieve this, a possible approach is to use previously detailed XAI evaluation metrics to

optimize hyperparameters of adapted XAI solutions. For this kind of approach, many strategies have been proposed in

the AutoML domain.

2.4 AutoML

Designing ML algorithms is an iterative task of testing and modifying both the architecture and the hyperparameters of

the algorithm. It is a repetitive task that requires a lot of time. For this reason, a part of the research has focused on

automating the design of ML algorithms, namely AutoML He et al. [2021]. AutoML frameworks look for the best

performing ML pipeline treatment to solve a task on a given dataset. According to He et al. [2021], AutoML consists of

several processes: data preparation, feature engineering, model generation, and model evaluation. They divide the model

generation process into two steps: search space and optimization methods. The ﬁrst step deﬁnes the design principles of

Why Should I Choose You? A PREPRINT

models that are tested, and the second is how to obtain the best scores in the model evaluation process. The main strategy

of interest here is HyperParameter Optimization (HPO) which consists in ﬁnding the best hyperparameters according to

a loss function. As model performances cannot be derived according to a hyperparameter, it is a non-differentiable

optimization problem, therefore the HPO methods do not rely on the model to propose a solution. Moreover, since

training a model until convergence is a costly operation, model evaluation is a very time-consuming step. Thus, several

strategies have been proposed to accelerate the evaluation of models. He et al. [2021] lists four types of strategies: low

ﬁdelity,weight sharing,surrogate, and early stopping. The low ﬁdelity strategy consists in reducing the number of

observations to reduce the number of epochs, or the size of observations to reduce the number of parameters to optimize.

Weight sharing reuses learned parameters between models. The surrogate strategy replaces a computationally expensive

model with an approximation to estimate the performance of neural architecture and guide architecture research. Early

stopping can accelerate model evaluation by stopping iterations if performances are predicted to be lower than the

current best score.

2.5 Approaches for choosing an XAI solution

As mentioned in Section 1, to select an XAI solution, the data scientist can currently rely on XAI libraries, benchmarks,

and AutoML frameworks. Currently, available XAI libraries such as DeepExplain Ancona et al. [2018], AI Explainability

360 Arya et al. [2019], and Alibi Klaise et al. [2021] are gathering state-of-the-art XAI solutions. However, they neither

integrate automatic evaluation of explanation nor recommend XAI solutions according to data scientists’ needs and

constraints.

Comparatives and benchmarks Yeh et al. [2019], Hooker et al. [2019], Alvarez-Melis and Jaakkola [2018] compare XAI

solutions efﬁciency using XAI evaluation metrics. They are often joint with the proposal of an XAI solution or an XAI

evaluation metric on which they are focusing. However, the results obtained depend on the dataset and the ML model

that may not be the ones the data scientist uses, and thus the results may be different. Moreover, the hyperparameters of

the XAI solutions are not optimized to maximize the various properties needed by the data scientist. This last point is

problematic as some XAI properties such as correctness and compactness are not independent Nauta et al. [2022].

Eventually, Vermeire et al. [2021] highlights that users should be guided in choosing XAI solutions and proposes a

methodology for this issue, while Palacio et al. [2021] proposes a theoretical framework to facilitate the comparison

between XAI solutions.

To summarize, as there is a high diversity of XAI solutions, it is a complex and tedious task for data scientists to

ﬁnd XAI solutions that ﬁt their needs. Yet, there is no recommendation system to automate this task. Moreover, data

scientists look for the best XAI solution as they want a reliable solution. XAI evaluation metrics allow for objective

comparison, but XAI libraries do not implement them and comparatives are not adapted to the user’s context. Eventually,

data scientists have to ﬁnd hyperparameters to maximize the desired properties. However, this task should be done for

multiple XAI solutions and multiple properties according to the data scientists’ preferences. This paper aims to address

these issues in the following sections.

3 Example and formalization

3.1 Illustrative example

Let’s ﬁrst consider a data scientist in a medical laboratory, Alice and Bob, a physician colleague. Bob uses a ML black

box model as a decision support tool and asks if it is possible to have an explanation for the predictions of the model to

check some rare cases. Alice has access to the model, as well as the data that were used to build it, and now wants to

implement an XAI solution to produce explanations.

Here, the needs of Bob, the physician, are the following: the explanations must focus on predictions (as it is asked why

they are obtained) and the XAI solution must explain a trained model without modifying it. Moreover, Bob wants to

know how the collected data for a patient (the features) inﬂuence the result of the model.

Regarding the constraints of the context, the high-stakes decisions impose the use of a precise model and the most

faithful explanations possible (correctness property). Nevertheless, the explanations should not be completely changed

by small perturbations as blood measurements might be noisy, therefore, stable explanations are mandatory (continuity

property). Eventually, since Bob will be the main user of these explanations, concise explanations should be encouraged

as physicians shouldn’t waste time on unimportant features (compactness property).

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

WHYSHOULDICHOOSEYOU?AUTOXAI:AFRAMEWORKFORSELECTINGANDTUNINGEXPLAINABLEAISOLUTIONSAPREPRINTRobinCugnyIRITUniversitéToulouse2Toulouse,Francerobin.cugny@irit.frJulienAligonIRITUniversitéToulouse1Toulouse,Francejulien.aligon@irit.frMaxChevalierIRITUniversitéToulouse3Toulouse,Francemax.chevalier@irit.frG...

展开>> 收起<<

Why Should I Choose You.pdf

共16页,预览4页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Why Should I Choose You

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: