Why Should I Choose You

2025-05-06 0 0 570.98KB 16 页 10玖币
侵权投诉
WHY SHOULD I CHOOSE YOU?
AUTOXAI: A FRAMEWORK FOR SELECTING AND TUNING EXPLAINABLE AI SOLUTIONS
A PREPRINT
Robin Cugny
IRIT
Université Toulouse 2
Toulouse, France
robin.cugny@irit.fr
Julien Aligon
IRIT
Université Toulouse 1
Toulouse, France
julien.aligon@irit.fr
Max Chevalier
IRIT
Université Toulouse 3
Toulouse, France
max.chevalier@irit.fr
Geoffrey Roman Jimenez
SolutionData Group
Toulouse, France
groman-jimenez@solutiondatagroup.fr
Olivier Teste
IRIT
Université Toulouse 2
Toulouse, France
olivier.teste@irit.fr
October 11, 2022
ABSTRACT
In recent years, a large number of XAI (eXplainable Artificial Intelligence) solutions have been
proposed to explain existing ML (Machine Learning) models or to create interpretable ML models.
Evaluation measures have recently been proposed and it is now possible to compare these XAI
solutions. However, selecting the most relevant XAI solution among all this diversity is still a
tedious task, especially when meeting specific needs and constraints. In this paper, we propose
AutoXAI, a framework that recommends the best XAI solution and its hyperparameters according to
specific XAI evaluation metrics while considering the user’s context (dataset, ML model, XAI needs
and constraints). It adapts approaches from context-aware recommender systems and strategies of
optimization and evaluation from AutoML (Automated Machine Learning). We apply AutoXAI to
two use cases, and show that it recommends XAI solutions adapted to the user’s needs with the best
hyperparameters matching the user’s constraints.
Keywords
Explainable machine learning, Evaluation of explainability, Quality of explanation, Evaluation metrics,
AutoML, Recommender system, Information system
Acknowledgements
This work was supported by ANRT (CIFRE [2020/0870]) in collaboration with SolutionData Group and IRIT.
1 Introduction
Machine Learning (ML) models are now widely used in the industry. However, their lack of understandability delays
their adoption in high stakes domains such as the medical field Markus et al. [2021], digital security Brown et al. [2018],
judicial field Tan et al. [2018], or autonomous driving Omeiza et al. [2021]. In such contexts, decision-makers should
understand ML models and their results to detect biases Tan et al. [2018] or meaningless relationships Ribeiro et al.
[2016]. During the last decade, the eXplainable Artificial Intelligence (XAI) field proposed a wide variety of solutions
to facilitate the understanding of ML models Adadi and Berrada [2018], Carvalho et al. [2019], Molnar [2020], Barredo
Arrieta et al. [2020], Doshi-Velez and Kim [2017], Lipton [2018], Gilpin et al. [2018]. In view of the growing number
of XAI proposals Barredo Arrieta et al. [2020], evaluating the quality of explanations has become necessary to choose
arXiv:2210.02795v2 [cs.LG] 10 Oct 2022
Why Should I Choose You? A PREPRINT
an appropriate XAI solution as well as its hyperparameters. It is worth noting that the evaluation of explanations can
either be done subjectively by humans or objectively with metrics Nguyen and Martínez [2020], Zhou et al. [2021],
Nauta et al. [2022]. However, data scientists who want to include an XAI solution have the following issues:
They must check which XAI solutions are compatible with the data type and the ML model.
The XAI solutions should explain specifically what the data scientists want to understand and it should be
explained in an appropriate format.
They should evaluate the effectiveness of the explanations produced by the selected XAI solutions.
The context requires that the explanations match specific quality criteria (called explanations’ properties)
which imposes the use of appropriate evaluation metrics.
They have to find the best hyperparameters for each of the selected XAI solutions to keep the best of them.
These are tedious and time-consuming tasks. Theoretical guides have been proposed by Vermeire et al. [2021], Palacio
et al. [2021] but, to the best of our knowledge, automating the complete XAI recommendation approach has never been
formalized and implemented before.
In this paper, we propose to automate these tasks in a framework to assist data scientists in choosing the best XAI
solutions according to their context (dataset, ML model, XAI needs and constraints).
Suggesting an adapted XAI solution requires defining the elements of the data scientists’ contexts and using them to
filter the compatible XAI solutions. This task is challenging because there are as many formalizations as there are
authors in the XAI field and very few works have attempted to unify XAI elements with a formalization Lundberg and
Lee [2017], Palacio et al. [2021], Nauta et al. [2022]. As we want to evaluate the XAI solutions, we must automatically
find the XAI evaluation metrics that are compatible with the XAI solutions and are meaningful to the context. Moreover,
it is necessary to find a way to validate multiple complementary properties by optimizing corresponding XAI evaluation
metrics while considering the data scientists’ preferences. Indeed, properties’ importance is subjective and depend on
the context. In addition, ranking XAI solutions requires finding the best hyperparameters using XAI evaluation metrics.
As it is computationally expensive, we draw inspiration from time-saving strategies for model evaluation in AutoML
He et al. [2021].
The contributions of this paper are as follows:
AutoXAI, a framework that recommends XAI solutions to match the data scientists’ context and optimize
their hyperparameters with respect to XAI evaluation metrics Zhou et al. [2021], Nauta et al. [2022].
A more generic formalization of the data scientists’ context for XAI.
A new evaluation metric to assess the completeness of example-based explanations.
New time-saving strategies adapted to XAI evaluation.
We illustrate AutoXAI’s recommendations through two use cases with different users’ constraints and needs as well as
different datasets and models. These studies let us uncover interactions between hyperparameters and properties of
explanations, as well as interactions between the properties themselves.
The rest of the paper is organized as follows. Related work are described in Section 2. Formal definitions are illustrated
by an example of context in Section 3. The core of our framework is detailed in Sections 4 and 5. Experiments of
Section 6 show that our framework is well adapted to propose an explanation matching the user’s context and that
time-saving strategies considerably reduce the computation time. Finally, we conclude the paper and give possible
perspectives in Section 7.
2 Related work
Four research topics interact in this paper: the XAI solutions, which are being assessed by XAI evaluation metrics while
context-aware recommender systems and AutoML are means used to propose the most adapted solution.
2.1 XAI solutions
In this paper, we define an XAI solution as any algorithm that produces an explanation related to an ML problem. This
includes methods that explain black-box models but also naturally interpretable models. As mentioned in Section 1,
many XAI solutions now exist and different taxonomies have been proposed such as Barredo Arrieta et al. [2020],
Carvalho et al. [2019], Adadi and Berrada [2018], Gilpin et al. [2018], Nauta et al. [2022]. Carvalho et al. [2019]
2
Why Should I Choose You? A PREPRINT
also suggests grouping XAI solutions according to the type of explanation produced. They list: feature summary,
model internals, data point, surrogate intrinsically interpretable model, rule sets, explanations in natural language, and
question-answering. Later, Liao et al. [2020] suggests that XAI explanations answer specific questions about data, its
processing and results in ML. They map existing XAI solutions to questions and create an XAI question bank that
supports the design of user-centered XAI applications. Overton [2011] defines an explanation as an explanan: the
answer to the question and an explanandum: what is to be explained. These two elements provide a user-friendly
characterization of explanations and thus allow the user to specify which explanation is more adapted.
The diversity of existing XAI solutions makes it hard to find an XAI solution adapted to one’s needs. Moreover, as
the XAI field is growing, more and more XAI solutions proposed in the literature are producing similar kinds of
explanations. Hence, it has become necessary to objectively compare XAI solutions by assessing the effectiveness of
their explanations. In this direction, the recent literature has focused on quantitative XAI evaluations Nauta et al. [2022].
2.2 Evaluation of XAI solutions
Doshi-Velez and Kim [2017] distinguishes three strategies of evaluation: application-grounded evaluation, human-
grounded evaluation, and functionality-grounded evaluation that does not imply human intervention. Application-
grounded evaluation tests the effectiveness of explanations in a real-world application with domain experts and
human-grounded evaluation are carried out with lay humans. While explanations are intended for humans, functionality-
grounded evaluations are interesting because of their objectivity. Thus, this type of evaluation is inexpensive, fast and
can lead to a formal comparison of explanation methods Zhou et al. [2021].
Since the notion of "good explanations" is not trivial, some quality properties have been proposed by Robnik-Šikonja
and Bohanec [2018]. These are man-made criteria that attest to the quality of the explanations. Functionality-grounded
evaluation metrics are constructed to calculate scores to measure how well a property is met.
Nauta et al. [2022] focuses on the functionality-grounded evaluation and proposed the Co-12 Explanation Quality
Properties to unify the diverse properties proposed in the literature. They reviewed most XAI evaluation metrics and
associate each of them with properties. Examples of their properties that will be studied in this paper are as follow:
Continuity describes how continuous and generalizable the explanation function is, Correctness describes how faithful
the explanation is w.r.t. the black box, Compactness describes the size of the explanation, and Completeness describes
how much of the black box behavior is described in the explanation.
In practice, XAI evaluation metrics produce scores for properties of interest, making it possible to compare and choose
an XAI solution. However, the data scientists still have to find the desired XAI solutions and their corresponding XAI
evaluation metrics. This issue could be addressed with strategies that have been studied in context-aware recommender
systems.
2.3 Context-aware recommender systems
Recommender systems filter information to present the most relevant elements to a user. To the best of our knowledge,
there is no recommender system for XAI solutions. To recommend adapted XAI solutions, one should consider the
whole context of the data scientist. According to Adomavicius et al. [2011], context-aware recommender systems offer
more relevant recommendations by adapting them to the user’s situation. They also state that context may be integrated
during three phases: contextual prefiltering which selects a subset of possible candidates before the recommendation,
contextual modeling which uses context in the recommendation process, and contextual postfiltering which adjusts
the recommendation afterward. These three phases require formally defining the elements of the context, which is
one of our objectives for the framework we propose in this paper. While recommending an adapted XAI solution is
a first interesting step, the data scientist eventually wants a reliable explanation, i.e. an explanation that verifies the
properties of interest. To achieve this, a possible approach is to use previously detailed XAI evaluation metrics to
optimize hyperparameters of adapted XAI solutions. For this kind of approach, many strategies have been proposed in
the AutoML domain.
2.4 AutoML
Designing ML algorithms is an iterative task of testing and modifying both the architecture and the hyperparameters of
the algorithm. It is a repetitive task that requires a lot of time. For this reason, a part of the research has focused on
automating the design of ML algorithms, namely AutoML He et al. [2021]. AutoML frameworks look for the best
performing ML pipeline treatment to solve a task on a given dataset. According to He et al. [2021], AutoML consists of
several processes: data preparation, feature engineering, model generation, and model evaluation. They divide the model
generation process into two steps: search space and optimization methods. The first step defines the design principles of
3
Why Should I Choose You? A PREPRINT
models that are tested, and the second is how to obtain the best scores in the model evaluation process. The main strategy
of interest here is HyperParameter Optimization (HPO) which consists in finding the best hyperparameters according to
a loss function. As model performances cannot be derived according to a hyperparameter, it is a non-differentiable
optimization problem, therefore the HPO methods do not rely on the model to propose a solution. Moreover, since
training a model until convergence is a costly operation, model evaluation is a very time-consuming step. Thus, several
strategies have been proposed to accelerate the evaluation of models. He et al. [2021] lists four types of strategies: low
fidelity,weight sharing,surrogate, and early stopping. The low fidelity strategy consists in reducing the number of
observations to reduce the number of epochs, or the size of observations to reduce the number of parameters to optimize.
Weight sharing reuses learned parameters between models. The surrogate strategy replaces a computationally expensive
model with an approximation to estimate the performance of neural architecture and guide architecture research. Early
stopping can accelerate model evaluation by stopping iterations if performances are predicted to be lower than the
current best score.
2.5 Approaches for choosing an XAI solution
As mentioned in Section 1, to select an XAI solution, the data scientist can currently rely on XAI libraries, benchmarks,
and AutoML frameworks. Currently, available XAI libraries such as DeepExplain Ancona et al. [2018], AI Explainability
360 Arya et al. [2019], and Alibi Klaise et al. [2021] are gathering state-of-the-art XAI solutions. However, they neither
integrate automatic evaluation of explanation nor recommend XAI solutions according to data scientists’ needs and
constraints.
Comparatives and benchmarks Yeh et al. [2019], Hooker et al. [2019], Alvarez-Melis and Jaakkola [2018] compare XAI
solutions efficiency using XAI evaluation metrics. They are often joint with the proposal of an XAI solution or an XAI
evaluation metric on which they are focusing. However, the results obtained depend on the dataset and the ML model
that may not be the ones the data scientist uses, and thus the results may be different. Moreover, the hyperparameters of
the XAI solutions are not optimized to maximize the various properties needed by the data scientist. This last point is
problematic as some XAI properties such as correctness and compactness are not independent Nauta et al. [2022].
Eventually, Vermeire et al. [2021] highlights that users should be guided in choosing XAI solutions and proposes a
methodology for this issue, while Palacio et al. [2021] proposes a theoretical framework to facilitate the comparison
between XAI solutions.
To summarize, as there is a high diversity of XAI solutions, it is a complex and tedious task for data scientists to
find XAI solutions that fit their needs. Yet, there is no recommendation system to automate this task. Moreover, data
scientists look for the best XAI solution as they want a reliable solution. XAI evaluation metrics allow for objective
comparison, but XAI libraries do not implement them and comparatives are not adapted to the user’s context. Eventually,
data scientists have to find hyperparameters to maximize the desired properties. However, this task should be done for
multiple XAI solutions and multiple properties according to the data scientists’ preferences. This paper aims to address
these issues in the following sections.
3 Example and formalization
3.1 Illustrative example
Let’s first consider a data scientist in a medical laboratory, Alice and Bob, a physician colleague. Bob uses a ML black
box model as a decision support tool and asks if it is possible to have an explanation for the predictions of the model to
check some rare cases. Alice has access to the model, as well as the data that were used to build it, and now wants to
implement an XAI solution to produce explanations.
Here, the needs of Bob, the physician, are the following: the explanations must focus on predictions (as it is asked why
they are obtained) and the XAI solution must explain a trained model without modifying it. Moreover, Bob wants to
know how the collected data for a patient (the features) influence the result of the model.
Regarding the constraints of the context, the high-stakes decisions impose the use of a precise model and the most
faithful explanations possible (correctness property). Nevertheless, the explanations should not be completely changed
by small perturbations as blood measurements might be noisy, therefore, stable explanations are mandatory (continuity
property). Eventually, since Bob will be the main user of these explanations, concise explanations should be encouraged
as physicians shouldn’t waste time on unimportant features (compactness property).
4
摘要:

WHYSHOULDICHOOSEYOU?AUTOXAI:AFRAMEWORKFORSELECTINGANDTUNINGEXPLAINABLEAISOLUTIONSAPREPRINTRobinCugnyIRITUniversitéToulouse2Toulouse,Francerobin.cugny@irit.frJulienAligonIRITUniversitéToulouse1Toulouse,Francejulien.aligon@irit.frMaxChevalierIRITUniversitéToulouse3Toulouse,Francemax.chevalier@irit.frG...

展开>> 收起<<
Why Should I Choose You.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:16 页 大小:570.98KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注