Explanations Based on Item Response Theory eXirt A Model-Specific Method to Explain Tree-Ensemble Model in Trust Perspective

2025-05-06 0 0 4.25MB 59 页 10玖币

侵权投诉

Explanations Based on Item Response Theory (eXirt):

A Model-Speciﬁc Method to Explain Tree-Ensemble

Model in Trust Perspective

Jos´e de Sousa Ribeiro Filhoa,b,c,∗, Lucas Felipe Ferraro Cardosoa,b, Ra´ıssa

Lorena Silva da Silvad,e, Nikolas Jorge Santiago Carneirob, Vitor Cirilo

Araujo Santosb, Ronnie Cley de Oliveira Alvesa,b

aFederal University of Par´a (UFPA), Postgraduate Program in Computer Science

(PPGCC), Bel´em, 66075-10, Par´a, Brazil

bVale Institute of Technology (ITV), Bel´em, 66055-090, Par´a, Brazil

cFederal Institute of Education, Science and Technology of Par´a

(IFPA), Ananindeua, 67125-000, Par´a, Brazil

dUniversity of Montpellier, Montpellier, 34090, H´erault, France

eLa Ligue Contre le Cancer, Montpellier, 34000, H´erault, France

Abstract

Solutions based on tree-ensemble models represent a considerable alter-

native to real-world prediction problems, but these models are considered

black box, thus hindering their applicability in problems of sensitive con-

texts (such as: health and safety). Explainable Artiﬁcial Intelligence (XAI)

aims to develop techniques that generate explanations of black box models,

since these models are normally not self-explanatory. Methods such as Ciu,

Dalex, Eli5, Lofo, Shap and Skater emerged with the proposal to explain

black box models through global rankings of feature relevance, which based

on diﬀerent methodologies, generate global explanations that indicate how

the model’s inputs explain its predictions. This research aims to present an

innovative XAI method, called eXirt, capable of carrying out the process of

explaining tree-ensemble models, based on Item Response Theory (IRT). In

∗Corresponding Author. Tel.: +55 (91) 98185-3166

Email addresses: jose.ribeiro@ifpa.edu.br (Jos´e de Sousa Ribeiro Filho),

lucas.cardoso@icen.ufpa.br (Lucas Felipe Ferraro Cardoso),

raissa.silva@inserm.fr (Ra´ıssa Lorena Silva da Silva), nikolas.carneiro@itv.org

(Nikolas Jorge Santiago Carneiro), vitor.cirilo.santos@itv.org (Vitor Cirilo Araujo

Santos), ronnie.alves@itv.org (Ronnie Cley de Oliveira Alves)

Preprint submitted to Expert Systems with Applications (Accepted) July 4, 2024

arXiv:2210.09933v3 [cs.LG] 3 Jul 2024

this context, 41 datasets, 4 tree-ensemble algorithms (Light Gradient Boost-

ing, CatBoost, Random Forest, and Gradient Boosting), and 7 XAI methods

(including eXirt) were used to generate explanations. In the ﬁrst set of

analyses, the 164 ranks of global feature relevance generated by eXirt were

compared with 984 ranks of the other XAI methods present in the literature,

being veriﬁed that the new method generated diﬀerent explanations from

other existing methods. In a second analysis, exclusive local and global ex-

planations generated by eXirt were presented that help in understanding the

model trust, since in this explanation it is possible to observe particularities

of the model regarding diﬃculty (if the model had diﬃculty predicting the

test dataset), discrimination (if the model understands the test dataset as

discriminative) and guesswork (if the model got the test dataset right by

chance). Thus, it was veriﬁed that eXirt is able to generate global expla-

nations of tree-ensemble models and also local and global explanations of

models through IRT, showing how this consolidated theory can be used in

machine learning in order to obtain explainable and reliable models.

Keywords:

Global Explanation, Item Response Theory, Explainable Artiﬁcial

Intelligence, Black box, Model-Speciﬁc, eXirt;

1. Introduction

Technology has been evolving and today artiﬁcial intelligence is already

a reality in the daily life of society. There are many real-world problems that

machine learning algorithms solve, making human life more automated and

intelligent (Shalev-Shwartz and Ben-David, 2014; Ghahramani, 2015).

Machine learning models based on tree-structured bagging and boosting

algorithms are known to provide high performance and high generalization

capabilities, and thus being widely used in intelligent systems embedded in

real-world problems (Maclin and Opitz, 1997; Haﬀar et al., 2022).

Even though they are popularly used in problems of the most diﬀerent

natures, tree-ensemble-based algorithms do not have a high number of XAI

methods capable of creating explanations of their predictions, as well as

neural networks, for example (Ibrahim and Shaﬁq, 2023; Samek et al., 2021)

Tree-ensemble algorithms are not considered transparent1, their predictions

1Transparent Algorithms: Algorithms that naturally generate explanations of how a

are not self-explanatory, thus being considered black box algorithms2and,

therefore, less used in problems related to sensitive contexts, such as health

and safety, for example (Shojaei et al., 2023; Ghosh et al., 2023; Ribeiro

et al., 2022).

With the increasing need for high-performance models — which implies

low transparency (Arrieta et al., 2020) — in sensitive contexts, there is cur-

rently a growing need to develop methods or tools that can provide infor-

mation about local explanations (feature relevance explanation generated

around each data instances) and global explanations (when it is possible to

understand the logic of all instances of the model generating in global way)

as a means to make predictions more easily interpretable and also more trust-

worthy by humans (Guidotti et al., 2018; Gunning and Aha, 2019; Lundberg

et al., 2020a; Ribeiro et al., 2016; Wang et al., 2021).

In this regard, methods such as Ciu (Fr¨amling, 2020), Dalex (Biecek

and Burzykowski, 2021), Eli5 (Korobov and Lopuhin, 2021), Lofo (Roseline

and Geetha, 2021), Shap (Lundberg et al., 2020a) e Skater (Oracle, 2021a)

have emerged to promote the creation of model-agnostic and model-speciﬁc

explanations. Note, a model-agnostic is a XAI method that it does not

depend on type of model to be explained (Arrieta et al., 2020), and a model-

speciﬁc is a XAI method that apply to a speciﬁc type of machine learning

model (Khan, 2022).

The main advantage of methods that use the model-agnostic approach

is related to its independence related the type of model to explained. In

other way, the main advantage of the model-speciﬁc approach is related the

possibility of developing speciﬁc explanations for certain types of algorithms

or even certain problems (Khan, 2022; Molnar, 2020).

It should be noted that each of the methods mentioned above is capable

of explaining models using diﬀerent techniques and methodologies, but one

fact they have in common is that they all generate global relevance rankings

of features related to the explanation of a model. And, therefore, are likely

to have their results compared in quantitative way (Ribeiro et al., 2021).

The terminologies Feature Relevance Ranking and Feature Importance

Ranking are widely used as synonyms in the computing community, but

particular output was produced. Such examples include Decision Tree, Logistic Regression,

and K-Nearest Neighbors.

2Black box algorithms: machine learning algorithms that have the steps of classiﬁcation

or regression decisions hidden from the user.

have diﬀerent deﬁnitions herein, as shown in (Arrieta et al., 2020). Since

feature rankings are regarded as ordered structures whereby each feature of

the dataset used by the model appears in a position indicated by a score. The

main diﬀerence being that, in relevance ranking, the calculation of the score

is based on the model output, whereas to calculate the importance ranking

of features, the correct label to be predicted is used (Arrieta et al., 2020;

Molnar, 2020).

Global feature relevance ranking represents a signiﬁcant part of this study

because they allow for general analyses of how a given model generalizes a

speciﬁc problem, along with analyses of how a given methodology explains

a speciﬁc model, without the need for a preliminary understanding of the

context in which the model is embedded (Ribeiro et al., 2021).

Despite being limited, the global feature relevance ranks carry general ex-

planations about the analyzed model, and for this reason they were selected

as a basic structure of explanation to compare results of diﬀerent XAI meth-

ods in a quantitative way, without the need to use knowledge of a human

expert of the context of each analyzed problem (Molnar, 2020). Because,

in XAI there is no baseline deﬁnition for good or bad model explanations

(Linardatos et al., 2021).

As shown in previous study (Ribeiro et al., 2021), explanations originating

from diﬀerent XAI methods may present speciﬁc similarities between them-

selves or also signiﬁcant diﬀerences. This, considering the properties of the

model to be explained and the particularities existing in each XAI method

used. Given this fact, when there are several explanations for a set of models,

the question naturally arises “Which model and explanation should I trust?”.

In addition to global explanations, there are local explanations that are

created at the model dataset instance level, allowing a greater level of un-

derstanding of how a model performs predictions (Arrieta et al., 2020).

Explanation-by-example is a type of model explanation technique focused

on local instances of signiﬁcant examples from a dataset, which through spe-

ciﬁc techniques produce explanations that help in the process of interpreting

model predictions by a human (Molnar, 2020). It is worth noting that this

method is a viable way to create explanations that provide insights into how

a human can trust a model prediction, or even the model as a whole (Ribeiro

et al., 2016; Cardoso et al., 2022).

Focusing on tree-ensemble algorithms, this research identiﬁes the need

and opportunity to create a model-speciﬁc method, capable of generating

global and local explanations aiming for greater reliability in the (Chatzim-

parmpas et al., 2020; Ribeiro et al., 2016) model. With this, there is a need

to have a way of evaluating models that is diﬀerent from other existing XAI

methods.

The Item Response Theory, is a very widespread theory, generally used

in the process of evaluating candidates in selection processes. The theory

uses the properties “discrimination”, “diﬃculty” and “guessing” to enable

evaluation of latent characteristics, which cannot be observed directly, of the

responses of candidates in a selection process. This is intended to establish

the relationship of hit probability to the candidate’s ability (Andrade et al.,

2000).

This research proposes a new method for explaining tree-ensemble mod-

els based on Item Response Theory, called eXirt. Seeking to validate this

method, global feature relevance ranks were generated for models created

from 4 diﬀerent algorithms (Light Gradient Boosting, CatBoost, Random

Forest, and Gradient Boosting) and 41 diﬀerent datasets (binary classiﬁca-

tion), which were compared to the results of 6 XAI methods already known

in the literature, aiming to show similarities and diﬀerences in several con-

texts of problems. Then, analyzes of local explanations uniquely generated

by the eXirt method are also presented, which provide insights on how to

trust the analyzed models.

This research is the continuation of previous studies carried out in: “Expla-

nation-by-Example Based on Item Response Theory”(Cardoso et al., 2022),

“Does Dataset Complexity Matters for Model Explainers?”(Ribeiro et al.,

2021) and “Decoding Machine Learning Benchmarks”(Cardoso et al., 2020),

which have already been duly published.

The main contributions to the studies in Explainable Artiﬁcial Intelli-

gence that this research generates are as follows:

•An innovative XAI method, called eXirt, which is based on Item Re-

sponse Theory, an interesting theory still under-explored in machine

learning;

•Innovative explanations of tree-ensemble models generated by the eXirt

method, capable of generating global feature relevance ranks based in

IRT, along with local information on model discrimination, diﬃculty

and guessing, enabling unique insights into its reliability;

•Comparisons of the features relevance ranks generated by the eXirt

method with the results generated by the Ciu, Dalex, Eli5, Lofo, Shap

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ExplanationsBasedonItemResponseTheory(eXirt):AModel-SpecificMethodtoExplainTree-EnsembleModelinTrustPerspectiveJos´edeSousaRibeiroFilhoa,b,c,∗,LucasFelipeFerraroCardosoa,b,Ra´ıssaLorenaSilvadaSilvad,e,NikolasJorgeSantiagoCarneirob,VitorCiriloAraujoSantosb,RonnieCleydeOliveiraAlvesa,baFederalUniversi...

展开>> 收起<<

Explanations Based on Item Response Theory eXirt A Model-Specific Method to Explain Tree-Ensemble Model in Trust Perspective.pdf

共59页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Explanations Based on Item Response Theory eXirt A Model-Specific Method to Explain Tree-Ensemble Model in Trust Perspective

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: