Painting the black box white experimental findings from applying XAI to an ECG reading setting

2025-05-06 0 0 1.7MB 15 页 10玖币

侵权投诉

Painting the black box white: experimental findings

from applying XAI to an ECG reading setting

Federico Cabitza1,2,*,Matteo Cameli3,Andrea Campagner1,Chiara Natali1and

Luca Ronzio4

1Department of Computer Science, Systems and Communication, University of Milano-Bicocca, Milan, Italy

2IRCCS Istituto Ortopedico Galeazzi, Milan, Italy

3Department of Medicine, Surgery and Neuroscience, University of Siena, Siena, Italy

4Department of Medicine and Surgery, University of Milano-Bicocca, Milan, Italy

Abstract

The shift from symbolic AI systems to black-box, sub-symbolic, and statistical ones has motivated a

rapid increase in the interest toward explainable AI (XAI), i.e. approaches to make black-box AI systems

explainable to human decision makers with the aim of making these systems more acceptable and more

usable tools and supports. However, we make the point that, rather than always making black boxes

transparent, these approaches are at risk of painting the black boxes white, thus failing to provide a level

of transparency that would increase the system’s usability and comprehensibility; or, even, at risk of

generating new errors, in what we termed the white-box paradox. To address these usability-related issues,

in this work we focus on the cognitive dimension of users’ perception of explanations and XAI systems.

To this aim, we designed and conducted a questionnaire-based experiment by which we involved 44

cardiology residents and specialists in an AI-supported ECG reading task. In doing so, we investigated

dierent research questions concerning the relationship between users’ characteristics (e.g. expertise)

and their perception of AI and XAI systems, including their trust, the perceived explanations’ quality

and their tendency to defer the decision process to automation (i.e. technology dominance), as well as

the mutual relationships among these dierent dimensions. Our ndings provide a contribution to the

evaluation of AI-based support systems from a Human-AI interaction-oriented perspective and lay the

ground for further investigation of XAI and its eects on decision making and user experience.

Keywords

Explainable AI, Decision Support Systems, ECG, Articial Intelligence, XAI

1. Introduction

We are witnessing a continuous and indeed accelerating move from decision support systems

that are based on explicit rules conceived by domain experts (so called expert systems or

knowledge-based systems) to systems whose behaviors can be traced back to an innumerable

amount of rules that have been automatically learnt on the basis of correlative and statistical

analyses of large quantities of data: this is the shift from symbolic AI systems to sub-symbolic

ones, which has made the black-box nature of these latter systems an object of a lively and

widespread debate in both technological and philosophical contexts [

]. The main assumption

*Corresponding author.

"federico.cabitza@unimib.it (F. Cabitza)

CEUR

Workshop

Proceedings

http://ceur-ws.org

ISSN 1613-0073

CEUR Workshop Proceedings (CEUR-WS.org)

arXiv:2210.15236v1 [cs.AI] 27 Oct 2022

motivating this debate is that making subsymbolic systems explainable to human decision

makers makes them better and more acceptable tools and supports.

This assumption is widely accepted [

], although there are a few scattered voices against

it (see e.g. [

]): for instance, explanations were found to increase complacency towards

the machine advice [

], increase automation bias [

] as well as groundlessly increase

condence in one’s own decision [

]. Understanding or participating to this debate, which

characterizes the scientic community that recognizes itself in the expression “explainable AI”

and in the acronym “XAI”, is dicult for the seemingly disarming heterogeneity of denitions

of explanation, and the variety of characteristics that are associated with “good explanations”,

or of the systems that generate them [14].

In what follows, we adopt the simplifying approach recently proposed in [

], where expla-

nation is dened as the meta output (that is an output that describes, enriches or complements,

another main output) of an XAI systems. In this perspective, good explanations are those that

make the XAI system more usable, and therefore a useful support. The reference to usability

suggests that we can assess explanations (and explainability) on dierent levels, by addressing

complementary questions, such as: do explanations make the socio-technical, decision-making

setting more eective, in that they help decision makers commit fewer errors? Do they make it

more ecient, by making decisions easier and faster, or just requiring fewer resources? And

lastly, but not least, do they make users more satised with the advice received, possibly because

they have understood it more, and this made them more condent about their nal say?

While some studies [

] have already considered the psychometric dimension of user satisfac-

tion (see, e.g., the concept of causability [

], related to the role of explanations in making advice

more understandable from a causal point of view), here we would like to focus on eectiveness

(i.e., accuracy) and other cognitive dimensions (than understandability), both in regard to the

support (e.g., trust and utility) and the explanations received. In fact, explanations can be either

clear or ambiguous (cf. comprehensibility); either tautological and placebic [

] or instructive

(cf. informativeness); either pertinent or o-topic (cf. pertinence); and, as obvious as it may

seem, either correct or wrong, as any AI output can be: therefore, otherwise good explanations

(that is persuasive, reassuring, comprehensible, etc.) could even mislead their target users: this

is the so called white-box paradox, which we have already begun investigating in previous

empirical studies [

]. Thus, investigating if and how much users nd explanations “good”

(and in the next section we will make this term operationally clear) can be related to focusing

on the possible determinants of machine inuence (i.e., also called dominance), automation

bias and other negative eects related to the output of decision support systems on decision

performance and practices.

2. Methods

To investigate how human decision makers perceive explanations, we designed and conducted

a questionnaire-based experiment in which we involved 44 cardiology of varying expertise and

competence (namely, 25 residents and 19 specialists) from the Medicine School of the University

Hospital of Siena (Italy), in an AI-supported ECG reading task, not connected to their daily

care. The readers were invited to classify and annotate 20 ECG cases, previously selected by a

cardiologist from a random set of cases extracted from the ECG Wave-Maven repository

on the

basis of their complexity (recorded in the above repository), so as to have a balanced dataset in

terms of case type and diculty. The study participants had to provide their diagnoses both with

and without the support of a simulated AI system, according to an asynchronous Wizard-of-Oz

protocol [

]: the support of the AI system included both a proposed diagnosis and a textual

explanation to back the former one. The experiment was performed by means of a web-based

questionnaire set up through the LimeSurvey platform (version 3.23), to which the readers had

been individually invited by personal email.

The ECG readers were randomly divided in two groups, which were equivalent for expertise

and were supposed to interact with the AI system dierently (see Fig. 1); in doing so, we could

comparatively evaluate potential dierences between a human-rst and an AI-rst conguration.

In both groups, the rst question of the questionnaire asked the readers to self-assess their

trust in AI-based diagnostic support systems for ECG reading. The same question was also

repeated at the end of the questionnaire, to evaluate potential dierences in trust caused by the

interaction with the AI system.

Figure 1:

A BPMN representation of the study design. Information collected are represented as data

objects, coming from collection tasks, whose name is denoted with the name of the main actor. Aer

the initial collection of the perceived “trust in AI”, the subprocess is repeated for each ECG case, where

HD1, AI, HD2, XAI and FHD items are collected, together with comprehensibility, appropriateness and

utility. Finally, a post-test “trust in AI” is collected again.

For each ECG case, the readers in the human-rst group were rst shown the trace of the ECG

together with a brief case description, and then they had to provide an initial diagnosis (in free

text format). After that this diagnosis had been recorded, these respondents were then shown

the diagnosis proposed by the AI; after having considered this latter advice, the respondents

could revise their initial diagnosis; then they were shown the textual explanation (motivating

the AI advice) and asked to provide their nal diagnosis in light of this additional information. In

contrast, the participants enrolled in the AI-rst group were shown the AI-proposed diagnosis

together with the ECG trace and case description; only afterwards, they were asked to provide

their own diagnosis in light of this advice only. Finally, ECG readers were shown the textual

1https://ecg.bidmc.harvard.edu/maven/mavenmain.asp

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

Paintingtheblackboxwhite:experimentalfindingsfromapplyingXAItoanECGreadingsettingFedericoCabitza1,2,*,MatteoCameli3,AndreaCampagner1,ChiaraNatali1andLucaRonzio41DepartmentofComputerScience,SystemsandCommunication,UniversityofMilano-Bicocca,Milan,Italy2IRCCSIstitutoOrtopedicoGaleazzi,Milan,Italy3Depa...

展开>> 收起<<

Painting the black box white experimental findings from applying XAI to an ECG reading setting.pdf

共15页,预览3页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

Painting the black box white experimental findings from applying XAI to an ECG reading setting

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: