
Human-centered XAI for Burn Depth Characterization A PREPRINT
Figure 1: Our human-in-the-loop through explainability framework. A human possessing expert knowledge trains a
classifier. Our LIME-based explainer provides insights to the human based on the current model. The human applies
expert knowledge to interpret the explanation. The human expert modifies the model appropriately, and the process can
be repeated. In our case, we modify the network by adding new features based on expert medical knowledge regarding
burns.
Computer vision based techniques lack the human expertise in current medical ML models. Therefore, a human-in-the-
loop system built on Explainable Artificial Intelligence is proposed here. Human-in-the-loop (HITL) is an Artificial
Intelligence (AI) paradigm that assumes the presence of human experts that can guide the learning or operation of
the otherwise-autonomous system. Lundberg et al. Lundberg et al. [2018] developed and tested a system to prevent
hypoxaemia during surgery by providing anaesthesiologists with interpretable hypoxaemia risks and contributing
factors. Later, Sayres et al. Sayres et al. [2019] proposed and evaluated a system to assist diabetic retinopathy grading
by ophtalmologists using a deep learning model and integrated gradients explanation.
An Explainable Artificial Intelligence (XAI) is an intelligent system which can be explained and understood by a human
Gohel et al. [2021]. For this, we utilize LIME (Local Interpretable Model-agnostic Explanations) Ribeiro et al. [2016],
a recent method that is able to explain the predictions of any classifier model in an interpretable manner. LIME operates
by roughly segmenting an image into feature regions, then assigning saliency scores to each region. Higher scoring
zones are more important in arriving at the classification result of the studied model. The algorithm first creates random
permutations of the image to be explained. Then, the classification model is run on those samples. Distances between
those samples and the original image can be calculated, which are then converted to weights by being mapped between
zero and one using a kernel function. Finally, a simple linear model is fitted around the initial prediction to generate
explanations. This explanation provided by LIME is the result of the following minimization:
ξ(x) = argmin
g∈G
L(f, g, πx) + Ω(g)(1)
Let
x
and
f
be the image and classifier to be examined and
G
as the class of interpretable models like decision trees and
linear models. The complexity of the model (e.g. depth of a decision tree, number of non-zero weights in a linear model)
should be as small as possible to maintain explainability. This complexity can be defined as
Ω(g)
. Explanations by
LIME are found by fitting explainable models — minimizing the sum of the local faithfulness loss
L
and the complexity
score. Permutated sampling is used to approximate this local faithfulness loss. For this reason, a proximity measure
πx(z)
calculates the distance between
x
and another image
z
. The objective is minimizing the fidelity function while
maintaining a measure of complexity low enough to be interpretable. This minimization makes no assumptions about
f
in order to generate a model-agnostic explanation.
Explainable intelligence is useful when combined with HITL systems because they provide understandable and
qualitative information about the relationship between the instance’s components and the model’s prediction. Therefore,
an expert can make an informed decision about whether the model is reliable, and can make the necessary changes if it
is not — eventually reaching a confident result. This is extremely important, above all in the medical field, because of
the severe ethical implications that suppose a wrong medical diagnosis.
By deploying our explainable human-in-the-loop method, we were able to confirm the importance of one family of
features which can enhance a convolutional burn prediction classifier — statistical texture. From the Gray Level
Co-ocurrence Matrix (GLCM) — a method that represents the second-order statistical information of gray levels
2