Utilizing Explainable AI for improving the Performance of Neural Networks Huawei Sun13 Lorenzo Servadei13 Hao Feng13 Michael Stephan12 Robert Wille3 Avik Santra1

2025-05-06 0 0 631.17KB 8 页 10玖币
侵权投诉
Utilizing Explainable AI for improving the
Performance of Neural Networks
Huawei Sun1,3, Lorenzo Servadei1,3, Hao Feng1,3, Michael Stephan1,2, Robert Wille3, Avik Santra1
1Infineon Technologies AG, Neubiberg, Germany
2Friedrich-Alexander-University Erlangen-Nuremberg, Erlangen, Germany
3Technical University of Munich, Munich, Germany
E-mail: {huawei.sun, lorenzo.servadei, avik.santra}@infineon.com
{hao.feng, robert.wille}@tum.de
{michael.stephan}@fau.de
Abstract—Nowadays, deep neural networks are widely used in
a variety of fields that have a direct impact on society. Although
those models typically show outstanding performance, they have
been used for a long time as black boxes. To address this,
Explainable Artificial Intelligence (XAI) has been developing as
a field that aims to improve the transparency of the model and
increase their trustworthiness. We propose a retraining pipeline
that consistently improves the model predictions starting from
XAI and utilizing state-of-the-art techniques. To do that, we use
the XAI results, namely SHapley Additive exPlanations (SHAP)
values, to give specific training weights to the data samples. This
leads to an improved training of the model and, consequently,
better performance. In order to benchmark our method, we
evaluate it on both real-life and public datasets. First, we
perform the method on a radar-based people counting scenario.
Afterward, we test it on the CIFAR-10, a public Computer Vision
dataset. Experiments using the SHAP-based retraining approach
achieve a 4% more accuracy w.r.t. the standard equal weight
retraining for people counting tasks. Moreover, on the CIFAR-10,
our SHAP-based weighting strategy ends up with a 3% accuracy
rate than the training procedure with equal weighted samples.
Index Terms—Radar Sensors, Explainable AI, Deep Learning,
SHapley additive exPlanations
I. INTRODUCTION
Various application areas have been positively affected by
the recent advances of Artificial Intelligence (AI) and Machine
Learning (ML). Among them, fields such as autonomous
driving [1], [2], health tech [3], [4] and robotics [5], [6]
heavily rely on the processing of ML algorithms onto a set
of different sensors. These approaches are typically based
on computationally intensive Deep Learning (DL) strategies,
which involve training millions, or even billions of parameters
to perform a specific task. Although the out-coming results
show high performance, a major problem occurs: As a neural
network gets deeper and deeper, it is also becoming more
complex and thus challenging to be interpreted. To this end,
a neural network is often considered a black box: Even if the
model correctly predicts the given specific input, it is difficult
to explain what causes the correct prediction. This property,
in turn, reduces the trustworthiness of the outcome.
In order to improve a DL system, it is necessary to under-
stand its weaknesses and shortcomings [7]. Approaching this,
XAI focuses on improving the transparency of ML technolo-
gies and increasing their trust. When a model’s predictions
are incorrect, explanatory algorithms can aid in tracing the
underlying reasons and phenomenon. XAI has been researched
for several years, and lots of work has been done in fields
such as Computer Vision (CV) [8], [9], [10] and Natural
Language Processing (NLP) [11], [12]. These algorithms
mainly generate attention maps, which help to highlight the
critical area/words in classifying images or during language
translation. Nevertheless, nowadays DL is widely applied in
less conventional application fields: For example, radar-based
solutions for tasks such as counting people [13], identifying
gestures [14], and tracking [15], as shown in this contribution
[16]. Although the advancements mentioned above success-
fully solve radar-based problems, explaining DL models for
radar signals is still a challenging topic. Additionally, most
XAI algorithms analyze the predictions from a well-trained
model, thus focusing only on the explanatory part. To this end,
a few research contributions move forward by utilizing the
results from XAI for secondary tasks. Layer-wise Relevance
Propagation (LRP), for example, is used for adaptive learning
rate during training in [17], and in [18] the authors prune Deep
Neural Networks (DNNs) and quantize the weights mainly by
Deep Learning Important FeaTures (DeepLIFT). However, to
the best of our knowledge, XAI has not yet been used to
process the dataset and improve the network performance. In
this paper, we first adapt our method to a real-life use case:
Radar-based people counting. Afterward, we show promising
results on the CIFAR-10 dataset [19] to further underlying the
approach’s generality.
Radar-based people counting is a significant application
with high privacy preservation and weather condition indepen-
dence compared to camera-based people counting. However,
the algorithms often underperform the state-of-the-art com-
puter vision methods, and radar data often has the limitation
of low-resolution and room dependency [20], [21]. To this end,
many solutions have been implemented which use DL for this
task in different scenarios [13], [22]. Although performant,
those solutions do not consider how the network obtains the
prediction and which features are essential to explain the
outcome of the task.
This paper introduces a retraining pipeline, which adopts
arXiv:2210.04686v1 [cs.LG] 7 Oct 2022
the SHAP values for improving the model performance. The
proposed approach shows convincing outcomes in both public
and real-life datasets. In the radar-based people counting
scenario, on the one hand, we explain the neural network’s
prediction of radar signals. This is the first time in the
literature that XAI algorithms are applied to radar-based input
neural networks. On the other hand, we propose a retraining
pipeline that adds sample weights generated from the SHAP
[23] results. This further improves the performance of the
people counting network. To show this, we execute several
experiments highlighting our method’s benefit. We first apply
our weighting strategy to our radar-based people counting task.
Compared with the equal weighting method, our SHAP-based
method ends up with an increase of 4%. When we apply the
method to CIFAR-10, our approach outperforms default equal
weighting retraining with a 3% more accuracy.
II. BACKGROUND AND MOTIVATION
This section first describes the status of explainable AI
and its application fields. Afterward, we introduce the funda-
mentals of mmWave Frequency-Modulated Continuous-Wave
(FMCW) radar.
A. Explainable Artificial Intelligence
Several class activation mapping (CAM) based XAI algo-
rithms such as CAM [8] and gradient class activation mapping
(Grad-CAM) [9] are used, in the literature, to explain well-
trained CNN architecture in image classification tasks. They
typically work by generating saliency maps through a linear
combination of activation maps that highlight the model’s
attention area. Although those methods are widely-spread,
there is still a lack of research on using XAI algorithms to
explain models with radar signals as input. In fact, on the one
hand, features in radar signals are more difficult to understand
for humans than in images. On the other hand, unlike image
RGB channels, radar signals are typically represented by
distinct information, such as Macro- and Micro-Doppler maps
which cannot be stacked and treated as one single image. This
contrasts with the usual practice in CV, where methods such as
CAM-based approaches only generate a single saliency map
for features of each input sample. Therefore, a major question
is: how to utilize an efficient XAI method adaptable to the
heterogeneous information contained in the input. Recently,
SHAP[23] has been used in domains such as text classification
[24] or analysis of time-series data [25], as an additive feature
attribution approach, which can explain multiple information
in the input at the same time. This exactly adapts to the
problem at hand.
Additive feature attribution method Additive feature attri-
bution method follows:
g(z0) = φ0+
M
X
i=1
φiz0
i(1)
where z0∈ {0,1}is a binary vector whose entries are the exis-
tence of the corresponding input feature and Mis the number
of simplified input features. φiindicates the importance of
the ith feature and φ0is the baseline explanation. In order
to explain a complex model f, such as a neural network, we
can use a more straightforward explanation model ginstead.
For a given input x,f(x)denotes the prediction output. A
simplified input x0can be restored to the input xthrough a
mapping function x=hx(x0). When the binary vector z0x0,
additive feature attribution methods ensure g(z0)fc(hx(x0))
where cis the target class of given input x.
SHAP values SHAP values are the feature attributions of the
explanation model, which obeys Eq. 1 and are formulated as
follows:
φi=X
z0x0
(M− |z0|)!(z01)!
M![fc(hx(z0)) fc(hx(z0\i))]
(2)
where |z0|denotes the number of non-zero entries in z0and
z0x0represents all z0vectors where the non-zero entries
are a subset of the non-zero entries in x0. Additionally, z0\i
denotes setting zito zero. In this way, taking image-based
input as an example, for every single model input, it can
generate one SHAP map in the same shape as the input sample,
where the pixel value denotes attribution of the corresponding
pixel from the input.
B. Introduction of mmWave FMCW Radar Sensor
FMCW radar sensors typically operate by transmitting a
sequence of modulated frequency chirp signals with a short
ramp time and delays between them. A chirp sequence with
a fixed number of chirps is usually defined as one frame
and repeated with the frame repetition time [13]. The chirp
signals reflected by the targets are received, mixed with the
transmitted signal, and filtered, to generate the Intermedi-
ate Frequency (IF) signal, which is then digitized by the
Analog to Digital Converter (ADC) for task-dependent pre-
processing. For example, range and Doppler information can
be acquired by taking the Fast Fourier Transform (FFT) along
the respective axes of the data in one frame. Using multiple
receiving or transmitting antennas also allows the estimation
of the Direction Of Arrival (DOA) from the DOA-dependent
time delay of the received signal across the antennas. This
makes FMCW radar the most commonly used since it avoids
complicated pre-processing and saves energy at the same time.
Therefore, it is an ideal sensor for many simple tasks, such
as people counting and activity classification, without losing
users’ data privacy.
III. APPROACH
In this section, we first describe the radar data preprocessing
method. Then, we illustrate the radar data augmentation meth-
ods. We also introduce a stabilized architecture for learning
robust embedding vectors and label predictions. Ultimately,
we focus on our retraining procedure with different weighting
methods calculated from SHAP values and probability vector
predictions.
摘要:

UtilizingExplainableAIforimprovingthePerformanceofNeuralNetworksHuaweiSun1;3,LorenzoServadei1;3,HaoFeng1;3,MichaelStephan1;2,RobertWille3,AvikSantra11InneonTechnologiesAG,Neubiberg,Germany2Friedrich-Alexander-UniversityErlangen-Nuremberg,Erlangen,Germany3TechnicalUniversityofMunich,Munich,GermanyE-...

展开>> 收起<<
Utilizing Explainable AI for improving the Performance of Neural Networks Huawei Sun13 Lorenzo Servadei13 Hao Feng13 Michael Stephan12 Robert Wille3 Avik Santra1.pdf

共8页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:8 页 大小:631.17KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 8
客服
关注