A Detailed Study of Interpretability of Deep Neural Network based Top Taggers Ayush Khot1 Mark S. Neubauer1 and Avik Roy1

2025-04-28 0 0 3.69MB 35 页 10玖币
侵权投诉
A Detailed Study of Interpretability of Deep Neural
Network based Top Taggers
Ayush Khot1, Mark S. Neubauer1, and Avik Roy1
1Department of Physics & National Center for Supercomputing Applications
(NCSA), University of Illinois at Urbana-Champaign, Urbana, IL 61801
E-mail: avroy@illinois.edu
June 2023
Abstract. Recent developments in the methods of explainable AI (XAI) allow
researchers to explore the inner workings of deep neural networks (DNNs), revealing
crucial information about input-output relationships and realizing how data connects
with machine learning models. In this paper we explore interpretability of DNN models
designed to identify jets coming from top quark decay in high energy proton-proton
collisions at the Large Hadron Collider (LHC). We review a subset of existing top tagger
models and explore different quantitative methods to identify which features play the
most important roles in identifying the top jets. We also investigate how and why
feature importance varies across different XAI metrics, how correlations among features
impact their explainability, and how latent space representations encode information
as well as correlate with physically meaningful quantities. Our studies uncover some
major pitfalls of existing XAI methods and illustrate how they can be overcome to
obtain consistent and meaningful interpretation of these models. We additionally
illustrate the activity of hidden layers as Neural Activation Pattern (NAP) diagrams
and demonstrate how they can be used to understand how DNNs relay information
across the layers and how this understanding can help to make such models significantly
simpler by allowing effective model reoptimization and hyperparameter tuning. These
studies not only facilitate a methodological approach to interpreting models but also
unveil new insights about what these models learn. Incorporating these observations
into augmented model design, we propose the Particle Flow Interaction Network
(PFIN) model and demonstrate how interpretability-inspired model augmentation can
improve top tagging performance.
Keywords: Explainable AI, interpretable machine learning, jet classification with deep
learning
corresponding author
arXiv:2210.04371v4 [hep-ex] 5 Jul 2023
A Detailed Study of Interpretability of Deep Neural Network based Top Taggers 2
1. Introduction
Machine learning (ML) models are ubiquitous in experimental High Energy Physics
(HEP). With an ever increasing volume of data coupled with complex detector
phenomenology, these models are useful to find meaningful information from these
large datasets. Over time, machine learning models have grown in complexity and
simpler regression and classification models have been replaced by intricate and deep
neural networks. Owing to their intractably large number of trainable parameters and
arbitrarily complex non-linear nature, deep neural networks (DNNs) have often been
treated as black boxes. It has always been challenging to understand how different input
features contribute to the network’s computational process and how the inter-connected
neural pathways convey information. In recent years, advances in explainable Artificial
Intelligence (XAI) [1] have made it possible to build intelligible relationship between
an AI model’s inputs, architecture, and predictions [2, 3, 4]. While some methods
remain model agnostic, a substantial subset of these methods have been developed
to infer interpretability of computer vision models where an intuitive reasoning can
be extracted from human-annotated datasets to validate XAI techniques. However, in
other data structures such as large tabular data or relational data constructs like graphs,
use of XAI methods are still quite novel [5, 6]. In recent times, XAI has been successful
in learning the underlying physics of a number of problems in high energy detectors [7],
including parton showers at the Large Hadron Collider (LHC) [8] and jet reconstruction
using particle flow algorithms [9].
One of the major applications of ML in the field of HEP is classification of jets,
which is referred to as jet tagging. Jets represent hadronic showers observed as conical
spray of particles originating from quarks and gluons produced in the high energy
collisions at a collider experiment like the LHC. Identifying jets that originate from
decay products of a particle such as the top quark (t) and being able to separate them
from other jet categories, such as jets originating from the quantum chromodynamics
(QCD) background, is an important challenge in many physics analyses. Traditional top
tagging algorithms based on kinematic features of jets and clustering of jet constituents
(see Refs. [10, 11, 12, 13] for example) have been used in particle phenomenology research
as well as by the ATLAS and CMS experiments and their predecessors. In Run 1 physics
analyses, these top tagging algorithms along with low-complexity statistical models like
decision trees took the center stage in dealing with top tagging [14, 15, 16]. However,
owing to their superior performance, models based on DNNs started becoming popular
in Run 2 at a higher center-of-mass energy of 13 TeV [17, 18].
For top quarks produced with large momenta, the decay products can be packed
close to one another and be reconstructed as a single jet. For such boosted jets,
top tagging can be particularly challenging and require a better analysis of jet
substructures, a collection of constituents and their derivative properties that can offer
better discrimination between jet classes. DNNs have proven to be useful to exploit
the jet substructure properties in performing jet classification. A wide variety of deep
A Detailed Study of Interpretability of Deep Neural Network based Top Taggers 3
learning models have been developed to optimize top tagging [19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29, 30, 31, 32, 33]. A comprehensive review and comparison of many of these
models is given in Ref. [34]. Some of these models have exploited DNN’s capacity to
approximate arbitrary non-linear functions [35] and their huge success with problems
in the field of computer vision, other models have been inspired by the underlying
physics information like jet clustering history [22], physical symmetries [23] and physics-
inspired feature engineering [27]. These efforts have inspired novel model architectures
and feature engineering by creating or augmenting input feature spaces with physically
meaningful quantities [27, 36, 37].
The rich history of physics-inspired model development makes the problem of top
tagging an excellent playground to better understand the modern XAI tools themselves.
This allows us to traverse a rare two-way bridge in exploring the relationship between
data and models- our physics knowledge will allow us to better understand the inner
workings of modern XAI tools and perfect them while those improved tools would allow
us to take a deeper look at the models- paving ways for analyzing and reoptimizing
them. As it has been pointed out in Ref. [38], such insights into explainability of
DNN-based models are important to validate them, to make them reliable and reusable.
Additionally, the broader scope of uncertainty quantification in association with ML
models relies on developing robust explanations [39] and in the field of HEP for
problems like top tagging will require dedicated understanding of how robust as well as
interpretable these models are [40].
Yet another remarkable application of interpretability is to understand how the
model conveys information and in doing so, which parts of a DNN most actively engage
in forward propagation of information. Such studies could be useful to understand
and reoptimize model complexity. Given DNNs have shown remarkable success in jet
and event classification, recent work has placed emphasis on developing DNN-enabled
FPGAs for trigger-level applications at the LHC [41, 42, 43]. As resource consumption
and latency of FPGAs directly depend on the size of the network to be implemented,
it is definitely easier to embed simpler networks on these devices. Hence, methods that
allow interpreting a network’s response patterns as well as provide critical insights about
model optimization without compromising its performance can greatly benefit these new
budding fields of ML applications, especially for online event selection and jet tagging
at current and future high energy colliders.
Application of state-of-the-art explainability techniques for interpreting jet tagger
models is receiving more attention recently [37, 44, 45, 46] and has been demonstrated
to be successful in identifying feature importance for models like the Interaction
Network [47]. In this paper, we study the interpretability of a subset of existing ML-
based top tagging models. The models we have chosen use multi-layer perceptrons
(MLPs) as underlying neural architecture. Choosing simpler neural architecture allows
us to elucidate the applicability and limitations of existing XAI methods and develop
new tools to examine them without convoluting these efforts with the complexity of
larger models or unorthodox data structures. To compare our results for different
A Detailed Study of Interpretability of Deep Neural Network based Top Taggers 4
models as well as with existing benchmarks in published literature, we use the dataset
developed by the authors of Ref. [23] and later used in the top tagger model review in
Ref. [34]. The models explored in this paper along with the dataset have been reviewed
in section 2. The model hyperparameters explained in this section will constitute the
baseline model in each category. Variants of each model are studied to better understand
their interpretability where the underlying architecture remains the same but model
hyperparameters, input features, or data preprocessing might be changed. Section 3
reviews modern XAI methods that we will use in investigating the explainability of top
tagger models. In section 4, we analyze the results of applying XAI methods on different
top tagger models. Section 6 summarizes our findings and illustrates new dimensions
to explore in the conjunction of XAI and HEP.
2. Review of Top Tagging Dataset and Models
The dataset used in this paper has been used to perform model benchmarking studies
in Ref. [34] and publicly available at Ref. [48]. This dataset consists of 1 million top
(signal) jets and 1 million QCD (background) jets genetated with Pythia8 [49] with its
default tune at 14 TeV center of mass energy for proton-proton collisions. The detector
simulation was performed with Delphes [50] and jets were reconstructed using the
anti ktalgorithm [51] with a jet radius of R= 0.8using FastJet [52]. Only jets
with transverse momenta within the range of 550 and 650 GeV are considered. For
each jet, the dataset contains the four momenta of up to 200 constituents with zero-
padded entries for missing constituents. The dataset is divided into traning, validation,
and testing sets with a 6:2:2 split. Some characteristic jet features from a random
subsample of the training data are shown in Figure 1.
(a) (b) (c)
Figure 1. Distribution of (a) number of constituent particles, (b) jet transverse
momentum (pT,J ), and (c) jet mass (mJ) for background (QCD) and signal (top)
jets
In this paper we consider three different NN-based models for top tagging. Given
the tagger distinguishes between two jet classes, minimizing the standard binary cross-
A Detailed Study of Interpretability of Deep Neural Network based Top Taggers 5
entropy (BCE) loss has been used as the training objective for all models. The training
is done using the Adam optimizer with minibatches. All networks showed comparable
performance with different batchsizes. The architecture, hyperparameters, and data
preprocessing for each of the baseline models is summarized below-
TopoDNN [19, 17]: The simplest top tagging model we consider is a fully connected
multi-layer perceptron (MLP) trained with transverse momentum (pT), azimuthal
angle (ϕ), and pseudorapidity (η) of the 30 most energetic particles. Usually referred
to as TopoDNN, this model represents a quintessential MLP network. Although
TopoDNN is outperformed by many other ML-models for top tagging, its simple
architecture allows us to explore different XAI metrics, their limitations, and the best
practices to overcome them. Since MLPs are still widely used in HEP for a wide variety
of applications, our studies of modern XAI for this model will also illustrate the best
practices to interpret the input-output relations for such models.
TopoDNN is trained on preprocessed data where (i) the jet is rotated on the ηϕ
plane to have the most energetic component aligned along the central coordinate (0,0),
(ii) the second most energetic component falls along the negative-ϕaxis, and (iii) all
momenta are scaled with an arbitrarily chosen factor of 1/1700. The transformations
(i) and (ii) take advantage of the underlying Lorentz invariance of collider physics and
(iii) converts the momenta into unitless quantities and scales them down to a numerical
range comparable to those of η, ϕ quantities. The baseline model is constructed with
4 hidden layers with 300, 102, 12, and 6 nodes respectively and ReLU activation
function. The output layer consists of a single node which is converted by the sigmoid
function to represent the probability of the jet being classified as a signal jet.
Multi-body N-subjettiness (MBNS) [20, 21]: Top-tagging with N-subjettiness
variables uses an MLP as the underlying trainable architecture. However, the input
to the network is different from the usual kinematic variables. It uses the multi-body
N-subjettiness variables [53], defined as
τ(β)
n=1
pT,J X
i
pT,i min nRβ
1i,Rβ
2i, ..., Rβ
nio(1)
where pT,J and pT,i represent the transverse momenta of the jet and its i-th constituent
and Rki is the distance between the k-th jet axis and the i-th particle constituent.
The njet axes chosen for claculating τ(β)
nare obtained using the ktalgorithm [54]
with E-scheme recombination [55]. Figure 2 shows the distribution of some of the τ
variables for QCD and top jets. The input to MBNS tagger is the set of subjettiness
variables
nτ(0.5)
1, τ(1)
1, τ(2)
1, τ(0.5)
2, τ(1)
2, τ(2)
2, ..., τ(0.5)
N2, τ(1)
N2, τ(2)
N2, τ(1)
N1, τ(2)
N1o[{pT,J , mJ}(2)
where, besides the subjettiness variables, the jet pTand jet mass (mJ)variables are
used as inputs to provide a kinematic scale for the jet event. However, the latter inputs
are scaled by a factor of 1/1000 to mitigate the several orders of magnitude gap between
摘要:

ADetailedStudyofInterpretabilityofDeepNeuralNetworkbasedTopTaggersAyushKhot1,MarkS.Neubauer1,andAvikRoy1‡1DepartmentofPhysics&NationalCenterforSupercomputingApplications(NCSA),UniversityofIllinoisatUrbana-Champaign,Urbana,IL61801E-mail:avroy@illinois.eduJune2023Abstract.Recentdevelopmentsinthemethod...

展开>> 收起<<
A Detailed Study of Interpretability of Deep Neural Network based Top Taggers Ayush Khot1 Mark S. Neubauer1 and Avik Roy1.pdf

共35页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:35 页 大小:3.69MB 格式:PDF 时间:2025-04-28

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 35
客服
关注