Global Counterfactual Explainer for Graph Neural Networks

2025-05-06 0 0 2.78MB 9 页 10玖币
侵权投诉
Global Counterfactual Explainer for Graph Neural Networks
Zexi Huang
University of California
Santa Barbara, CA, USA
zexi_huang@cs.ucsb.edu
Mert Kosan
University of California
Santa Barbara, CA, USA
mertkosan@cs.ucsb.edu
Sourav Medya
University of Illinois
Chicago, IL, USA
medya@uic.edu
Sayan Ranu
Indian Institute of Technology
Delhi, India
sayanranu@cse.iitd.ac.in
Ambuj Singh
University of California
Santa Barbara, CA, USA
ambuj@cs.ucsb.edu
ABSTRACT
Graph neural networks (GNNs) nd applications in various domains
such as computational biology, natural language processing, and
computer security. Owing to their popularity, there is an increasing
need to explain GNN predictions since GNNs are black-box machine
learning models. One way to address this is counterfactual reason-
ing where the objective is to change the GNN prediction by minimal
changes in the input graph. Existing methods for counterfactual
explanation of GNNs are limited to instance-specic local reason-
ing. This approach has two major limitations of not being able
to oer global recourse policies and overloading human cognitive
ability with too much information. In this work, we study the global
explainability of GNNs through global counterfactual reasoning.
Specically, we want to nd a small set of representative counter-
factual graphs that explains all input graphs. Towards this goal,
we propose GCFExplainer, a novel algorithm powered by vertex-
reinforced random walks on an edit map of graphs with a greedy
summary. Extensive experiments on real graph datasets show that
the global explanation from GCFExplainer provides important
high-level insights of the model behavior and achieves a
46.9%
gain in recourse coverage and a
9.5%
reduction in recourse cost
compared to the state-of-the-art local counterfactual explainers.
CCS CONCEPTS
Computing methodologies
Causal reasoning and diagnostics;
Theory of computation Graph algorithms analysis.
KEYWORDS
Counterfactual explanation; Graph neural networks
ACM Reference Format:
Zexi Huang, Mert Kosan, Sourav Medya, Sayan Ranu, and Ambuj Singh. 2023.
Global Counterfactual Explainer for Graph Neural Networks. In Proceedings
of the Sixteenth ACM International Conference on Web Search and Data Mining
(WSDM ’23), February 27-March 3, 2023, Singapore, Singapore. ACM, New
York, NY, USA, 9 pages. https://doi.org/10.1145/3539597.3570376
Both authors contributed equally to this research.
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
WSDM ’23, February 27-March 3, 2023, Singapore, Singapore
©2023 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9407-9/23/02.
https://doi.org/10.1145/3539597.3570376
1 INTRODUCTION
Graph Neural Networks (GNNs) [
9
,
15
,
27
,
38
,
42
,
43
] are being used
in many domains such as drug discovery [
12
], chip design [
25
], com-
binatorial optimization [
21
], physical simulations [
3
,
37
] and event
prediction [
8
,
17
,
22
]. Taking the graph(s) as input, GNNs are trained
to perform various downstream tasks that form the core of many
real-world applications. For example, graph classication has been
applied to predict whether a drug would exhibit the desired chem-
ical activity [
12
]. Similarly, node prediction is used to predict the
functionality of proteins in protein-protein interaction networks [
5
]
and categorize users into roles on social networks [45].
Despite the impressive success of GNNs on predictive tasks,
GNNs are black-box machine learning models. It is non-trivial to
explain or reason why a particular prediction is made by a GNN.
Explainability of a prediction model is important to understand its
shortcomings and identify areas for improvement. In addition, the
ability to explain a model is critical towards making it trustworthy.
Owing to this limitation of GNNs, there has been signicant eorts
in recent times towards explanation approaches.
Existing work on explaining GNN predictions can be categorized
mainly in two directions: 1) factual reasoning [
20
,
40
,
46
,
47
], and
2) counterfactual reasoning [
1
,
2
,
19
,
36
]. Generally speaking, the
methods in the rst category aim to nd an important subgraph that
correlates most with the underlying GNN prediction. In contrast,
the methods with counterfactual reasoning attempt to identify the
smallest amount of perturbation on the input graph that changes the
GNN’s prediction, for example, removal/addition of edges or nodes.
Compared to factual reasoning, counterfactual explainers have
the additional advantage of providing the means for recourse [
39
].
For example, in the applications of drug discovery [
12
,
44
], muta-
genicity is an adverse property of a molecule that hampers its poten-
tial to become a marketable drug [
13
]. In Figure 1, formaldehyde is
CH
O
H
(a) Formaldehyde
CH
O
O H
(b) Formic acid
Figure 1: Formaldehyde (a) is classied by a GNN to be an
undesired mutagenic molecule with its important subgraph
found by factual reasoning highlighted in red. Formic acid
(b) is its non-mutagenic counterfactual example obtained
by removing one edge and adding one node and two edges.
1
arXiv:2210.11695v2 [cs.LG] 11 Nov 2022
WSDM ’23, February 27-March 3, 2023, Singapore, Singapore Zexi Huang, Mert Kosan, Sourav Medya, Sayan Ranu, and Ambuj Singh
classied by a GNN to be mutagenic. Factual explainers can attribute
the subgraph containing the carbon-hydrogen bond to the cause of
mutagenicity, while counterfactual explainers provide an eective
way (i.e., a recourse) to turn formaldehyde into formic acid, which
is non-mutagenic, by replacing a hydrogen atom with a hydroxyl.
In this work, we focus on counterfactual explanations. Our work
is based on the observation that existing counterfactual explain-
ers [
20
,
40
,
46
,
47
] for graphs take a local perspective, generating
counterfactual examples for individual input graphs. However, this
approach has two key limitations:
Lack of global insights:
It is desirable to oer insights that
generalize across a multitude of data graphs. For example, in-
stead of providing formic acid as a counterfactual example to
formaldehyde, we can summarize global recourse rules such as
Given any molecule with a carbonyl group (carbon-oxygen double
bond), it needs a hydroxy to be non-mutagenic. This focus on
global counterfactual explanation promises to provide higher-
level insights that are complementary to those obtained from
local counterfactual explanations.
Information overload:
The primary motivation behind coun-
terfactual analysis is to provide human-intelligible explanations.
With this objective, consider real-world graph datasets that rou-
tinely contain thousands to millions of graphs. Owing to instance-
specic counterfactual explanations, the number of counterfac-
tual graphs grows linearly with the graph dataset size. Conse-
quently, the sheer volume of counterfactual graphs overloads
human cognitive ability to process this information. Hence, the
initial motivation of providing human-intelligible insights is lost
if one does not obtain a holistic view of the counterfactual graphs.
Contributions:
In this paper, we study the problem of model-
agnostic, global counterfactual explanations of GNNs for graph
classication. More specically, given a graph dataset, our goal is
to counterfactually explain the largest number of input graphs with
a small number of counterfactuals. As we will demonstrate later
in our experiments, this formulation naturally forces us to remove
redundancy from instance-specic counterfactual explanations and
hence has higher information density. Algorithmically, the pro-
posed problem introduces new challenges. We theoretically estab-
lish that the proposed problem is NP-hard. Furthermore, the space
of all possible counterfactual graphs itself is exponential. Our work
overcomes these challenges and makes the following contributions:
Novel formulation:
We formulate the novel problem of global
counterfactual reasoning/explanation of GNNs for graph classi-
cation. In contrast to existing works on counterfactual reasoning
that only generate instance-specic examples, we provide an
explanation on the global behavior of the model.
Algorithm design:
While the problem is NP-hard, we propose
GCFExplainer, which organizes the exponential search space as
an edit map. We then perform vertex-reinforced random walks on
it to generate diverse, representative counterfactual candidates,
which are greedily summarized as the global explanation.
Experiments:
We conduct extensive experiments on real-world
datasets to validate the eectiveness of the proposed method.
Results show that GCFExplainer not only provides important
high-level insights on the model behavior but also outperforms
state-of-the-art baselines related to counterfactual reasoning in
various recourse quality metrics.
2 GLOBAL COUNTERFACTUAL
EXPLANATIONS
This section introduces the global counterfactual explanation (GCE)
problem for graph classication. We start with the background on lo-
cal counterfactual reasoning. Then, we propose a representation of
the global recourse rule that provides a high-level counterfactual un-
derstanding of the classier behavior. Finally, we introduce quality
measures for recourse rules and formally dene the GCE problem.
2.1 Local Counterfactual
Consider a graph
𝐺=(𝑉 , 𝐸)
, where
𝑉
and
𝐸
are the sets of (labelled)
nodes and edges respectively. A (binary) graph classier (e.g., a
GNN)
𝜙
classies
𝐺
into either the undesired class (
𝜙(𝐺)=
0) or the
desired one (
𝜙(𝐺)=
1). An explanation of
𝜙
seeks to answer how
these predictions are made. Those based on factual reasoning ana-
lyze what properties
𝐺
possesses to be classied in the current class
while those based on counterfactual reasoning nd what properties
𝐺needs to be assigned to the opposite class.
Existing counterfactual explanation methods take a local per-
spective. Specically, for each input graph
𝐺
, they nd a
counter-
factual
(graph)
𝐶
that is somewhat similar to
𝐺
but is assigned
to a dierent class. Without loss of generality, let
𝐺
belong to the
undesired class, i.e.,
𝜙(𝐺)=
0, then the counterfactual
𝐶
satises
𝜙(𝐶)=
1. The similarity between
𝐶
and
𝐺
is quantied by a prede-
ned distance metric
𝑑
, for example, the number of added/removed
edges [2, 19].
In our work, we consider the graph edit distance (GED) [
33
], a
more general distance measure, as the distance function to account
for other types of changes. Specically,
GED(𝐺1, 𝐺2)
counts the
minimum number of “edits” to convert
𝐺1
to
𝐺2
. An “edit” can be
the addition or removal of edges and nodes, or change of node
labels (see Figure 2). Moreover, to account for graphs of dierent
sizes, we normalize the GED by the sizes of graphs:
GED(𝐺1, 𝐺2)=
GED(𝐺1, 𝐺2)/(|𝑉1| + |𝑉2| + |𝐸1| + |𝐸2|)
. Nonetheless, our method
can be applied with other graph distance metrics, such as those
based on graph kernels (e.g., RW [4], NSPDG [6], WL [35]).
Node/edge addition
Node/edge
removal Node label change
Figure 2: Edits between graphs.
The distance function measures the quality of the counterfactual
found by the explanation model. Ideally, the counterfactual
𝐶
should
be very close to the input graph
𝐺
while belonging to a dierent
class. Formally, we dene the counterfactuals that are within a
certain distance
𝜃
from the input graph as
close counterfactuals
.
Definition
1 (Close Counterfactual). Given the GNN clas-
sier
𝜙
, distance parameter
𝜃
, and an input graph
𝐺
with undesired
2
摘要:

GlobalCounterfactualExplainerforGraphNeuralNetworksZexiHuang∗UniversityofCaliforniaSantaBarbara,CA,USAzexi_huang@cs.ucsb.eduMertKosan∗UniversityofCaliforniaSantaBarbara,CA,USAmertkosan@cs.ucsb.eduSouravMedyaUniversityofIllinoisChicago,IL,USAmedya@uic.eduSayanRanuIndianInstituteofTechnologyDelhi,Indi...

展开>> 收起<<
Global Counterfactual Explainer for Graph Neural Networks.pdf

共9页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:9 页 大小:2.78MB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 9
客服
关注