The Devil is in the Conflict: Disentangled
Information Graph Neural Networks For Fraud
Detection
Zhixun Li1,†, Dingshuo Chen2,3,†, Qiang Liu2,3, Shu Wu2,3,*
1School of Computer Science and Technology, Beijing Institute of Technology
2Center for Research on Intelligent Perception and Computing, National Laboratory of Pattern Recognition,
Institute of Automation, Chinese Academy of Sciences
3School of Artificial Intelligence, University of Chinese Academy of Sciences
lizhixun@bit.edu.cn, dingshuo.chen@cripac.ia.ac.cn, {qiang.liu, shu.wu}@nlpr.ia.ac.cn
Abstract—Graph-based fraud detection has heretofore received
considerable attention. Owning to the great success of Graph
Neural Networks (GNNs), many approaches adopting GNNs for
fraud detection has been gaining momentum. However, most
existing methods are based on the strong inductive bias of
homophily, which indicates that the context neighbors tend to
have same labels or similar features. In real scenarios, fraudsters
often engage in camouflage behaviors in order to avoid detection
system. Therefore, the homophilic assumption no longer holds,
which is known as the inconsistency problem. In this paper, we
argue that the performance degradation is mainly attributed to
the inconsistency between topology and attribute. To address
this problem, we propose to disentangle the fraud network
into two views, each corresponding to topology and attribute
respectively. Then we propose a simple and effective method
that uses the attention mechanism to adaptively fuse two views
which captures data-specific preference. In addition, we further
improve it by introducing mutual information constraints for
topology and attribute. To this end, we propose a Disentangled
Information Graph Neural Network (DIGNN) model, which
utilizes variational bounds to find an approximate solution to our
proposed optimization objective function. Extensive experiments
demonstrate that our model can significantly outperform state-
of-the-art baselines on real-world fraud detection datasets.
Index Terms—Graph Neural Networks, Fraud Detection, In-
formation Theory
I. INTRODUCTION
Graph-based fraud detection is a crucial task and has
tremendous impact in various applications, such as opinion
fraud detection [1], fake news detection [2], [3], review spams
[4] and financial fraud detection [5], [6]. In these scenarios, as
graph can effectively model the correlations among entities,
interactive activities on platform can be characterized as a
graph, where users or objects are often treated as nodes, and
transactions or relations between them are treated as edges.
Numerous techniques have been proposed to detect the
fraudsters. Recently, driven by the powerful representation
capability of graph structure and advances of Graph Neural
Networks (GNNs) [7]–[9], many approaches try to harness
†The first two authors contributed equally to this work.
*Corresponding author.
GNNs for fraud detection on either homogeneous or hetero-
geneous graphs. The main idea is to leverage GNNs to learn
expressive node representations with the goal of distinguishing
abnormal nodes from the normal ones in the latent embedding
space. Message-Passing GNNs (MP-GNNs) are mainstream-
ing in recent years, which aggregate neighbor node features
and achieve local smoothing by stacking layers. Although
MP-GNNs can obtain satisfactory performance on most of
cases, the strong inductive bias of homophily limits their
representative ability on heterophilic graphs. Some works [10]
point out that plentiful GNNs can be seen as low-pass filters,
so their generalization ability on high frequency graph signals
are poor. In fraud detection task, fraudsters often imitate
normal users in order to camouflage themselves, hence they
will interact with normal users more frequently. For instance,
normal users account for 81% of the fraudsters’ neighbor
nodes in YelpChi dataset (Figure 1). In other words, fraudsters’
features are inconsistent with their behaviors (interactions,
e.g., topological structure). Thus, recall that MP-GNNs do
not work well on heterophilic graphs, they fail to tackle the
inconsistency phenomenon in graph-based fraud detection and
fraudsters could fool the detection system.
Recently, a few works have noticed this problem, and they
employ aggregating weights to reduce the adverse impact
of dissimilar neighbors, or set similarity-aware thresholds to
select and re-link similar nodes. For instance, GraphConsis
[11] computes consistent score between connected node pairs
as the sampling probability. PC-GNN [6] combines label
information and latent embeddings as distance function to
measure similarity. Although such methods can alleviate the
inconsistency problem in some extent, they discard a lot of
information during filting dissimilar neighbors out, thus they
may lead to sub-optimal performance.
In this paper, we analyze the inconsistency problem in
graph-based fraud detection task, which has been obstruct-
ing a full understanding of this field. First, we clarify that
the inconsistency problem is the bottleneck of graph fraud
detection. According to [12], the underlying optimization
process of GNNs is equivalent with minimizing the topology
arXiv:2210.12384v1 [cs.LG] 22 Oct 2022