The Devil is in the Conﬂict Disentangled Information Graph Neural Networks For Fraud Detection

2025-05-02 0 0 894.76KB 10 页 10玖币

侵权投诉

The Devil is in the Conﬂict: Disentangled

Information Graph Neural Networks For Fraud

Detection

Zhixun Li1,†, Dingshuo Chen2,3,†, Qiang Liu2,3, Shu Wu2,3,*

1School of Computer Science and Technology, Beijing Institute of Technology

2Center for Research on Intelligent Perception and Computing, National Laboratory of Pattern Recognition,

Institute of Automation, Chinese Academy of Sciences

3School of Artiﬁcial Intelligence, University of Chinese Academy of Sciences

lizhixun@bit.edu.cn, dingshuo.chen@cripac.ia.ac.cn, {qiang.liu, shu.wu}@nlpr.ia.ac.cn

Abstract—Graph-based fraud detection has heretofore received

considerable attention. Owning to the great success of Graph

Neural Networks (GNNs), many approaches adopting GNNs for

fraud detection has been gaining momentum. However, most

existing methods are based on the strong inductive bias of

homophily, which indicates that the context neighbors tend to

have same labels or similar features. In real scenarios, fraudsters

often engage in camouﬂage behaviors in order to avoid detection

system. Therefore, the homophilic assumption no longer holds,

which is known as the inconsistency problem. In this paper, we

argue that the performance degradation is mainly attributed to

the inconsistency between topology and attribute. To address

this problem, we propose to disentangle the fraud network

into two views, each corresponding to topology and attribute

respectively. Then we propose a simple and effective method

that uses the attention mechanism to adaptively fuse two views

which captures data-speciﬁc preference. In addition, we further

improve it by introducing mutual information constraints for

topology and attribute. To this end, we propose a Disentangled

Information Graph Neural Network (DIGNN) model, which

utilizes variational bounds to ﬁnd an approximate solution to our

proposed optimization objective function. Extensive experiments

demonstrate that our model can signiﬁcantly outperform state-

of-the-art baselines on real-world fraud detection datasets.

Index Terms—Graph Neural Networks, Fraud Detection, In-

formation Theory

I. INTRODUCTION

Graph-based fraud detection is a crucial task and has

tremendous impact in various applications, such as opinion

fraud detection [1], fake news detection [2], [3], review spams

[4] and ﬁnancial fraud detection [5], [6]. In these scenarios, as

graph can effectively model the correlations among entities,

interactive activities on platform can be characterized as a

graph, where users or objects are often treated as nodes, and

transactions or relations between them are treated as edges.

Numerous techniques have been proposed to detect the

fraudsters. Recently, driven by the powerful representation

capability of graph structure and advances of Graph Neural

Networks (GNNs) [7]–[9], many approaches try to harness

†The ﬁrst two authors contributed equally to this work.

*Corresponding author.

GNNs for fraud detection on either homogeneous or hetero-

geneous graphs. The main idea is to leverage GNNs to learn

expressive node representations with the goal of distinguishing

abnormal nodes from the normal ones in the latent embedding

space. Message-Passing GNNs (MP-GNNs) are mainstream-

ing in recent years, which aggregate neighbor node features

and achieve local smoothing by stacking layers. Although

MP-GNNs can obtain satisfactory performance on most of

cases, the strong inductive bias of homophily limits their

representative ability on heterophilic graphs. Some works [10]

point out that plentiful GNNs can be seen as low-pass ﬁlters,

so their generalization ability on high frequency graph signals

are poor. In fraud detection task, fraudsters often imitate

normal users in order to camouﬂage themselves, hence they

will interact with normal users more frequently. For instance,

normal users account for 81% of the fraudsters’ neighbor

nodes in YelpChi dataset (Figure 1). In other words, fraudsters’

features are inconsistent with their behaviors (interactions,

e.g., topological structure). Thus, recall that MP-GNNs do

not work well on heterophilic graphs, they fail to tackle the

inconsistency phenomenon in graph-based fraud detection and

fraudsters could fool the detection system.

Recently, a few works have noticed this problem, and they

employ aggregating weights to reduce the adverse impact

of dissimilar neighbors, or set similarity-aware thresholds to

select and re-link similar nodes. For instance, GraphConsis

[11] computes consistent score between connected node pairs

as the sampling probability. PC-GNN [6] combines label

information and latent embeddings as distance function to

measure similarity. Although such methods can alleviate the

inconsistency problem in some extent, they discard a lot of

information during ﬁlting dissimilar neighbors out, thus they

may lead to sub-optimal performance.

In this paper, we analyze the inconsistency problem in

graph-based fraud detection task, which has been obstruct-

ing a full understanding of this ﬁeld. First, we clarify that

the inconsistency problem is the bottleneck of graph fraud

detection. According to [12], the underlying optimization

process of GNNs is equivalent with minimizing the topology

arXiv:2210.12384v1 [cs.LG] 22 Oct 2022

and attribute constraints, and Yang et al. [13] indicates that

the degradation of performance is imputed to the compro-

mise between topology and attribute. Due to the camouﬂage

behaviors (topology) of fraudsters, which are inconsistent

with their essence (attribute), this conﬂict in fraud networks

may injure the discriminative ability of GNNs. Second, the

forefronts of different datasets are diverse, and most existing

methods are not satisfactory in fusing topological structures

and node attributes [14]. For example, fraudsters may possess

distinguishable attribute on some platforms, but their deceptive

behaviors can confuse the detection model. Therefore, we

are motivated to explore a novel method that is able to

minimize the conﬂict between topology and attribute and

meanwhile effectively extract most task-relevant information

from datasets.

We borrow the concept of multi-view learning problems

to graph-based fraud detection task and propose a simple

and effective model, Disentangled Information Graph Neural

Networks (DIGNN). Technically, we ﬁrst disentangle fraud

networks into topology and attribute views. Next, we employ

attention mechanism to fuse two view embeddings adaptively

for extracting task-relevant information. Surprisingly, we ob-

serve that this simple method surpasses all state-of-the-art

baselines. This empirically proves that the conﬂict between

topology and attribute causes the inconsistency problem. Be-

sides, to further decrease the entanglement between topology

and attribute and improve the performance, we design a new

optimization objective based on information theory, which

resorts to variational bounds to minimize mutual information

between two views and maximize the mutual information

between view embeddings and original inputs.

We conduct extensive experiments to compare our proposed

model with existing graph-based fraud detection models, the

results demonstrate the effectiveness of our model. In sum-

mary, the contributions of this paper can be summarized as

follows:

•We analyze the cause of the inconsistency problem, and

point out that it is mainly attributed to the conﬂict

between topology and attribute. In light of this, we

propose a simple yet effective model, DIGNN, which

ﬁrstly disentangles fraud network into two views and

fuses them by attention mechanism.

•We propose a novel optimization objective based on

mutual information theory and theoretically derive its

upper bound for tractable calculation.

•We verify the effectiveness of our model on real-world

fraud detection datasets. It is shown that our model is

able to signiﬁcantly improve the performance in terms of

all commonly adopted metrics.

II. RELATED WORK

A. Graph-based Fraud Detection

The core idea of graph-based fraud detection task is tak-

ing the advantages of GNNs to get the discriminative node

embeddings, and ﬁnd out the malicious ones in the latent

Benign Fraudster Heterophilic

Relation Homophilic

Relation

(a) (b)

Fig. 1. (a) Illustration of graph-based fraud detection. (b) Neighbor distribu-

tion of fraudsters and benign users in YelpChi dataset.

space. Examples include [11], [15], [16] for review fraud

detection, [2], [3] for fake news detection and [5], [6], [17]–

[19] for ﬁnancial fraud detection. Ma et al. [20] provides a

comprehensive investigation on graph-based fraud detection.

Most of existing GNNs methods holds homophilic as-

sumption that neighbor nodes share same labels or similar

features. However, fraudsters will try to conceal themselves,

so that their features are inconsistent with their camouﬂage

behaviors. Some graph-based fraud detection works have no-

ticed this problem. GraphConsis [11] pioneers to formulate

and tackle the inconsistency problem. They introduce three

kinds of inconsistency phenomenon existing in fraud networks.

CARE-GNN [15] devises a label-aware similarity measure to

ﬁnd informative neighboring nodes and utilizes reinforcement

learning to select similar neighbors. FRAUDRE [21] aggre-

gates difference between adjacent node pairs. PC-GNN [6]

devises a choose operation to select beneﬁcial neighbors based

on feature similarity. IHGAT [22] is devised to encode both

sequence-like intentions and relationship among transactions

for leveraging the cross-interaction information.

Our model is different from all above works. We in-

novatively disentangle topology and attribute and consider

graph learning as a multi-view learning problem, instead of

measuring similarity between adjacent node pairs.

B. Multi-view on GNNs

Topology and attribute are two essential compositions of

graphs. However existing state-of-the-art GNN models are

disable to effectively fuse topological structure and node

attributes. AM-GCN [14] uses k-nearest neighbor to con-

struct feature graph and combine it with topological struc-

ture view and common embeddings. SCRL [23] designs a

self-supervised approach to maximize the agreement of the

embeddings in the topology graph and the feature graph. A

recent work [13] claims that the interference between topology

and attribute is mainly ascribed to compromises between

them. LINKX [24] processes node attributes and topological

structure in an orthogonal manner. In this paper, we also follow

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

TheDevilisintheConict:DisentangledInformationGraphNeuralNetworksForFraudDetectionZhixunLi1,y,DingshuoChen2,3,y,QiangLiu2,3,ShuWu2,3,*1SchoolofComputerScienceandTechnology,BeijingInstituteofTechnology2CenterforResearchonIntelligentPerceptionandComputing,NationalLaboratoryofPatternRecognition,Institu...

收起<<

The Devil is in the Conﬂict Disentangled Information Graph Neural Networks For Fraud Detection.pdf

共10页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

The Devil is in the Conﬂict Disentangled Information Graph Neural Networks For Fraud Detection

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: