An Attention-based Long Short-Term Memory Framework for Detection of Bitcoin Scams Puyang Zhao Wei Tian Lefu Xiao Xinhui Liu Jingjin Wuy

2025-04-30 1 0 880.1KB 6 页 10玖币

侵权投诉

An Attention-based Long Short-Term Memory

Framework for Detection of Bitcoin Scams

Puyang Zhao∗, Wei Tian∗, Lefu Xiao∗, Xinhui Liu∗, Jingjin Wu∗†

∗Department of Statistics and Data Science, BNU-HKBU United International College, Zhuhai, Guangdong, P. R. China

†Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science

Email: puyangzhao.27@gmail.com; s230202702@mail.uic.edu.cn; p930005056@mail.uic.edu.cn;

xinhui liu@outlook.com; jj.wu@ieee.org

Abstract—Bitcoin is the most common cryptocurrency involved

in cyber scams. Cybercriminals often utilize pseudonymity and

privacy protection mechanism associated with Bitcoin trans-

actions to make their scams virtually untraceable. The Ponzi

scheme has attracted particularly signiﬁcant attention among

the Bitcoin fraudulent activities. This paper considers a multi-

class classiﬁcation problem to determine whether a transaction

is involved in Ponzi schemes or other cyber scams, or is a non-

scam transaction. We design a speciﬁcally designed crawler to

collect data and propose a novel Attention-based Long Short-

Term Memory (A-LSTM) method for the classiﬁcation problem.

The experimental results show that the proposed model has

better efﬁciency and accuracy than existing approaches, including

Random Forest, Extra Trees, Gradient Boosting, and classical

LSTM. With correctly identiﬁed scam features, our proposed A-

LSTM achieves an F1-score over 82%for the original data and

outperforms the existing approaches.

Index Terms—Bitcoin, Blockchain, Data mining, Attention-

based LSTM, Fraud detection, Multi-class classiﬁcation

I. INTRODUCTION

Bitcoin is the ﬁrst decentralized cryptocurrency. As of Oc-

tober 2021, Bitcoin had a market share of around 45%, being

the highest among all cryptocurrencies [1], and is expected

to continue dominating the crypto market in the foreseeable

future. This paper will study techniques to detect cyber-crime

activities conducted by Bitcoin.

There are multiple forms of cybercrime involving Bitcoin

transactions, such as Ponzi schemes, cryptojacking, and e-

mail frauds. Among these, Ponzi schemes represent one of

the most prevalent types of cybercrime. Statistics show that

almost $7 billion was generated in cryptocurrency revenue by

Ponzi schemes in 2019, nearly twice the amount generated by

all other cyber fraud categories combined in 2020 [2].

A general trend is that more and more investors are be-

coming victims of cyber scams involving cryptos due to

inadequacies ineffective intervention and prevention measures.

Thus, one of the essential steps is to detect cyber scams in

∗Corresponding author.

This work is partly supported by Zhuhai Basic and Applied Basic Research

Foundation Grant ZH22017003200018PWC, and partly supported by the

Guangdong Provincial Key Laboratory of Interdisciplinary Research and

Application for Data Science, BNU-HKBU United International College,

Project code 2022B1212010006 and in part by Guangdong Higher Education

Upgrading Plan (2021-2025) UIC R0400001-22.

their early stages to ensure the proper functioning of the cyber

society. In this paper, we classify all Bitcoin transactions into

three categories: 1) transactions involved in a Ponzi scheme,

2) transactions involved in other types of scams, or 3) normal

non-scam transactions, for the sake of preventing the scams in

advance or detecting them in the early stage of the fraud.

In this paper, we develop a framework that can accurately

detect Ponzi schemes and other scams conducted by Bit-

coin transactions with a novel deep learning method called

attention-based Long Short-Term Memory (A-LSTM). The

main contributions are summarized as follows.

•We design a crawler which can automatically crawl in-

formation of Bitcoin transactions that potentially involve

scams from known Bitcoin addresses, such that we can

obtain the ﬁrsthand information. The crawler automati-

cally parses websites based on a dictionary that contains

Ponzi-related words like “Ponzi”, “proﬁt”, “HYI”, “multi-

plier”, “investment”, “MLM”. With the crawler, we man-

age to collect a number of Bitcoin addresses that initiated

transactions, and then build a dataset considerably larger

than those used in similar existing studies.

•From the transaction information, we study the features

that distinguish normal transactions from those involving

cyber scams. We identify the ﬁve most inﬂuential features

in Bitcoin scams detection, providing insights into the

detection of such scams. They are (i) active days; (ii)

number of outs; (iii) input number; (iv) the total number

of BTC spent; (v) number of addresses received. The

features would be explained in detail later.

•We adopt the A-LSTM mechanism that suits the features

of our constructed dataset to classify the transactions

in our framework. We compare the performance of our

proposed A-LSTM approach with four popular super-

vised learning models, namely Random Forest [3], Extra

Trees [4], Gradient Boosting [5] and classical LSTM [6].

We also integrate resampling methods with each of these

methods, aiming to solve the imbalance problem in the

dataset. We demonstrate that, while resampling is a

traditional method for solving the imbalance problem, it

is not applicable to the A-LSTM model. This is because

the resampling method would introduce a large amount of

arXiv:2210.14408v1 [cs.CR] 26 Oct 2022

resampling gives even better results than other methods

with resampling.

Fig. 1 presents an overview of our proposed framework.

The rest of this paper is organized as follows. Section II

summarizes the existing related works. Section III describes

our methodologies for identifying and collecting addresses of

Ponzi schemes, and for constructing a data set of Ponzi scheme

related features. Section IV describes the steps of the proposed

classiﬁcation approaches in detail. Section V compares the

effectiveness of strategies in terms of correctly classifying

transactions. Finally, Section VI concludes the paper and

provides some potential future research directions.

Fig. 1. Workﬂow of our Bitcoin scam detection framework

II. RELATED WORKS

Since Bitcoin scammers frequently change their IP address

to evade from cyber regulators, it is quite difﬁcult to gather

sufﬁcient data to perform meaningful analysis. Most existing

Bitcoin scam studies collected the Bitcoin addresses involved

through manual or semi-automated searches [7]. Nevertheless,

it should be noted that these methods would fail when the

fraudulent addresses are not disclosed, for example, in private

communication or when a transaction is conducted on the deep

or dark web. Therefore, manual or semi-automated collection

method is relatively inefﬁcient when the majority of scams are

carried out using hidden addresses. Another consequence of

this fact is that, although the number of scams and relevant

studies is rapidly increasing, few public datasets are available

for further analysis. For example, in Bartoletti et al. [8], a small

and imbalanced dataset consisting of 32 Ponzi scam cases and

6,400 normal cases was considered. Some more recent studies

(e.g., [9]) combined the datasets of several previous studies

aiming at making meaningful conclusions with sufﬁcient data.

However, a downside of this approach is that the timeliness

of the dataset is not preserved, and the features of scams

from different periods may not be the same. Therefore, when

dealing with these cases, we believe that it is optimal to use

tools that automatically search [8] the Bitcoin blockchain for

suspicious behavior and identify new addresses associated with

fraudulent activities within a rather short period of time.

In terms of classiﬁers, the most popular ones used in existing

studies include RIPPER [10] and Random Forest (RF). RF

was the best classiﬁer, having obtained a F1 score of 78.7%

for Ponzi schemes according to [11]. While more data mining

methods were employed to detect Ponzi schemes in recent

studies, the highest F1 scores among them have not exceeded

that in [11]. A potential reason is that other Bitcoin scams,

such as sextortion, blackmail scam, and pyramid schemes [12],

are also increasing. These scams share certain similar features

as Ponzi schemes. Existing data mining-based classiﬁers for

this purpose are all binary, which means that they only distin-

guish whether a transaction is involved in a Ponzi scheme or

not, and thus may cause plenty of false positives for detecting

Ponzi schemes [9].

To the best of our knowledge, our proposed method is the

ﬁrst to add attention mechanisms to identify Bitcoin scams.

Detailed explanations for the new mechanism will be provided

in later sections.

III. DATA COLLECTION

Our work is arranged as follows. First, we collect Bitcoin

addresses that investors could use to send money to scam

operators. The next step is to create a set of features relevant

to scam classiﬁcation, compute their values on our addresses,

and then train classiﬁers based on this dataset. After that, we

formalize a detection model for Bitcoin scams as a multi-class

classiﬁcation problem, where the task was to distinguish Ponzi

schemes, other scams, and non-scam transactions.

A. Collection of Bitcoin addresses

We ﬁrst search Reddit and Bitcointalk.org, the two largest

Bitcoin communities, to collect addresses that conducted Bit-

coin transactions. We investigate advertisements related to

High-Yield Investment Programs (HYIPs), which are invest-

ment schemes that promise extraordinarily high returns, up to

100%, as a majority of HYIPs are Ponzi schemes. Then, we

search each web page through the address on the advertisement

to ﬁnd the Bitcoin address where the funds were deposited. In

rare cases, these advertisements will explicitly state where the

funds are deposited. Therefore, we can also obtain the address

by visiting the website hosting the program.

Moreover, we thoroughly study “The 2021 Crypto Crime

Report” [2] to identify Bitcoin addresses involved in Ponzi

schemes and other scams, and collect Bitcoin addresses from

other existing relevant papers with publicly available data. In

addition, we ﬁnd Bitcoin addresses related to cyber scams re-

ported on the Bitcoin Abuse Database website. Some addresses

found in previous steps are duplicated and discarded. Overall,

we identify 160 deposit addresses involved in Ponzi schemes

and 442 involved in other scams. A small proportion of the

addresses are shown in Table I.

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

AnAttention-basedLongShort-TermMemoryFrameworkforDetectionofBitcoinScamsPuyangZhao,WeiTian,LefuXiao,XinhuiLiu,JingjinWuyDepartmentofStatisticsandDataScience,BNU-HKBUUnitedInternationalCollege,Zhuhai,Guangdong,P.R.ChinayGuangdongProvincialKeyLaboratoryofInterdisciplinaryResearchandApplicationfo...

展开>> 收起<<

An Attention-based Long Short-Term Memory Framework for Detection of Bitcoin Scams Puyang Zhao Wei Tian Lefu Xiao Xinhui Liu Jingjin Wuy.pdf

共6页,预览2页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

An Attention-based Long Short-Term Memory Framework for Detection of Bitcoin Scams Puyang Zhao Wei Tian Lefu Xiao Xinhui Liu Jingjin Wuy

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: