An Attention-based Long Short-Term Memory Framework for Detection of Bitcoin Scams Puyang Zhao Wei Tian Lefu Xiao Xinhui Liu Jingjin Wuy

2025-04-30 0 0 880.1KB 6 页 10玖币
侵权投诉
An Attention-based Long Short-Term Memory
Framework for Detection of Bitcoin Scams
Puyang Zhao, Wei Tian, Lefu Xiao, Xinhui Liu, Jingjin Wu
Department of Statistics and Data Science, BNU-HKBU United International College, Zhuhai, Guangdong, P. R. China
Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science
Email: puyangzhao.27@gmail.com; s230202702@mail.uic.edu.cn; p930005056@mail.uic.edu.cn;
xinhui liu@outlook.com; jj.wu@ieee.org
Abstract—Bitcoin is the most common cryptocurrency involved
in cyber scams. Cybercriminals often utilize pseudonymity and
privacy protection mechanism associated with Bitcoin trans-
actions to make their scams virtually untraceable. The Ponzi
scheme has attracted particularly significant attention among
the Bitcoin fraudulent activities. This paper considers a multi-
class classification problem to determine whether a transaction
is involved in Ponzi schemes or other cyber scams, or is a non-
scam transaction. We design a specifically designed crawler to
collect data and propose a novel Attention-based Long Short-
Term Memory (A-LSTM) method for the classification problem.
The experimental results show that the proposed model has
better efficiency and accuracy than existing approaches, including
Random Forest, Extra Trees, Gradient Boosting, and classical
LSTM. With correctly identified scam features, our proposed A-
LSTM achieves an F1-score over 82%for the original data and
outperforms the existing approaches.
Index Terms—Bitcoin, Blockchain, Data mining, Attention-
based LSTM, Fraud detection, Multi-class classification
I. INTRODUCTION
Bitcoin is the first decentralized cryptocurrency. As of Oc-
tober 2021, Bitcoin had a market share of around 45%, being
the highest among all cryptocurrencies [1], and is expected
to continue dominating the crypto market in the foreseeable
future. This paper will study techniques to detect cyber-crime
activities conducted by Bitcoin.
There are multiple forms of cybercrime involving Bitcoin
transactions, such as Ponzi schemes, cryptojacking, and e-
mail frauds. Among these, Ponzi schemes represent one of
the most prevalent types of cybercrime. Statistics show that
almost $7 billion was generated in cryptocurrency revenue by
Ponzi schemes in 2019, nearly twice the amount generated by
all other cyber fraud categories combined in 2020 [2].
A general trend is that more and more investors are be-
coming victims of cyber scams involving cryptos due to
inadequacies ineffective intervention and prevention measures.
Thus, one of the essential steps is to detect cyber scams in
Corresponding author.
This work is partly supported by Zhuhai Basic and Applied Basic Research
Foundation Grant ZH22017003200018PWC, and partly supported by the
Guangdong Provincial Key Laboratory of Interdisciplinary Research and
Application for Data Science, BNU-HKBU United International College,
Project code 2022B1212010006 and in part by Guangdong Higher Education
Upgrading Plan (2021-2025) UIC R0400001-22.
their early stages to ensure the proper functioning of the cyber
society. In this paper, we classify all Bitcoin transactions into
three categories: 1) transactions involved in a Ponzi scheme,
2) transactions involved in other types of scams, or 3) normal
non-scam transactions, for the sake of preventing the scams in
advance or detecting them in the early stage of the fraud.
In this paper, we develop a framework that can accurately
detect Ponzi schemes and other scams conducted by Bit-
coin transactions with a novel deep learning method called
attention-based Long Short-Term Memory (A-LSTM). The
main contributions are summarized as follows.
We design a crawler which can automatically crawl in-
formation of Bitcoin transactions that potentially involve
scams from known Bitcoin addresses, such that we can
obtain the firsthand information. The crawler automati-
cally parses websites based on a dictionary that contains
Ponzi-related words like “Ponzi”, “profit”, “HYI”, “multi-
plier”, “investment”, “MLM”. With the crawler, we man-
age to collect a number of Bitcoin addresses that initiated
transactions, and then build a dataset considerably larger
than those used in similar existing studies.
From the transaction information, we study the features
that distinguish normal transactions from those involving
cyber scams. We identify the five most influential features
in Bitcoin scams detection, providing insights into the
detection of such scams. They are (i) active days; (ii)
number of outs; (iii) input number; (iv) the total number
of BTC spent; (v) number of addresses received. The
features would be explained in detail later.
We adopt the A-LSTM mechanism that suits the features
of our constructed dataset to classify the transactions
in our framework. We compare the performance of our
proposed A-LSTM approach with four popular super-
vised learning models, namely Random Forest [3], Extra
Trees [4], Gradient Boosting [5] and classical LSTM [6].
We also integrate resampling methods with each of these
methods, aiming to solve the imbalance problem in the
dataset. We demonstrate that, while resampling is a
traditional method for solving the imbalance problem, it
is not applicable to the A-LSTM model. This is because
the resampling method would introduce a large amount of
noise into A-LSTM. On the other hand, A-LSTM without978-1-6654-9144-0/22/$31.00 ©2022 IEEE
arXiv:2210.14408v1 [cs.CR] 26 Oct 2022
resampling gives even better results than other methods
with resampling.
Fig. 1 presents an overview of our proposed framework.
The rest of this paper is organized as follows. Section II
summarizes the existing related works. Section III describes
our methodologies for identifying and collecting addresses of
Ponzi schemes, and for constructing a data set of Ponzi scheme
related features. Section IV describes the steps of the proposed
classification approaches in detail. Section V compares the
effectiveness of strategies in terms of correctly classifying
transactions. Finally, Section VI concludes the paper and
provides some potential future research directions.
Fig. 1. Workflow of our Bitcoin scam detection framework
II. RELATED WORKS
Since Bitcoin scammers frequently change their IP address
to evade from cyber regulators, it is quite difficult to gather
sufficient data to perform meaningful analysis. Most existing
Bitcoin scam studies collected the Bitcoin addresses involved
through manual or semi-automated searches [7]. Nevertheless,
it should be noted that these methods would fail when the
fraudulent addresses are not disclosed, for example, in private
communication or when a transaction is conducted on the deep
or dark web. Therefore, manual or semi-automated collection
method is relatively inefficient when the majority of scams are
carried out using hidden addresses. Another consequence of
this fact is that, although the number of scams and relevant
studies is rapidly increasing, few public datasets are available
for further analysis. For example, in Bartoletti et al. [8], a small
and imbalanced dataset consisting of 32 Ponzi scam cases and
6,400 normal cases was considered. Some more recent studies
(e.g., [9]) combined the datasets of several previous studies
aiming at making meaningful conclusions with sufficient data.
However, a downside of this approach is that the timeliness
of the dataset is not preserved, and the features of scams
from different periods may not be the same. Therefore, when
dealing with these cases, we believe that it is optimal to use
tools that automatically search [8] the Bitcoin blockchain for
suspicious behavior and identify new addresses associated with
fraudulent activities within a rather short period of time.
In terms of classifiers, the most popular ones used in existing
studies include RIPPER [10] and Random Forest (RF). RF
was the best classifier, having obtained a F1 score of 78.7%
for Ponzi schemes according to [11]. While more data mining
methods were employed to detect Ponzi schemes in recent
studies, the highest F1 scores among them have not exceeded
that in [11]. A potential reason is that other Bitcoin scams,
such as sextortion, blackmail scam, and pyramid schemes [12],
are also increasing. These scams share certain similar features
as Ponzi schemes. Existing data mining-based classifiers for
this purpose are all binary, which means that they only distin-
guish whether a transaction is involved in a Ponzi scheme or
not, and thus may cause plenty of false positives for detecting
Ponzi schemes [9].
To the best of our knowledge, our proposed method is the
first to add attention mechanisms to identify Bitcoin scams.
Detailed explanations for the new mechanism will be provided
in later sections.
III. DATA COLLECTION
Our work is arranged as follows. First, we collect Bitcoin
addresses that investors could use to send money to scam
operators. The next step is to create a set of features relevant
to scam classification, compute their values on our addresses,
and then train classifiers based on this dataset. After that, we
formalize a detection model for Bitcoin scams as a multi-class
classification problem, where the task was to distinguish Ponzi
schemes, other scams, and non-scam transactions.
A. Collection of Bitcoin addresses
We first search Reddit and Bitcointalk.org, the two largest
Bitcoin communities, to collect addresses that conducted Bit-
coin transactions. We investigate advertisements related to
High-Yield Investment Programs (HYIPs), which are invest-
ment schemes that promise extraordinarily high returns, up to
100%, as a majority of HYIPs are Ponzi schemes. Then, we
search each web page through the address on the advertisement
to find the Bitcoin address where the funds were deposited. In
rare cases, these advertisements will explicitly state where the
funds are deposited. Therefore, we can also obtain the address
by visiting the website hosting the program.
Moreover, we thoroughly study “The 2021 Crypto Crime
Report” [2] to identify Bitcoin addresses involved in Ponzi
schemes and other scams, and collect Bitcoin addresses from
other existing relevant papers with publicly available data. In
addition, we find Bitcoin addresses related to cyber scams re-
ported on the Bitcoin Abuse Database website. Some addresses
found in previous steps are duplicated and discarded. Overall,
we identify 160 deposit addresses involved in Ponzi schemes
and 442 involved in other scams. A small proportion of the
addresses are shown in Table I.
摘要:

AnAttention-basedLongShort-TermMemoryFrameworkforDetectionofBitcoinScamsPuyangZhao,WeiTian,LefuXiao,XinhuiLiu,JingjinWuyDepartmentofStatisticsandDataScience,BNU-HKBUUnitedInternationalCollege,Zhuhai,Guangdong,P.R.ChinayGuangdongProvincialKeyLaboratoryofInterdisciplinaryResearchandApplicationfo...

展开>> 收起<<
An Attention-based Long Short-Term Memory Framework for Detection of Bitcoin Scams Puyang Zhao Wei Tian Lefu Xiao Xinhui Liu Jingjin Wuy.pdf

共6页,预览2页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!

相关推荐

分类:图书资源 价格:10玖币 属性:6 页 大小:880.1KB 格式:PDF 时间:2025-04-30

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 6
客服
关注