BABD: A Bitcoin Address Behavior Dataset for Address Behavior Pattern
Analysis
- URL: http://arxiv.org/abs/2204.05746v1
- Date: Sun, 10 Apr 2022 06:46:51 GMT
- Title: BABD: A Bitcoin Address Behavior Dataset for Address Behavior Pattern
Analysis
- Authors: Yuexin Xiang, Wei Ren, Hang Gao, Ding Bao, Yuchen Lei, Tiantian Li,
Qingqing Yang, Wenmao Liu, Tianqing Zhu, and Kim-Kwang Raymond Choo
- Abstract summary: We build a dataset comprising Bitcoin transactions between 12 July 2019 and 26 May 2021.
This dataset contains 13 types of Bitcoin addresses, 5 categories of indicators with 148 features, and 544,462 labeled data.
We use our proposed dataset on common machine learning models, namely: k-nearest neighbors algorithm, decision tree, random forest, multilayer perceptron, and XGBoost.
- Score: 36.42552617883664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cryptocurrencies are no longer just the preferred option for cybercriminal
activities on darknets, due to the increasing adoption in mainstream
applications. This is partly due to the transparency associated with the
underpinning ledgers, where any individual can access the record of a
transaction record on the public ledger. In this paper, we build a dataset
comprising Bitcoin transactions between 12 July 2019 and 26 May 2021. This
dataset (hereafter referred to as BABD-13) contains 13 types of Bitcoin
addresses, 5 categories of indicators with 148 features, and 544,462 labeled
data. We then use our proposed dataset on common machine learning models,
namely: k-nearest neighbors algorithm, decision tree, random forest, multilayer
perceptron, and XGBoost. The results show that the accuracy rates of these
machine learning models on our proposed dataset are between 93.24% and 96.71%.
We also analyze the proposed features and their relationships from the
experiments, and propose a k-hop subgraph generation algorithm to extract a
k-hop subgraph from the entire Bitcoin transaction graph constructed by the
directed heterogeneous multigraph starting from a specific Bitcoin address node
(e.g., a known transaction associated with a criminal investigation).
Related papers
- ORBITAAL: A Temporal Graph Dataset of Bitcoin Entity-Entity Transactions [0.0]
ORBITAAL is the first comprehensive dataset based on temporal graph formalism.
The dataset covers all Bitcoin transactions from January 2009 to January 2021.
This dataset also provides details on entities such as their global BTC balance and associated public addresses.
arXiv Detail & Related papers (2024-08-26T09:48:45Z) - Transaction Fraud Detection via an Adaptive Graph Neural Network [64.9428588496749]
We propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection.
A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes.
Experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
arXiv Detail & Related papers (2023-07-11T07:48:39Z) - Demystifying Fraudulent Transactions and Illicit Nodes in the Bitcoin
Network for Financial Forensics [8.97719386315469]
This paper presents a holistic applied data science approach to fraud detection in the Bitcoin network.
First, we contribute the Elliptic++ dataset, which extends the Elliptic transaction dataset to include over 822k Bitcoin wallet addresses (nodes)
Second, we perform fraud detection tasks on all four graphs by using diverse machine learning algorithms.
arXiv Detail & Related papers (2023-05-25T18:36:54Z) - Chainlet Orbits: Topological Address Embedding for the Bitcoin
Blockchain [15.099255988459602]
Rise of cryptocurrencies like Bitcoin, which enable transactions with a degree of pseudonymity, has led to a surge in various illicit activities.
We introduce an effective solution called Chainlet Orbits to embed Bitcoin addresses by leveraging their topological characteristics in transactions.
Our approach enables the use of interpretable and explainable machine learning models in as little as 15 minutes for most days on the Bitcoin transaction network.
arXiv Detail & Related papers (2023-05-18T21:16:59Z) - Blockchain Large Language Models [65.7726590159576]
This paper presents a dynamic, real-time approach to detecting anomalous blockchain transactions.
The proposed tool, BlockGPT, generates tracing representations of blockchain activity and trains from scratch a large language model to act as a real-time Intrusion Detection System.
arXiv Detail & Related papers (2023-04-25T11:56:18Z) - Demystifying Bitcoin Address Behavior via Graph Neural Networks [20.002509270755443]
BAClassifier is a tool that can automatically classify bitcoin addresses based on their behaviors.
We construct and release a large-scale annotated dataset that consists of over 2 million real-world bitcoin addresses.
arXiv Detail & Related papers (2022-11-26T14:55:50Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - Comprehensive Graph-conditional Similarity Preserving Network for
Unsupervised Cross-modal Hashing [97.44152794234405]
Unsupervised cross-modal hashing (UCMH) has become a hot topic recently.
In this paper, we devise a deep graph-neighbor coherence preserving network (DGCPN)
DGCPN regulates comprehensive similarity preserving losses by exploiting three types of data similarities.
arXiv Detail & Related papers (2020-12-25T07:40:59Z) - Bitcoin Transaction Forecasting with Deep Network Representation
Learning [16.715475608359046]
This paper presents a novel approach to developing a Bitcoin transaction forecast model, DLForecast, by leveraging deep neural networks for learning Bitcoin transaction network representations.
We construct a time-decaying reachability graph and a time-decaying transaction pattern graph, aiming at capturing different types of spatial-temporal Bitcoin transaction patterns.
We show that our spatial-temporal forecasting model is efficient with fast runtime and effective with forecasting accuracy over 60% and improves the prediction performance by 50% when compared to forecasting model built on the static graph baseline.
arXiv Detail & Related papers (2020-07-15T21:11:32Z) - Inverse Graph Identification: Can We Identify Node Labels Given Graph
Labels? [89.13567439679709]
Graph Identification (GI) has long been researched in graph learning and is essential in certain applications.
This paper defines a novel problem dubbed Inverse Graph Identification (IGI)
We propose a simple yet effective method that makes the node-level message passing process using Graph Attention Network (GAT) under the protocol of GI.
arXiv Detail & Related papers (2020-07-12T12:06:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.