The Temporal Graph of Bitcoin Transactions
- URL: http://arxiv.org/abs/2510.20028v1
- Date: Wed, 22 Oct 2025 21:10:46 GMT
- Title: The Temporal Graph of Bitcoin Transactions
- Authors: Vahid Jalili,
- Abstract summary: Since its 2009 genesis block, the Bitcoin network has processed num>1.08 billion (B) transactions representing num>8.72B BTC.<n>We present an ML-compatible graph modeling the Bitcoin's economic topology by reconstructing the flow of funds.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Since its 2009 genesis block, the Bitcoin network has processed \num{>1.08} billion (B) transactions representing \num{>8.72}B BTC, offering rich potential for machine learning (ML); yet, its pseudonymity and obscured flow of funds inherent in its \utxo-based design, have rendered this data largely inaccessible for ML research. Addressing this gap, we present an ML-compatible graph modeling the Bitcoin's economic topology by reconstructing the flow of funds. This temporal, heterogeneous graph encompasses complete transaction history up to block \cutoffHeight, consisting of \num{>2.4}B nodes and \num{>39.72}B edges. Additionally, we provide custom sampling methods yielding node and edge feature vectors of sampled communities, tools to load and analyze the Bitcoin graph data within specialized graph databases, and ready-to-use database snapshots. This comprehensive dataset and toolkit empower the ML community to tackle Bitcoin's intricate ecosystem at scale, driving progress in applications such as anomaly detection, address classification, market analysis, and large-scale graph ML benchmarking. Dataset and code available at \href{https://github.com/B1AAB/EBA}{github.com/b1aab/eba}
Related papers
- Unlocking Advanced Graph Machine Learning Insights through Knowledge Completion on Neo4j Graph Database [1.1059590443280725]
This paper proposes an innovative architecture that integrates a Knowledge Completion phase into GDB-GML applications.<n>We show how revealing hidden knowledge can heavily impact datasets' behavior and metrics.<n> Experimental results demonstrate that our intuition radically reshapes both topology and overall dataset dynamics.
arXiv Detail & Related papers (2025-11-14T15:27:31Z) - GraphHash: Graph Clustering Enables Parameter Efficiency in Recommender Systems [51.64666652517944]
This paper introduces GraphHash, the first graph-based approach that leverages modularity-based bipartite graph clustering to reduce embedding table sizes.<n>By employing fast clustering algorithms, GraphHash serves as a computationally efficient proxy for message-passing during preprocessing.<n>In experiments, GraphHash substantially outperforms diverse hashing baselines on both retrieval and click-through-rate prediction tasks.
arXiv Detail & Related papers (2024-12-23T03:37:58Z) - Linear Transformer Topological Masking with Graph Random Features [52.717865653036796]
We show how to parameterise topological masks as a learnable function of a weighted adjacency matrix.
Our efficient masking algorithms provide strong performance gains for tasks on image and point cloud data.
arXiv Detail & Related papers (2024-10-04T14:24:06Z) - The Shape of Money Laundering: Subgraph Representation Learning on the Blockchain with the Elliptic2 Dataset [6.209290101460395]
Subgraph representation learning is a technique for analyzing local structures (or shapes) within complex networks.
We introduce Elliptic2, a large graph dataset containing 122K labeled subgraphs of Bitcoin clusters.
We find immediate practical value in this approach and the potential for a new standard in anti-money laundering and forensic analytics in cryptocurrencies.
arXiv Detail & Related papers (2024-04-29T21:19:41Z) - Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding [51.75091298017941]
This paper proposes a novel Deep Manifold (Variational) Graph Auto-Encoder (DMVGAE/DMGAE) for attributed graph data.
The proposed method surpasses state-of-the-art baseline algorithms by a significant margin on different downstream tasks across popular datasets.
arXiv Detail & Related papers (2024-01-12T17:57:07Z) - Live Graph Lab: Towards Open, Dynamic and Real Transaction Graphs with
NFT [28.08921595650609]
We introduce the concept of it Live Graph Lab for temporal graphs, which enables open, dynamic and real transaction graphs from blockchains.
We instantiate a live graph with NFT transaction network and investigate its dynamics to provide new observations and insights.
arXiv Detail & Related papers (2023-10-18T04:54:22Z) - BABD: A Bitcoin Address Behavior Dataset for Address Behavior Pattern
Analysis [36.42552617883664]
We build a dataset comprising Bitcoin transactions between 12 July 2019 and 26 May 2021.
This dataset contains 13 types of Bitcoin addresses, 5 categories of indicators with 148 features, and 544,462 labeled data.
We use our proposed dataset on common machine learning models, namely: k-nearest neighbors algorithm, decision tree, random forest, multilayer perceptron, and XGBoost.
arXiv Detail & Related papers (2022-04-10T06:46:51Z) - MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs [55.66953093401889]
Masked graph autoencoder (MGAE) framework to perform effective learning on graph structure data.
Taking insights from self-supervised learning, we randomly mask a large proportion of edges and try to reconstruct these missing edges during training.
arXiv Detail & Related papers (2022-01-07T16:48:07Z) - Node Feature Extraction by Self-Supervised Multi-scale Neighborhood
Prediction [123.20238648121445]
We propose a new self-supervised learning framework, Graph Information Aided Node feature exTraction (GIANT)
GIANT makes use of the eXtreme Multi-label Classification (XMC) formalism, which is crucial for fine-tuning the language model based on graph information.
We demonstrate the superior performance of GIANT over the standard GNN pipeline on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2021-10-29T19:55:12Z) - Blockchain Phishing Scam Detection via Multi-channel Graph
Classification [1.6980621769406918]
Phishing scam detection methods will protect possible victims and build a healthier blockchain ecosystem.
We defined the transaction pattern graphs for users and transformed the phishing scam detection into a graph classification task.
The proposed multi-channel graph classification model (MCGC) is more able to detect potential phishing by extracting the transaction pattern features of the target users.
arXiv Detail & Related papers (2021-08-19T02:59:55Z) - GraphMI: Extracting Private Graph Data from Graph Neural Networks [59.05178231559796]
We present textbfGraph textbfModel textbfInversion attack (GraphMI), which aims to extract private graph data of the training graph by inverting GNN.
Specifically, we propose a projected gradient module to tackle the discreteness of graph edges while preserving the sparsity and smoothness of graph features.
We design a graph auto-encoder module to efficiently exploit graph topology, node attributes, and target model parameters for edge inference.
arXiv Detail & Related papers (2021-06-05T07:07:52Z) - Efficient Graph Deep Learning in TensorFlow with tf_geometric [53.237754811019464]
We introduce tf_geometric, an efficient and friendly library for graph deep learning.
tf_geometric provides kernel libraries for building Graph Neural Networks (GNNs) as well as implementations of popular GNNs.
The kernel libraries consist of infrastructures for building efficient GNNs, including graph data structures, graph map-reduce framework, graph mini-batch strategy, etc.
arXiv Detail & Related papers (2021-01-27T17:16:36Z) - Inverse Graph Identification: Can We Identify Node Labels Given Graph
Labels? [89.13567439679709]
Graph Identification (GI) has long been researched in graph learning and is essential in certain applications.
This paper defines a novel problem dubbed Inverse Graph Identification (IGI)
We propose a simple yet effective method that makes the node-level message passing process using Graph Attention Network (GAT) under the protocol of GI.
arXiv Detail & Related papers (2020-07-12T12:06:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.