RiskTagger: An LLM-based Agent for Automatic Annotation of Web3 Crypto Money Laundering Behaviors
- URL: http://arxiv.org/abs/2510.17848v1
- Date: Sun, 12 Oct 2025 08:54:28 GMT
- Title: RiskTagger: An LLM-based Agent for Automatic Annotation of Web3 Crypto Money Laundering Behaviors
- Authors: Dan Lin, Yanli Ding, Weipeng Zou, Jiachi Chen, Xiapu Luo, Jiajing Wu, Zibin Zheng,
- Abstract summary: RiskTagger is a large-language-model-based agent for the automatic annotation of crypto laundering behaviors in Web3.<n>RiskTagger is designed to replace or complement human annotators by addressing three key challenges: extracting clues from complex unstructured reports, reasoning over multichain transaction paths, and producing auditor-friendly explanations.
- Score: 65.80108147440863
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While the rapid growth of Web3 has driven the development of decentralized finance, user anonymity and cross-chain asset flows make on-chain laundering behaviors more covert and complex. In this context, constructing high-quality anti-money laundering(AML) datasets has become essential for risk-control systems and on-chain forensic analysis, yet current practices still rely heavily on manual efforts with limited efficiency and coverage. In this paper, we introduce RiskTagger, a large-language-model-based agent for the automatic annotation of crypto laundering behaviors in Web3. RiskTagger is designed to replace or complement human annotators by addressing three key challenges: extracting clues from complex unstructured reports, reasoning over multichain transaction paths, and producing auditor-friendly explanations. RiskTagger implements an end-to-end multi-module agent, integrating a key-clue extractor, a multichain fetcher with a laundering-behavior reasoner, and a data explainer, forming a data annotation pipeline. Experiments on the real case Bybit Hack (with the highest stolen asset value) demonstrate that RiskTagger achieves 100% accuracy in clue extraction, 84.1% consistency with expert judgment, and 90% coverage in explanation generation. Overall, RiskTagger automates laundering behavior annotation while improving transparency and scalability in AML research.
Related papers
- StableAML: Machine Learning for Behavioral Wallet Detection in Stablecoin Anti-Money Laundering on Ethereum [1.6492745888221318]
Global illicit fund flows exceed an estimated $3.1 trillion annually, with stablecoins emerging as a preferred laundering medium due to their liquidity.<n>This study analyzes an dataset and uses behavioral features to develop a robust AML framework.<n>By automating high-precision detection, we propose an approach that effectively raises the economic cost of financial misconduct without stifling innovation.
arXiv Detail & Related papers (2026-02-19T21:13:39Z) - From Transactions to Exploits: Automated PoC Synthesis for Real-World DeFi Attacks [15.23851315830671]
We present the first automated framework for verifiable proofs-of-concept (PoCs) directly from on-chain attack executions.<n>TracExp localizes attack-relevant execution contexts from noisy, multi-contract traces.<n>We evaluate TracExp on 321 real-world attacks over the past 20 months.
arXiv Detail & Related papers (2026-01-23T11:52:50Z) - Detection of Crowdsourcing Cryptocurrency Laundering via Multi-Task Collaboration [6.593202318405946]
Crowdsourcing laundering is a new form of money laundering on stablecoins.<n>Crowdsourcing laundering transactions exhibit diverse patterns and a polycentric structure.<n>We propose the Multi-Task Collaborative Crowdsourcing Laundering Detection framework.
arXiv Detail & Related papers (2025-12-02T08:58:11Z) - AgentFold: Long-Horizon Web Agents with Proactive Context Management [98.54523771369018]
LLM-based web agents show immense promise for information seeking, yet their effectiveness is hindered by a fundamental trade-off in context management.<n>We introduce AgentFold, a novel agent paradigm centered on proactive context management.<n>With simple supervised fine-tuning, our AgentFold-30B-A3B agent achieves 36.2% on BrowseComp and 47.3% on BrowseComp-ZH.
arXiv Detail & Related papers (2025-10-28T17:51:50Z) - Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain [82.98626829232899]
Fine-tuning AI agents on data from their own interactions introduces a critical security vulnerability within the AI supply chain.<n>We show that adversaries can easily poison the data collection pipeline to embed hard-to-detect backdoors.
arXiv Detail & Related papers (2025-10-03T12:47:21Z) - WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning [73.91893534088798]
WebSailor is a complete post-training methodology designed to instill this crucial capability.<n>Our approach involves generating novel, high-uncertainty tasks through structured sampling and information obfuscation.<n>WebSailor significantly outperforms all open-source agents in complex information-seeking tasks.
arXiv Detail & Related papers (2025-09-16T17:57:03Z) - MPOCryptoML: Multi-Pattern based Off-Chain Crypto Money Laundering Detection [2.2530496464901106]
We propose MPOCryptoML to effectively detect multiple laundering patterns in cryptocurrency transactions.<n>MPOCryptoML includes the development of a multi-source Personalized PageRank algorithm to identify random laundering patterns.<n>We show consistent performance gains, with improvements up to 9.13% in precision, up to 10.16% in recall, up to 7.63% in F1-score, and up to 10.19% in accuracy.
arXiv Detail & Related papers (2025-08-18T06:06:32Z) - GARG-AML against Smurfing: A Scalable and Interpretable Graph-Based Framework for Anti-Money Laundering [5.4807970361321585]
This paper introduces a novel graph-based method, GARG-AML, for efficient and effective anti-money laundering (AML)<n>It quantifies smurfing risk, a popular money laundering method, by providing each node in the network with a single interpretable score.<n>The proposed method strikes a balance among computational efficiency, detection power and transparency.
arXiv Detail & Related papers (2025-06-04T11:30:37Z) - Backdoor Cleaning without External Guidance in MLLM Fine-tuning [76.82121084745785]
Believe Your Eyes (BYE) is a data filtering framework that leverages attention entropy patterns as self-supervised signals to identify and filter backdoor samples.<n>It achieves near-zero attack success rates while maintaining clean-task performance.
arXiv Detail & Related papers (2025-05-22T17:11:58Z) - Deep Learning Approaches for Anti-Money Laundering on Mobile Transactions: Review, Framework, and Directions [51.43521977132062]
Money laundering is a financial crime that obscures the origin of illicit funds.<n>The proliferation of mobile payment platforms and smart IoT devices has significantly complicated anti-money laundering investigations.<n>This paper conducts a comprehensive review of deep learning solutions and the challenges associated with their use in AML.
arXiv Detail & Related papers (2025-03-13T05:19:44Z) - Beyond Static Datasets: A Behavior-Driven Entity-Specific Simulation to Overcome Data Scarcity and Train Effective Crypto Anti-Money Laundering Models [0.23020018305241333]
Money laundering is a key crime to be mitigated to also suspend the movement of funds from other illicit activities.<n>It is getting extremely difficult to identify money laundering in crypto transactions owing to many layering strategies available today.<n>In this paper, we propose behavior embedded entity-specific money laundering-like transaction simulation.
arXiv Detail & Related papers (2025-01-01T06:58:05Z) - LookAhead: Preventing DeFi Attacks via Unveiling Adversarial Contracts [15.071155232677643]
Decentralized Finance (DeFi) has resulted in financial losses exceeding 3 billion US dollars.<n>Current detection tools face significant challenges in identifying attack activities effectively.<n>We propose LookAhead, a new framework for detecting DeFi attacks via unveiling adversarial contracts.
arXiv Detail & Related papers (2024-01-14T11:39:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.