Leveraging Ensemble-Based Semi-Supervised Learning for Illicit Account Detection in Ethereum DeFi Transactions
- URL: http://arxiv.org/abs/2412.02408v2
- Date: Wed, 15 Oct 2025 11:57:47 GMT
- Title: Leveraging Ensemble-Based Semi-Supervised Learning for Illicit Account Detection in Ethereum DeFi Transactions
- Authors: Shabnam Fazliani, Mohammad Mowlavi Sorond, Arsalan Masoudifard, Shaghayegh Fazliani,
- Abstract summary: We propose a robust solution for safeguarding the DeFi ecosystem.<n>We use an Isolation Forest model for initial detection and a self-training mechanism to iteratively generate pseudo-labels for unlabeled accounts.<n>Experiments on 6,903,860 transactions with extensive DeFi interaction coverage demonstrate that SLEID significantly outperforms supervised and semi-supervised baselines.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advent of smart contracts has enabled the rapid rise of Decentralized Finance (DeFi) on the Ethereum blockchain, offering substantial rewards in financial innovation and inclusivity. This growth, however, is accompanied by significant security risks such as illicit accounts engaged in fraud. Effective detection is further limited by the scarcity of labeled data and the evolving tactics of malicious accounts. To address these challenges with a robust solution for safeguarding the DeFi ecosystem, we propose $\textbf{SLEID}$, a $\textbf{S}$elf-$\textbf{L}$earning $\textbf{E}$nsemble-based $\textbf{I}$llicit account $\textbf{D}$etection framework. SLEID uses an Isolation Forest model for initial outlier detection and a self-training mechanism to iteratively generate pseudo-labels for unlabeled accounts, enhancing detection accuracy. Experiments on 6,903,860 Ethereum transactions with extensive DeFi interaction coverage demonstrate that SLEID significantly outperforms supervised and semi-supervised baselines with $\textbf{+2.56}$ percentage-point precision, comparable recall, and $\textbf{+0.90}$ percentage-point F1 -- particularly for the minority illicit class -- alongside $\textbf{+3.74}$ percentage-points higher accuracy and improvements in PR-AUC, while substantially reducing reliance on labeled data.
Related papers
- StableAML: Machine Learning for Behavioral Wallet Detection in Stablecoin Anti-Money Laundering on Ethereum [1.6492745888221318]
Global illicit fund flows exceed an estimated $3.1 trillion annually, with stablecoins emerging as a preferred laundering medium due to their liquidity.<n>This study analyzes an dataset and uses behavioral features to develop a robust AML framework.<n>By automating high-precision detection, we propose an approach that effectively raises the economic cost of financial misconduct without stifling innovation.
arXiv Detail & Related papers (2026-02-19T21:13:39Z) - FedGraph-VASP: Privacy-Preserving Federated Graph Learning with Post-Quantum Security for Cross-Institutional Anti-Money Laundering [0.0]
We present FedGraph-VASP, a privacy-generative graph learning framework for anti-money laundering detection.<n>Our key contribution is a Boundary Embedding Exchange protocol that shares only compressed, non-invertible graph neural network representations of boundary accounts.
arXiv Detail & Related papers (2026-01-25T18:14:26Z) - SSR: Safeguarding Staking Rewards by Defining and Detecting Logical Defects in DeFi Staking [55.62033436283969]
Decentralized Finance (DeFi) staking is one of the most prominent applications within the DeFi ecosystem.<n> logical defects in DeFi staking could enable attackers to claim unwarranted rewards.<n>We developed SSR (Safeguarding Staking Reward), a static analysis tool designed to detect logical defects in DeFi staking contracts.
arXiv Detail & Related papers (2026-01-09T15:01:41Z) - Spectral Sentinel: Scalable Byzantine-Robust Decentralized Federated Learning via Sketched Random Matrix Theory on Blockchain [0.0]
Byzantine clients poison gradients under heterogeneous (Non-IID) data.<n>We propose Spectral Sentinel, a Byzantine detection and aggregation framework.<n>We implement the full system with blockchain integration on Polygon networks.
arXiv Detail & Related papers (2025-12-14T09:43:03Z) - RiskTagger: An LLM-based Agent for Automatic Annotation of Web3 Crypto Money Laundering Behaviors [65.80108147440863]
RiskTagger is a large-language-model-based agent for the automatic annotation of crypto laundering behaviors in Web3.<n>RiskTagger is designed to replace or complement human annotators by addressing three key challenges: extracting clues from complex unstructured reports, reasoning over multichain transaction paths, and producing auditor-friendly explanations.
arXiv Detail & Related papers (2025-10-12T08:54:28Z) - Probabilistically Tightened Linear Relaxation-based Perturbation Analysis for Neural Network Verification [83.25968588249776]
We present a novel framework that combines over-approximation techniques from LiRPA-based approaches with a sampling-based method to compute tight intermediate reachable sets.<n>With negligible computational overhead, $textttPT-LiRPA$ exploiting the estimated reachable sets, significantly tightens the lower and upper linear bounds of a neural network's output.
arXiv Detail & Related papers (2025-07-07T18:45:53Z) - Saffron-1: Safety Inference Scaling [69.61130284742353]
SAFFRON is a novel inference scaling paradigm tailored explicitly for safety assurance.<n>Central to our approach is the introduction of a multifurcation reward model (MRM) that significantly reduces the required number of reward model evaluations.<n>We publicly release our trained multifurcation reward model (Saffron-1) and the accompanying token-level safety reward dataset (Safety4M)
arXiv Detail & Related papers (2025-06-06T18:05:45Z) - Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM [0.7018579932647147]
Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts.
This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection.
arXiv Detail & Related papers (2025-04-07T12:32:14Z) - Unsupervised Detection of Fraudulent Transactions in E-commerce Using Contrastive Learning [9.199789653471269]
E-commerce platforms are facing an increasing number of fraud threats.
Traditional fraud detection methods rely on supervised learning, which requires large amounts of labeled data.
This study proposes an unsupervised e-commerce fraud detection algorithm based on SimCLR.
arXiv Detail & Related papers (2025-03-24T16:14:16Z) - Deep Learning Approaches for Anti-Money Laundering on Mobile Transactions: Review, Framework, and Directions [51.43521977132062]
Money laundering is a financial crime that obscures the origin of illicit funds.
The proliferation of mobile payment platforms and smart IoT devices has significantly complicated anti-money laundering investigations.
This paper conducts a comprehensive review of deep learning solutions and the challenges associated with their use in AML.
arXiv Detail & Related papers (2025-03-13T05:19:44Z) - A Label-Free Heterophily-Guided Approach for Unsupervised Graph Fraud Detection [60.09453163562244]
We propose a Heterophily-guided Unsupervised Graph fraud dEtection approach (HUGE) for unsupervised GFD.
In the estimation module, we design a novel label-free heterophily metric called HALO, which captures the critical graph properties for GFD.
In the alignment-based fraud detection module, we develop a joint-GNN architecture with ranking loss and asymmetric alignment loss.
arXiv Detail & Related papers (2025-02-18T22:07:36Z) - Hide in Plain Sight: Clean-Label Backdoor for Auditing Membership Inference [16.893873979953593]
We propose a novel clean-label backdoor-based approach for stealthy data auditing.
Our approach employs an optimal trigger generated by a shadow model that mimics target model's behavior.
The proposed method enables robust data auditing through blackbox access, achieving high attack success rates across diverse datasets.
arXiv Detail & Related papers (2024-11-24T20:56:18Z) - Towards Resource-Efficient Federated Learning in Industrial IoT for Multivariate Time Series Analysis [50.18156030818883]
Anomaly and missing data constitute a thorny problem in industrial applications.
Deep learning enabled anomaly detection has emerged as a critical direction.
The data collected in edge devices contain user privacy.
arXiv Detail & Related papers (2024-11-06T15:38:31Z) - Enhancing Security in Federated Learning through Adaptive
Consensus-Based Model Update Validation [2.28438857884398]
This paper introduces an advanced approach for fortifying Federated Learning (FL) systems against label-flipping attacks.
We propose a consensus-based verification process integrated with an adaptive thresholding mechanism.
Our results indicate a significant mitigation of label-flipping attacks, bolstering the FL system's resilience.
arXiv Detail & Related papers (2024-03-05T20:54:56Z) - Enhancing Credit Card Fraud Detection A Neural Network and SMOTE Integrated Approach [4.341096233663623]
This research proposes an innovative methodology combining Neural Networks (NN) and Synthet ic Minority Over-sampling Technique (SMOTE) to enhance the detection performance.
The study addresses the inherent imbalance in credit card transaction data, focusing on technical advancements for robust and precise fraud detection.
arXiv Detail & Related papers (2024-02-27T02:26:04Z) - Securing Transactions: A Hybrid Dependable Ensemble Machine Learning
Model using IHT-LR and Grid Search [2.4374097382908477]
We introduce a state-of-the-art hybrid ensemble (ENS) Machine learning (ML) model that intelligently combines multiple algorithms to enhance fraud identification.
Our experiments are conducted on a publicly available credit card dataset comprising 284,807 transactions.
The proposed model achieves impressive accuracy rates of 99.66%, 99.73%, 98.56%, and 99.79%, and a perfect 100% for the DT, RF, KNN, and ENS models, respectively.
arXiv Detail & Related papers (2024-02-22T09:01:42Z) - LookAhead: Preventing DeFi Attacks via Unveiling Adversarial Contracts [15.071155232677643]
Decentralized Finance (DeFi) has resulted in financial losses exceeding 3 billion US dollars.<n>Current detection tools face significant challenges in identifying attack activities effectively.<n>We propose LookAhead, a new framework for detecting DeFi attacks via unveiling adversarial contracts.
arXiv Detail & Related papers (2024-01-14T11:39:33Z) - Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z) - Transaction Fraud Detection via an Adaptive Graph Neural Network [64.9428588496749]
We propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection.
A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes.
Experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
arXiv Detail & Related papers (2023-07-11T07:48:39Z) - Leveraging Machine Learning for Multichain DeFi Fraud Detection [5.213509776274283]
We present a framework for extracting features from different chains, including the largest one, and it is evaluated over an extensive dataset.
Different Machine Learning methods were employed, such as XGBoost and a Neural Network for identifying fraud accounts detection interacting with DeFi.
We demonstrate that the introduction of novel DeFi-related features, significantly improves the evaluation results.
arXiv Detail & Related papers (2023-05-17T15:48:21Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - Free Lunch for Generating Effective Outlier Supervision [46.37464572099351]
We propose an ultra-effective method to generate near-realistic outlier supervision.
Our proposed textttBayesAug significantly reduces the false positive rate over 12.50% compared with the previous schemes.
arXiv Detail & Related papers (2023-01-17T01:46:45Z) - Exploring Robustness of Unsupervised Domain Adaptation in Semantic
Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space.
This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z) - Towards Credit-Fraud Detection via Sparsely Varying Gaussian
Approximations [0.0]
We propose a credit card fraud detection concept incorporating the uncertainty in our prediction system to ensure better judgment in such a crucial task.
We perform the same with different sets of kernels and the different number of inducing data points to show the best accuracy was obtained.
arXiv Detail & Related papers (2020-07-14T16:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.