PhishingHook: Catching Phishing Ethereum Smart Contracts leveraging EVM Opcodes
- URL: http://arxiv.org/abs/2506.19480v1
- Date: Tue, 24 Jun 2025 10:16:47 GMT
- Title: PhishingHook: Catching Phishing Ethereum Smart Contracts leveraging EVM Opcodes
- Authors: Pasquale De Rosa, Simon Queyrut, YƩrom-David Bromberg, Pascal Felber, Valerio Schiavoni,
- Abstract summary: PhishingHook is a framework that applies machine learning techniques to detect phishing activities in smart contracts.<n>We experimentally compare 16 techniques, belonging to four main categories (Histogram Similaritys, Vision Models, Language Models and Vulnerability Detection Models) using 7,000 real-world malware smart contracts.<n>Our results demonstrate the efficiency of PhishingHook in performing phishing classification systems, with about 90% average accuracy among all the models.
- Score: 2.427531959511691
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Ethereum Virtual Machine (EVM) is a decentralized computing engine. It enables the Ethereum blockchain to execute smart contracts and decentralized applications (dApps). The increasing adoption of Ethereum sparked the rise of phishing activities. Phishing attacks often target users through deceptive means, e.g., fake websites, wallet scams, or malicious smart contracts, aiming to steal sensitive information or funds. A timely detection of phishing activities in the EVM is therefore crucial to preserve the user trust and network integrity. Some state-of-the art approaches to phishing detection in smart contracts rely on the online analysis of transactions and their traces. However, replaying transactions often exposes sensitive user data and interactions, with several security concerns. In this work, we present PhishingHook, a framework that applies machine learning techniques to detect phishing activities in smart contracts by directly analyzing the contract's bytecode and its constituent opcodes. We evaluate the efficacy of such techniques in identifying malicious patterns, suspicious function calls, or anomalous behaviors within the contract's code itself before it is deployed or interacted with. We experimentally compare 16 techniques, belonging to four main categories (Histogram Similarity Classifiers, Vision Models, Language Models and Vulnerability Detection Models), using 7,000 real-world malware smart contracts. Our results demonstrate the efficiency of PhishingHook in performing phishing classification systems, with about 90% average accuracy among all the models. We support experimental reproducibility, and we release our code and datasets to the research community.
Related papers
- Decompiling Smart Contracts with a Large Language Model [51.49197239479266]
Despite Etherscan's 78,047,845 smart contracts deployed on (as of May 26, 2025), a mere 767,520 ( 1%) are open source.<n>This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode.<n>We introduce a pioneering decompilation pipeline that transforms bytecode into human-readable and semantically faithful Solidity code.
arXiv Detail & Related papers (2025-06-24T13:42:59Z) - Scam Detection for Ethereum Smart Contracts: Leveraging Graph Representation Learning for Secure Blockchain [1.2180334969164464]
This paper proposes to use graphical representation learning technology to find transaction patterns and distinguish malicious transaction contracts.<n>Our research opens up more possibilities for trust and security in the ecosystem.
arXiv Detail & Related papers (2024-12-16T21:56:01Z) - Dual-view Aware Smart Contract Vulnerability Detection for Ethereum [5.002702845720439]
We propose a Dual-view Aware Smart Contract Vulnerability Detection Framework named DVDet.
The framework initially converts the source code and bytecode of smart contracts into weighted graphs and control flow sequences.
Comprehensive experiments on the dataset show that our method outperforms others in detecting vulnerabilities.
arXiv Detail & Related papers (2024-06-29T06:47:51Z) - All Your Tokens are Belong to Us: Demystifying Address Verification Vulnerabilities in Solidity Smart Contracts [24.881450403784786]
Vulnerabilities in the process of address verification can lead to great security issues.
We design and implement AVVERIFIER, a lightweight taint analyzer based on static EVM opcode simulation.
After a large-scale evaluation of over 5 million smart contracts, we have identified 812 vulnerable smart contracts that were undisclosed by our community.
arXiv Detail & Related papers (2024-05-31T01:02:07Z) - Understanding crypter-as-a-service in a popular underground marketplace [51.328567400947435]
Crypters are pieces of software whose main goal is to transform a target binary so it can avoid detection from Anti Viruses (AVs) applications.
The crypter-as-a-service model has gained popularity, in response to the increased sophistication of detection mechanisms.
This paper provides the first study on an online underground market dedicated to crypter-as-a-service.
arXiv Detail & Related papers (2024-05-20T08:35:39Z) - Enhancing Smart Contract Security Analysis with Execution Property Graphs [48.31617821205042]
We introduce Clue, a dynamic analysis framework specifically designed for a runtime virtual machine.<n>Clue captures critical information during contract executions, employing a novel graph-based representation, the Execution Property Graph.<n> evaluation results reveal Clue's superior performance with high true positive rates and low false positive rates, outperforming state-of-the-art tools.
arXiv Detail & Related papers (2023-05-23T13:16:42Z) - Blockchain Large Language Models [65.7726590159576]
This paper presents a dynamic, real-time approach to detecting anomalous blockchain transactions.
The proposed tool, BlockGPT, generates tracing representations of blockchain activity and trains from scratch a large language model to act as a real-time Intrusion Detection System.
arXiv Detail & Related papers (2023-04-25T11:56:18Z) - TTAGN: Temporal Transaction Aggregation Graph Network for Ethereum
Phishing Scams Detection [11.20384152151594]
Existing phishing scams detection technology mostly uses machine learning or network representation learning to mine the key information from the transaction network to identify phishing addresses.
We propose a Temporal Transaction Aggregation Graph Network (TTAGN) to enhance phishing detection performance.
Our TTAGN (92.8% AUC, and 81.6% F1score) outperforms the state-of-the-art methods, and the effectiveness of temporal edges representation and edge2node module is also demonstrated.
arXiv Detail & Related papers (2022-04-28T12:17:00Z) - Towards Web Phishing Detection Limitations and Mitigation [21.738240693843295]
We show how phishing sites bypass Machine Learning-based detection.
Experiments with 100K phishing/benign sites show promising accuracy (98.8%)
We propose Anti-SubtlePhish, a more resilient model based on logistic regression.
arXiv Detail & Related papers (2022-04-03T04:26:04Z) - Smart Contract Vulnerability Detection: From Pure Neural Network to
Interpretable Graph Feature and Expert Pattern Fusion [48.744359070088166]
Conventional smart contract vulnerability detection methods heavily rely on fixed expert rules.
Recent deep learning approaches alleviate this issue but fail to encode useful expert knowledge.
We develop automatic tools to extract expert patterns from the source code.
We then cast the code into a semantic graph to extract deep graph features.
arXiv Detail & Related papers (2021-06-17T07:12:13Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.