MultiCFV: Detecting Control Flow Vulnerabilities in Smart Contracts Leveraging Multimodal Deep Learning
- URL: http://arxiv.org/abs/2508.01346v1
- Date: Sat, 02 Aug 2025 12:49:41 GMT
- Title: MultiCFV: Detecting Control Flow Vulnerabilities in Smart Contracts Leveraging Multimodal Deep Learning
- Authors: Hongli Peng, Xiaoqi Li, Wenkai Li,
- Abstract summary: Many smart contracts have been found to contain vulnerabilities and errors, leading to the loss of assets within the blockchain.<n>This paper proposes a multimodal deep learning approach, MultiCFV, which is designed specifically to analyze and detect erroneous control flow vulnerability.<n> Experimental results demonstrate our method effectively combines structural, syntactic, and semantic information, improving the accuracy of smart contract vulnerability detection and clone detection.
- Score: 0.8225825738565354
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The introduction of smart contract functionality marks the advent of the blockchain 2.0 era, enabling blockchain technology to support digital currency transactions and complex distributed applications. However, many smart contracts have been found to contain vulnerabilities and errors, leading to the loss of assets within the blockchain. Despite a range of tools that have been developed to identify vulnerabilities in smart contracts at the source code or bytecode level, most rely on a single modality, reducing performance, accuracy, and limited generalization capabilities. This paper proposes a multimodal deep learning approach, MultiCFV, which is designed specifically to analyze and detect erroneous control flow vulnerability, as well as identify code clones in smart contracts. Bytecode is generated from source code to construct control flow graphs, with graph embedding techniques extracting graph features. Abstract syntax trees are used to obtain syntax features, while code comments capture key commentary words and comment features. These three feature vectors are fused to create a database for code inspection, which is used to detect similar code and identify contract vulnerabilities. Experimental results demonstrate our method effectively combines structural, syntactic, and semantic information, improving the accuracy of smart contract vulnerability detection and clone detection.
Related papers
- Decompiling Smart Contracts with a Large Language Model [51.49197239479266]
Despite Etherscan's 78,047,845 smart contracts deployed on (as of May 26, 2025), a mere 767,520 ( 1%) are open source.<n>This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode.<n>We introduce a pioneering decompilation pipeline that transforms bytecode into human-readable and semantically faithful Solidity code.
arXiv Detail & Related papers (2025-06-24T13:42:59Z) - Is Compression Really Linear with Code Intelligence? [60.123628177110206]
textitFormat Annealing is a lightweight, transparent training methodology designed to assess the intrinsic capabilities of pre-trained models equitably.<n>Our empirical results reveal a fundamental logarithmic relationship between measured code intelligence and bits-per-character (BPC)<n>Our work provides a more nuanced understanding of compression's role in developing code intelligence and contributes a robust evaluation framework in the code domain.
arXiv Detail & Related papers (2025-05-16T16:59:14Z) - Malicious Code Detection in Smart Contracts via Opcode Vectorization [0.8225825738565354]
Security problems of smart contracts become increasingly prominent.<n>The existence of malicious codes may lead to the loss of user assets and system crash.<n>In this paper, a simple study is carried out on malicious code detection of intelligent contracts based on machine learning.
arXiv Detail & Related papers (2025-04-17T07:51:48Z) - SmartBugBert: BERT-Enhanced Vulnerability Detection for Smart Contract Bytecode [0.7018579932647147]
This paper introduces SmartBugBert, a novel approach that combines BERT-based deep learning with control flow graph (CFG) analysis to detect vulnerabilities directly from bytecode.<n>Our method first decompiles smart contract bytecode into optimized opcode sequences, extracts semantic features using TF-IDF, constructs control flow graphs to capture execution logic, and isolates vulnerable CFG fragments for targeted analysis.
arXiv Detail & Related papers (2025-04-07T12:30:12Z) - Combining GPT and Code-Based Similarity Checking for Effective Smart Contract Vulnerability Detection [0.0]
We present SimilarGPT, a vulnerability identification tool for smart contract.<n>The main concept of SimilarGPT is to measure the similarity between the code under inspection and the secure code from third-party libraries.<n>We propose optimizing the detection sequence using topological ordering to enhance logical coherence and reduce false positives during detection.
arXiv Detail & Related papers (2024-12-24T07:15:48Z) - Blockchain Smart Contract Threat Detection Technology Based on Symbolic
Execution [0.0]
Reentrancy vulnerability, which is hidden and complex, poses a great threat to smart contracts.
In this paper, we propose a smart contract threat detection technology based on symbolic execution.
The experimental results show that this method significantly increases both detection efficiency and accuracy.
arXiv Detail & Related papers (2023-12-24T03:27:03Z) - VulnSense: Efficient Vulnerability Detection in Ethereum Smart Contracts
by Multimodal Learning with Graph Neural Network and Language Model [0.0]
VulnSense is a comprehensive approach to efficiently detect vulnerabilities in smart contracts.
Our framework combines three types of features from smart contracts including source code, opcode sequences, and control flow graph.
We employ Bidirectional Representations from Transformers (BERT), Bidirectional Long Short-Term Memory (BiLSTM) and Graph Neural Network (GNN) models to extract and analyze these features.
The experimental outcomes demonstrate the superior performance of our proposed approach, achieving an average accuracy of 77.96% across all three categories of vulnerable smart contracts.
arXiv Detail & Related papers (2023-09-15T15:26:44Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - Smart Contract Vulnerability Detection: From Pure Neural Network to
Interpretable Graph Feature and Expert Pattern Fusion [48.744359070088166]
Conventional smart contract vulnerability detection methods heavily rely on fixed expert rules.
Recent deep learning approaches alleviate this issue but fail to encode useful expert knowledge.
We develop automatic tools to extract expert patterns from the source code.
We then cast the code into a semantic graph to extract deep graph features.
arXiv Detail & Related papers (2021-06-17T07:12:13Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z) - A Transformer-based Approach for Source Code Summarization [86.08359401867577]
We learn code representation for summarization by modeling the pairwise relationship between code tokens.
We show that despite the approach is simple, it outperforms the state-of-the-art techniques by a significant margin.
arXiv Detail & Related papers (2020-05-01T23:29:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.