Related papers: SmartBugBert: BERT-Enhanced Vulnerability Detection for Smart Contract Bytecode

SmartBugBert: BERT-Enhanced Vulnerability Detection for Smart Contract Bytecode

URL: http://arxiv.org/abs/2504.05002v1
Date: Mon, 07 Apr 2025 12:30:12 GMT
Title: SmartBugBert: BERT-Enhanced Vulnerability Detection for Smart Contract Bytecode
Authors: Jiuyang Bu, Wenkai Li, Zongwei Li, Zeng Zhang, Xiaoqi Li,
Abstract summary: This paper introduces SmartBugBert, a novel approach that combines BERT-based deep learning with control flow graph (CFG) analysis to detect vulnerabilities directly from bytecode.<n>Our method first decompiles smart contract bytecode into optimized opcode sequences, extracts semantic features using TF-IDF, constructs control flow graphs to capture execution logic, and isolates vulnerable CFG fragments for targeted analysis.
Score: 0.7018579932647147
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Smart contracts deployed on blockchain platforms are vulnerable to various security vulnerabilities. However, only a small number of Ethereum contracts have released their source code, so vulnerability detection at the bytecode level is crucial. This paper introduces SmartBugBert, a novel approach that combines BERT-based deep learning with control flow graph (CFG) analysis to detect vulnerabilities directly from bytecode. Our method first decompiles smart contract bytecode into optimized opcode sequences, extracts semantic features using TF-IDF, constructs control flow graphs to capture execution logic, and isolates vulnerable CFG fragments for targeted analysis. By integrating both semantic and structural information through a fine-tuned BERT model and LightGBM classifier, our approach effectively identifies four critical vulnerability types: transaction-ordering, access control, self-destruct, and timestamp dependency vulnerabilities. Experimental evaluation on 6,157 Ethereum smart contracts demonstrates that SmartBugBert achieves 90.62% precision, 91.76% recall, and 91.19% F1-score, significantly outperforming existing detection methods. Ablation studies confirm that the combination of semantic features with CFG information substantially enhances detection performance. Furthermore, our approach maintains efficient detection speed (0.14 seconds per contract), making it practical for large-scale vulnerability assessment.

Related papers

Combining GPT and Code-Based Similarity Checking for Effective Smart Contract Vulnerability Detection [0.0]
We present SimilarGPT, a vulnerability identification tool for smart contract.<n>The main concept of SimilarGPT is to measure the similarity between the code under inspection and the secure code from third-party libraries.<n>We propose optimizing the detection sequence using topological ordering to enhance logical coherence and reduce false positives during detection.
arXiv Detail & Related papers (2024-12-24T07:15:48Z)
CryptoFormalEval: Integrating LLMs and Formal Verification for Automated Cryptographic Protocol Vulnerability Detection [41.94295877935867]
We introduce a benchmark to assess the ability of Large Language Models to autonomously identify vulnerabilities in new cryptographic protocols. We created a dataset of novel, flawed, communication protocols and designed a method to automatically verify the vulnerabilities found by the AI agents.
arXiv Detail & Related papers (2024-11-20T14:16:55Z)
GNN-Based Code Annotation Logic for Establishing Security Boundaries in C Code [41.10157750103835]
Securing sensitive operations in today's interconnected software landscape is crucial yet challenging. Modern platforms rely on Trusted Execution Environments (TEEs) to isolate security sensitive code from the main system. Code Logic (CAL) is a pioneering tool that automatically identifies security sensitive components for TEE isolation.
arXiv Detail & Related papers (2024-11-18T13:40:03Z)
COBRA: Interaction-Aware Bytecode-Level Vulnerability Detector for Smart Contracts [4.891180928768215]
We propose COBRA, a framework that integrates semantic context and function interfaces to detect vulnerabilities in smart contracts. To infer the function signatures that are not present in signature databases, we present SRIF, which automatically learns the rules of function signatures from the smart contract bytecodes. Experimental results demonstrate that SRIF can achieve 94.76% F1-score for function signature inference.
arXiv Detail & Related papers (2024-10-28T03:55:09Z)
Vulnerability Scanners for Ethereum Smart Contracts: A Large-Scale Study [44.25093111430751]
In 2023 alone, such vulnerabilities led to substantial financial losses exceeding a billion of US dollars. Various tools have been developed to detect and mitigate vulnerabilities in smart contracts. This study investigates the gap between the effectiveness of existing security scanners and the vulnerabilities that still persist in practice.
arXiv Detail & Related papers (2023-12-27T11:26:26Z)
Blockchain Smart Contract Threat Detection Technology Based on Symbolic Execution [0.0]
Reentrancy vulnerability, which is hidden and complex, poses a great threat to smart contracts. In this paper, we propose a smart contract threat detection technology based on symbolic execution. The experimental results show that this method significantly increases both detection efficiency and accuracy.
arXiv Detail & Related papers (2023-12-24T03:27:03Z)
VulnSense: Efficient Vulnerability Detection in Ethereum Smart Contracts by Multimodal Learning with Graph Neural Network and Language Model [0.0]
VulnSense is a comprehensive approach to efficiently detect vulnerabilities in smart contracts. Our framework combines three types of features from smart contracts including source code, opcode sequences, and control flow graph. We employ Bidirectional Representations from Transformers (BERT), Bidirectional Long Short-Term Memory (BiLSTM) and Graph Neural Network (GNN) models to extract and analyze these features. The experimental outcomes demonstrate the superior performance of our proposed approach, achieving an average accuracy of 77.96% across all three categories of vulnerable smart contracts.
arXiv Detail & Related papers (2023-09-15T15:26:44Z)
Enhancing Smart Contract Security Analysis with Execution Property Graphs [48.31617821205042]
We introduce Clue, a dynamic analysis framework specifically designed for a runtime virtual machine. Clue captures critical information during contract executions, employing a novel graph-based representation, the Execution Property Graph. evaluation results reveal Clue's superior performance with high true positive rates and low false positive rates, outperforming state-of-the-art tools.
arXiv Detail & Related papers (2023-05-23T13:16:42Z)
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code. Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph. VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable. We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts. We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z)
Robust Encodings: A Framework for Combating Adversarial Typos [85.70270979772388]
NLP systems are easily fooled by small perturbations of inputs. Existing procedures to defend against such perturbations provide guaranteed robustness to worst-case attacks. We introduce robust encodings (RobEn) that confer guaranteed robustness without making compromises on model architecture.
arXiv Detail & Related papers (2020-05-04T01:28:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.