COBRA: Interaction-Aware Bytecode-Level Vulnerability Detector for Smart Contracts
- URL: http://arxiv.org/abs/2410.20712v1
- Date: Mon, 28 Oct 2024 03:55:09 GMT
- Title: COBRA: Interaction-Aware Bytecode-Level Vulnerability Detector for Smart Contracts
- Authors: Wenkai Li, Xiaoqi Li, Zongwei Li, Yuqing Zhang,
- Abstract summary: We propose COBRA, a framework that integrates semantic context and function interfaces to detect vulnerabilities in smart contracts.
To infer the function signatures that are not present in signature databases, we present SRIF, which automatically learns the rules of function signatures from the smart contract bytecodes.
Experimental results demonstrate that SRIF can achieve 94.76% F1-score for function signature inference.
- Score: 4.891180928768215
- License:
- Abstract: The detection of vulnerabilities in smart contracts remains a significant challenge. While numerous tools are available for analyzing smart contracts in source code, only about 1.79% of smart contracts on Ethereum are open-source. For existing tools that target bytecodes, most of them only consider the semantic logic context and disregard function interface information in the bytecodes. In this paper, we propose COBRA, a novel framework that integrates semantic context and function interfaces to detect vulnerabilities in bytecodes of the smart contract. To our best knowledge, COBRA is the first framework that combines these two features. Moreover, to infer the function signatures that are not present in signature databases, we present SRIF (Signatures Reverse Inference from Functions), automatically learn the rules of function signatures from the smart contract bytecodes. The bytecodes associated with the function signatures are collected by constructing a control flow graph (CFG) for the SRIF training. We optimize the semantic context using the operation code in the static single assignment (SSA) format. Finally, we integrate the context and function interface representations in the latent space as the contract feature embedding. The contract features in the hidden space are decoded for vulnerability classifications with a decoder and attention module. Experimental results demonstrate that SRIF can achieve 94.76% F1-score for function signature inference. Furthermore, when the ground truth ABI exists, COBRA achieves 93.45% F1-score for vulnerability classification. In the absence of ABI, the inferred function feature fills the encoder, and the system accomplishes an 89.46% recall rate.
Related papers
- Identifying Smart Contract Security Issues in Code Snippets from Stack Overflow [34.79673982473015]
We introduce SOChecker, a tool to identify potential vulnerabilities in incomplete SO smart contract code snippets.
Results show that SOChecker achieves an F1 score of 68.2%, greatly surpassing GPT-3.5 and GPT-4.
Our findings underscore the need to improve the security of code snippets from Q&A websites.
arXiv Detail & Related papers (2024-07-18T08:25:16Z) - Effective Targeted Testing of Smart Contracts [0.0]
Since smart contracts are immutable, their bugs cannot be fixed, which may lead to significant monetary losses.
Our framework, Griffin, tackles this deficiency by employing a targeted symbolic execution technique for generating test data.
This paper discusses how smart contracts differ from legacy software in targeted symbolic execution and how these differences can affect the tool structure.
arXiv Detail & Related papers (2024-07-05T04:38:11Z) - FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries.
We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language.
We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z) - SigRec: Automatic Recovery of Function Signatures in Smart Contracts [40.20115707680234]
It is challenging to recover function signatures from contract bytecode, since neither debug information nor type information is present in the bytecode.
We develop SigRec, a new tool for recovering function signatures from contract bytecode without the need of source code and function signature databases.
arXiv Detail & Related papers (2023-05-11T18:03:39Z) - SimCLF: A Simple Contrastive Learning Framework for Function-level
Binary Embeddings [2.1222884030559315]
We propose SimCLF: A Simple Contrastive Learning Framework for Function-level Binary Embeddings.
We take an unsupervised learning approach and formulate binary code similarity detection as instance discrimination.
SimCLF directly operates on disassembled binary functions and could be implemented with any encoder.
arXiv Detail & Related papers (2022-09-06T12:09:45Z) - Enhancing Semantic Code Search with Multimodal Contrastive Learning and
Soft Data Augmentation [50.14232079160476]
We propose a new approach with multimodal contrastive learning and soft data augmentation for code search.
We conduct extensive experiments to evaluate the effectiveness of our approach on a large-scale dataset with six programming languages.
arXiv Detail & Related papers (2022-04-07T08:49:27Z) - HCC: A Language-Independent Hardening Contract Compiler for Smart Contracts [5.379572824182189]
We propose the first practical smart contract compiler, called HCC.
HCC inserts security hardening checks at the source-code level based on a novel and language-independent code property graph (CPG) notation.
arXiv Detail & Related papers (2022-03-01T11:25:32Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z) - Contrastive Code Representation Learning [95.86686147053958]
We show that the popular reconstruction-based BERT model is sensitive to source code edits, even when the edits preserve semantics.
We propose ContraCode: a contrastive pre-training task that learns code functionality, not form.
arXiv Detail & Related papers (2020-07-09T17:59:06Z) - Robust Encodings: A Framework for Combating Adversarial Typos [85.70270979772388]
NLP systems are easily fooled by small perturbations of inputs.
Existing procedures to defend against such perturbations provide guaranteed robustness to worst-case attacks.
We introduce robust encodings (RobEn) that confer guaranteed robustness without making compromises on model architecture.
arXiv Detail & Related papers (2020-05-04T01:28:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.