SigRec: Automatic Recovery of Function Signatures in Smart Contracts
- URL: http://arxiv.org/abs/2305.07067v1
- Date: Thu, 11 May 2023 18:03:39 GMT
- Title: SigRec: Automatic Recovery of Function Signatures in Smart Contracts
- Authors: Ting Chen, Zihao Li, Xiapu Luo, Xiaofeng Wang, Ting Wang, Zheyuan He,
Kezhao Fang, Yufei Zhang, Hang Zhu, Hongwei Li, Yan Cheng, Xiaosong Zhang
- Abstract summary: It is challenging to recover function signatures from contract bytecode, since neither debug information nor type information is present in the bytecode.
We develop SigRec, a new tool for recovering function signatures from contract bytecode without the need of source code and function signature databases.
- Score: 40.20115707680234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Millions of smart contracts have been deployed onto Ethereum for providing
various services, whose functions can be invoked. For this purpose, the caller
needs to know the function signature of a callee, which includes its function
id and parameter types. Such signatures are critical to many applications
focusing on smart contracts, e.g., reverse engineering, fuzzing, attack
detection, and profiling. Unfortunately, it is challenging to recover the
function signatures from contract bytecode, since neither debug information nor
type information is present in the bytecode. To address this issue, prior
approaches rely on source code, or a collection of known signatures from
incomplete databases or incomplete heuristic rules, which, however, are far
from adequate and cannot cope with the rapid growth of new contracts. In this
paper, we propose a novel solution that leverages how functions are handled by
Ethereum virtual machine (EVM) to automatically recover function signatures. In
particular, we exploit how smart contracts determine the functions to be
invoked to locate and extract function ids, and propose a new approach named
type-aware symbolic execution (TASE) that utilizes the semantics of EVM
operations on parameters to identify the number and the types of parameters.
Moreover, we develop SigRec, a new tool for recovering function signatures from
contract bytecode without the need of source code and function signature
databases. The extensive experimental results show that SigRec outperforms all
existing tools, achieving an unprecedented 98.7 percent accuracy within 0.074
seconds. We further demonstrate that the recovered function signatures are
useful in attack detection, fuzzing and reverse engineering of EVM bytecode.
Related papers
- Automating Comment Generation for Smart Contract from Bytecode [11.143538294203026]
In practice, only 13% of smart contracts deployed on the component to the blockchain are associated with source code.
We propose SmartBT (Smart contract Bytecode Translator) for automatically translating smart contract bytecode into fine-grained natural language description.
arXiv Detail & Related papers (2025-03-19T14:45:40Z) - SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair [51.0686873716938]
We introduce SolBench, a benchmark for evaluating the functional correctness of Solidity smart contracts generated by code completion models.
We propose a Retrieval-Augmented Code Repair framework to verify functional correctness of smart contracts.
Results show that code repair and retrieval techniques effectively enhance the correctness of smart contract completion while reducing computational costs.
arXiv Detail & Related papers (2025-03-03T01:55:20Z) - ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.
This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - COBRA: Interaction-Aware Bytecode-Level Vulnerability Detector for Smart Contracts [4.891180928768215]
We propose COBRA, a framework that integrates semantic context and function interfaces to detect vulnerabilities in smart contracts.
To infer the function signatures that are not present in signature databases, we present SRIF, which automatically learns the rules of function signatures from the smart contract bytecodes.
Experimental results demonstrate that SRIF can achieve 94.76% F1-score for function signature inference.
arXiv Detail & Related papers (2024-10-28T03:55:09Z) - Functional Adaptor Signatures: Beyond All-or-Nothing Blockchain-based Payments [7.8925011858865695]
We propose functional adaptor signatures (FAS), a cryptographic primitive and show how it can be used to enable functional sales.
We formalize the security properties of FAS, among which is a new notion called witness privacy to capture seller's privacy.
We present multiple variants of witness privacy, namely, witness hiding, witness indistinguishability, and zero-knowledge.
arXiv Detail & Related papers (2024-10-14T23:17:03Z) - Identifying Smart Contract Security Issues in Code Snippets from Stack Overflow [34.79673982473015]
We introduce SOChecker, a tool to identify potential vulnerabilities in incomplete SO smart contract code snippets.
Results show that SOChecker achieves an F1 score of 68.2%, greatly surpassing GPT-3.5 and GPT-4.
Our findings underscore the need to improve the security of code snippets from Q&A websites.
arXiv Detail & Related papers (2024-07-18T08:25:16Z) - Effective Targeted Testing of Smart Contracts [0.0]
Since smart contracts are immutable, their bugs cannot be fixed, which may lead to significant monetary losses.
Our framework, Griffin, tackles this deficiency by employing a targeted symbolic execution technique for generating test data.
This paper discusses how smart contracts differ from legacy software in targeted symbolic execution and how these differences can affect the tool structure.
arXiv Detail & Related papers (2024-07-05T04:38:11Z) - FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries.
We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language.
We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z) - Specification Mining for Smart Contracts with Trace Slicing and Predicate Abstraction [10.723903783651537]
We propose a specification mining approach to infer contract specifications from past transactionhistories.
Our approach derives high-level behavioral automata of function invocations, accompanied byprogram invariants statistically inferred from the transaction histories.
arXiv Detail & Related papers (2024-03-20T03:39:51Z) - CodeChameleon: Personalized Encryption Framework for Jailbreaking Large
Language Models [49.60006012946767]
We propose CodeChameleon, a novel jailbreak framework based on personalized encryption tactics.
We conduct extensive experiments on 7 Large Language Models, achieving state-of-the-art average Attack Success Rate (ASR)
Remarkably, our method achieves an 86.6% ASR on GPT-4-1106.
arXiv Detail & Related papers (2024-02-26T16:35:59Z) - RepoCoder: Repository-Level Code Completion Through Iterative Retrieval
and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process.
It incorporates a similarity-based retriever and a pre-trained code language model.
It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z) - Contrastive Code Representation Learning [95.86686147053958]
We show that the popular reconstruction-based BERT model is sensitive to source code edits, even when the edits preserve semantics.
We propose ContraCode: a contrastive pre-training task that learns code functionality, not form.
arXiv Detail & Related papers (2020-07-09T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.