Related papers: SigRec: Automatic Recovery of Function Signatures in Smart Contracts

SigRec: Automatic Recovery of Function Signatures in Smart Contracts

URL: http://arxiv.org/abs/2305.07067v1
Date: Thu, 11 May 2023 18:03:39 GMT
Title: SigRec: Automatic Recovery of Function Signatures in Smart Contracts
Authors: Ting Chen, Zihao Li, Xiapu Luo, Xiaofeng Wang, Ting Wang, Zheyuan He, Kezhao Fang, Yufei Zhang, Hang Zhu, Hongwei Li, Yan Cheng, Xiaosong Zhang
Abstract summary: It is challenging to recover function signatures from contract bytecode, since neither debug information nor type information is present in the bytecode. We develop SigRec, a new tool for recovering function signatures from contract bytecode without the need of source code and function signature databases.
Score: 40.20115707680234
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Millions of smart contracts have been deployed onto Ethereum for providing various services, whose functions can be invoked. For this purpose, the caller needs to know the function signature of a callee, which includes its function id and parameter types. Such signatures are critical to many applications focusing on smart contracts, e.g., reverse engineering, fuzzing, attack detection, and profiling. Unfortunately, it is challenging to recover the function signatures from contract bytecode, since neither debug information nor type information is present in the bytecode. To address this issue, prior approaches rely on source code, or a collection of known signatures from incomplete databases or incomplete heuristic rules, which, however, are far from adequate and cannot cope with the rapid growth of new contracts. In this paper, we propose a novel solution that leverages how functions are handled by Ethereum virtual machine (EVM) to automatically recover function signatures. In particular, we exploit how smart contracts determine the functions to be invoked to locate and extract function ids, and propose a new approach named type-aware symbolic execution (TASE) that utilizes the semantics of EVM operations on parameters to identify the number and the types of parameters. Moreover, we develop SigRec, a new tool for recovering function signatures from contract bytecode without the need of source code and function signature databases. The extensive experimental results show that SigRec outperforms all existing tools, achieving an unprecedented 98.7 percent accuracy within 0.074 seconds. We further demonstrate that the recovered function signatures are useful in attack detection, fuzzing and reverse engineering of EVM bytecode.

Related papers

Automating Comment Generation for Smart Contract from Bytecode [11.143538294203026]
In practice, only 13% of smart contracts deployed on the component to the blockchain are associated with source code. We propose SmartBT (Smart contract Bytecode Translator) for automatically translating smart contract bytecode into fine-grained natural language description.
arXiv Detail & Related papers (2025-03-19T14:45:40Z)
SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair [51.0686873716938]
We introduce SolBench, a benchmark for evaluating the functional correctness of Solidity smart contracts generated by code completion models. We propose a Retrieval-Augmented Code Repair framework to verify functional correctness of smart contracts. Results show that code repair and retrieval techniques effectively enhance the correctness of smart contract completion while reducing computational costs.
arXiv Detail & Related papers (2025-03-03T01:55:20Z)
ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages. This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z)
COBRA: Interaction-Aware Bytecode-Level Vulnerability Detector for Smart Contracts [4.891180928768215]
We propose COBRA, a framework that integrates semantic context and function interfaces to detect vulnerabilities in smart contracts. To infer the function signatures that are not present in signature databases, we present SRIF, which automatically learns the rules of function signatures from the smart contract bytecodes. Experimental results demonstrate that SRIF can achieve 94.76% F1-score for function signature inference.
arXiv Detail & Related papers (2024-10-28T03:55:09Z)
Functional Adaptor Signatures: Beyond All-or-Nothing Blockchain-based Payments [7.8925011858865695]
We propose functional adaptor signatures (FAS), a cryptographic primitive and show how it can be used to enable functional sales. We formalize the security properties of FAS, among which is a new notion called witness privacy to capture seller's privacy. We present multiple variants of witness privacy, namely, witness hiding, witness indistinguishability, and zero-knowledge.
arXiv Detail & Related papers (2024-10-14T23:17:03Z)
Identifying Smart Contract Security Issues in Code Snippets from Stack Overflow [34.79673982473015]
We introduce SOChecker, a tool to identify potential vulnerabilities in incomplete SO smart contract code snippets. Results show that SOChecker achieves an F1 score of 68.2%, greatly surpassing GPT-3.5 and GPT-4. Our findings underscore the need to improve the security of code snippets from Q&A websites.
arXiv Detail & Related papers (2024-07-18T08:25:16Z)
Effective Targeted Testing of Smart Contracts [0.0]
Since smart contracts are immutable, their bugs cannot be fixed, which may lead to significant monetary losses. Our framework, Griffin, tackles this deficiency by employing a targeted symbolic execution technique for generating test data. This paper discusses how smart contracts differ from legacy software in targeted symbolic execution and how these differences can affect the tool structure.
arXiv Detail & Related papers (2024-07-05T04:38:11Z)
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries. We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language. We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z)
Specification Mining for Smart Contracts with Trace Slicing and Predicate Abstraction [10.723903783651537]
We propose a specification mining approach to infer contract specifications from past transactionhistories. Our approach derives high-level behavioral automata of function invocations, accompanied byprogram invariants statistically inferred from the transaction histories.
arXiv Detail & Related papers (2024-03-20T03:39:51Z)
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models [49.60006012946767]
We propose CodeChameleon, a novel jailbreak framework based on personalized encryption tactics. We conduct extensive experiments on 7 Large Language Models, achieving state-of-the-art average Attack Success Rate (ASR) Remarkably, our method achieves an 86.6% ASR on GPT-4-1106.
arXiv Detail & Related papers (2024-02-26T16:35:59Z)
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process. It incorporates a similarity-based retriever and a pre-trained code language model. It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z)
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable. We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts. We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z)
Contrastive Code Representation Learning [95.86686147053958]
We show that the popular reconstruction-based BERT model is sensitive to source code edits, even when the edits preserve semantics. We propose ContraCode: a contrastive pre-training task that learns code functionality, not form.
arXiv Detail & Related papers (2020-07-09T17:59:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.