Related papers: SC-Bench: A Large-Scale Dataset for Smart Contract Auditing

SC-Bench: A Large-Scale Dataset for Smart Contract Auditing

URL: http://arxiv.org/abs/2410.06176v1
Date: Tue, 8 Oct 2024 16:23:50 GMT
Title: SC-Bench: A Large-Scale Dataset for Smart Contract Auditing
Authors: Shihao Xia, Mengting He, Linhai Song, Yiying Zhang,
Abstract summary: We present SC-Bench, the first dataset for automated smart-contract auditing research. SC-Bench consists of 5,377 real-world smart contracts and 15,975 violations of standards on Ehereum called ERCs. We evaluate SC-Bench using GPT-4 by prompting it with both the contracts and ERC rules. Our results show that without the oracle, GPT-4 can only detect 0.9% violations, and with the oracle, it detects 22.9% violations.
Score: 5.787866021952808
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: There is a huge demand to ensure the compliance of smart contracts listed on blockchain platforms to safety and economic standards. Today, manual efforts in the form of auditing are commonly used to achieve this goal. ML-based automated techniques have the promise to alleviate human efforts and the resulting monetary costs. However, unlike other domains where ML techniques have had huge successes, no systematic ML techniques have been proposed or applied to smart contract auditing. We present SC-Bench, the first dataset for automated smart-contract auditing research. SC-Bench consists of 5,377 real-world smart contracts running on Ethereum, a widely used blockchain platform, and 15,975 violations of standards on Ehereum called ERCs. Out of these violations, 139 are real violations programmers made. The remaining are errors we systematically injected to reflect the violations of different ERC rules. We evaluate SC-Bench using GPT-4 by prompting it with both the contracts and ERC rules. In addition, we manually identify each violated rule and the corresponding code site (i.e., oracle) and prompt GPT-4 with the information asking for a True-or-False question. Our results show that without the oracle, GPT-4 can only detect 0.9% violations, and with the oracle, it detects 22.9% violations. These results show the potential room for improvement in ML-based techniques for smart-contract auditing.

Related papers

Process Reward Models That Think [86.88809596842428]
Step-by-step verifiers -- also known as process reward models (PRMs) -- are a key ingredient for test-time scaling. This work aims to build data-efficient PRMs as verbalized step-wise reward models that verify every step in the solution by generating a verification chain-of-thought (CoT) We propose ThinkPRM, a long CoT verifier fine-tuned on orders of magnitude fewer process labels than those required by discriminative PRMs.
arXiv Detail & Related papers (2025-04-23T15:44:54Z)
Deep Learning Approaches for Anti-Money Laundering on Mobile Transactions: Review, Framework, and Directions [51.43521977132062]
Money laundering is a financial crime that obscures the origin of illicit funds. The proliferation of mobile payment platforms and smart IoT devices has significantly complicated anti-money laundering investigations. This paper conducts a comprehensive review of deep learning solutions and the challenges associated with their use in AML.
arXiv Detail & Related papers (2025-03-13T05:19:44Z)
SmartLLM: Smart Contract Auditing using Custom Generative AI [0.0]
This paper introduces SmartLLM, a novel approach leveraging fine-tuned LLaMA 3.1 models with Retrieval-Augmented Generation (RAG) By integrating domain-specific knowledge from ERC standards, SmartLLM achieves superior performance compared to static analysis tools like Mythril and Slither. Experimental results demonstrate a perfect recall of 100% and an accuracy score of 70%, highlighting the model's robustness in identifying vulnerabilities.
arXiv Detail & Related papers (2025-02-17T06:22:05Z)
SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models [8.697080709545352]
This paper introduces SymGPT, a tool that combines the natural language understanding of large language models (LLMs) with the formal guarantees of symbolic execution. We conduct an empirical study of 132 ERC rules from three widely used ERC standards, examining their content, security implications, and natural language descriptions. We then synthesize constraints from the formalized rules to represent scenarios where violations may occur and use symbolic execution to detect them. Our evaluation shows that SymGPT identifies 5,783 ERC rule violations in 4,000 real-world contracts, including 1,375 violations with clear attack paths for stealing financial assets.
arXiv Detail & Related papers (2025-02-11T15:34:00Z)
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors [64.9938658716425]
Existing evaluations of large language models' (LLMs) ability to recognize and reject unsafe user requests face three limitations. First, existing methods often use coarse-grained of unsafe topics, and are over-representing some fine-grained topics. Second, linguistic characteristics and formatting of prompts are often overlooked, like different languages, dialects, and more -- which are only implicitly considered in many evaluations. Third, existing evaluations rely on large LLMs for evaluation, which can be expensive.
arXiv Detail & Related papers (2024-06-20T17:56:07Z)
All Your Tokens are Belong to Us: Demystifying Address Verification Vulnerabilities in Solidity Smart Contracts [24.881450403784786]
Vulnerabilities in the process of address verification can lead to great security issues. We design and implement AVVERIFIER, a lightweight taint analyzer based on static EVM opcode simulation. After a large-scale evaluation of over 5 million smart contracts, we have identified 812 vulnerable smart contracts that were undisclosed by our community.
arXiv Detail & Related papers (2024-05-31T01:02:07Z)
Automated Attack Synthesis for Constant Product Market Makers [7.276711049655224]
composability bugs refer to issues that lead to erroneous behaviors when multiple smart contracts operate together. CPMM-Exploiter automatically detects and generates end-to-end exploits for composability bugs. It successfully generated 18 new exploits, which can result in 12.9K USD profit in total.
arXiv Detail & Related papers (2024-04-08T08:35:15Z)
AuditGPT: Auditing Smart Contracts with ChatGPT [8.697080709545352]
AuditGPT is a tool to automatically and comprehensively verify ERC rules against smart contracts. It pinpoints 418 ERC rule violations and only reports 18 false positives, showcasing its effectiveness and accuracy.
arXiv Detail & Related papers (2024-04-05T07:19:13Z)
Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications [18.138452572457552]
iAudit is a framework for intuitive smart contract auditing with justifications. On a dataset of 263 real smart contract vulnerabilities, iAudit achieves an F1 score of 91.21% and an accuracy of 91.11%.
arXiv Detail & Related papers (2024-03-24T09:26:53Z)
Vulnerability Scanners for Ethereum Smart Contracts: A Large-Scale Study [44.25093111430751]
In 2023 alone, such vulnerabilities led to substantial financial losses exceeding a billion of US dollars. Various tools have been developed to detect and mitigate vulnerabilities in smart contracts. This study investigates the gap between the effectiveness of existing security scanners and the vulnerabilities that still persist in practice.
arXiv Detail & Related papers (2023-12-27T11:26:26Z)
Blockchain Large Language Models [65.7726590159576]
This paper presents a dynamic, real-time approach to detecting anomalous blockchain transactions. The proposed tool, BlockGPT, generates tracing representations of blockchain activity and trains from scratch a large language model to act as a real-time Intrusion Detection System.
arXiv Detail & Related papers (2023-04-25T11:56:18Z)
Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight. They only give tight estimates under implausible worst-case assumptions. We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z)
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable. We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts. We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z)
Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection [184.563345153682]
We develop an instance-aware and context-focused unified framework for weakly supervised learning. It employs an instance-aware self-training algorithm and a learnable Concrete DropBlock while devising a memory-efficient sequential batch back-propagation. Our proposed method state-of-the-art results on COCO ($12.1% AP$, $24.8% AP_50$), VOC 2007 ($54.9% AP$), and VOC 2012 ($52.1% AP$)
arXiv Detail & Related papers (2020-04-09T17:57:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.