AuditGPT: Auditing Smart Contracts with ChatGPT
- URL: http://arxiv.org/abs/2404.04306v1
- Date: Fri, 5 Apr 2024 07:19:13 GMT
- Title: AuditGPT: Auditing Smart Contracts with ChatGPT
- Authors: Shihao Xia, Shuai Shao, Mengting He, Tingting Yu, Linhai Song, Yiying Zhang,
- Abstract summary: AuditGPT is a tool to automatically and comprehensively verify ERC rules against smart contracts.
It pinpoints 418 ERC rule violations and only reports 18 false positives, showcasing its effectiveness and accuracy.
- Score: 8.697080709545352
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: To govern smart contracts running on Ethereum, multiple Ethereum Request for Comment (ERC) standards have been developed, each containing a set of rules to guide the behaviors of smart contracts. Violating the ERC rules could cause serious security issues and financial loss, signifying the importance of verifying smart contracts follow ERCs. Today's practices of such verification are to either manually audit each single contract or use expert-developed, limited-scope program-analysis tools, both of which are far from being effective in identifying ERC rule violations. This paper presents a tool named AuditGPT that leverages large language models (LLMs) to automatically and comprehensively verify ERC rules against smart contracts. To build AuditGPT, we first conduct an empirical study on 222 ERC rules specified in four popular ERCs to understand their content, their security impacts, their specification in natural language, and their implementation in Solidity. Guided by the study, we construct AuditGPT by separating the large, complex auditing process into small, manageable tasks and design prompts specialized for each ERC rule type to enhance LLMs' auditing performance. In the evaluation, AuditGPT successfully pinpoints 418 ERC rule violations and only reports 18 false positives, showcasing its effectiveness and accuracy. Moreover, AuditGPT beats an auditing service provided by security experts in effectiveness, accuracy, and cost, demonstrating its advancement over state-of-the-art smart-contract auditing practices.
Related papers
- SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models [8.697080709545352]
This paper introduces SymGPT, a tool that combines the natural language understanding of large language models (LLMs) with the formal guarantees of symbolic execution.
We conduct an empirical study of 132 ERC rules from three widely used ERC standards, examining their content, security implications, and natural language descriptions.
We then synthesize constraints from the formalized rules to represent scenarios where violations may occur and use symbolic execution to detect them.
Our evaluation shows that SymGPT identifies 5,783 ERC rule violations in 4,000 real-world contracts, including 1,375 violations with clear attack paths for stealing financial assets.
arXiv Detail & Related papers (2025-02-11T15:34:00Z) - NLP-based Regulatory Compliance -- Using GPT 4.0 to Decode Regulatory Documents [0.0]
This study evaluates GPT-4.0's ability to identify conflicts within regulatory requirements.
Using metrics such as precision, recall, and F1 score, the experiment demonstrates GPT-4.0's effectiveness in detecting inconsistencies.
arXiv Detail & Related papers (2024-12-29T22:14:59Z) - SC-Bench: A Large-Scale Dataset for Smart Contract Auditing [5.787866021952808]
We present SC-Bench, the first dataset for automated smart-contract auditing research.
SC-Bench consists of 5,377 real-world smart contracts and 15,975 violations of standards on Ehereum called ERCs.
We evaluate SC-Bench using GPT-4 by prompting it with both the contracts and ERC rules.
Our results show that without the oracle, GPT-4 can only detect 0.9% violations, and with the oracle, it detects 22.9% violations.
arXiv Detail & Related papers (2024-10-08T16:23:50Z) - Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - A Context-Driven Approach for Co-Auditing Smart Contracts with The Support of GPT-4 code interpreter [15.28361088402754]
This paper introduces a novel context-driven prompting technique for smart contract co-auditing.
Our approach employs three techniques for context scoping and augmentation, encompassing code scoping to chunk long code into self-contained code segments.
Our method demonstrated a detection rate of 96% for vulnerable functions, outperforming the native prompting approach, which detected only 53%.
arXiv Detail & Related papers (2024-06-26T05:14:35Z) - SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors [64.9938658716425]
Existing evaluations of large language models' (LLMs) ability to recognize and reject unsafe user requests face three limitations.
First, existing methods often use coarse-grained of unsafe topics, and are over-representing some fine-grained topics.
Second, linguistic characteristics and formatting of prompts are often overlooked, like different languages, dialects, and more -- which are only implicitly considered in many evaluations.
Third, existing evaluations rely on large LLMs for evaluation, which can be expensive.
arXiv Detail & Related papers (2024-06-20T17:56:07Z) - The Decisive Power of Indecision: Low-Variance Risk-Limiting Audits and Election Contestation via Marginal Mark Recording [51.82772358241505]
Risk-limiting audits (RLAs) are techniques for verifying the outcomes of large elections.
We define new families of audits that improve efficiency and offer advances in statistical power.
New audits are enabled by revisiting the standard notion of a cast-vote record so that it can declare multiple possible mark interpretations.
arXiv Detail & Related papers (2024-02-09T16:23:54Z) - A Framework for Assurance Audits of Algorithmic Systems [2.2342503377379725]
We propose the criterion audit as an operationalizable compliance and assurance external audit framework.
We argue that AI audits should similarly provide assurance to their stakeholders about AI organizations' ability to govern their algorithms in ways that harms and uphold human values.
We conclude by offering a critical discussion on the benefits, inherent limitations, and implementation challenges of applying practices of the more mature financial auditing industry to AI auditing.
arXiv Detail & Related papers (2024-01-26T14:38:54Z) - Formally Verifying a Real World Smart Contract [52.30656867727018]
We search for a tool capable of formally verifying a real-world smart contract written in a recent version of Solidity.
In this article, we present our search for a tool capable of formally verifying a real-world smart contract written in a recent version of Solidity.
arXiv Detail & Related papers (2023-07-05T14:30:21Z) - DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT
Models [92.6951708781736]
This work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5.
We find that GPT models can be easily misled to generate toxic and biased outputs and leak private information.
Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps.
arXiv Detail & Related papers (2023-06-20T17:24:23Z) - Consistency Analysis of ChatGPT [65.268245109828]
This paper investigates the trustworthiness of ChatGPT and GPT-4 regarding logically consistent behaviour.
Our findings suggest that while both models appear to show an enhanced language understanding and reasoning ability, they still frequently fall short of generating logically consistent predictions.
arXiv Detail & Related papers (2023-03-11T01:19:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.