AuditGPT: Auditing Smart Contracts with ChatGPT
- URL: http://arxiv.org/abs/2404.04306v1
- Date: Fri, 5 Apr 2024 07:19:13 GMT
- Title: AuditGPT: Auditing Smart Contracts with ChatGPT
- Authors: Shihao Xia, Shuai Shao, Mengting He, Tingting Yu, Linhai Song, Yiying Zhang,
- Abstract summary: AuditGPT is a tool to automatically and comprehensively verify ERC rules against smart contracts.
It pinpoints 418 ERC rule violations and only reports 18 false positives, showcasing its effectiveness and accuracy.
- Score: 8.697080709545352
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: To govern smart contracts running on Ethereum, multiple Ethereum Request for Comment (ERC) standards have been developed, each containing a set of rules to guide the behaviors of smart contracts. Violating the ERC rules could cause serious security issues and financial loss, signifying the importance of verifying smart contracts follow ERCs. Today's practices of such verification are to either manually audit each single contract or use expert-developed, limited-scope program-analysis tools, both of which are far from being effective in identifying ERC rule violations. This paper presents a tool named AuditGPT that leverages large language models (LLMs) to automatically and comprehensively verify ERC rules against smart contracts. To build AuditGPT, we first conduct an empirical study on 222 ERC rules specified in four popular ERCs to understand their content, their security impacts, their specification in natural language, and their implementation in Solidity. Guided by the study, we construct AuditGPT by separating the large, complex auditing process into small, manageable tasks and design prompts specialized for each ERC rule type to enhance LLMs' auditing performance. In the evaluation, AuditGPT successfully pinpoints 418 ERC rule violations and only reports 18 false positives, showcasing its effectiveness and accuracy. Moreover, AuditGPT beats an auditing service provided by security experts in effectiveness, accuracy, and cost, demonstrating its advancement over state-of-the-art smart-contract auditing practices.
Related papers
- SC-Bench: A Large-Scale Dataset for Smart Contract Auditing [5.787866021952808]
We present SC-Bench, the first dataset for automated smart-contract auditing research.
SC-Bench consists of 5,377 real-world smart contracts and 15,975 violations of standards on Ehereum called ERCs.
We evaluate SC-Bench using GPT-4 by prompting it with both the contracts and ERC rules.
Our results show that without the oracle, GPT-4 can only detect 0.9% violations, and with the oracle, it detects 22.9% violations.
arXiv Detail & Related papers (2024-10-08T16:23:50Z) - Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - RegNLP in Action: Facilitating Compliance Through Automated Information Retrieval and Answer Generation [51.998738311700095]
Regulatory documents, characterized by their length, complexity and frequent updates, are challenging to interpret.
RegNLP is a multidisciplinary subfield aimed at simplifying access to and interpretation of regulatory rules and obligations.
ObliQA dataset contains 27,869 questions derived from the Abu Dhabi Global Markets (ADGM) financial regulation document collection.
arXiv Detail & Related papers (2024-09-09T14:44:19Z) - A Context-Driven Approach for Co-Auditing Smart Contracts with The Support of GPT-4 code interpreter [15.28361088402754]
This paper introduces a novel context-driven prompting technique for smart contract co-auditing.
Our approach employs three techniques for context scoping and augmentation, encompassing code scoping to chunk long code into self-contained code segments.
Our method demonstrated a detection rate of 96% for vulnerable functions, outperforming the native prompting approach, which detected only 53%.
arXiv Detail & Related papers (2024-06-26T05:14:35Z) - Evaluation of ChatGPT's Smart Contract Auditing Capabilities Based on
Chain of Thought [8.04987973069845]
This study explores the potential of enhancing smart contract security audits using the GPT-4 model.
We utilized a dataset of 35 smart contracts from the SolidiFI-benchmark vulnerability library, containing 732 vulnerabilities.
We found GPT-4 performed poorly in detecting smart contract vulnerabilities, with a high Precision of 96.6%, but a low Recall of 37.8%, and an F1-score of 41.1%.
arXiv Detail & Related papers (2024-02-19T10:33:29Z) - The Decisive Power of Indecision: Low-Variance Risk-Limiting Audits and Election Contestation via Marginal Mark Recording [51.82772358241505]
Risk-limiting audits (RLAs) are techniques for verifying the outcomes of large elections.
We define new families of audits that improve efficiency and offer advances in statistical power.
New audits are enabled by revisiting the standard notion of a cast-vote record so that it can declare multiple possible mark interpretations.
arXiv Detail & Related papers (2024-02-09T16:23:54Z) - A Framework for Assurance Audits of Algorithmic Systems [2.2342503377379725]
We propose the criterion audit as an operationalizable compliance and assurance external audit framework.
We argue that AI audits should similarly provide assurance to their stakeholders about AI organizations' ability to govern their algorithms in ways that harms and uphold human values.
We conclude by offering a critical discussion on the benefits, inherent limitations, and implementation challenges of applying practices of the more mature financial auditing industry to AI auditing.
arXiv Detail & Related papers (2024-01-26T14:38:54Z) - Who Audits the Auditors? Recommendations from a field scan of the
algorithmic auditing ecosystem [0.971392598996499]
We provide the first comprehensive field scan of the AI audit ecosystem.
We identify emerging best practices as well as methods and tools that are becoming commonplace.
We outline policy recommendations to improve the quality and impact of these audits.
arXiv Detail & Related papers (2023-10-04T01:40:03Z) - Formally Verifying a Real World Smart Contract [52.30656867727018]
We search for a tool capable of formally verifying a real-world smart contract written in a recent version of Solidity.
In this article, we present our search for a tool capable of formally verifying a real-world smart contract written in a recent version of Solidity.
arXiv Detail & Related papers (2023-07-05T14:30:21Z) - DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT
Models [92.6951708781736]
This work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5.
We find that GPT models can be easily misled to generate toxic and biased outputs and leak private information.
Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps.
arXiv Detail & Related papers (2023-06-20T17:24:23Z) - Consistency Analysis of ChatGPT [65.268245109828]
This paper investigates the trustworthiness of ChatGPT and GPT-4 regarding logically consistent behaviour.
Our findings suggest that while both models appear to show an enhanced language understanding and reasoning ability, they still frequently fall short of generating logically consistent predictions.
arXiv Detail & Related papers (2023-03-11T01:19:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.