A Multi-Agent Framework for Automated Vulnerability Detection and Repair in Solidity and Move Smart Contracts
- URL: http://arxiv.org/abs/2502.18515v1
- Date: Sat, 22 Feb 2025 20:30:47 GMT
- Title: A Multi-Agent Framework for Automated Vulnerability Detection and Repair in Solidity and Move Smart Contracts
- Authors: Rabimba Karanjai, Sam Blackshear, Lei Xu, Weidong Shi,
- Abstract summary: This paper presents Smartify, a novel multi-agent framework leveraging Large Language Models (LLMs) to automatically detect and repair vulnerabilities in Solidity and Move smart contracts.<n>Smartify employs a team of specialized agents working on different specially fine-tuned LLMs to analyze code based on underlying programming concepts and language-specific security principles.<n>Our results show that Smartify achieves state-of-the-art performance, surpassing existing LLMs and enhancing general-purpose models' capabilities, such as Llama 3.1.
- Score: 4.3764649156831235
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The rapid growth of the blockchain ecosystem and the increasing value locked in smart contracts necessitate robust security measures. While languages like Solidity and Move aim to improve smart contract security, vulnerabilities persist. This paper presents Smartify, a novel multi-agent framework leveraging Large Language Models (LLMs) to automatically detect and repair vulnerabilities in Solidity and Move smart contracts. Unlike traditional methods that rely solely on vast pre-training datasets, Smartify employs a team of specialized agents working on different specially fine-tuned LLMs to analyze code based on underlying programming concepts and language-specific security principles. We evaluated Smartify on a dataset for Solidity and a curated dataset for Move, demonstrating its effectiveness in fixing a wide range of vulnerabilities. Our results show that Smartify (Gemma2+codegemma) achieves state-of-the-art performance, surpassing existing LLMs and enhancing general-purpose models' capabilities, such as Llama 3.1. Notably, Smartify can incorporate language-specific knowledge, such as the nuances of Move, without requiring massive language-specific pre-training datasets. This work offers a detailed analysis of various LLMs' performance on smart contract repair, highlighting the strengths of our multi-agent approach and providing a blueprint for developing more secure and reliable decentralized applications in the growing blockchain landscape. We also provide a detailed recipe for extending this to other similar use cases.
Related papers
- Generative Large Language Model usage in Smart Contract Vulnerability Detection [8.720242549772154]
This paper presents a systematic review of the current LLM-based smart contract vulnerability detection tools.
We compare them against traditional static and dynamic analysis tools Slither and Mythril.
Our analysis highlights key areas where each performs better and shows that while these tools show promise, the LLM-based tools available for testing are not ready to replace more traditional tools.
arXiv Detail & Related papers (2025-04-07T02:33:40Z) - SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair [51.0686873716938]
We introduce SolBench, a benchmark for evaluating the functional correctness of Solidity smart contracts generated by code completion models.
We propose a Retrieval-Augmented Code Repair framework to verify functional correctness of smart contracts.
Results show that code repair and retrieval techniques effectively enhance the correctness of smart contract completion while reducing computational costs.
arXiv Detail & Related papers (2025-03-03T01:55:20Z) - Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection [0.0]
We train and test machine learning algorithms to classify smart contract codes according to type in order to compare model performance.<n>Our research combines machine learning and large language models to provide a rich and interpretable framework for detecting different smart contract vulnerabilities.
arXiv Detail & Related papers (2025-01-04T08:32:53Z) - Smart-LLaMA: Two-Stage Post-Training of Large Language Models for Smart Contract Vulnerability Detection and Explanation [21.39496709865097]
Existing smart contract vulnerability detection methods face three main issues.
Insufficient quality of datasets, lacking detailed explanations and precise vulnerability locations.
We propose Smart-LLaMA, an advanced detection method based on the LLaMA language model.
arXiv Detail & Related papers (2024-11-09T15:49:42Z) - SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation [89.24729958546168]
We present SPA-B ENCH, a comprehensive SmartPhone Agent Benchmark designed to evaluate (M)LLM-based agents.<n> SPA-B ENCH offers three key contributions: (1) A diverse set of tasks covering system and third-party apps in both English and Chinese, focusing on features commonly used in daily routines; (2) A plug-and-play framework enabling real-time agent interaction with Android devices; and (3) A novel evaluation pipeline that automatically assesses agent performance across multiple dimensions.
arXiv Detail & Related papers (2024-10-19T17:28:48Z) - LLM-SmartAudit: Advanced Smart Contract Vulnerability Detection [3.1409266162146467]
This paper introduces LLM-SmartAudit, a novel framework to detect and analyze vulnerabilities in smart contracts.
Using a multi-agent conversational approach, LLM-SmartAudit employs a collaborative system with specialized agents to enhance the audit process.
Our framework can detect complex logic vulnerabilities that traditional tools have previously overlooked.
arXiv Detail & Related papers (2024-10-12T06:24:21Z) - SmartML: Towards a Modeling Language for Smart Contracts [0.3277163122167434]
This paper proposes SmartML, a modeling language for smart contracts that is platform independent and easy to comprehend.
We detail its formal semantics and type system with a focus on its role in addressing security vulnerabilities.
arXiv Detail & Related papers (2024-03-11T11:27:53Z) - AgentScope: A Flexible yet Robust Multi-Agent Platform [66.64116117163755]
AgentScope is a developer-centric multi-agent platform with message exchange as its core communication mechanism.
The abundant syntactic tools, built-in agents and service functions, user-friendly interfaces for application demonstration and utility monitor, zero-code programming workstation, and automatic prompt tuning mechanism significantly lower the barriers to both development and deployment.
arXiv Detail & Related papers (2024-02-21T04:11:28Z) - HasTEE+ : Confidential Cloud Computing and Analytics with Haskell [50.994023665559496]
Confidential computing enables the protection of confidential code and data in a co-tenanted cloud deployment using specialized hardware isolation units called Trusted Execution Environments (TEEs)
TEEs offer low-level C/C++-based toolchains that are susceptible to inherent memory safety vulnerabilities and lack language constructs to monitor explicit and implicit information-flow leaks.
We address the above with HasTEE+, a domain-specific language (cla) embedded in Haskell that enables programming TEEs in a high-level language with strong type-safety.
arXiv Detail & Related papers (2024-01-17T00:56:23Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration [98.18244218156492]
Large Language Models (LLMs) have significantly advanced natural language processing.<n>As their applications expand into multi-agent environments, there arises a need for a comprehensive evaluation framework.<n>This work introduces a novel competition-based benchmark framework to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - RoFL: Attestable Robustness for Secure Federated Learning [59.63865074749391]
Federated Learning allows a large number of clients to train a joint model without the need to share their private data.
To ensure the confidentiality of the client updates, Federated Learning systems employ secure aggregation.
We present RoFL, a secure Federated Learning system that improves robustness against malicious clients.
arXiv Detail & Related papers (2021-07-07T15:42:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.