Guiding LLM-based Smart Contract Generation with Finite State Machine
- URL: http://arxiv.org/abs/2505.08542v1
- Date: Tue, 13 May 2025 13:13:26 GMT
- Title: Guiding LLM-based Smart Contract Generation with Finite State Machine
- Authors: Hao Luo, Yuhao Lin, Xiao Yan, Xintong Hu, Yuxiang Wang, Qiming Zeng, Hao Wang, Jiawei Jiang,
- Abstract summary: We propose FSM-SCG, a smart contract generation framework based on finite state machine (FSM) and Large Language Models (LLMs)<n>Compared to the best baseline, FSM-SCG improves the compilation success rate of generated smart contract code by at most 48%, and reduces the average vulnerability risk score by approximately 68%.
- Score: 24.841721855191857
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Smart contract is a kind of self-executing code based on blockchain technology with a wide range of application scenarios, but the traditional generation method relies on manual coding and expert auditing, which has a high threshold and low efficiency. Although Large Language Models (LLMs) show great potential in programming tasks, they still face challenges in smart contract generation w.r.t. effectiveness and security. To solve these problems, we propose FSM-SCG, a smart contract generation framework based on finite state machine (FSM) and LLMs, which significantly improves the quality of the generated code by abstracting user requirements to generate FSM, guiding LLMs to generate smart contracts, and iteratively optimizing the code with the feedback of compilation and security checks. The experimental results show that FSM-SCG significantly improves the quality of smart contract generation. Compared to the best baseline, FSM-SCG improves the compilation success rate of generated smart contract code by at most 48%, and reduces the average vulnerability risk score by approximately 68%.
Related papers
- LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation [56.84049855266145]
We propose a Multi-feedback Smart Contract Fuzzing framework (LLAMA) that integrates evolutionary mutation strategies, and hybrid testing techniques.<n>LLAMA achieves 91% instruction coverage and 90% branch coverage, while detecting 132 out of 148 known vulnerabilities.<n>These results highlight LLAMA's effectiveness, adaptability, and practicality in real-world smart contract security testing scenarios.
arXiv Detail & Related papers (2025-07-16T09:46:58Z) - Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z) - SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair [51.0686873716938]
We introduce SolBench, a benchmark for evaluating the functional correctness of Solidity smart contracts generated by code completion models.<n>We propose a Retrieval-Augmented Code Repair framework to verify functional correctness of smart contracts.<n>Results show that code repair and retrieval techniques effectively enhance the correctness of smart contract completion while reducing computational costs.
arXiv Detail & Related papers (2025-03-03T01:55:20Z) - SmartLLM: Smart Contract Auditing using Custom Generative AI [0.0]
This paper introduces SmartLLM, a novel approach leveraging fine-tuned LLaMA 3.1 models with Retrieval-Augmented Generation (RAG)<n>By integrating domain-specific knowledge from ERC standards, SmartLLM achieves superior performance compared to static analysis tools like Mythril and Slither.<n> Experimental results demonstrate a perfect recall of 100% and an accuracy score of 70%, highlighting the model's robustness in identifying vulnerabilities.
arXiv Detail & Related papers (2025-02-17T06:22:05Z) - Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection [0.0]
We train and test machine learning algorithms to classify smart contract codes according to type in order to compare model performance.<n>Our research combines machine learning and large language models to provide a rich and interpretable framework for detecting different smart contract vulnerabilities.
arXiv Detail & Related papers (2025-01-04T08:32:53Z) - SmartLLMSentry: A Comprehensive LLM Based Smart Contract Vulnerability Detection Framework [0.0]
This paper introduces SmartLLMSentry, a novel framework that leverages large language models (LLMs) to advance smart contract vulnerability detection.<n>We created a specialized dataset of five randomly selected vulnerabilities for model training and evaluation.<n>Our results show an exact match accuracy of 91.1% with sufficient data, although GPT-4 demonstrated reduced performance compared to GPT-3 in rule generation.
arXiv Detail & Related papers (2024-11-28T16:02:01Z) - FTSmartAudit: A Knowledge Distillation-Enhanced Framework for Automated Smart Contract Auditing Using Fine-Tuned LLMs [17.76505488643214]
This paper investigates the feasibility of using smaller, fine-tuned models to achieve comparable or even superior results in smart contract auditing.<n>We introduce the FTSmartAudit framework, which is designed to develop cost-effective, specialized models for smart contract auditing.<n>Our contributions include: (1) a single-task learning framework that streamlines data preparation, training, evaluation, and continuous learning; (2) a robust dataset generation method utilizing domain-special knowledge distillation to produce high-quality datasets from advanced models like GPT-4o; and (3) an adaptive learning strategy to maintain model accuracy and robustness.
arXiv Detail & Related papers (2024-10-17T09:09:09Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents [50.82665351100067]
FlowGen is a code generation framework that emulates software process models based on multiple Large Language Model (LLM) agents.
We evaluate FlowGenScrum on four benchmarks: HumanEval, HumanEval-ET, MBPP, and MBPP-ET.
arXiv Detail & Related papers (2024-03-23T14:04:48Z) - SCLA: Automated Smart Contract Summarization via LLMs and Control Flow Prompt [2.539913845592959]
We propose SCLA, an LLM-based method that enhances summarization by integrating a Control Flow Graph (CFG) and semantic facts from the code's control flow into a semantically enriched prompt.<n>We validate the effectiveness of SCLA through comprehensive experiments on a dataset of 40,000 real-world smart contracts.<n>The experiment shows that SCLA significantly improves summarization quality, outperforming the SOTA baselines with improvements of 26.7%, 23.2%, 16.7%, and 14.7% in BLEU-4, METEOR, ROUGE-L, and BLEURT scores, respectively.
arXiv Detail & Related papers (2024-02-07T13:58:26Z) - StepCoder: Improve Code Generation with Reinforcement Learning from
Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components.
CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks.
FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization.
Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z) - An Empirical Study of AI-based Smart Contract Creation [4.801455786801489]
Large language models (LLMs) like ChatGPT and Google Palm2 for smart contract generation seem to be the first well-established instance of an AI pair programmer.
The main objective of this study is to assess the quality of generated code provided by LLMs for smart contracts.
arXiv Detail & Related papers (2023-08-05T21:38:57Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.