Related papers: A Preference-Driven Methodology for High-Quality Solidity Code Generation

A Preference-Driven Methodology for High-Quality Solidity Code Generation

URL: http://arxiv.org/abs/2506.03006v2
Date: Fri, 06 Jun 2025 08:39:17 GMT
Title: A Preference-Driven Methodology for High-Quality Solidity Code Generation
Authors: Zhiyuan Peng, Xin Yin, Chenhao Ying, Chao Ni, Yuan Luo,
Abstract summary: We propose textbfmytitle, a novel framework that extends standard DPO beyond human preferences to incorporate quantifiable blockchain-specific metrics.<n>Our framework introduces a comprehensive evaluation methodology with four complementary metrics: Pass@k (functional correctness), Compile@k (syntactic correctness), Gas@k (gas efficiency), and Secure@k (security assessment)<n>Our framework significantly outperforms existing approaches across all critical dimensions, achieving 66.7% Pass@5, 58.9% Gas@5, and 62.5% Secure@5.
Score: 11.139579355590332
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While Large Language Models (LLMs) have demonstrated remarkable progress in generating functionally correct Solidity code, they continue to face critical challenges in producing gas-efficient and secure code, which are critical requirements for real-world smart contract deployment. Although recent advances leverage Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) for code preference alignment, existing approaches treat functional correctness, gas optimization, and security as independent objectives, resulting in contracts that may achieve operational soundness but suffer from prohibitive execution costs or dangerous vulnerabilities. To address these limitations, we propose \textbf{\mytitle}, a novel framework that extends standard DPO beyond human preferences to incorporate quantifiable blockchain-specific metrics, enabling holistic multi-objective optimization specifically tailored for smart contract generation. Our framework introduces a comprehensive evaluation methodology with four complementary metrics: Pass@k (functional correctness), Compile@k (syntactic correctness), Gas@k (gas efficiency), and Secure@k (security assessment), providing rigorous multi-dimensional contract evaluation. Through extensive experimentation, we demonstrate that \mytitle significantly outperforms existing approaches across all critical dimensions, achieving 66.7\% Pass@5, 58.9\% Gas@5, and 62.5\% Secure@5, while generating production-ready smart contracts that are functionally correct, cost-efficient, and secure.

Related papers

An Optimisation Framework for Unsupervised Environment Design [88.29733214939544]
unsupervised environment design (UED) aims to maximise agent's general robustness.<n>We provide a provably convergent algorithm in the zero-sum setting.<n>We empirically verify the efficacy of our method.
arXiv Detail & Related papers (2025-05-27T03:07:26Z)
Guiding LLM-based Smart Contract Generation with Finite State Machine [24.841721855191857]
We propose FSM-SCG, a smart contract generation framework based on finite state machine (FSM) and Large Language Models (LLMs)<n>Compared to the best baseline, FSM-SCG improves the compilation success rate of generated smart contract code by at most 48%, and reduces the average vulnerability risk score by approximately 68%.
arXiv Detail & Related papers (2025-05-13T13:13:26Z)
Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM [0.7018579932647147]
Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts.<n>This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection.
arXiv Detail & Related papers (2025-04-07T12:32:14Z)
Revisiting Locally Differentially Private Protocols: Towards Better Trade-offs in Privacy, Utility, and Attack Resistance [4.5282933786221395]
Local Differential Privacy (LDP) offers strong privacy protection, especially in settings in which the server collecting the data is untrusted.<n>We introduce a general multi-objective optimization framework for refining LDP protocols.<n>Our framework enables modular and context-aware deployment of LDP mechanisms with tunable privacy-utility trade-offs.
arXiv Detail & Related papers (2025-03-03T12:41:01Z)
SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair [51.0686873716938]
We introduce SolBench, a benchmark for evaluating the functional correctness of Solidity smart contracts generated by code completion models.<n>We propose a Retrieval-Augmented Code Repair framework to verify functional correctness of smart contracts.<n>Results show that code repair and retrieval techniques effectively enhance the correctness of smart contract completion while reducing computational costs.
arXiv Detail & Related papers (2025-03-03T01:55:20Z)
SmartLLM: Smart Contract Auditing using Custom Generative AI [0.0]
This paper introduces SmartLLM, a novel approach leveraging fine-tuned LLaMA 3.1 models with Retrieval-Augmented Generation (RAG)<n>By integrating domain-specific knowledge from ERC standards, SmartLLM achieves superior performance compared to static analysis tools like Mythril and Slither.<n> Experimental results demonstrate a perfect recall of 100% and an accuracy score of 70%, highlighting the model's robustness in identifying vulnerabilities.
arXiv Detail & Related papers (2025-02-17T06:22:05Z)
Automated Proof Generation for Rust Code via Self-Evolution [69.25795662658356]
We introduce SAFE, a framework that overcomes the lack of human-written snippets to enable automated proof generation of Rust code.<n> SAFE re-purposes the large number of synthesized incorrect proofs to train the self-ging capability of the fine-tuned models.<n>We achieve a 52.52% accuracy rate in a benchmark crafted by human experts, a significant leap over GPT-4o's performance of 14.39%.
arXiv Detail & Related papers (2024-10-21T08:15:45Z)
CodeDPO: Aligning Code Models with Self Generated and Verified Source Code [52.70310361822519]
We propose CodeDPO, a framework that integrates preference learning into code generation to improve two key code preference factors: code correctness and efficiency. CodeDPO employs a novel dataset construction method, utilizing a self-generation-and-validation mechanism that simultaneously generates and evaluates code and test cases.
arXiv Detail & Related papers (2024-10-08T01:36:15Z)
OATH: Efficient and Flexible Zero-Knowledge Proofs of End-to-End ML Fairness [13.986886689256128]
Zero-Knowledge Proofs of Fairness address fairness noncompliance by allowing a service provider to verify that their model serves diverse demographics equitably. We present OATH, a framework that is deployably efficient with client-facing communication and an offline audit phase. OATH provides a 1343x improvement to runtime over previous work for neural network ZKPoF, and scales up to much larger models.
arXiv Detail & Related papers (2024-09-17T16:00:35Z)
ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees [68.33498595506941]
We introduce a novel uncertainty measure based on self-consistency theory. We then develop a conformal uncertainty criterion by integrating the uncertainty condition aligned with correctness into the CP algorithm. Empirical evaluations indicate that our uncertainty measure outperforms prior state-of-the-art methods.
arXiv Detail & Related papers (2024-06-29T17:33:07Z)
RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content [62.685566387625975]
Current mitigation strategies, while effective, are not resilient under adversarial attacks. This paper introduces Resilient Guardrails for Large Language Models (RigorLLM), a novel framework designed to efficiently moderate harmful and unsafe inputs.
arXiv Detail & Related papers (2024-03-19T07:25:02Z)
Bayesian Optimization with Formal Safety Guarantees via Online Conformal Prediction [36.14499894307206]
Black-box zero-th order optimization is a central primitive for applications in fields as diverse as finance, physics, and engineering. In this paper, we study scenarios in which feedback is also provided on the safety of the attempted solution. A novel BO-based approach is introduced that satisfies safety requirements irrespective of properties of the constraint function.
arXiv Detail & Related papers (2023-06-30T17:26:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.