Smart-LLaMA-DPO: Reinforced Large Language Model for Explainable Smart Contract Vulnerability Detection
- URL: http://arxiv.org/abs/2506.18245v1
- Date: Mon, 23 Jun 2025 02:24:07 GMT
- Title: Smart-LLaMA-DPO: Reinforced Large Language Model for Explainable Smart Contract Vulnerability Detection
- Authors: Lei Yu, Zhirong Huang, Hang Yuan, Shiqi Cheng, Li Yang, Fengjun Zhang, Chenjie Shen, Jiajia Ma, Jingyuan Zhang, Junyi Lu, Chun Zuo,
- Abstract summary: Existing vulnerability detection methods face two main issues.<n>They lack comprehensive coverage and high-quality explanations for preference learning.<n>Large language models (LLMs) often struggle with accurately interpreting specific concepts in smart contract security.
- Score: 15.694744168599055
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Smart contract vulnerability detection remains a major challenge in blockchain security. Existing vulnerability detection methods face two main issues: (1) Existing datasets lack comprehensive coverage and high-quality explanations for preference learning. (2) Large language models (LLMs) often struggle with accurately interpreting specific concepts in smart contract security. Empirical analysis shows that even after continual pre-training (CPT) and supervised fine-tuning (SFT), LLMs may misinterpret the execution order of state changes, resulting in incorrect explanations despite making correct detection decisions. To address these challenges, we propose Smart-LLaMA-DPO based on LLaMA-3.1-8B. We construct a comprehensive dataset covering four major vulnerability types and machine-unauditable vulnerabilities, including precise labels, explanations, and locations for SFT, as well as high-quality and low-quality output pairs for Direct Preference Optimization (DPO). Second, we perform CPT using large-scale smart contract to enhance the LLM's understanding of specific security practices in smart contracts. Futhermore, we conduct SFT with our comprehensive dataset. Finally, we apply DPO, leveraging human feedback and a specially designed loss function that increases the probability of preferred explanations while reducing the likelihood of non-preferred outputs. We evaluate Smart-LLaMA-DPO on four major vulnerability types: reentrancy, timestamp dependence, integer overflow/underflow, and delegatecall, as well as machine-unauditable vulnerabilities. Our method significantly outperforms state-of-the-art baselines, with average improvements of 10.43% in F1 score and 7.87% in accuracy. Moreover, both LLM evaluation and human evaluation confirm that our method generates more correct, thorough, and clear explanations.
Related papers
- SAEL: Leveraging Large Language Models with Adaptive Mixture-of-Experts for Smart Contract Vulnerability Detection [14.581402965011117]
We propose SAEL, an LLM-based framework for smart contract vulnerability detection.<n>We first design targeted prompts to guide LLMs in identifying vulnerabilities and generating explanations.<n>Next, we apply prompt-tuning on CodeT5 and T5 to process contract code and explanations, enhancing task-specific performance.
arXiv Detail & Related papers (2025-07-30T04:28:00Z) - Boosting Vulnerability Detection of LLMs via Curriculum Preference Optimization with Synthetic Reasoning Data [22.557961978833386]
We propose a novel framework for large language models (LLMs) that excels at mining vulnerability patterns.<n>Specifically, we construct forward and backward reasoning processes for vulnerability and corresponding fixed code, ensuring the synthesis of high-quality reasoning data.<n>We show that ReVD sets new state-of-the-art for LLM-based software vulnerability detection, e.g., 12.24%-22.77% improvement in the accuracy.
arXiv Detail & Related papers (2025-06-09T03:25:23Z) - Reinforcing Video Reasoning with Focused Thinking [65.85683941058916]
We propose TW-GRPO, a novel framework that enhances visual reasoning with focused thinking and dense reward granularity.<n>Specifically, we employ a token weighting mechanism that prioritizes tokens with high informational density.<n>We also reformulate RL training by shifting from single-choice to multi-choice QA tasks.
arXiv Detail & Related papers (2025-05-30T15:42:19Z) - MOS: Towards Effective Smart Contract Vulnerability Detection through Mixture-of-Experts Tuning of Large Language Models [16.16186929130931]
Smart contract vulnerabilities pose significant security risks to blockchain systems.<n>We propose a smart contract vulnerability detection framework based on mixture-of-experts tuning (MOE-Tuning) of large language models.<n> Experiments show that MOS significantly outperforms existing methods with average improvements of 6.32% in F1 score and 4.80% in accuracy.
arXiv Detail & Related papers (2025-04-16T16:33:53Z) - SafeMLRM: Demystifying Safety in Multi-modal Large Reasoning Models [50.34706204154244]
Acquiring reasoning capabilities catastrophically degrades inherited safety alignment.<n>Certain scenarios suffer 25 times higher attack rates.<n>Despite tight reasoning-answer safety coupling, MLRMs demonstrate nascent self-correction.
arXiv Detail & Related papers (2025-04-09T06:53:23Z) - PredictaBoard: Benchmarking LLM Score Predictability [50.47497036981544]
Large Language Models (LLMs) often fail unpredictably.<n>This poses a significant challenge to ensuring their safe deployment.<n>We present PredictaBoard, a novel collaborative benchmarking framework.
arXiv Detail & Related papers (2025-02-20T10:52:38Z) - Smart-LLaMA: Two-Stage Post-Training of Large Language Models for Smart Contract Vulnerability Detection and Explanation [21.39496709865097]
Existing smart contract vulnerability detection methods face three main issues.
Insufficient quality of datasets, lacking detailed explanations and precise vulnerability locations.
We propose Smart-LLaMA, an advanced detection method based on the LLaMA language model.
arXiv Detail & Related papers (2024-11-09T15:49:42Z) - LLM-SmartAudit: Advanced Smart Contract Vulnerability Detection [3.1409266162146467]
This paper introduces LLM-SmartAudit, a novel framework to detect and analyze vulnerabilities in smart contracts.
Using a multi-agent conversational approach, LLM-SmartAudit employs a collaborative system with specialized agents to enhance the audit process.
Our framework can detect complex logic vulnerabilities that traditional tools have previously overlooked.
arXiv Detail & Related papers (2024-10-12T06:24:21Z) - Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - Token-Level Adversarial Prompt Detection Based on Perplexity Measures
and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks.
This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs.
We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.