Large Language Model based Smart Contract Auditing with LLMBugScanner
- URL: http://arxiv.org/abs/2512.02069v1
- Date: Sat, 29 Nov 2025 19:13:44 GMT
- Title: Large Language Model based Smart Contract Auditing with LLMBugScanner
- Authors: Yining Yuan, Yifei Wang, Yichang Xu, Zachary Yahn, Sihao Hu, Ling Liu,
- Abstract summary: Smart contract auditing presents several challenges for large language models (LLM)<n>No single model performs consistently well across all vulnerability types or contract structures.<n>LLMBugScanner combines domain knowledge adaptation with ensemble reasoning to improve robustness and generalization.
- Score: 16.70822025530469
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents LLMBugScanner, a large language model (LLM) based framework for smart contract vulnerability detection using fine-tuning and ensemble learning. Smart contract auditing presents several challenges for LLMs: different pretrained models exhibit varying reasoning abilities, and no single model performs consistently well across all vulnerability types or contract structures. These limitations persist even after fine-tuning individual LLMs. To address these challenges, LLMBugScanner combines domain knowledge adaptation with ensemble reasoning to improve robustness and generalization. Through domain knowledge adaptation, we fine-tune LLMs on complementary datasets to capture both general code semantics and instruction-guided vulnerability reasoning, using parameter-efficient tuning to reduce computational cost. Through ensemble reasoning, we leverage the complementary strengths of multiple LLMs and apply a consensus-based conflict resolution strategy to produce more reliable vulnerability assessments. We conduct extensive experiments across multiple popular LLMs and compare LLMBugScanner with both pretrained and fine-tuned individual models. Results show that LLMBugScanner achieves consistent accuracy improvements and stronger generalization, demonstrating that it provides a principled, cost-effective, and extensible framework for smart contract auditing.
Related papers
- LLMs in Code Vulnerability Analysis: A Proof of Concept [0.3441021278275805]
Traditional software security analysis methods struggle to keep pace with the scale and complexity of moderns.<n>This paper explores the incorporation of code-specific and general-purpose Large Language Models to automate critical software security tasks.
arXiv Detail & Related papers (2026-01-13T16:16:11Z) - Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads [104.9566359759396]
We propose a lightweight alternative for step-level reasoning verification based on data-driven uncertainty scores.<n>Our findings suggest that the internal states of LLMs encode their uncertainty and can serve as reliable signals for reasoning verification.
arXiv Detail & Related papers (2025-11-09T03:38:29Z) - ParaVul: A Parallel Large Language Model and Retrieval-Augmented Framework for Smart Contract Vulnerability Detection [43.41293570032631]
ParaVul is a retrieval-augmented framework to improve the reliability and accuracy of smart contract vulnerability detection.<n>We develop Sparse Low-Rank Adaptation (SLoRA) for LLM fine-tuning.<n>We construct a vulnerability contract dataset and develop a hybrid Retrieval-Augmented Generation (RAG) system.
arXiv Detail & Related papers (2025-10-20T03:23:41Z) - Beyond Majority Voting: LLM Aggregation by Leveraging Higher-Order Information [57.397381631496906]
We develop two new aggregation algorithms called Optimal Weight (OW) and Inverse Surprising Popularity (ISP)<n>Our theoretical analysis shows these methods provably mitigate inherent limitations of majority voting under mild assumptions.<n>We empirically validate our algorithms on synthetic datasets, popular LLM fine-tuning benchmarks such as UltraFeedback and MMLU, and a real-world healthcare setting ARMMAN.
arXiv Detail & Related papers (2025-10-01T22:21:50Z) - Ensembling Large Language Models for Code Vulnerability Detection: An Empirical Evaluation [69.8237598448941]
This study investigates the potential of ensemble learning to enhance the performance of Large Language Models (LLMs) in source code vulnerability detection.<n>We propose Dynamic Gated Stacking (DGS), a Stacking variant tailored for vulnerability detection.
arXiv Detail & Related papers (2025-09-16T03:48:22Z) - CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward [50.97588334916863]
We develop CompassVerifier, an accurate and robust lightweight verifier model for evaluation and outcome reward.<n>It demonstrates multi-domain competency spanning math, knowledge, and diverse reasoning tasks, with the capability to process various answer types.<n>We introduce VerifierBench benchmark comprising model outputs collected from multiple data sources, augmented through manual analysis of metaerror patterns to enhance CompassVerifier.
arXiv Detail & Related papers (2025-08-05T17:55:24Z) - SAEL: Leveraging Large Language Models with Adaptive Mixture-of-Experts for Smart Contract Vulnerability Detection [14.581402965011117]
We propose SAEL, an LLM-based framework for smart contract vulnerability detection.<n>We first design targeted prompts to guide LLMs in identifying vulnerabilities and generating explanations.<n>Next, we apply prompt-tuning on CodeT5 and T5 to process contract code and explanations, enhancing task-specific performance.
arXiv Detail & Related papers (2025-07-30T04:28:00Z) - Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection [0.0]
We train and test machine learning algorithms to classify smart contract codes according to type in order to compare model performance.<n>Our research combines machine learning and large language models to provide a rich and interpretable framework for detecting different smart contract vulnerabilities.
arXiv Detail & Related papers (2025-01-04T08:32:53Z) - LLM-SmartAudit: Advanced Smart Contract Vulnerability Detection [3.1409266162146467]
This paper introduces LLM-SmartAudit, a novel framework to detect and analyze vulnerabilities in smart contracts.
Using a multi-agent conversational approach, LLM-SmartAudit employs a collaborative system with specialized agents to enhance the audit process.
Our framework can detect complex logic vulnerabilities that traditional tools have previously overlooked.
arXiv Detail & Related papers (2024-10-12T06:24:21Z) - Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification [52.095460362197336]
Large language models (LLMs) struggle with consistent and accurate reasoning.
LLMs are trained primarily on correct solutions, reducing their ability to detect and learn from errors.
We propose a novel collaborative method integrating Chain-of-Thought (CoT) and Program-of-Thought (PoT) solutions for verification.
arXiv Detail & Related papers (2024-10-05T05:21:48Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)<n>We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.