Related papers: An Effective and Cost-Efficient Agentic Framework for Ethereum Smart Contract Auditing

An Effective and Cost-Efficient Agentic Framework for Ethereum Smart Contract Auditing

URL: http://arxiv.org/abs/2601.17833v1
Date: Sun, 25 Jan 2026 13:28:37 GMT
Title: An Effective and Cost-Efficient Agentic Framework for Ethereum Smart Contract Auditing
Authors: Xiaohui Hu, Wun Yu Chan, Yuejie Shi, Qumeng Sun, Wei-Cheng Wang, Chiachih Wu, Haoyu Wang, Ningyu He,
Abstract summary: Heimdallr is an automated auditing agent designed to overcome hurdles through four core innovations.<n>It minimizes context overhead while preserving essential business logic.<n>It then employs reasoning to detect complex vulnerabilities and automatically chain functional exploits.
Score: 8.735899453872966
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Smart contract security is paramount, but identifying intricate business logic vulnerabilities remains a persistent challenge because existing solutions consistently fall short: manual auditing is unscalable, static analysis tools are plagued by false positives, and fuzzers struggle to navigate deep logic states within complex systems. Even emerging AI-based methods suffer from hallucinations, context constraints, and a heavy reliance on expensive, proprietary Large Language Models. In this paper, we introduce Heimdallr, an automated auditing agent designed to overcome these hurdles through four core innovations. By reorganizing code at the function level, Heimdallr minimizes context overhead while preserving essential business logic. It then employs heuristic reasoning to detect complex vulnerabilities and automatically chain functional exploits. Finally, a cascaded verification layer validates these findings to eliminate false positives. Notably, this approach achieves high performance on lightweight, open-source models like GPToss-120B without relying on proprietary systems. Our evaluations demonstrate exceptional performance, as Heimdallr successfully reconstructed 17 out of 20 real-world attacks post June 2025, resulting in total losses of $384M, and uncovered 4 confirmed zero-day vulnerabilities that safeguarded $400M in TVL. Compared to SOTA baselines including both official industrial tools and academic tools, Heimdallr at most reduces analysis time by 97.59% and financial costs by 98.77% while boosting detection precision by over 93.66%. Notably, when applied to auditing contests, Heimdallr can achieve a 92.45% detection rate at a negligible cost of $2.31 per 10K LOC. We provide production-ready auditing services and release valuable benchmarks for future work.

Related papers

RealSec-bench: A Benchmark for Evaluating Secure Code Generation in Real-World Repositories [58.32028251925354]
Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, but their proficiency in producing secure code remains a critical, under-explored area.<n>We introduce RealSec-bench, a new benchmark for secure code generation meticulously constructed from real-world, high-risk Java repositories.
arXiv Detail & Related papers (2026-01-30T08:29:01Z)
SSR: Safeguarding Staking Rewards by Defining and Detecting Logical Defects in DeFi Staking [55.62033436283969]
Decentralized Finance (DeFi) staking is one of the most prominent applications within the DeFi ecosystem.<n> logical defects in DeFi staking could enable attackers to claim unwarranted rewards.<n>We developed SSR (Safeguarding Staking Reward), a static analysis tool designed to detect logical defects in DeFi staking contracts.
arXiv Detail & Related papers (2026-01-09T15:01:41Z)
LLM-Powered Detection of Price Manipulation in DeFi [12.59175486585742]
Decentralized Finance (DeFi) smart contracts manage billions of dollars, making them a prime target for exploits.<n>Price manipulation vulnerabilities, often via flash loans, are a devastating class of attacks causing significant financial losses.<n>We propose PMDetector, a hybrid framework combining static analysis with Large Language Model (LLM)-based reasoning.
arXiv Detail & Related papers (2025-10-24T09:13:30Z)
Towards Secure and Explainable Smart Contract Generation with Security-Aware Group Relative Policy Optimization [18.013438474903314]
We propose SmartCoder-R1, a framework for secure and explainable smart contract generation.<n>We train the model to emulate human security analysis.<n>SmartCoder-R1 establishes a new state of the art, achieving top performance across five key metrics.
arXiv Detail & Related papers (2025-09-12T03:14:50Z)
CyberGym: Evaluating AI Agents' Real-World Cybersecurity Capabilities at Scale [45.97598662617568]
We introduce CyberGym, a large-scale benchmark featuring 1,507 real-world vulnerabilities across 188 software projects.<n>We show that CyberGym leads to the discovery of 35 zero-day vulnerabilities and 17 historically incomplete patches.<n>These results underscore that CyberGym is not only a robust benchmark for measuring AI's progress in cybersecurity but also a platform for creating direct, real-world security impact.
arXiv Detail & Related papers (2025-06-03T07:35:14Z)
AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security [74.22452069013289]
AegisLLM is a cooperative multi-agent defense against adversarial attacks and information leakage.<n>We show that scaling agentic reasoning system at test-time substantially enhances robustness without compromising model utility.<n> Comprehensive evaluations across key threat scenarios, including unlearning and jailbreaking, demonstrate the effectiveness of AegisLLM.
arXiv Detail & Related papers (2025-04-29T17:36:05Z)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [71.7892165868749]
Commercial Large Language Model (LLM) APIs create a fundamental trust problem.<n>Users pay for specific models but have no guarantee that providers deliver them faithfully.<n>We formalize this model substitution problem and evaluate detection methods under realistic adversarial conditions.<n>We propose and evaluate the use of Trusted Execution Environments (TEEs) as one practical and robust solution.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
Do Automated Fixes Truly Mitigate Smart Contract Exploits? [7.570246812206772]
This paper introduces a novel and systematic experimental framework for evaluating exploit mitigation of program repair tools for smart contracts.<n>We qualitatively and quantitatively analyze 20 state-of-the-art APR tools using a dataset of 143 vulnerable smart contracts.<n>Our findings reveal substantial disparities in the state of the art, with an exploit mitigation rate ranging from a low of 29% to a high of 74%.
arXiv Detail & Related papers (2025-01-08T16:31:59Z)
AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs. Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z)
Retrieval Augmented Generation Integrated Large Language Models in Smart Contract Vulnerability Detection [0.0]
Decentralized Finance (DeFi) has been accompanied by substantial financial losses due to smart contract vulnerabilities. With attacks becoming more frequent, the necessity and demand for auditing services has escalated. This study builds upon existing frameworks by integrating Retrieval-Augmented Generation (RAG) with large language models (LLMs)
arXiv Detail & Related papers (2024-07-20T10:46:42Z)
G$^2$uardFL: Safeguarding Federated Learning Against Backdoor Attacks through Attributed Client Graph Clustering [116.4277292854053]
Federated Learning (FL) offers collaborative model training without data sharing. FL is vulnerable to backdoor attacks, where poisoned model weights lead to compromised system integrity. We present G$2$uardFL, a protective framework that reinterprets the identification of malicious clients as an attributed graph clustering problem.
arXiv Detail & Related papers (2023-06-08T07:15:04Z)
Smart Contract and DeFi Security Tools: Do They Meet the Needs of Practitioners? [10.771021805354911]
Attacks targeting smart contracts are increasing, causing an estimated $6.45 billion in financial losses. We aim to shed light on the effectiveness of automated security tools in identifying vulnerabilities that can lead to high-profile attacks. Our findings reveal a stark reality: the tools could have prevented a mere 8% of the attacks in our dataset, amounting to $149 million out of the $2.3 billion in losses.
arXiv Detail & Related papers (2023-04-06T10:27:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.