TraceLLM: Security Diagnosis Through Traces and Smart Contracts in Ethereum
- URL: http://arxiv.org/abs/2509.03037v1
- Date: Wed, 03 Sep 2025 05:53:56 GMT
- Title: TraceLLM: Security Diagnosis Through Traces and Smart Contracts in Ethereum
- Authors: Shuzheng Wang, Yue Huang, Zhuoer Xu, Yuming Huang, Jing Tang,
- Abstract summary: TraceLLM establishes the first benchmark for joint trace and contract code-driven security analysis.<n>TraceLLM identifies attacker and victim addresses with 85.19% precision and produces automated reports with 70.37% factual precision.<n>Across real-world incidents, TraceLLM automatically generates reports with 66.22% expert-verified accuracy, demonstrating strong generalizability.
- Score: 23.800363904247146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ethereum smart contracts hold tens of billions of USD in DeFi and NFTs, yet comprehensive security analysis remains difficult due to unverified code, proxy-based architectures, and the reliance on manual inspection of complex execution traces. Existing approaches fall into two main categories: anomaly transaction detection, which flags suspicious transactions but offers limited insight into specific attack strategies hidden in execution traces inside transactions, and code vulnerability detection, which cannot analyze unverified contracts and struggles to show how identified flaws are exploited in real incidents. As a result, analysts must still manually align transaction traces with contract code to reconstruct attack scenarios and conduct forensics. To address this gap, TraceLLM is proposed as a framework that leverages LLMs to integrate execution trace-level detection with decompiled contract code. We introduce a new anomaly execution path identification algorithm and an LLM-refined decompile tool to identify vulnerable functions and provide explicit attack paths to LLM. TraceLLM establishes the first benchmark for joint trace and contract code-driven security analysis. For comparison, proxy baselines are created by jointly transmitting the results of three representative code analysis along with raw traces to LLM. TraceLLM identifies attacker and victim addresses with 85.19\% precision and produces automated reports with 70.37\% factual precision across 27 cases with ground truth expert reports, achieving 25.93\% higher accuracy than the best baseline. Moreover, across 148 real-world Ethereum incidents, TraceLLM automatically generates reports with 66.22\% expert-verified accuracy, demonstrating strong generalizability.
Related papers
- RealSec-bench: A Benchmark for Evaluating Secure Code Generation in Real-World Repositories [58.32028251925354]
Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, but their proficiency in producing secure code remains a critical, under-explored area.<n>We introduce RealSec-bench, a new benchmark for secure code generation meticulously constructed from real-world, high-risk Java repositories.
arXiv Detail & Related papers (2026-01-30T08:29:01Z) - Bridging Expert Reasoning and LLM Detection: A Knowledge-Driven Framework for Malicious Packages [10.858565849895314]
Open-source ecosystems such as NPM and PyPI are increasingly targeted by supply chain attacks.<n>We present IntelGuard, a retrieval-augmented generation (RAG) based framework that integrates expert analytical reasoning into automated malicious package detection.
arXiv Detail & Related papers (2026-01-23T05:31:12Z) - Trace: Securing Smart Contract Repository Against Access Control Vulnerability [58.02691083789239]
GitHub hosts numerous smart contract repositories containing source code, documentation, and configuration files.<n>Third-party developers often reference, reuse, or fork code from these repositories during custom development.<n>Existing tools for detecting smart contract vulnerabilities are limited in their ability to handle complex repositories.
arXiv Detail & Related papers (2025-10-22T05:18:28Z) - Generic Adversarial Smart Contract Detection with Semantics and Uncertainty-Aware LLM [18.01454017110476]
FinDet is a generic adversarial smart contracts detection framework.<n>It takes as input only the EVM-bytecode contracts and identifies adversarial ones with high balanced accuracy.<n>Our comprehensive evaluation shows that FinDet achieves a BAC of 0.9223 and a TPR of 0.8950, significantly outperforming existing baselines.
arXiv Detail & Related papers (2025-09-23T12:52:05Z) - VulAgent: Hypothesis-Validation based Multi-Agent Vulnerability Detection [55.957275374847484]
VulAgent is a multi-agent vulnerability detection framework based on hypothesis validation.<n>It implements a semantics-sensitive, multi-view detection pipeline, each aligned to a specific analysis perspective.<n>On average, VulAgent improves overall accuracy by 6.6%, increases the correct identification rate of vulnerable--fixed code pairs by up to 450%, and reduces the false positive rate by about 36%.
arXiv Detail & Related papers (2025-09-15T02:25:38Z) - NATLM: Detecting Defects in NFT Smart Contracts Leveraging LLM [0.8225825738565354]
This paper presents a framework called NATLM, designed to detect potential defects in NFT smart contracts.<n>The framework effectively identifies four common types of vulnerabilities in NFT smart contracts: ERC-721 Reentrancy, Public Burn, Risky Mutable Proxy, and Unlimited Minting.<n> Experimental results indicate that NATLM analyzed 8,672 collected NFT smart contracts, achieving an overall precision of 87.72%, a recall of 89.58%, and an F1 score of 88.94%.
arXiv Detail & Related papers (2025-08-02T12:56:27Z) - Decompiling Smart Contracts with a Large Language Model [51.49197239479266]
Despite Etherscan's 78,047,845 smart contracts deployed on (as of May 26, 2025), a mere 767,520 ( 1%) are open source.<n>This opacity necessitates the automated semantic analysis of on-chain smart contract bytecode.<n>We introduce a pioneering decompilation pipeline that transforms bytecode into human-readable and semantically faithful Solidity code.
arXiv Detail & Related papers (2025-06-24T13:42:59Z) - ETrace:Event-Driven Vulnerability Detection in Smart Contracts via LLM-Based Trace Analysis [14.24781559851732]
We present ETrace, a novel event-driven vulnerability detection framework for smart contracts.<n>By extracting fine-grained event sequences from transaction logs, the framework leverages Large Language Models (LLMs) as adaptive semantic interpreters.<n>ETrace implements pattern-matching to establish causal links between transaction behavior patterns and known attack behaviors.
arXiv Detail & Related papers (2025-06-18T18:18:19Z) - Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [60.881609323604685]
Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge.<n>Users pay for services based on advertised model capabilities.<n> providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs.<n>This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
arXiv Detail & Related papers (2025-04-07T03:57:41Z) - Towards Copyright Protection for Knowledge Bases of Retrieval-augmented Language Models via Reasoning [58.57194301645823]
Large language models (LLMs) are increasingly integrated into real-world personalized applications.<n>The valuable and often proprietary nature of the knowledge bases used in RAG introduces the risk of unauthorized usage by adversaries.<n>Existing methods that can be generalized as watermarking techniques to protect these knowledge bases typically involve poisoning or backdoor attacks.<n>We propose name for harmless' copyright protection of knowledge bases.
arXiv Detail & Related papers (2025-02-10T09:15:56Z) - Towards Exception Safety Code Generation with Intermediate Representation Agents Framework [54.03528377384397]
Large Language Models (LLMs) often struggle with robust exception handling in generated code, leading to fragile programs that are prone to runtime errors.<n>We propose Seeker, a novel multi-agent framework that enforces exception safety in LLM generated code through an Intermediate Representation (IR) approach.<n>Seeker decomposes exception handling into five specialized agents: Scanner, Detector, Predator, Ranker, and Handler.
arXiv Detail & Related papers (2024-10-09T14:45:45Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - Efficiently Detecting Reentrancy Vulnerabilities in Complex Smart Contracts [35.26195628798847]
Existing vulnerability detection tools perform poorly in terms of efficiency and successful detection rates for vulnerabilities in complex contracts.
SliSE provides a robust and efficient method for detection of Reentrancy vulnerabilities for complex contracts.
arXiv Detail & Related papers (2024-03-17T16:08:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.