Related papers: NATLM: Detecting Defects in NFT Smart Contracts Leveraging LLM

NATLM: Detecting Defects in NFT Smart Contracts Leveraging LLM

URL: http://arxiv.org/abs/2508.01351v1
Date: Sat, 02 Aug 2025 12:56:27 GMT
Title: NATLM: Detecting Defects in NFT Smart Contracts Leveraging LLM
Authors: Yuanzheng Niu, Xiaoqi Li, Wenkai Li,
Abstract summary: This paper presents a framework called NATLM, designed to detect potential defects in NFT smart contracts.<n>The framework effectively identifies four common types of vulnerabilities in NFT smart contracts: ERC-721 Reentrancy, Public Burn, Risky Mutable Proxy, and Unlimited Minting.<n> Experimental results indicate that NATLM analyzed 8,672 collected NFT smart contracts, achieving an overall precision of 87.72%, a recall of 89.58%, and an F1 score of 88.94%.
Score: 0.8225825738565354
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Security issues are becoming increasingly significant with the rapid evolution of Non-fungible Tokens (NFTs). As NFTs are traded as digital assets, they have emerged as prime targets for cyber attackers. In the development of NFT smart contracts, there may exist undiscovered defects that could lead to substantial financial losses if exploited. To tackle this issue, this paper presents a framework called NATLM(NFT Assistant LLM), designed to detect potential defects in NFT smart contracts. The framework effectively identifies four common types of vulnerabilities in NFT smart contracts: ERC-721 Reentrancy, Public Burn, Risky Mutable Proxy, and Unlimited Minting. Relying exclusively on large language models (LLMs) for defect detection can lead to a high false-positive rate. To enhance detection performance, NATLM integrates static analysis with LLMs, specifically Gemini Pro 1.5. Initially, NATLM employs static analysis to extract structural, syntactic, and execution flow information from the code, represented through Abstract Syntax Trees (AST) and Control Flow Graphs (CFG). These extracted features are then combined with vectors of known defect examples to create a matrix for input into the knowledge base. Subsequently, the feature vectors and code vectors of the analyzed contract are compared with the contents of the knowledge base. Finally, the LLM performs deep semantic analysis to enhance detection capabilities, providing a more comprehensive and accurate identification of potential security issues. Experimental results indicate that NATLM analyzed 8,672 collected NFT smart contracts, achieving an overall precision of 87.72%, a recall of 89.58%, and an F1 score of 88.94%. The results outperform other baseline experiments, successfully identifying four common types of defects.

Related papers

LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation [56.84049855266145]
We propose a Multi-feedback Smart Contract Fuzzing framework (LLAMA) that integrates evolutionary mutation strategies, and hybrid testing techniques.<n>LLAMA achieves 91% instruction coverage and 90% branch coverage, while detecting 132 out of 148 known vulnerabilities.<n>These results highlight LLAMA's effectiveness, adaptability, and practicality in real-world smart contract security testing scenarios.
arXiv Detail & Related papers (2025-07-16T09:46:58Z)
Smart-LLaMA-DPO: Reinforced Large Language Model for Explainable Smart Contract Vulnerability Detection [15.694744168599055]
Existing vulnerability detection methods face two main issues.<n>They lack comprehensive coverage and high-quality explanations for preference learning.<n>Large language models (LLMs) often struggle with accurately interpreting specific concepts in smart contract security.
arXiv Detail & Related papers (2025-06-23T02:24:07Z)
AI-Based Vulnerability Analysis of NFT Smart Contracts [6.378351117969227]
This study proposes an AI-driven approach to detect vulnerabilities in NFT smart contracts.<n>We collected 16,527 public smart contract codes, classifying them into five vulnerability categories: Risky Mutable Proxy, ERC-721 Reentrancy, Unlimited Minting, Missing Requirements, and Public Burn.<n>A random forest model was implemented to improve robustness through random data/feature sampling and multitree integration.
arXiv Detail & Related papers (2025-04-18T08:55:31Z)
MOS: Towards Effective Smart Contract Vulnerability Detection through Mixture-of-Experts Tuning of Large Language Models [16.16186929130931]
Smart contract vulnerabilities pose significant security risks to blockchain systems.<n>We propose a smart contract vulnerability detection framework based on mixture-of-experts tuning (MOE-Tuning) of large language models.<n> Experiments show that MOS significantly outperforms existing methods with average improvements of 6.32% in F1 score and 4.80% in accuracy.
arXiv Detail & Related papers (2025-04-16T16:33:53Z)
Enhancing Smart Contract Vulnerability Detection in DApps Leveraging Fine-Tuned LLM [0.7018579932647147]
Decentralized applications (DApps) face significant security risks due to vulnerabilities in smart contracts.<n>This paper proposes a novel approach leveraging fine-tuned Large Language Models (LLMs) to enhance smart contract vulnerability detection.
arXiv Detail & Related papers (2025-04-07T12:32:14Z)
Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning [79.48839334040197]
Instruction fine-tuning (IFT) can increase the informativeness of large language models (LLMs), but may reduce their truthfulness.<n>In this paper, we empirically demonstrate how unfamiliar knowledge in IFT datasets can negatively affect the truthfulness of LLMs.<n>We introduce two new IFT paradigms, $UNIT_cut$ and $UNIT_ref$, to address this issue.
arXiv Detail & Related papers (2025-02-17T16:10:30Z)
MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark [57.999567012489706]
We propose a contamination-free and more challenging benchmark called MMLU-CF.<n>This benchmark reassesses LLMs' understanding of world knowledge by averting both unintentional and malicious data leakage.<n>Our evaluation of mainstream LLMs reveals that the powerful GPT-4o achieves merely a 5-shot score of 73.4% and a 0-shot score of 71.9% on the test set.
arXiv Detail & Related papers (2024-12-19T18:58:04Z)
Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses. Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives. The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z)
Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications [18.138452572457552]
iAudit is a framework for intuitive smart contract auditing with justifications. On a dataset of 263 real smart contract vulnerabilities, iAudit achieves an F1 score of 91.21% and an accuracy of 91.11%.
arXiv Detail & Related papers (2024-03-24T09:26:53Z)
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability. In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling. Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z)
Definition and Detection of Defects in NFT Smart Contracts [34.359991158202796]
Defects in NFT smart contracts could be exploited by attackers to harm the security and reliability of the NFT ecosystem. In this paper, we introduce 5 defects in NFT smart contracts and propose a tool named NFTGuard to detect these defects. We find that 1,331 contracts contain at least one of the 5 defects, and the overall precision achieved by our tool is 92.6%.
arXiv Detail & Related papers (2023-05-25T08:17:05Z)
SymNMF-Net for The Symmetric NMF Problem [62.44067422984995]
We propose a neural network called SymNMF-Net for the Symmetric NMF problem. We show that the inference of each block corresponds to a single iteration of the optimization. Empirical results on real-world datasets demonstrate the superiority of our SymNMF-Net.
arXiv Detail & Related papers (2022-05-26T08:17:39Z)
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable. We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts. We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.