Smart Contract Intent Detection with Pre-trained Programming Language Model
- URL: http://arxiv.org/abs/2508.20086v3
- Date: Fri, 03 Oct 2025 14:43:31 GMT
- Title: Smart Contract Intent Detection with Pre-trained Programming Language Model
- Authors: Youwei Huang, Jianwen Li, Sen Fang, Yao Li, Peng Yang, Bin Hu,
- Abstract summary: Malicious developer intents in smart contracts constitute significant security threats to decentralized applications.<n>In this study, we present an enhanced version of this model, SmartIntentNN2 (Smart Contract Intent Neural Network V2)<n>The primary enhancement is the integration of a BERT-based pre-trained programming language model.<n>On the same evaluation set of 10,000 smart contracts, SmartIntentNN2 achieves superior performance with an accuracy of 0.9789, precision of 0.9090, recall of 0.9476, and an F1 score of 0.9279.
- Score: 8.693208013894653
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Malicious developer intents in smart contracts constitute significant security threats to decentralized applications, leading to substantial economic losses. To address this, SmartIntentNN was previously introduced as a deep learning model for detecting unsafe developer intents. By combining the Universal Sentence Encoder, a K-means clustering-based intent highlighting mechanism, and a Bidirectional Long Short-Term Memory (BiLSTM) network, the model achieved an F1 score of 0.8633 on an evaluation set of 10,000 real-world smart contracts across ten distinct intent categories. In this study, we present an enhanced version of this model, SmartIntentNN2 (Smart Contract Intent Neural Network V2). The primary enhancement is the integration of a BERT-based pre-trained programming language model, which we domain-adaptively pre-train on a dataset of 16,000 real-world smart contracts using a Masked Language Modeling objective. SmartIntentNN2 retains the BiLSTM-based multi-label classification network for intent detection. On the same evaluation set of 10,000 smart contracts, SmartIntentNN2 achieves superior performance with an accuracy of 0.9789, precision of 0.9090, recall of 0.9476, and an F1 score of 0.9279, substantially outperforming its predecessor and other baseline models. Notably, SmartIntentNN2 also delivers a 65.5% relative improvement in F1 score over GPT-4.1 on this specialized task. These results establish SmartIntentNN2 as a new state-of-the-art model for smart contract intent detection.
Related papers
- MI$^2$DAS: A Multi-Layer Intrusion Detection Framework with Incremental Learning for Securing Industrial IoT Networks [47.386868423451595]
MI$2$DAS is a multi-layer intrusion detection framework that integrates anomaly-based hierarchical traffic pooling and open-set recognition.<n>Experiments conducted on the Edge-IIoTset dataset demonstrate strong performance across all layers.<n>These results showcase MI$2$DAS as an effective, scalable and adaptive framework for enhancing IIoT security.
arXiv Detail & Related papers (2026-02-27T09:37:05Z) - CREDIT: Certified Ownership Verification of Deep Neural Networks Against Model Extraction Attacks [54.04030169323115]
We introduce CREDIT, a certified ownership verification against Model Extraction Attacks (MEAs)<n>We quantify the similarity between DNN models, propose a practical verification threshold, and provide rigorous theoretical guarantees for ownership verification based on this threshold.<n>We extensively evaluate our approach on several mainstream datasets across different domains and tasks, achieving state-of-the-art performance.
arXiv Detail & Related papers (2026-02-23T23:36:25Z) - SmartLLM: Smart Contract Auditing using Custom Generative AI [0.0]
This paper introduces SmartLLM, a novel approach leveraging fine-tuned LLaMA 3.1 models with Retrieval-Augmented Generation (RAG)<n>By integrating domain-specific knowledge from ERC standards, SmartLLM achieves superior performance compared to static analysis tools like Mythril and Slither.<n> Experimental results demonstrate a perfect recall of 100% and an accuracy score of 70%, highlighting the model's robustness in identifying vulnerabilities.
arXiv Detail & Related papers (2025-02-17T06:22:05Z) - SmartLLMSentry: A Comprehensive LLM Based Smart Contract Vulnerability Detection Framework [0.0]
This paper introduces SmartLLMSentry, a novel framework that leverages large language models (LLMs) to advance smart contract vulnerability detection.<n>We created a specialized dataset of five randomly selected vulnerabilities for model training and evaluation.<n>Our results show an exact match accuracy of 91.1% with sufficient data, although GPT-4 demonstrated reduced performance compared to GPT-3 in rule generation.
arXiv Detail & Related papers (2024-11-28T16:02:01Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - Prompt Tuning for Zero-shot Compositional Learning [53.090335182962605]
We propose a framework named Multi-Modal Prompt Tuning (MMPT) to inherit the "knowledgeable" property from the large pre-trained vision-language model.
On the UT-Zappos dataset, MMPT pushes the AUC score to $29.8$, while the previous best score is $26.5$.
On the more challenging MIT-States dataset, the AUC score of MMPT is 1.5 times better than the current state-of-the-art.
arXiv Detail & Related papers (2023-12-02T07:32:24Z) - Co-guiding for Multi-intent Spoken Language Understanding [53.30511968323911]
We propose a novel model termed Co-guiding Net, which implements a two-stage framework achieving the mutual guidances between the two tasks.
For the first stage, we propose single-task supervised contrastive learning, and for the second stage, we propose co-guiding supervised contrastive learning.
Experiment results on multi-intent SLU show that our model outperforms existing models by a large margin.
arXiv Detail & Related papers (2023-11-22T08:06:22Z) - Schooling to Exploit Foolish Contracts [4.18804572788063]
We introduce SCooLS, our Smart Contract Learning (Semi-supervised) engine.
SCooLS uses neural networks to analyze contract bytecode and identifies specific vulnerable functions.
SCooLS's performance is better than existing tools, with an accuracy level of 98.4%, an F1 score of 90.5%, and an exceptionally low false positive rate of only 0.8%.
arXiv Detail & Related papers (2023-04-21T04:25:08Z) - SmartIntentNN: Towards Smart Contract Intent Detection [5.9789082082171525]
We introduce textscSmartIntentNN (Smart Contract Intent Neural Network), a deep learning-based tool designed to automate the detection of developers' intent in smart contracts.
Our approach integrates a Universal Sentence for contextual representation of smart contract code, and employs a K-means clustering algorithm to highlight intent-related code features.
Evaluations on 10,000 real-world smart contracts demonstrate that textscSmartIntentNN surpasses all baselines, achieving an F1-score of 0.8633.
arXiv Detail & Related papers (2022-11-24T15:36:35Z) - Deep Smart Contract Intent Detection [5.642524477190184]
textscSmartIntentNN is a deep learning model designed to automatically detect development intents in smart contracts.<n>We trained and evaluated textscSmartIntentNN on a dataset containing over 40,000 real-world smart contracts.
arXiv Detail & Related papers (2022-11-19T15:40:26Z) - Robustness Certificates for Implicit Neural Networks: A Mixed Monotone
Contractive Approach [60.67748036747221]
Implicit neural networks offer competitive performance and reduced memory consumption.
They can remain brittle with respect to input adversarial perturbations.
This paper proposes a theoretical and computational framework for robustness verification of implicit neural networks.
arXiv Detail & Related papers (2021-12-10T03:08:55Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z) - DeBERTa: Decoding-enhanced BERT with Disentangled Attention [119.77305080520718]
We propose a new model architecture DeBERTa that improves the BERT and RoBERTa models using two novel techniques.
We show that these techniques significantly improve the efficiency of model pre-training and the performance of both natural language understanding (NLU) and natural langauge generation (NLG) downstream tasks.
arXiv Detail & Related papers (2020-06-05T19:54:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.