Related papers: A Bytecode-based Approach for Smart Contract Classification

A Bytecode-based Approach for Smart Contract Classification

URL: http://arxiv.org/abs/2106.15497v1
Date: Mon, 31 May 2021 03:00:29 GMT
Title: A Bytecode-based Approach for Smart Contract Classification
Authors: Chaochen Shi, Yong Xiang, Robin Ram Mohan Doss, Jiangshan Yu, Keshav Sood, Longxiang Gao
Abstract summary: The number of smart contracts deployed on blockchain platforms is growing exponentially, which makes it difficult for users to find desired services by manual screening. Current research on smart contract classification focuses on Natural Language Processing (NLP) solutions which are based on contract source code. This paper proposes a classification model based on features from contract bytecode instead of source code to solve these problems.
Score: 10.483992071557195
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the development of blockchain technologies, the number of smart contracts deployed on blockchain platforms is growing exponentially, which makes it difficult for users to find desired services by manual screening. The automatic classification of smart contracts can provide blockchain users with keyword-based contract searching and helps to manage smart contracts effectively. Current research on smart contract classification focuses on Natural Language Processing (NLP) solutions which are based on contract source code. However, more than 94% of smart contracts are not open-source, so the application scenarios of NLP methods are very limited. Meanwhile, NLP models are vulnerable to adversarial attacks. This paper proposes a classification model based on features from contract bytecode instead of source code to solve these problems. We also use feature selection and ensemble learning to optimize the model. Our experimental studies on over 3,300 real-world Ethereum smart contracts show that our model can classify smart contracts without source code and has better performance than baseline models. Our model also has good resistance to adversarial attacks compared with NLP-based models. In addition, our analysis reveals that account features used in many smart contract classification models have little effect on classification and can be excluded.

Related papers

Malicious Code Detection in Smart Contracts via Opcode Vectorization [0.8225825738565354]
Security problems of smart contracts become increasingly prominent. The existence of malicious codes may lead to the loss of user assets and system crash. In this paper, a simple study is carried out on malicious code detection of intelligent contracts based on machine learning.
arXiv Detail & Related papers (2025-04-17T07:51:48Z)
SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair [51.0686873716938]
We introduce SolBench, a benchmark for evaluating the functional correctness of Solidity smart contracts generated by code completion models. We propose a Retrieval-Augmented Code Repair framework to verify functional correctness of smart contracts. Results show that code repair and retrieval techniques effectively enhance the correctness of smart contract completion while reducing computational costs.
arXiv Detail & Related papers (2025-03-03T01:55:20Z)
Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection [0.0]
We train and test machine learning algorithms to classify smart contract codes according to type in order to compare model performance. Our research combines machine learning and large language models to provide a rich and interpretable framework for detecting different smart contract vulnerabilities.
arXiv Detail & Related papers (2025-01-04T08:32:53Z)
Contractual Reinforcement Learning: Pulling Arms with Invisible Hands [68.77645200579181]
We propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design. For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent. For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation.
arXiv Detail & Related papers (2024-07-01T16:53:00Z)
Efficacy of Various Large Language Models in Generating Smart Contracts [0.0]
This study analyzes the application of code-generating Large Language Models in the creation of Solidity smart contracts on the immutable. We also discovered a novel way of generating smart contracts through prompting new strategies.
arXiv Detail & Related papers (2024-06-28T17:31:47Z)
Improving Smart Contract Security with Contrastive Learning-based Vulnerability Detection [8.121484960948303]
We propose Contrastive Learning Enhanced Automated Recognition Approach for Smart Contract Vulnerabilities, named Clear. In particular, Clear employs a contrastive learning (CL) model to capture the fine-grained correlation information among contracts. We show that Clear achieves optimal performance over all baseline methods; (2) 9.73%-39.99% higher F1-score than existing deep learning methods.
arXiv Detail & Related papers (2024-04-27T09:13:25Z)
VulnSense: Efficient Vulnerability Detection in Ethereum Smart Contracts by Multimodal Learning with Graph Neural Network and Language Model [0.0]
VulnSense is a comprehensive approach to efficiently detect vulnerabilities in smart contracts. Our framework combines three types of features from smart contracts including source code, opcode sequences, and control flow graph. We employ Bidirectional Representations from Transformers (BERT), Bidirectional Long Short-Term Memory (BiLSTM) and Graph Neural Network (GNN) models to extract and analyze these features. The experimental outcomes demonstrate the superior performance of our proposed approach, achieving an average accuracy of 77.96% across all three categories of vulnerable smart contracts.
arXiv Detail & Related papers (2023-09-15T15:26:44Z)
Delegating Data Collection in Decentralized Machine Learning [67.0537668772372]
Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection. We design optimal and near-optimal contracts that deal with two fundamental information asymmetries. We show that a principal can cope with such asymmetry via simple linear contracts that achieve 1-1/e fraction of the optimal utility.
arXiv Detail & Related papers (2023-09-04T22:16:35Z)
SmartIntentNN: Towards Smart Contract Intent Detection [5.9789082082171525]
We introduce textscSmartIntentNN (Smart Contract Intent Neural Network), a deep learning-based tool designed to automate the detection of developers' intent in smart contracts. Our approach integrates a Universal Sentence for contextual representation of smart contract code, and employs a K-means clustering algorithm to highlight intent-related code features. Evaluations on 10,000 real-world smart contracts demonstrate that textscSmartIntentNN surpasses all baselines, achieving an F1-score of 0.8633.
arXiv Detail & Related papers (2022-11-24T15:36:35Z)
Smart Contract Vulnerability Detection: From Pure Neural Network to Interpretable Graph Feature and Expert Pattern Fusion [48.744359070088166]
Conventional smart contract vulnerability detection methods heavily rely on fixed expert rules. Recent deep learning approaches alleviate this issue but fail to encode useful expert knowledge. We develop automatic tools to extract expert patterns from the source code. We then cast the code into a semantic graph to extract deep graph features.
arXiv Detail & Related papers (2021-06-17T07:12:13Z)
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable. We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts. We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z)
Blockchain Assisted Decentralized Federated Learning (BLADE-FL) with Lazy Clients [124.48732110742623]
We propose a novel framework by integrating blockchain into Federated Learning (FL) BLADE-FL has a good performance in terms of privacy preservation, tamper resistance, and effective cooperation of learning. It gives rise to a new problem of training deficiency, caused by lazy clients who plagiarize others' trained models and add artificial noises to conceal their cheating behaviors.
arXiv Detail & Related papers (2020-12-02T12:18:27Z)
Blockchain Enabled Smart Contract Based Applications: Deficiencies with the Software Development Life Cycle Models [0.0]
The immutability of the blocks, where the smart contracts are stored, causes conflicts with the traditional Software Development Life Cycle (SDLC) models. This research article addresses this current problem by first exploring the six traditional SDLC models. It advocates that there is an urgent need to develop new standard model(s) to address the arising issues.
arXiv Detail & Related papers (2020-01-21T03:48:46Z)
AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses [97.50616524350123]
We build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering. The first model, MinAvgOut, directly maximizes the diversity score through the output distributions of each batch. The second model, Label Fine-Tuning (LFT), prepends to the source sequence a label continuously scaled by the diversity score to control the diversity level. The third model, RL, adopts Reinforcement Learning and treats the diversity score as a reward signal.
arXiv Detail & Related papers (2020-01-15T18:32:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.