A Bytecode-based Approach for Smart Contract Classification
- URL: http://arxiv.org/abs/2106.15497v1
- Date: Mon, 31 May 2021 03:00:29 GMT
- Title: A Bytecode-based Approach for Smart Contract Classification
- Authors: Chaochen Shi, Yong Xiang, Robin Ram Mohan Doss, Jiangshan Yu, Keshav
Sood, Longxiang Gao
- Abstract summary: The number of smart contracts deployed on blockchain platforms is growing exponentially, which makes it difficult for users to find desired services by manual screening.
Current research on smart contract classification focuses on Natural Language Processing (NLP) solutions which are based on contract source code.
This paper proposes a classification model based on features from contract bytecode instead of source code to solve these problems.
- Score: 10.483992071557195
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the development of blockchain technologies, the number of smart
contracts deployed on blockchain platforms is growing exponentially, which
makes it difficult for users to find desired services by manual screening. The
automatic classification of smart contracts can provide blockchain users with
keyword-based contract searching and helps to manage smart contracts
effectively. Current research on smart contract classification focuses on
Natural Language Processing (NLP) solutions which are based on contract source
code. However, more than 94% of smart contracts are not open-source, so the
application scenarios of NLP methods are very limited. Meanwhile, NLP models
are vulnerable to adversarial attacks. This paper proposes a classification
model based on features from contract bytecode instead of source code to solve
these problems. We also use feature selection and ensemble learning to optimize
the model. Our experimental studies on over 3,300 real-world Ethereum smart
contracts show that our model can classify smart contracts without source code
and has better performance than baseline models. Our model also has good
resistance to adversarial attacks compared with NLP-based models. In addition,
our analysis reveals that account features used in many smart contract
classification models have little effect on classification and can be excluded.
Related papers
- Contractual Reinforcement Learning: Pulling Arms with Invisible Hands [68.77645200579181]
We propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design.
For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent.
For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation.
arXiv Detail & Related papers (2024-07-01T16:53:00Z) - Efficacy of Various Large Language Models in Generating Smart Contracts [0.0]
This study analyzes the application of code-generating Large Language Models in the creation of Solidity smart contracts on the immutable.
We also discovered a novel way of generating smart contracts through prompting new strategies.
arXiv Detail & Related papers (2024-06-28T17:31:47Z) - Improving Smart Contract Security with Contrastive Learning-based Vulnerability Detection [8.121484960948303]
We propose Contrastive Learning Enhanced Automated Recognition Approach for Smart Contract Vulnerabilities, named Clear.
In particular, Clear employs a contrastive learning (CL) model to capture the fine-grained correlation information among contracts.
We show that Clear achieves optimal performance over all baseline methods; (2) 9.73%-39.99% higher F1-score than existing deep learning methods.
arXiv Detail & Related papers (2024-04-27T09:13:25Z) - VulnSense: Efficient Vulnerability Detection in Ethereum Smart Contracts
by Multimodal Learning with Graph Neural Network and Language Model [0.0]
VulnSense is a comprehensive approach to efficiently detect vulnerabilities in smart contracts.
Our framework combines three types of features from smart contracts including source code, opcode sequences, and control flow graph.
We employ Bidirectional Representations from Transformers (BERT), Bidirectional Long Short-Term Memory (BiLSTM) and Graph Neural Network (GNN) models to extract and analyze these features.
The experimental outcomes demonstrate the superior performance of our proposed approach, achieving an average accuracy of 77.96% across all three categories of vulnerable smart contracts.
arXiv Detail & Related papers (2023-09-15T15:26:44Z) - Delegating Data Collection in Decentralized Machine Learning [67.0537668772372]
Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection.
We design optimal and near-optimal contracts that deal with two fundamental information asymmetries.
We show that a principal can cope with such asymmetry via simple linear contracts that achieve 1-1/e fraction of the optimal utility.
arXiv Detail & Related papers (2023-09-04T22:16:35Z) - SmartIntentNN: Towards Smart Contract Intent Detection [5.9789082082171525]
We introduce textscSmartIntentNN (Smart Contract Intent Neural Network), a deep learning-based tool designed to automate the detection of developers' intent in smart contracts.
Our approach integrates a Universal Sentence for contextual representation of smart contract code, and employs a K-means clustering algorithm to highlight intent-related code features.
Evaluations on 10,000 real-world smart contracts demonstrate that textscSmartIntentNN surpasses all baselines, achieving an F1-score of 0.8633.
arXiv Detail & Related papers (2022-11-24T15:36:35Z) - Smart Contract Vulnerability Detection: From Pure Neural Network to
Interpretable Graph Feature and Expert Pattern Fusion [48.744359070088166]
Conventional smart contract vulnerability detection methods heavily rely on fixed expert rules.
Recent deep learning approaches alleviate this issue but fail to encode useful expert knowledge.
We develop automatic tools to extract expert patterns from the source code.
We then cast the code into a semantic graph to extract deep graph features.
arXiv Detail & Related papers (2021-06-17T07:12:13Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z) - Blockchain Assisted Decentralized Federated Learning (BLADE-FL) with
Lazy Clients [124.48732110742623]
We propose a novel framework by integrating blockchain into Federated Learning (FL)
BLADE-FL has a good performance in terms of privacy preservation, tamper resistance, and effective cooperation of learning.
It gives rise to a new problem of training deficiency, caused by lazy clients who plagiarize others' trained models and add artificial noises to conceal their cheating behaviors.
arXiv Detail & Related papers (2020-12-02T12:18:27Z) - Blockchain Enabled Smart Contract Based Applications: Deficiencies with
the Software Development Life Cycle Models [0.0]
The immutability of the blocks, where the smart contracts are stored, causes conflicts with the traditional Software Development Life Cycle (SDLC) models.
This research article addresses this current problem by first exploring the six traditional SDLC models.
It advocates that there is an urgent need to develop new standard model(s) to address the arising issues.
arXiv Detail & Related papers (2020-01-21T03:48:46Z) - AvgOut: A Simple Output-Probability Measure to Eliminate Dull Responses [97.50616524350123]
We build dialogue models that are dynamically aware of what utterances or tokens are dull without any feature-engineering.
The first model, MinAvgOut, directly maximizes the diversity score through the output distributions of each batch.
The second model, Label Fine-Tuning (LFT), prepends to the source sequence a label continuously scaled by the diversity score to control the diversity level.
The third model, RL, adopts Reinforcement Learning and treats the diversity score as a reward signal.
arXiv Detail & Related papers (2020-01-15T18:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.