Related papers: Multi-Class classification of vulnerabilities in Smart Contracts using AWD-LSTM, with pre-trained encoder inspired from natural language processing

Multi-Class classification of vulnerabilities in Smart Contracts using AWD-LSTM, with pre-trained encoder inspired from natural language processing

URL: http://arxiv.org/abs/2004.00362v1
Date: Sat, 21 Mar 2020 20:48:09 GMT
Title: Multi-Class classification of vulnerabilities in Smart Contracts using AWD-LSTM, with pre-trained encoder inspired from natural language processing
Authors: Ajay K. Gogineni, S. Swayamjyoti, Devadatta Sahoo, Kisor K. Sahu, Raj kishore
Abstract summary: Symbolic tools like OYENTE and MAIAN are typically used for vulnerability prediction in smart contracts. We used Average Gradient Weight-Dropped LSTM (AWD-LSTM), which is a variant of LSTM, to perform classification. We have achieved a weighted average Fbeta score of 90.0%.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vulnerability detection and safety of smart contracts are of paramount importance because of their immutable nature. Symbolic tools like OYENTE and MAIAN are typically used for vulnerability prediction in smart contracts. As these tools are computationally expensive, they are typically used to detect vulnerabilities until some predefined invocation depth. These tools require more search time as the invocation depth increases. Since the number of smart contracts is increasing exponentially, it is difficult to analyze the contracts using these traditional tools. Recently a machine learning technique called Long Short Term Memory (LSTM) has been used for binary classification, i.e., to predict whether a smart contract is vulnerable or not. This technique requires nearly constant search time as the invocation depth increases. In the present article, we have shown a multi-class classification, where we classify a smart contract in Suicidal, Prodigal, Greedy, or Normal categories. We used Average Stochastic Gradient Descent Weight-Dropped LSTM (AWD-LSTM), which is a variant of LSTM, to perform classification. We reduced the class imbalance (a large number of normal contracts as compared to other categories) by considering only the distinct opcode combination for normal contracts. We have achieved a weighted average Fbeta score of 90.0%. Hence, such techniques can be used to analyze a large number of smart contracts and help to improve the security of these contracts.

Related papers

Malicious Code Detection in Smart Contracts via Opcode Vectorization [0.8225825738565354]
Security problems of smart contracts become increasingly prominent. The existence of malicious codes may lead to the loss of user assets and system crash. In this paper, a simple study is carried out on malicious code detection of intelligent contracts based on machine learning.
arXiv Detail & Related papers (2025-04-17T07:51:48Z)
Not all tokens are created equal: Perplexity Attention Weighted Networks for AI generated text detection [49.15148871877941]
Next-token distribution outputs offer a theoretically appealing approach for detection of large language models (LLMs) We propose the Perplexity Attention Weighted Network (PAWN), which uses the last hidden states of the LLM and positions to weight the sum of a series of features based on metrics from the next-token distribution across the sequence length. PAWN shows competitive and even better performance in-distribution than the strongest baselines with a fraction of their trainable parameters.
arXiv Detail & Related papers (2025-01-07T17:00:49Z)
Leveraging Large Language Models and Machine Learning for Smart Contract Vulnerability Detection [0.0]
We train and test machine learning algorithms to classify smart contract codes according to type in order to compare model performance. Our research combines machine learning and large language models to provide a rich and interpretable framework for detecting different smart contract vulnerabilities.
arXiv Detail & Related papers (2025-01-04T08:32:53Z)
Versioned Analysis of Software Quality Indicators and Self-admitted Technical Debt in Ethereum Smart Contracts with Ethstractor [2.052808596154225]
This paper proposes Ethstractor, the first smart contract collection tool for gathering a dataset of versioned smart contracts. The collected dataset is then used to evaluate the reliability of code metrics as indicators of vulnerabilities in smart contracts.
arXiv Detail & Related papers (2024-07-22T18:27:29Z)
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit [55.73370804397226]
Quantization, a key compression technique, can effectively mitigate these demands by compressing and accelerating large language models. We present LLMC, a plug-and-play compression toolkit, to fairly and systematically explore the impact of quantization. Powered by this versatile toolkit, our benchmark covers three key aspects: calibration data, algorithms (three strategies), and data formats.
arXiv Detail & Related papers (2024-05-09T11:49:05Z)
Improving Smart Contract Security with Contrastive Learning-based Vulnerability Detection [8.121484960948303]
We propose Contrastive Learning Enhanced Automated Recognition Approach for Smart Contract Vulnerabilities, named Clear. In particular, Clear employs a contrastive learning (CL) model to capture the fine-grained correlation information among contracts. We show that Clear achieves optimal performance over all baseline methods; (2) 9.73%-39.99% higher F1-score than existing deep learning methods.
arXiv Detail & Related papers (2024-04-27T09:13:25Z)
Formally Verifying a Real World Smart Contract [52.30656867727018]
We search for a tool capable of formally verifying a real-world smart contract written in a recent version of Solidity. In this article, we present our search for a tool capable of formally verifying a real-world smart contract written in a recent version of Solidity.
arXiv Detail & Related papers (2023-07-05T14:30:21Z)
Enhancing Smart Contract Security Analysis with Execution Property Graphs [48.31617821205042]
We introduce Clue, a dynamic analysis framework specifically designed for a runtime virtual machine. Clue captures critical information during contract executions, employing a novel graph-based representation, the Execution Property Graph. evaluation results reveal Clue's superior performance with high true positive rates and low false positive rates, outperforming state-of-the-art tools.
arXiv Detail & Related papers (2023-05-23T13:16:42Z)
HyMo: Vulnerability Detection in Smart Contracts using a Novel Multi-Modal Hybrid Model [1.16095700765361]
Existing analysis techniques are capable of identifying a large number of smart contract security flaws, but they rely too much on rigid criteria established by specialists. We propose HyMo as a multi-modal hybrid deep learning model, which intelligently considers various input representations to consider multimodality. We show that our hybrid HyMo model has excellent smart contract vulnerability detection performance.
arXiv Detail & Related papers (2023-04-25T19:16:21Z)
SmartIntentNN: Towards Smart Contract Intent Detection [5.9789082082171525]
We introduce textscSmartIntentNN (Smart Contract Intent Neural Network), a deep learning-based tool designed to automate the detection of developers' intent in smart contracts. Our approach integrates a Universal Sentence for contextual representation of smart contract code, and employs a K-means clustering algorithm to highlight intent-related code features. Evaluations on 10,000 real-world smart contracts demonstrate that textscSmartIntentNN surpasses all baselines, achieving an F1-score of 0.8633.
arXiv Detail & Related papers (2022-11-24T15:36:35Z)
Robustness Certificates for Implicit Neural Networks: A Mixed Monotone Contractive Approach [60.67748036747221]
Implicit neural networks offer competitive performance and reduced memory consumption. They can remain brittle with respect to input adversarial perturbations. This paper proposes a theoretical and computational framework for robustness verification of implicit neural networks.
arXiv Detail & Related papers (2021-12-10T03:08:55Z)
Smart Contract Vulnerability Detection: From Pure Neural Network to Interpretable Graph Feature and Expert Pattern Fusion [48.744359070088166]
Conventional smart contract vulnerability detection methods heavily rely on fixed expert rules. Recent deep learning approaches alleviate this issue but fail to encode useful expert knowledge. We develop automatic tools to extract expert patterns from the source code. We then cast the code into a semantic graph to extract deep graph features.
arXiv Detail & Related papers (2021-06-17T07:12:13Z)
A Bytecode-based Approach for Smart Contract Classification [10.483992071557195]
The number of smart contracts deployed on blockchain platforms is growing exponentially, which makes it difficult for users to find desired services by manual screening. Current research on smart contract classification focuses on Natural Language Processing (NLP) solutions which are based on contract source code. This paper proposes a classification model based on features from contract bytecode instead of source code to solve these problems.
arXiv Detail & Related papers (2021-05-31T03:00:29Z)
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable. We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts. We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z)
Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles. Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center. We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes. A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.