Deep-Learning-based Vulnerability Detection in Binary Executables
- URL: http://arxiv.org/abs/2212.01254v1
- Date: Fri, 25 Nov 2022 10:33:33 GMT
- Title: Deep-Learning-based Vulnerability Detection in Binary Executables
- Authors: Andreas Schaad, Dominik Binder
- Abstract summary: We present a supervised deep learning approach using recurrent neural networks for the application of vulnerability detection based on binary executables.
A dataset with 50,651 samples of vulnerable code in the form of a standardized LLVM Intermediate Representation is used.
A binary classification was established for detecting the presence of an arbitrary vulnerability, and a multi-class model was trained for the identification of the exact vulnerability.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The identification of vulnerabilities is an important element in the software
development life cycle to ensure the security of software. While vulnerability
identification based on the source code is a well studied field, the
identification of vulnerabilities on basis of a binary executable without the
corresponding source code is more challenging. Recent research [1] has shown,
how such detection can be achieved by deep learning methods. However, that
particular approach is limited to the identification of only 4 types of
vulnerabilities. Subsequently, we analyze to what extent we could cover the
identification of a larger variety of vulnerabilities. Therefore, a supervised
deep learning approach using recurrent neural networks for the application of
vulnerability detection based on binary executables is used. The underlying
basis is a dataset with 50,651 samples of vulnerable code in the form of a
standardized LLVM Intermediate Representation. The vectorised features of a
Word2Vec model are used to train different variations of three basic
architectures of recurrent neural networks (GRU, LSTM, SRNN). A binary
classification was established for detecting the presence of an arbitrary
vulnerability, and a multi-class model was trained for the identification of
the exact vulnerability, which achieved an out-of-sample accuracy of 88% and
77%, respectively. Differences in the detection of different vulnerabilities
were also observed, with non-vulnerable samples being detected with a
particularly high precision of over 98%. Thus, the methodology presented allows
an accurate detection of 23 (compared to 4 [1]) vulnerabilities.
Related papers
- VulEval: Towards Repository-Level Evaluation of Software Vulnerability Detection [14.312197590230994]
repository-level evaluation system named textbfVulEval aims at evaluating the detection performance of inter- and intra-procedural vulnerabilities simultaneously.
VulEval consists of a large-scale dataset, with a total of 4,196 CVE entries, 232,239 functions, and corresponding 4,699 repository-level source code in C/C++ programming languages.
arXiv Detail & Related papers (2024-04-24T02:16:11Z) - Enhancing Code Vulnerability Detection via Vulnerability-Preserving Data Augmentation [29.72520866016839]
Source code vulnerability detection aims to identify inherent vulnerabilities to safeguard software systems from potential attacks.
Many prior studies overlook diverse vulnerability characteristics, simplifying the problem into a binary (0-1) classification task.
FGVulDet employs multiple classifiers to discern characteristics of various vulnerability types and combines their outputs to identify the specific type of vulnerability.
FGVulDet is trained on a large-scale dataset from GitHub, encompassing five different types of vulnerabilities.
arXiv Detail & Related papers (2024-04-15T09:10:52Z) - Learning to Quantize Vulnerability Patterns and Match to Locate
Statement-Level Vulnerabilities [19.6975205650411]
A vulnerability codebook is learned, which consists of quantized vectors representing various vulnerability patterns.
During inference, the codebook is iterated to match all learned patterns and predict the presence of potential vulnerabilities.
Our approach was extensively evaluated on a real-world dataset comprising more than 188,000 C/C++ functions.
arXiv Detail & Related papers (2023-05-26T04:13:31Z) - VUDENC: Vulnerability Detection with Deep Learning on a Natural Codebase
for Python [8.810543294798485]
VUDENC is a deep learning-based vulnerability detection tool.
It learns features of vulnerable code from a large and real-world Python corpus.
VUDENC achieves a recall of 78%-87%, a precision of 82%-96%, and an F1 score of 80%-90%.
arXiv Detail & Related papers (2022-01-20T20:29:22Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - Feature Encoding with AutoEncoders for Weakly-supervised Anomaly
Detection [46.76220474310698]
Weakly-supervised anomaly detection aims at learning an anomaly detector from a limited amount of labeled data and abundant unlabeled data.
Recent works build deep neural networks for anomaly detection by discriminatively mapping the normal samples and abnormal samples to different regions in the feature space or fitting different distributions.
This paper proposes a novel strategy to transform the input data into a more meaningful representation that could be used for anomaly detection.
arXiv Detail & Related papers (2021-05-22T16:23:05Z) - Anomaly Detection in Cybersecurity: Unsupervised, Graph-Based and
Supervised Learning Methods in Adversarial Environments [63.942632088208505]
Inherent to today's operating environment is the practice of adversarial machine learning.
In this work, we examine the feasibility of unsupervised learning and graph-based methods for anomaly detection.
We incorporate a realistic adversarial training mechanism when training our supervised models to enable strong classification performance in adversarial environments.
arXiv Detail & Related papers (2021-05-14T10:05:10Z) - Multi-attentional Deepfake Detection [79.80308897734491]
Face forgery by deepfake is widely spread over the internet and has raised severe societal concerns.
We propose a new multi-attentional deepfake detection network. Specifically, it consists of three key components: 1) multiple spatial attention heads to make the network attend to different local parts; 2) textural feature enhancement block to zoom in the subtle artifacts in shallow features; 3) aggregate the low-level textural feature and high-level semantic features guided by the attention maps.
arXiv Detail & Related papers (2021-03-03T13:56:14Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - BiDet: An Efficient Binarized Object Detector [96.19708396510894]
We propose a binarized neural network learning method called BiDet for efficient object detection.
Our BiDet fully utilizes the representational capacity of the binary neural networks for object detection by redundancy removal.
Our method outperforms the state-of-the-art binary neural networks by a sizable margin.
arXiv Detail & Related papers (2020-03-09T08:16:16Z) - $\mu$VulDeePecker: A Deep Learning-Based System for Multiclass
Vulnerability Detection [24.98991662345816]
We propose the first deep learning-based system for multiclass vulnerability detection, dubbed $mu$VulDeePecker.
The key insight underlying $mu$VulDeePecker is the concept of code attention, which can capture information that can help pinpoint types of vulnerabilities.
Experiments show that $mu$VulDeePecker is effective for multiclass vulnerability detection and that accommodating control-dependence can lead to higher detection capabilities.
arXiv Detail & Related papers (2020-01-08T01:47:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.