Vulnerability Detection Using Two-Stage Deep Learning Models
- URL: http://arxiv.org/abs/2305.09673v1
- Date: Mon, 8 May 2023 22:12:34 GMT
- Title: Vulnerability Detection Using Two-Stage Deep Learning Models
- Authors: Mohamed Mjd Alhafi and Mohammad Hammade and Khloud Al Jallad
- Abstract summary: Two deep learning models were proposed for vulnerability detection in C/C++ source codes.
The first stage is CNN which detects if the source code contains any vulnerability.
The second stage is CNN-LTSM that classifies this vulnerability into a class of 50 different types of vulnerabilities.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Application security is an essential part of developing modern software, as
lots of attacks depend on vulnerabilities in software. The number of attacks is
increasing globally due to technological advancements. Companies must include
security in every stage of developing, testing, and deploying their software in
order to prevent data breaches. There are several methods to detect software
vulnerability Non-AI-based such as Static Application Security Testing (SAST)
and Dynamic Application Security Testing (DAST). However, these approaches have
substantial false-positive and false-negative rates. On the other side,
researchers have been interested in developing an AI-based vulnerability
detection system employing deep learning models like BERT, BLSTM, etc. In this
paper, we proposed a two-stage solution, two deep learning models were proposed
for vulnerability detection in C/C++ source codes, the first stage is CNN which
detects if the source code contains any vulnerability (binary classification
model) and the second stage is CNN-LTSM that classifies this vulnerability into
a class of 50 different types of vulnerabilities (multiclass classification
model). Experiments were done on SySeVR dataset. Results show an accuracy of
99% for the first and 98% for the second stage.
Related papers
- Automated Vulnerability Detection Using Deep Learning Technique [1.1710022685486914]
The study demonstrates that deep learning techniques, particularly with CodeBERT's advanced contextual understanding, can significantly improve vulnerability detection.
Our approach transforms source code into vector representations and trains a Long Short-Term Memory (LSTM) model to identify vulnerable patterns.
arXiv Detail & Related papers (2024-10-29T11:51:51Z) - Comparison of Static Application Security Testing Tools and Large Language Models for Repo-level Vulnerability Detection [11.13802281700894]
Static Application Security Testing (SAST) is usually utilized to scan source code for security vulnerabilities.
Deep learning (DL)-based methods have demonstrated their potential in software vulnerability detection.
This paper compares 15 diverse SAST tools with 12 popular or state-of-the-art open-source LLMs in detecting software vulnerabilities.
arXiv Detail & Related papers (2024-07-23T07:21:14Z) - Enhancing Code Vulnerability Detection via Vulnerability-Preserving Data Augmentation [29.72520866016839]
Source code vulnerability detection aims to identify inherent vulnerabilities to safeguard software systems from potential attacks.
Many prior studies overlook diverse vulnerability characteristics, simplifying the problem into a binary (0-1) classification task.
FGVulDet employs multiple classifiers to discern characteristics of various vulnerability types and combines their outputs to identify the specific type of vulnerability.
FGVulDet is trained on a large-scale dataset from GitHub, encompassing five different types of vulnerabilities.
arXiv Detail & Related papers (2024-04-15T09:10:52Z) - When Authentication Is Not Enough: On the Security of Behavioral-Based Driver Authentication Systems [53.2306792009435]
We develop two lightweight driver authentication systems based on Random Forest and Recurrent Neural Network architectures.
We are the first to propose attacks against these systems by developing two novel evasion attacks, SMARTCAN and GANCAN.
Through our contributions, we aid practitioners in safely adopting these systems, help reduce car thefts, and enhance driver security.
arXiv Detail & Related papers (2023-06-09T14:33:26Z) - Learning to Quantize Vulnerability Patterns and Match to Locate
Statement-Level Vulnerabilities [19.6975205650411]
A vulnerability codebook is learned, which consists of quantized vectors representing various vulnerability patterns.
During inference, the codebook is iterated to match all learned patterns and predict the presence of potential vulnerabilities.
Our approach was extensively evaluated on a real-world dataset comprising more than 188,000 C/C++ functions.
arXiv Detail & Related papers (2023-05-26T04:13:31Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - Cross Project Software Vulnerability Detection via Domain Adaptation and
Max-Margin Principle [21.684043656053106]
Software vulnerabilities (SVs) have become a common, serious and crucial concern due to the ubiquity of computer software.
We propose a novel end-to-end approach to tackle these two crucial issues.
Our method obtains a higher performance on F1-measure, the most important measure in SVD, from 1.83% to 6.25% compared to the second highest method in the used datasets.
arXiv Detail & Related papers (2022-09-19T23:47:22Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - Certifiers Make Neural Networks Vulnerable to Availability Attacks [70.69104148250614]
We show for the first time that fallback strategies can be deliberately triggered by an adversary.
In addition to naturally occurring abstains for some inputs and perturbations, the adversary can use training-time attacks to deliberately trigger the fallback.
We design two novel availability attacks, which show the practical relevance of these threats.
arXiv Detail & Related papers (2021-08-25T15:49:10Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.