Toward Improved Deep Learning-based Vulnerability Detection
- URL: http://arxiv.org/abs/2403.03024v1
- Date: Tue, 5 Mar 2024 14:57:28 GMT
- Title: Toward Improved Deep Learning-based Vulnerability Detection
- Authors: Adriana Sejfia, Satyaki Das, Saad Shafiq, Nenad Medvidovi\'c
- Abstract summary: Vulnerabilities in datasets have to be represented in a certain way, e.g., code lines, functions, or program slices within which the vulnerabilities exist.
The detectors learn how base units can be vulnerable and then predict whether other base units are vulnerable.
We have hypothesized that this focus on individual base units harms the ability of the detectors to properly detect those vulnerabilities that span multiple base units.
We present our study and a framework that can be used to help DL-based detectors toward the proper inclusion of MBU vulnerabilities.
- Score: 6.212044762686268
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning (DL) has been a common thread across several recent techniques
for vulnerability detection. The rise of large, publicly available datasets of
vulnerabilities has fueled the learning process underpinning these techniques.
While these datasets help the DL-based vulnerability detectors, they also
constrain these detectors' predictive abilities. Vulnerabilities in these
datasets have to be represented in a certain way, e.g., code lines, functions,
or program slices within which the vulnerabilities exist. We refer to this
representation as a base unit. The detectors learn how base units can be
vulnerable and then predict whether other base units are vulnerable. We have
hypothesized that this focus on individual base units harms the ability of the
detectors to properly detect those vulnerabilities that span multiple base
units (or MBU vulnerabilities). For vulnerabilities such as these, a correct
detection occurs when all comprising base units are detected as vulnerable.
Verifying how existing techniques perform in detecting all parts of a
vulnerability is important to establish their effectiveness for other
downstream tasks. To evaluate our hypothesis, we conducted a study focusing on
three prominent DL-based detectors: ReVeal, DeepWukong, and LineVul. Our study
shows that all three detectors contain MBU vulnerabilities in their respective
datasets. Further, we observed significant accuracy drops when detecting these
types of vulnerabilities. We present our study and a framework that can be used
to help DL-based detectors toward the proper inclusion of MBU vulnerabilities.
Related papers
- VulEval: Towards Repository-Level Evaluation of Software Vulnerability Detection [14.312197590230994]
repository-level evaluation system named textbfVulEval aims at evaluating the detection performance of inter- and intra-procedural vulnerabilities simultaneously.
VulEval consists of a large-scale dataset, with a total of 4,196 CVE entries, 232,239 functions, and corresponding 4,699 repository-level source code in C/C++ programming languages.
arXiv Detail & Related papers (2024-04-24T02:16:11Z) - Enhancing Code Vulnerability Detection via Vulnerability-Preserving Data Augmentation [29.72520866016839]
Source code vulnerability detection aims to identify inherent vulnerabilities to safeguard software systems from potential attacks.
Many prior studies overlook diverse vulnerability characteristics, simplifying the problem into a binary (0-1) classification task.
FGVulDet employs multiple classifiers to discern characteristics of various vulnerability types and combines their outputs to identify the specific type of vulnerability.
FGVulDet is trained on a large-scale dataset from GitHub, encompassing five different types of vulnerabilities.
arXiv Detail & Related papers (2024-04-15T09:10:52Z) - FaultGuard: A Generative Approach to Resilient Fault Prediction in Smart Electrical Grids [53.2306792009435]
FaultGuard is the first framework for fault type and zone classification resilient to adversarial attacks.
We propose a low-complexity fault prediction model and an online adversarial training technique to enhance robustness.
Our model outclasses the state-of-the-art for resilient fault prediction benchmarking, with an accuracy of up to 0.958.
arXiv Detail & Related papers (2024-03-26T08:51:23Z) - The Vulnerability Is in the Details: Locating Fine-grained Information of Vulnerable Code Identified by Graph-based Detectors [33.395068754566935]
VULEXPLAINER is a tool for locating vulnerability-critical code lines from coarse-level vulnerable code snippets.
It can flag the vulnerability-triggering code statements with an accuracy of around 90% against eight common C/C++ vulnerabilities.
arXiv Detail & Related papers (2024-01-05T10:15:04Z) - DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection [55.70982767084996]
A critical yet frequently overlooked challenge in the field of deepfake detection is the lack of a standardized, unified, comprehensive benchmark.
We present the first comprehensive benchmark for deepfake detection, called DeepfakeBench, which offers three key contributions.
DeepfakeBench contains 15 state-of-the-art detection methods, 9CL datasets, a series of deepfake detection evaluation protocols and analysis tools, as well as comprehensive evaluations.
arXiv Detail & Related papers (2023-07-04T01:34:41Z) - A Comprehensive Study of the Robustness for LiDAR-based 3D Object
Detectors against Adversarial Attacks [84.10546708708554]
3D object detectors are increasingly crucial for security-critical tasks.
It is imperative to understand their robustness against adversarial attacks.
This paper presents the first comprehensive evaluation and analysis of the robustness of LiDAR-based 3D detectors under adversarial attacks.
arXiv Detail & Related papers (2022-12-20T13:09:58Z) - DCDetector: An IoT terminal vulnerability mining system based on
distributed deep ensemble learning under source code representation [2.561778620560749]
The goal of the research is to intelligently detect vulnerabilities in source codes of high-level languages such as C/C++.
This enables us to propose a code representation of sensitive sentence-related slices of source code, and to detect vulnerabilities by designing a distributed deep ensemble learning model.
Experiments show that this method can reduce the false positive rate of traditional static analysis and improve the performance and accuracy of machine learning.
arXiv Detail & Related papers (2022-11-29T14:19:14Z) - Detection of Adversarial Supports in Few-shot Classifiers Using Feature
Preserving Autoencoders and Self-Similarity [89.26308254637702]
We propose a detection strategy to highlight adversarial support sets.
We make use of feature preserving autoencoder filtering and also the concept of self-similarity of a support set to perform this detection.
Our method is attack-agnostic and also the first to explore detection for few-shot classifiers to the best of our knowledge.
arXiv Detail & Related papers (2020-12-09T14:13:41Z) - No Need to Know Physics: Resilience of Process-based Model-free Anomaly
Detection for Industrial Control Systems [95.54151664013011]
We present a novel framework to generate adversarial spoofing signals that violate physical properties of the system.
We analyze four anomaly detectors published at top security conferences.
arXiv Detail & Related papers (2020-12-07T11:02:44Z) - Survey of Network Intrusion Detection Methods from the Perspective of
the Knowledge Discovery in Databases Process [63.75363908696257]
We review the methods that have been applied to network data with the purpose of developing an intrusion detector.
We discuss the techniques used for the capture, preparation and transformation of the data, as well as, the data mining and evaluation methods.
As a result of this literature review, we investigate some open issues which will need to be considered for further research in the area of network security.
arXiv Detail & Related papers (2020-01-27T11:21:05Z) - $\mu$VulDeePecker: A Deep Learning-Based System for Multiclass
Vulnerability Detection [24.98991662345816]
We propose the first deep learning-based system for multiclass vulnerability detection, dubbed $mu$VulDeePecker.
The key insight underlying $mu$VulDeePecker is the concept of code attention, which can capture information that can help pinpoint types of vulnerabilities.
Experiments show that $mu$VulDeePecker is effective for multiclass vulnerability detection and that accommodating control-dependence can lead to higher detection capabilities.
arXiv Detail & Related papers (2020-01-08T01:47:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.