An Empirical Study of Deep Learning Models for Vulnerability Detection
- URL: http://arxiv.org/abs/2212.08109v2
- Date: Mon, 19 Dec 2022 02:41:33 GMT
- Title: An Empirical Study of Deep Learning Models for Vulnerability Detection
- Authors: Benjamin Steenhoek, Md Mahbubur Rahman, Richard Jiles, and Wei Le
- Abstract summary: We surveyed and reproduced 9 state-of-the-art deep learning models on 2 widely used vulnerability detection datasets.
We investigated model capabilities, training data, and model interpretation.
Our findings can help better understand model results, provide guidance on preparing training data, and improve the robustness of the models.
- Score: 4.243592852049963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning (DL) models of code have recently reported great progress for
vulnerability detection. In some cases, DL-based models have outperformed
static analysis tools. Although many great models have been proposed, we do not
yet have a good understanding of these models. This limits the further
advancement of model robustness, debugging, and deployment for the
vulnerability detection. In this paper, we surveyed and reproduced 9
state-of-the-art (SOTA) deep learning models on 2 widely used vulnerability
detection datasets: Devign and MSR. We investigated 6 research questions in
three areas, namely model capabilities, training data, and model
interpretation. We experimentally demonstrated the variability between
different runs of a model and the low agreement among different models'
outputs. We investigated models trained for specific types of vulnerabilities
compared to a model that is trained on all the vulnerabilities at once. We
explored the types of programs DL may consider "hard" to handle. We
investigated the relations of training data sizes and training data composition
with model performance. Finally, we studied model interpretations and analyzed
important features that the models used to make predictions. We believe that
our findings can help better understand model results, provide guidance on
preparing training data, and improve the robustness of the models. All of our
datasets, code, and results are available at
https://figshare.com/s/284abfba67dba448fdc2.
Related papers
- Learning-based Models for Vulnerability Detection: An Extensive Study [3.1317409221921144]
We extensively and comprehensively investigate two types of state-of-the-art learning-based approaches.
We experimentally demonstrate the priority of sequence-based models and the limited abilities of both graph-based models.
arXiv Detail & Related papers (2024-08-14T13:01:30Z) - Identifying and Mitigating Model Failures through Few-shot CLIP-aided
Diffusion Generation [65.268245109828]
We propose an end-to-end framework to generate text descriptions of failure modes associated with spurious correlations.
These descriptions can be used to generate synthetic data using generative models, such as diffusion models.
Our experiments have shown remarkable textbfimprovements in accuracy ($sim textbf21%$) on hard sub-populations.
arXiv Detail & Related papers (2023-12-09T04:43:49Z) - Do Language Models Learn Semantics of Code? A Case Study in
Vulnerability Detection [7.725755567907359]
We analyze the models using three distinct methods: interpretability tools, attention analysis, and interaction matrix analysis.
We develop two annotation methods which highlight the bug semantics inside the model's inputs.
Our findings indicate that it is helpful to provide the model with information of the bug semantics, that the model can attend to it, and motivate future work in learning more complex path-based bug semantics.
arXiv Detail & Related papers (2023-11-07T16:31:56Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models [74.58014281829946]
We analyze the effectiveness of several representative attacks/defenses, including model stealing attacks, membership inference attacks, and backdoor detection on public models.
Our evaluation empirically shows the performance of these attacks/defenses can vary significantly on public models compared to self-trained models.
arXiv Detail & Related papers (2023-10-19T11:49:22Z) - Towards Causal Deep Learning for Vulnerability Detection [31.59558109518435]
We introduce do calculus based causal learning to software engineering models.
Our results show that CausalVul consistently improved the model accuracy, robustness and OOD performance.
arXiv Detail & Related papers (2023-10-12T00:51:06Z) - A study on the impact of pre-trained model on Just-In-Time defect
prediction [10.205110163570502]
We build six models: RoBERTaJIT, CodeBERTJIT, BARTJIT, PLBARTJIT, GPT2JIT, and CodeGPTJIT, each with a distinct pre-trained model as its backbone.
We investigate the performance of the models when using Commit code and Commit message as inputs, as well as the relationship between training efficiency and model distribution.
arXiv Detail & Related papers (2023-09-05T15:34:22Z) - A Comprehensive Evaluation and Analysis Study for Chinese Spelling Check [53.152011258252315]
We show that using phonetic and graphic information reasonably is effective for Chinese Spelling Check.
Models are sensitive to the error distribution of the test set, which reflects the shortcomings of models.
The commonly used benchmark, SIGHAN, can not reliably evaluate models' performance.
arXiv Detail & Related papers (2023-07-25T17:02:38Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Deep Learning Models for Knowledge Tracing: Review and Empirical
Evaluation [2.423547527175807]
We review and evaluate a body of deep learning knowledge tracing (DLKT) models with openly available and widely-used data sets.
The evaluated DLKT models have been reimplemented for assessing and replicability of previously reported results.
arXiv Detail & Related papers (2021-12-30T14:19:27Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.