Utilization of machine learning for the detection of self-admitted
vulnerabilities
- URL: http://arxiv.org/abs/2309.15619v1
- Date: Wed, 27 Sep 2023 12:38:12 GMT
- Title: Utilization of machine learning for the detection of self-admitted
vulnerabilities
- Authors: Moritz Mock
- Abstract summary: Technical debt is a metaphor that describes not-quite-right code introduced for short-term needs.
Developers are aware of it and admit it in source code comments, which is called Self- Admitted Technical Debt (SATD)
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Motivation: Technical debt is a metaphor that describes not-quite-right code
introduced for short-term needs. Developers are aware of it and admit it in
source code comments, which is called Self- Admitted Technical Debt (SATD).
Therefore, SATD indicates weak code that developers are aware of. Problem
statement: Inspecting source code is time-consuming; automatically inspecting
source code for its vulnerabilities is a crucial aspect of developing software.
It helps practitioners reduce the time-consuming process and focus on
vulnerable aspects of the source code. Proposal: Accurately identify and better
understand the semantics of self-admitted technical debt (SATD) by leveraging
NLP and NL-PL approaches to detect vulnerabilities and the related SATD.
Finally, a CI/CD pipeline will be proposed to make the vulnerability discovery
process easily accessible to practitioners.
Related papers
- The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition.
Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies.
We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z) - Vulnerability Handling of AI-Generated Code -- Existing Solutions and Open Challenges [0.0]
We focus on approaches for vulnerability detection, localization, and repair in AI-generated code.
We highlight open challenges that must be addressed in order to establish a reliable and scalable vulnerability handling process of AI-generated code.
arXiv Detail & Related papers (2024-08-16T06:31:44Z) - Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation [41.831831628421675]
Graph Neural Networks (GNNs) have emerged as a prominent code embedding approach for vulnerability detection.
We propose CFExplainer, a novel counterfactual explainer for GNN-based vulnerability detection.
arXiv Detail & Related papers (2024-04-24T06:52:53Z) - The Vulnerability Is in the Details: Locating Fine-grained Information of Vulnerable Code Identified by Graph-based Detectors [33.395068754566935]
VULEXPLAINER is a tool for locating vulnerability-critical code lines from coarse-level vulnerable code snippets.
It can flag the vulnerability-triggering code statements with an accuracy of around 90% against eight common C/C++ vulnerabilities.
arXiv Detail & Related papers (2024-01-05T10:15:04Z) - Using Machine Learning To Identify Software Weaknesses From Software
Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications.
Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z) - Transformer-based Vulnerability Detection in Code at EditTime:
Zero-shot, Few-shot, or Fine-tuning? [5.603751223376071]
We present a practical system that leverages deep learning on a large-scale data set of vulnerable code patterns.
We show that in comparison with state of the art vulnerability detection models our approach improves the state of the art by 10%.
arXiv Detail & Related papers (2023-05-23T01:21:55Z) - Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code.
We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z) - A Hierarchical Deep Neural Network for Detecting Lines of Codes with
Vulnerabilities [6.09170287691728]
Software vulnerabilities, caused by unintentional flaws in source codes, are the main root cause of cyberattacks.
We propose a deep learning approach to detect vulnerabilities from their LLVM IR representations based on the techniques that have been used in natural language processing.
arXiv Detail & Related papers (2022-11-15T21:21:27Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - Multi-context Attention Fusion Neural Network for Software Vulnerability
Identification [4.05739885420409]
We propose a deep learning model that learns to detect some of the common categories of security vulnerabilities in source code efficiently.
The model builds an accurate understanding of code semantics with a lot less learnable parameters.
The proposed AI achieves 98.40% F1-score on specific CWEs from the benchmarked NIST SARD dataset.
arXiv Detail & Related papers (2021-04-19T11:50:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.