Related papers: Utilization of machine learning for the detection of self-admitted vulnerabilities

Utilization of machine learning for the detection of self-admitted vulnerabilities

URL: http://arxiv.org/abs/2309.15619v1
Date: Wed, 27 Sep 2023 12:38:12 GMT
Title: Utilization of machine learning for the detection of self-admitted vulnerabilities
Authors: Moritz Mock
Abstract summary: Technical debt is a metaphor that describes not-quite-right code introduced for short-term needs. Developers are aware of it and admit it in source code comments, which is called Self- Admitted Technical Debt (SATD)
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Motivation: Technical debt is a metaphor that describes not-quite-right code introduced for short-term needs. Developers are aware of it and admit it in source code comments, which is called Self- Admitted Technical Debt (SATD). Therefore, SATD indicates weak code that developers are aware of. Problem statement: Inspecting source code is time-consuming; automatically inspecting source code for its vulnerabilities is a crucial aspect of developing software. It helps practitioners reduce the time-consuming process and focus on vulnerable aspects of the source code. Proposal: Accurately identify and better understand the semantics of self-admitted technical debt (SATD) by leveraging NLP and NL-PL approaches to detect vulnerabilities and the related SATD. Finally, a CI/CD pipeline will be proposed to make the vulnerability discovery process easily accessible to practitioners.

Related papers

On Explaining (Large) Language Models For Code Using Global Code-Based Explanations [45.126233498200534]
Language Models for Code (LLM4Code) have significantly changed the landscape of software engineering (SE) We introduce code rationales (Code$Q$), a technique with rigorous mathematical underpinning, to identify subsets of tokens that can explain individual code predictions. Our evaluation demonstrates that Code$Q$ is a powerful interpretability method to explain how (less) meaningful input concepts (i.e., natural language particle at') highly impact output generation.
arXiv Detail & Related papers (2025-03-21T01:00:45Z)
Leveraging multi-task learning to improve the detection of SATD and vulnerability [2.5385600700122737]
Self-Admitted Technical Debt (SATD) are comments in the code that indicate not-quite-right code introduced for short-term needs. VulSATD is a deep learner that detects vulnerable and SATD code based on CodeBERT.
arXiv Detail & Related papers (2025-01-27T10:31:07Z)
CryptoFormalEval: Integrating LLMs and Formal Verification for Automated Cryptographic Protocol Vulnerability Detection [41.94295877935867]
We introduce a benchmark to assess the ability of Large Language Models to autonomously identify vulnerabilities in new cryptographic protocols. We created a dataset of novel, flawed, communication protocols and designed a method to automatically verify the vulnerabilities found by the AI agents.
arXiv Detail & Related papers (2024-11-20T14:16:55Z)
The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition. Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies. We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z)
Vulnerability Handling of AI-Generated Code -- Existing Solutions and Open Challenges [0.0]
We focus on approaches for vulnerability detection, localization, and repair in AI-generated code. We highlight open challenges that must be addressed in order to establish a reliable and scalable vulnerability handling process of AI-generated code.
arXiv Detail & Related papers (2024-08-16T06:31:44Z)
Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation [41.831831628421675]
Graph Neural Networks (GNNs) have emerged as a prominent code embedding approach for vulnerability detection. We propose CFExplainer, a novel counterfactual explainer for GNN-based vulnerability detection.
arXiv Detail & Related papers (2024-04-24T06:52:53Z)
The Vulnerability Is in the Details: Locating Fine-grained Information of Vulnerable Code Identified by Graph-based Detectors [33.395068754566935]
VULEXPLAINER is a tool for locating vulnerability-critical code lines from coarse-level vulnerable code snippets. It can flag the vulnerability-triggering code statements with an accuracy of around 90% against eight common C/C++ vulnerabilities.
arXiv Detail & Related papers (2024-01-05T10:15:04Z)
Using Machine Learning To Identify Software Weaknesses From Software Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications. Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z)
Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning? [5.603751223376071]
We present a practical system that leverages deep learning on a large-scale data set of vulnerable code patterns. We show that in comparison with state of the art vulnerability detection models our approach improves the state of the art by 10%.
arXiv Detail & Related papers (2023-05-23T01:21:55Z)
Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code. We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z)
A Hierarchical Deep Neural Network for Detecting Lines of Codes with Vulnerabilities [6.09170287691728]
Software vulnerabilities, caused by unintentional flaws in source codes, are the main root cause of cyberattacks. We propose a deep learning approach to detect vulnerabilities from their LLVM IR representations based on the techniques that have been used in natural language processing.
arXiv Detail & Related papers (2022-11-15T21:21:27Z)
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code. Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph. VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora. Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z)
Multi-context Attention Fusion Neural Network for Software Vulnerability Identification [4.05739885420409]
We propose a deep learning model that learns to detect some of the common categories of security vulnerabilities in source code efficiently. The model builds an accurate understanding of code semantics with a lot less learnable parameters. The proposed AI achieves 98.40% F1-score on specific CWEs from the benchmarked NIST SARD dataset.
arXiv Detail & Related papers (2021-04-19T11:50:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.