What Can Self-Admitted Technical Debt Tell Us About Security? A
Mixed-Methods Study
- URL: http://arxiv.org/abs/2401.12768v3
- Date: Sat, 2 Mar 2024 17:06:17 GMT
- Title: What Can Self-Admitted Technical Debt Tell Us About Security? A
Mixed-Methods Study
- Authors: Nicol\'as E. D\'iaz Ferreyra, Mojtaba Shahin, Mansooreh Zahedi, Sodiq
Quadri and Ricardo Scandariato
- Abstract summary: Self-Admitted Technical Debt (SATD)
can be deemed as dreadful sources of information on potentially exploitable vulnerabilities and security flaws.
This work investigates the security implications of SATD from a technical and developer-centred perspective.
- Score: 6.286506087629511
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-Admitted Technical Debt (SATD) encompasses a wide array of sub-optimal
design and implementation choices reported in software artefacts (e.g., code
comments and commit messages) by developers themselves. Such reports have been
central to the study of software maintenance and evolution over the last
decades. However, they can also be deemed as dreadful sources of information on
potentially exploitable vulnerabilities and security flaws. This work
investigates the security implications of SATD from a technical and
developer-centred perspective. On the one hand, it analyses whether security
pointers disclosed inside SATD sources can be used to characterise
vulnerabilities in Open-Source Software (OSS) projects and repositories. On the
other hand, it delves into developers' perspectives regarding the motivations
behind this practice, its prevalence, and its potential negative consequences.
We followed a mixed-methods approach consisting of (i) the analysis of a
preexisting dataset containing 8,812 SATD instances and (ii) an online survey
with 222 OSS practitioners. We gathered 201 SATD instances through the dataset
analysis and mapped them to different Common Weakness Enumeration (CWE)
identifiers. Overall, 25 different types of CWEs were spotted across commit
messages, pull requests, code comments, and issue sections, from which 8 appear
among MITRE's Top-25 most dangerous ones. The survey shows that software
practitioners often place security pointers across SATD artefacts to promote a
security culture among their peers and help them spot flaky code sections,
among other motives. However, they also consider such a practice risky as it
may facilitate vulnerability exploits. Our findings suggest that preserving the
contextual integrity of security pointers disseminated across SATD artefacts is
critical to safeguard both commercial and OSS solutions against zero-day
attacks.
Related papers
- RedCode: Risky Code Execution and Generation Benchmark for Code Agents [50.81206098588923]
RedCode is a benchmark for risky code execution and generation.
RedCode-Exec provides challenging prompts that could lead to risky code execution.
RedCode-Gen provides 160 prompts with function signatures and docstrings as input to assess whether code agents will follow instructions.
arXiv Detail & Related papers (2024-11-12T13:30:06Z) - Insights and Current Gaps in Open-Source LLM Vulnerability Scanners: A Comparative Analysis [1.5149711185416004]
This report presents a comparative analysis of open-source vulnerability scanners for conversational large language models (LLMs)
Our study evaluates prominent scanners - Garak, Giskard, PyRIT, and CyberSecEval - that adapt red-teaming practices to expose vulnerabilities.
arXiv Detail & Related papers (2024-10-21T21:36:03Z) - The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition.
Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies.
We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z) - Unintentional Security Flaws in Code: Automated Defense via Root Cause Analysis [2.899501205987888]
We developed an automated vulnerability root cause (RC) toolkit called T5-RCGCN.
It combines T5 language model embeddings with a graph convolutional network (GCN) for vulnerability classification and localization.
We tested T5-RCGCN with 56 junior developers across three datasets, showing a 28.9% improvement in code security compared to previous methods.
arXiv Detail & Related papers (2024-08-30T18:26:59Z) - Static Application Security Testing (SAST) Tools for Smart Contracts: How Far Are We? [14.974832502863526]
In recent years, the importance of smart contract security has been heightened by the increasing number of attacks against them.
To address this issue, a multitude of static application security testing (SAST) tools have been proposed for detecting vulnerabilities in smart contracts.
In this paper, we propose an up-to-date and fine-grained taxonomy that includes 45 unique vulnerability types for smart contracts.
arXiv Detail & Related papers (2024-04-28T13:40:18Z) - Fixing Smart Contract Vulnerabilities: A Comparative Analysis of
Literature and Developer's Practices [6.09162202256218]
We refer to vulnerability fixing in the ways found in the literature as guidelines.
It is not clear to what extent developers adhere to these guidelines, nor whether there are other viable common solutions and what they are.
The goal of our research is to fill knowledge gaps related to developers' observance of existing guidelines and to propose new and viable solutions to security vulnerabilities.
arXiv Detail & Related papers (2024-03-12T09:55:54Z) - DevPhish: Exploring Social Engineering in Software Supply Chain Attacks on Developers [0.3754193239793766]
adversaries utilize Social Engineering (SocE) techniques specifically aimed at software developers.
This paper aims to comprehensively explore the existing and emerging SocE tactics employed by adversaries to trick Software Engineers (SWEs) into delivering malicious software.
arXiv Detail & Related papers (2024-02-28T15:24:43Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - Why Should Adversarial Perturbations be Imperceptible? Rethink the
Research Paradigm in Adversarial NLP [83.66405397421907]
We rethink the research paradigm of textual adversarial samples in security scenarios.
We first collect, process, and release a security datasets collection Advbench.
Next, we propose a simple method based on rules that can easily fulfill the actual adversarial goals to simulate real-world attack methods.
arXiv Detail & Related papers (2022-10-19T15:53:36Z) - Exploring Robustness of Unsupervised Domain Adaptation in Semantic
Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space.
This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.