Related papers: Automatically Estimating the Effort Required to Repay Self-Admitted Technical Debt

Automatically Estimating the Effort Required to Repay Self-Admitted Technical Debt

URL: http://arxiv.org/abs/2309.06020v1
Date: Tue, 12 Sep 2023 07:40:18 GMT
Title: Automatically Estimating the Effort Required to Repay Self-Admitted Technical Debt
Authors: Yikun Li, Mohamed Soliman, Paris Avgeriou
Abstract summary: Self-Admitted Technical Debt (SATD) is a specific form of technical debt documented by developers within software artifacts. We propose a novel approach for automatically estimating SATD repayment effort, utilizing a comprehensive dataset. Our findings show that different types of SATD require varying levels of repayment effort, with code/design, requirement, and test debt demanding greater effort compared to non-SATD items.
Score: 1.8208834479445897
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Technical debt refers to the consequences of sub-optimal decisions made during software development that prioritize short-term benefits over long-term maintainability. Self-Admitted Technical Debt (SATD) is a specific form of technical debt, explicitly documented by developers within software artifacts such as source code comments and commit messages. As SATD can hinder software development and maintenance, it is crucial to address and prioritize it effectively. However, current methodologies lack the ability to automatically estimate the repayment effort of SATD based on its textual descriptions. To address this limitation, we propose a novel approach for automatically estimating SATD repayment effort, utilizing a comprehensive dataset comprising 341,740 SATD items from 2,568,728 commits across 1,060 Apache repositories. Our findings show that different types of SATD require varying levels of repayment effort, with code/design, requirement, and test debt demanding greater effort compared to non-SATD items, while documentation debt requires less. We introduce and evaluate machine learning methodologies, particularly BERT and TextCNN, which outperforms classic machine learning methods and the naive baseline in estimating repayment effort. Additionally, we summarize keywords associated with varying levels of repayment effort that occur during SATD repayment. Our contributions aim to enhance the prioritization of SATD repayment effort and resource allocation efficiency, ultimately benefiting software development and maintainability.

Related papers

Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models. Our framework incorporates two complementary strategies: internal TTC and external TTC. We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
Leveraging multi-task learning to improve the detection of SATD and vulnerability [2.5385600700122737]
Self-Admitted Technical Debt (SATD) are comments in the code that indicate not-quite-right code introduced for short-term needs. VulSATD is a deep learner that detects vulnerable and SATD code based on CodeBERT.
arXiv Detail & Related papers (2025-01-27T10:31:07Z)
Negativity in Self-Admitted Technical Debt: How Sentiment Influences Prioritization [50.07057212504773]
Self-Admitted Technical Debt, or SATD, is a self-admission of technical debt present in a software system. About a quarter of descriptions of SATD in software systems express some form of negativity or negative emotions. Our study shows how developers actively use negativity in SATD to determine how urgently a particular instance of TD should be addressed.
arXiv Detail & Related papers (2025-01-02T05:33:43Z)
Evidence is All We Need: Do Self-Admitted Technical Debts Impact Method-Level Maintenance? [1.0377683220196874]
Self-Admitted Technical Debt (SATD) refers to the phenomenon where developers explicitly acknowledge technical debt through comments in the source code. This paper aims to empirically investigate the influence of SATD on various facets of software maintenance at the method level.
arXiv Detail & Related papers (2024-11-21T01:21:35Z)
Improving the detection of technical debt in Java source code with an enriched dataset [12.07607688189035]
Technical debt (TD) is the additional work and costs that emerge when developers opt for a quick and easy solution to a problem. Recent research has focused on detecting Self-Admitted Technical Debts (SATDs) by analyzing comments embedded in source code. We curated the first ever dataset of TD identified by code comments, coupled with its associated source code.
arXiv Detail & Related papers (2024-11-08T10:12:33Z)
An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities. Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool. We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z)
Self-Admitted Technical Debt Detection Approaches: A Decade Systematic Review [5.670597842524448]
Technical debt (TD) represents the long-term costs associated with suboptimal design or code decisions in software development. Self-Admitted Technical Debt (SATD) occurs when developers explicitly acknowledge these trade-offs. automated detection of SATD has become an increasingly important research area.
arXiv Detail & Related papers (2023-12-19T12:01:13Z)
DebtViz: A Tool for Identifying, Measuring, Visualizing, and Monitoring Self-Admitted Technical Debt [1.6201475185215248]
Technical debt, specifically Self-Admitted Technical Debt (SATD), remains a significant challenge for software developers and managers. This paper presents DebtViz, an innovative SATD tool designed to automatically detect, classify, visualize and monitor various types of SATD in source code comments and issue tracking systems.
arXiv Detail & Related papers (2023-08-25T01:05:38Z)
Towards Automatically Addressing Self-Admitted Technical Debt: How Far Are We? [17.128428286986573]
This paper empirically investigates the extent to which technical debt can be automatically paid back by neural-based generative models. We start by extracting a dateset of 5,039 Self-Admitted Technical Debt (SATD) removals from 595 open-source projects. We use this dataset to experiment with seven different generative deep learning (DL) model configurations.
arXiv Detail & Related papers (2023-08-17T12:27:32Z)
SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs) We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer. We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z)
Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions. We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers. We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z)
Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates [110.92598350897192]
Q-Learning has proven effective at learning a policy to perform control tasks. estimation noise becomes a bias after the max operator in the policy improvement step. We present Unbiased Soft Q-Learning (UQL), which extends the work of EQL from two action, finite state spaces to multi-action, infinite state Markov Decision Processes.
arXiv Detail & Related papers (2021-10-28T00:07:19Z)
Explanations of Machine Learning predictions: a mandatory step for its application to Operational Processes [61.20223338508952]
Credit Risk Modelling plays a paramount role. Recent machine and deep learning techniques have been applied to the task. We suggest to use LIME technique to tackle the explainability problem in this field.
arXiv Detail & Related papers (2020-12-30T10:27:59Z)
Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation [49.69139684065241]
Contextual multi-armed bandit (MAB) achieves cutting-edge performance on a variety of problems. In this paper, we propose a hierarchical adaptive contextual bandit method (HATCH) to conduct the policy learning of contextual bandits with a budget constraint.
arXiv Detail & Related papers (2020-04-02T17:04:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.