Self-Admitted Technical Debt Detection Approaches: A Decade Systematic Review
- URL: http://arxiv.org/abs/2312.15020v3
- Date: Sat, 21 Sep 2024 19:56:56 GMT
- Title: Self-Admitted Technical Debt Detection Approaches: A Decade Systematic Review
- Authors: Edi Sutoyo, Andrea Capiluppi,
- Abstract summary: Technical debt (TD) represents the long-term costs associated with suboptimal design or code decisions in software development.
Self-Admitted Technical Debt (SATD) occurs when developers explicitly acknowledge these trade-offs.
automated detection of SATD has become an increasingly important research area.
- Score: 5.670597842524448
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Technical debt (TD) represents the long-term costs associated with suboptimal design or code decisions in software development, often made to meet short-term delivery goals. Self-Admitted Technical Debt (SATD) occurs when developers explicitly acknowledge these trade-offs in the codebase, typically through comments or annotations. Automated detection of SATD has become an increasingly important research area, particularly with the rise of natural language processing (NLP), machine learning (ML), and deep learning (DL) techniques that aim to streamline SATD detection. This systematic literature review provides a comprehensive analysis of SATD detection approaches published between 2014 and 2024, focusing on the evolution of techniques from NLP-based models to more advanced ML, DL, and Transformers-based models such as BERT. The review identifies key trends in SATD detection methodologies and tools, evaluates the effectiveness of different approaches using metrics like precision, recall, and F1-score, and highlights the primary challenges in this domain, including dataset heterogeneity, model generalizability, and the explainability of models. The findings suggest that while early NLP methods laid the foundation for SATD detection, more recent advancements in DL and Transformers models have significantly improved detection accuracy. However, challenges remain in scaling these models for broader industrial use. This SLR offers insights into current research gaps and provides directions for future work, aiming to improve the robustness and practicality of SATD detection tools.
Related papers
- SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.
We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.
Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - State-Space Modeling in Long Sequence Processing: A Survey on Recurrence in the Transformer Era [59.279784235147254]
This survey provides an in-depth summary of the latest approaches that are based on recurrent models for sequential data processing.
The emerging picture suggests that there is room for thinking of novel routes, constituted by learning algorithms which depart from the standard Backpropagation Through Time.
arXiv Detail & Related papers (2024-06-13T12:51:22Z) - A Novel Generative AI-Based Framework for Anomaly Detection in Multicast Messages in Smart Grid Communications [0.0]
Cybersecurity breaches in digital substations pose significant challenges to the stability and reliability of power system operations.
This paper proposes a task-oriented dialogue system for anomaly detection (AD) in datasets of multicast messages.
It has a lower potential error and better scalability and adaptability than a process that considers the cybersecurity guidelines recommended by humans.
arXiv Detail & Related papers (2024-06-08T13:28:50Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Uncertainty Estimation of Transformers' Predictions via Topological Analysis of the Attention Matrices [3.1466086042810884]
Transformer-based language models have set new benchmarks across a wide range of NLP tasks.
reliably estimating the uncertainty of their predictions remains a significant challenge.
We tackle these limitations by harnessing the geometry of attention maps across multiple heads and layers to assess model confidence.
Our method significantly outperforms existing uncertainty estimation techniques on benchmarks for acceptability judgments and artificial text detection.
arXiv Detail & Related papers (2023-08-22T09:17:45Z) - The Devil is in the Errors: Leveraging Large Language Models for
Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations.
We study the impact of labeled data through in-context learning and finetuning.
We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z) - Deep Transfer Learning for Automatic Speech Recognition: Towards Better
Generalization [3.6393183544320236]
Speech recognition has become an important challenge when using deep learning (DL)
It requires large-scale training datasets and high computational and storage resources.
Deep transfer learning (DTL) has been introduced to overcome these issues.
arXiv Detail & Related papers (2023-04-27T21:08:05Z) - Measuring Improvement of F$_1$-Scores in Detection of Self-Admitted
Technical Debt [5.750379648650073]
We improve SATD detection with a novel approach that leverages the Bidirectional Representations from Transformers (BERT) architecture.
We find that our trained BERT model improves over the best performance of all previous methods in 19 of the 20 projects in cross-project scenarios.
Future research will look into ways to diversify SATD datasets in order to maximize the latent power in large BERT models.
arXiv Detail & Related papers (2023-03-16T19:47:38Z) - On the Reliability and Explainability of Language Models for Program
Generation [15.569926313298337]
We study the capabilities and limitations of automated program generation approaches.
We employ advanced explainable AI approaches to highlight the tokens that significantly contribute to the code transformation.
Our analysis reveals that, in various experimental scenarios, language models can recognize code grammar and structural information, but they exhibit limited robustness to changes in input sequences.
arXiv Detail & Related papers (2023-02-19T14:59:52Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.