Related papers: Improved IR-based Bug Localization with Intelligent Relevance Feedback

Improved IR-based Bug Localization with Intelligent Relevance Feedback

URL: http://arxiv.org/abs/2501.10542v1
Date: Fri, 17 Jan 2025 20:29:38 GMT
Title: Improved IR-based Bug Localization with Intelligent Relevance Feedback
Authors: Asif Mohammed Samir, Mohammad Masudur Rahman,
Abstract summary: Software bugs pose a significant challenge during development and maintenance, and practitioners spend nearly 50% of their time dealing with bugs.<n>Many existing techniques adopt Information Retrieval (IR) to localize a reported bug using textual and semantic relevance between bug reports and source code.<n>We present a novel technique for bug localization - BRaIn - that addresses the contextual gaps by assessing the relevance between bug reports and code.
Score: 2.9312156642007294
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Software bugs pose a significant challenge during development and maintenance, and practitioners spend nearly 50% of their time dealing with bugs. Many existing techniques adopt Information Retrieval (IR) to localize a reported bug using textual and semantic relevance between bug reports and source code. However, they often struggle to bridge a critical gap between bug reports and code that requires in-depth contextual understanding, which goes beyond textual or semantic relevance. In this paper, we present a novel technique for bug localization - BRaIn - that addresses the contextual gaps by assessing the relevance between bug reports and code with Large Language Models (LLM). It then leverages the LLM's feedback (a.k.a., Intelligent Relevance Feedback) to reformulate queries and re-rank source documents, improving bug localization. We evaluate BRaIn using a benchmark dataset, Bench4BL, and three performance metrics and compare it against six baseline techniques from the literature. Our experimental results show that BRaIn outperforms baselines by 87.6%, 89.5%, and 48.8% margins in MAP, MRR, and HIT@K, respectively. Additionally, it can localize approximately 52% of bugs that cannot be localized by the baseline techniques due to the poor quality of corresponding bug reports. By addressing the contextual gaps and introducing Intelligent Relevance Feedback, BRaIn advances not only theory but also improves IR-based bug localization.

Related papers

Enhancing IR-based Fault Localization using Large Language Models [5.032687557488094]
This paper enhances Fault Localization (IRFL) by categorizing bug reports based on programming entities, stack traces, and natural language text.<n>To address inaccuracies in queries, we introduce a user and conversational-based query reformulation approach, termed LLmiRQ+.<n> Evaluation on 46 projects with 6,340 bug reports yields an MRR of 0.6770 and MAP of 0.5118, surpassing seven state-of-the-art IRFL techniques.
arXiv Detail & Related papers (2024-12-04T22:47:51Z)
Understanding Code Understandability Improvements in Code Reviews [79.16476505761582]
We analyzed 2,401 code review comments from Java open-source projects on GitHub. 83.9% of suggestions for improvement were accepted and integrated, with fewer than 1% later reverted.
arXiv Detail & Related papers (2024-10-29T12:21:23Z)
BLAZE: Cross-Language and Cross-Project Bug Localization via Dynamic Chunking and Hard Example Learning [1.9854146581797698]
BLAZE is an approach that employs dynamic chunking and hard example learning. It fine-tunes a GPT-based model using challenging bug cases to enhance cross-project and cross-language bug localization. BLAZE achieves up to an increase of 120% in Top 1 accuracy, 144% in Mean Average Precision (MAP), and 100% in Mean Reciprocal Rank (MRR)
arXiv Detail & Related papers (2024-07-24T20:44:36Z)
DebugBench: Evaluating Debugging Capability of Large Language Models [80.73121177868357]
DebugBench is a benchmark for Large Language Models (LLMs) It covers four major bug categories and 18 minor types in C++, Java, and Python. We evaluate two commercial and four open-source models in a zero-shot scenario.
arXiv Detail & Related papers (2024-01-09T15:46:38Z)
On Using GUI Interaction Data to Improve Text Retrieval-based Bug Localization [10.717184444794505]
We investigate the hypothesis that, for end user-facing applications, connecting information in a bug report with information from the GUI, can improve upon existing techniques for bug localization. We source the current largest dataset of fully-localized and reproducible real bugs for Android apps, with corresponding bug reports.
arXiv Detail & Related papers (2023-10-12T07:14:22Z)
RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen) RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs. We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z)
A Comparative Study of Text Embedding Models for Semantic Text Similarity in Bug Reports [0.0]
Retrieving similar bug reports from an existing database can help reduce the time and effort required to resolve bugs. We explored several embedding models such as TF-IDF (Baseline), FastText, Gensim, BERT, and ADA. Our study provides insights into the effectiveness of different embedding methods for retrieving similar bug reports and highlights the impact of selecting the appropriate one for this task.
arXiv Detail & Related papers (2023-08-17T21:36:56Z)
Towards Multiple References Era -- Addressing Data Leakage and Limited Reference Diversity in NLG Evaluation [55.92852268168816]
N-gram matching-based evaluation metrics, such as BLEU and chrF, are widely utilized across a range of natural language generation (NLG) tasks. Recent studies have revealed a weak correlation between these matching-based metrics and human evaluations. We propose to utilize textitmultiple references to enhance the consistency between these metrics and human evaluations.
arXiv Detail & Related papers (2023-08-06T14:49:26Z)
Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers. We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z)
BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization. We provide a general benchmark with a diversity of real and synthetic Java bugs. We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z)
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empirical Study [17.809196793565224]
This article critically examines the state-of-the-art query selection practices in IR-based bug localization. We exploit the Genetic Algorithm-based approach to construct optimal, near-optimal search queries from 2,320 bug reports. We demonstrate 27%--34% improvement in the performance of non-optimal queries through the application of our actionable insights.
arXiv Detail & Related papers (2021-08-11T17:37:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.