Related papers: RLocator: Reinforcement Learning for Bug Localization

RLocator: Reinforcement Learning for Bug Localization

URL: http://arxiv.org/abs/2305.05586v3
Date: Mon, 30 Sep 2024 15:29:35 GMT
Title: RLocator: Reinforcement Learning for Bug Localization
Authors: Partha Chakraborty, Mahmoud Alfadel, Meiyappan Nagappan,
Abstract summary: We present RLocator, a Reinforcement Learning-based bug localization approach. We experimentally evaluate it based on a benchmark dataset of 8,316 bug reports from six popular Apache projects. RLocator achieves a Mean Reciprocal Rank (MRR) of 0.62, a Mean Average Precision (MAP) of 0.59, and a Top 1 score of 0.46.
Score: 1.9854146581797698
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Software developers spend a significant portion of time fixing bugs in their projects. To streamline this process, bug localization approaches have been proposed to identify the source code files that are likely responsible for a particular bug. Prior work proposed several similarity-based machine-learning techniques for bug localization. Despite significant advances in these techniques, they do not directly optimize the evaluation measures. We argue that directly optimizing evaluation measures can positively contribute to the performance of bug localization approaches. Therefore, In this paper, we utilize Reinforcement Learning (RL) techniques to directly optimize the ranking metrics. We propose RLocator, a Reinforcement Learning-based bug localization approach. We formulate RLocator using a Markov Decision Process (MDP) to optimize the evaluation measures directly. We present the technique and experimentally evaluate it based on a benchmark dataset of 8,316 bug reports from six highly popular Apache projects. The results of our evaluation reveal that RLocator achieves a Mean Reciprocal Rank (MRR) of 0.62, a Mean Average Precision (MAP) of 0.59, and a Top 1 score of 0.46. We compare RLocator with two state-of-the-art bug localization tools, FLIM and BugLocator. Our evaluation reveals that RLocator outperforms both approaches by a substantial margin, with improvements of 38.3% in MAP, 36.73% in MRR, and 23.68% in the Top K metric. These findings highlight that directly optimizing evaluation measures considerably contributes to performance improvement of the bug localization problem.

Related papers

MIB: A Mechanistic Interpretability Benchmark [77.35046700898326]
We propose MIB, a benchmark with two tracks spanning four tasks and five models. Using MIB, we find that attribution and mask optimization methods perform best on circuit localization. For causal variable localization, we find that the supervised DAS method performs best, while SAE features are not better than neurons.
arXiv Detail & Related papers (2025-04-17T17:55:45Z)
Improved IR-based Bug Localization with Intelligent Relevance Feedback [2.9312156642007294]
Software bugs pose a significant challenge during development and maintenance, and practitioners spend nearly 50% of their time dealing with bugs. Many existing techniques adopt Information Retrieval (IR) to localize a reported bug using textual and semantic relevance between bug reports and source code. We present a novel technique for bug localization - BRaIn - that addresses the contextual gaps by assessing the relevance between bug reports and code.
arXiv Detail & Related papers (2025-01-17T20:29:38Z)
Continuously Learning Bug Locations [11.185300073739098]
We evaluate the potential of using Continual Learning (CL) techniques in multiple sub-tasks setting for bug localization. We show that CL techniques perform better than DL-based techniques by up to 61% in terms of Mean Reciprocal Rank (MRR), 44% in terms of Mean Average Precision (MAP), 83% in terms of top@1, 56% in terms of top@5, and 66% in terms of top@10 metrics in non-stationary setting.
arXiv Detail & Related papers (2024-12-15T19:37:15Z)
Preference Optimization for Reasoning with Pseudo Feedback [100.62603571434167]
We introduce a novel approach to generate pseudo feedback for reasoning tasks by framing the labeling of solutions as an evaluation against associated test cases. We conduct experiments on both mathematical reasoning and coding tasks using pseudo feedback for preference optimization, and observe improvements across both tasks.
arXiv Detail & Related papers (2024-11-25T12:44:02Z)
BLAZE: Cross-Language and Cross-Project Bug Localization via Dynamic Chunking and Hard Example Learning [1.9854146581797698]
BLAZE is an approach that employs dynamic chunking and hard example learning. It fine-tunes a GPT-based model using challenging bug cases to enhance cross-project and cross-language bug localization. BLAZE achieves up to an increase of 120% in Top 1 accuracy, 144% in Mean Average Precision (MAP), and 100% in Mean Reciprocal Rank (MRR)
arXiv Detail & Related papers (2024-07-24T20:44:36Z)
Leveraging Stack Traces for Spectrum-based Fault Localization in the Absence of Failing Tests [44.13331329339185]
We introduce a new approach, SBEST, that integrates stack trace data with test coverage to enhance fault localization. Our approach shows a significant improvement, increasing Mean Average Precision (MAP) by 32.22% and Mean Reciprocal Rank (MRR) by 17.43% over traditional stack trace ranking methods.
arXiv Detail & Related papers (2024-05-01T15:15:52Z)
Lower-Left Partial AUC: An Effective and Efficient Optimization Metric for Recommendation [52.45394284415614]
We propose a new optimization metric, Lower-Left Partial AUC (LLPAUC), which is computationally efficient like AUC but strongly correlates with Top-K ranking metrics. LLPAUC considers only the partial area under the ROC curve in the Lower-Left corner to push the optimization focus on Top-K.
arXiv Detail & Related papers (2024-02-29T13:58:33Z)
Less is More: Fewer Interpretable Region via Submodular Subset Selection [54.07758302264416]
This paper re-models the above image attribution problem as a submodular subset selection problem. We construct a novel submodular function to discover more accurate small interpretation regions. For correctly predicted samples, the proposed method improves the Deletion and Insertion scores with an average of 4.9% and 2.5% gain relative to HSIC-Attribution.
arXiv Detail & Related papers (2024-02-14T13:30:02Z)
Enhancing Redundancy-based Automated Program Repair by Fine-grained Pattern Mining [18.3896381051331]
We propose a new repair technique named Repatt, which incorporates a two-level pattern mining process for guiding effective patch generation. We have conducted an experiment on the widely-used Defects4J benchmark and compared Repatt with eight state-of-the-art APR approaches.
arXiv Detail & Related papers (2023-12-26T08:42:32Z)
Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC) AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability. We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z)
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention [48.697458429460184]
Two factors, information bottleneck sensitivity and inconsistency between different attention topologies, could affect the performance of the Sparse Transformer. This paper proposes a well-designed model named ERNIE-Sparse. It consists of two distinctive parts: (i) Hierarchical Sparse Transformer (HST) to sequentially unify local and global information, and (ii) Self-Attention Regularization (SAR) to minimize the distance for transformers with different attention topologies.
arXiv Detail & Related papers (2022-03-23T08:47:01Z)
Out-of-Vocabulary Entities in Link Prediction [1.9036571490366496]
Link prediction is often used as a proxy to evaluate the quality of embeddings. As benchmarks are crucial for the fair comparison of algorithms, ensuring their quality is tantamount to providing a solid ground for developing better solutions. We provide an implementation of an approach for spotting and removing such entities and provide corrected versions of the datasets WN18RR, FB15K-237, and YAGO3-10.
arXiv Detail & Related papers (2021-05-26T12:58:18Z)
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection [85.53263670166304]
One-stage detector basically formulates object detection as dense classification and localization. Recent trend for one-stage detectors is to introduce an individual prediction branch to estimate the quality of localization. This paper delves into the representations of the above three fundamental elements: quality estimation, classification and localization.
arXiv Detail & Related papers (2020-06-08T07:24:33Z)
CPM R-CNN: Calibrating Point-guided Misalignment in Object Detection [30.819685214855685]
CPM R-CNN contains three efficient modules to optimize anchor-based point-guided method. Compared with Faster R-CNN and Grid R-CNN based on ResNet-101 with FPN, our approach can substantially improve detection mAP by 3.3% and 1.5% respectively.
arXiv Detail & Related papers (2020-03-07T12:29:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.