Related papers: BugRepro: Enhancing Android Bug Reproduction with Domain-Specific Knowledge Integration

BugRepro: Enhancing Android Bug Reproduction with Domain-Specific Knowledge Integration

URL: http://arxiv.org/abs/2505.14528v2
Date: Thu, 29 May 2025 13:03:01 GMT
Title: BugRepro: Enhancing Android Bug Reproduction with Domain-Specific Knowledge Integration
Authors: Hongrong Yin, Jinhong Huang, Yao Li, Yunwei Dong, Tao Zhang,
Abstract summary: BugRepro is a novel technique that integrates domain-specific knowledge to enhance the accuracy and efficiency of bug reproduction.<n>BugRepro significantly outperforms two state-of-the-art methods.
Score: 4.833035081314386
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mobile application development is a fast-paced process where maintaining high-quality user experiences is crucial. Bug reproduction, a key aspect of maintaining app quality, often faces significant challenges. Specifically, when descriptions in bug reports are ambiguous or difficult to comprehend, current approaches fail to extract accurate information. Moreover, modern applications exhibit inherent complexity with multiple pages and diverse functionalities, making it challenging for existing methods to map the relevant information in bug reports to the corresponding UI elements that need to be manipulated. To address these challenges, we propose BugRepro, a novel technique that integrates domain-specific knowledge to enhance the accuracy and efficiency of bug reproduction. BugRepro adopts a Retrieval-Augmented Generation (RAG) approach. It retrieves similar bug reports along with their corresponding steps to reproduce (S2R) entities from an example-rich RAG document. In addition, BugRepro explores the graphical user interface (GUI) of the app and extracts transition graphs from the user interface to incorporate app-specific knowledge to guide large language models (LLMs) in their exploration process. Our experiments demonstrate that BugRepro significantly outperforms two state-of-the-art methods (ReCDroid and AdbGPT). For S2R entity extraction accuracy, it achieves a 7.57 to 28.89 percentage point increase over prior methods. For the bug reproduction success rate, the improvement reaches 74.55% and 152.63%. In reproduction efficiency, the gains are 0.72% and 76.68%.

Related papers

Improving Factuality with Explicit Working Memory [68.39261790277615]
Large language models can generate factually inaccurate content, a problem known as hallucination.<n>We introduce EWE (Explicit Working Memory), a novel approach that enhances factuality in long-form text generation by integrating a working memory that receives real-time feedback from external resources.
arXiv Detail & Related papers (2024-12-24T00:55:59Z)
Semantic GUI Scene Learning and Video Alignment for Detecting Duplicate Video-based Bug Reports [16.45808969240553]
Video-based bug reports are increasingly being used to document bugs for programs centered around a graphical user interface (GUI) We introduce a new approach, called JANUS, that adapts the scene-learning capabilities of vision transformers to capture subtle visual and textual patterns that manifest on app UI screens. Janus also makes use of a video alignment technique capable of adaptive weighting of video frames to account for typical bug manifestation patterns.
arXiv Detail & Related papers (2024-07-11T15:48:36Z)
Feedback-Driven Automated Whole Bug Report Reproduction for Android Apps [23.460238111094608]
ReBL is a novel feedback-driven approach to reproduce Android bug reports. It is more flexible and context-aware than the traditional step-by-step entity matching approach. It has the capability of handling non-crash functional bug reports.
arXiv Detail & Related papers (2024-07-06T19:58:03Z)
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models [102.72940700598055]
In reasoning tasks, even a minor error can cascade into inaccurate results. We develop a method that avoids introducing external resources, relying instead on perturbations to the input. Our training approach randomly masks certain tokens within the chain of thought, a technique we found to be particularly effective for reasoning tasks.
arXiv Detail & Related papers (2024-03-04T16:21:54Z)
On Using GUI Interaction Data to Improve Text Retrieval-based Bug Localization [10.717184444794505]
We investigate the hypothesis that, for end user-facing applications, connecting information in a bug report with information from the GUI, can improve upon existing techniques for bug localization. We source the current largest dataset of fully-localized and reproducible real bugs for Android apps, with corresponding bug reports.
arXiv Detail & Related papers (2023-10-12T07:14:22Z)
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios [87.12753459582116]
A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models. We propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models.
arXiv Detail & Related papers (2023-07-25T14:20:51Z)
Prompting Is All You Need: Automated Android Bug Replay with Large Language Models [28.69675481931385]
We propose AdbGPT, a new lightweight approach to automatically reproduce the bugs from bug reports through prompt engineering. AdbGPT leverages few-shot learning and chain-of-thought reasoning to elicit human knowledge and logical reasoning from LLMs. Our evaluations demonstrate the effectiveness and efficiency of our AdbGPT to reproduce 81.3% of bug reports in 253.6 seconds.
arXiv Detail & Related papers (2023-06-03T03:03:52Z)
Auto-labelling of Bug Report using Natural Language Processing [0.0]
Rule and Query-based solutions recommend a long list of potential similar bug reports with no clear ranking. In this paper, we have proposed a solution using a combination of NLP techniques. It uses a custom data transformer, a deep neural network, and a non-generalizing machine learning method to retrieve existing identical bug reports.
arXiv Detail & Related papers (2022-12-13T02:32:42Z)
Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers. We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z)
BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization. We provide a general benchmark with a diversity of real and synthetic Java bugs. We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z)
DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem. The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network. To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z)
VSAC: Efficient and Accurate Estimator for H and F [68.65610177368617]
VSAC is a RANSAC-type robust estimator with a number of novelties. It is significantly faster than all its predecessors and runs on average in 1-2 ms, on a CPU. It is two orders of magnitude faster and yet as precise as MAGSAC++, the currently most accurate estimator of two-view geometry.
arXiv Detail & Related papers (2021-06-18T17:04:57Z)
RepPoints V2: Verification Meets Regression for Object Detection [65.120827759348]
We introduce verification tasks into the localization prediction of RepPoints. RepPoints v2 provides consistent improvements of about 2.0 mAP over the original RepPoints. We show that the proposed approach can more generally elevate other object detection frameworks as well as applications such as instance segmentation.
arXiv Detail & Related papers (2020-07-16T17:57:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.