Past, Present, and Future of Bug Tracking in the Generative AI Era
- URL: http://arxiv.org/abs/2510.08005v1
- Date: Thu, 09 Oct 2025 09:42:30 GMT
- Title: Past, Present, and Future of Bug Tracking in the Generative AI Era
- Authors: Utku Boran Torun, Mehmet Taha Demircan, Mahmut Furkan Gön, Eray Tüzün,
- Abstract summary: Traditional bug tracking systems rely heavily on manual reporting, reproduction, triaging, and resolution.<n>We propose an AI-powered bug tracking framework that augments existing tools with intelligent, large language model (LLM)-driven automation.
- Score: 1.3241176321860364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional bug tracking systems rely heavily on manual reporting, reproduction, triaging, and resolution, each carried out by different stakeholders such as end users, customer support, developers, and testers. This division of responsibilities requires significant coordination and widens the communication gap between non-technical users and technical teams, slowing the process from bug discovery to resolution. Moreover, current systems are highly asynchronous; users often wait hours or days for a first response, delaying fixes and contributing to frustration. This paper examines the evolution of bug tracking, from early paper-based reporting to today's web-based and SaaS platforms. Building on this trajectory, we propose an AI-powered bug tracking framework that augments existing tools with intelligent, large language model (LLM)-driven automation. Our framework addresses two main challenges: reducing time-to-fix and minimizing human overhead. Users report issues in natural language, while AI agents refine reports, attempt reproduction, and request missing details. Reports are then classified, invalid ones resolved through no-code fixes, and valid ones localized and assigned to developers. LLMs also generate candidate patches, with human oversight ensuring correctness. By integrating automation into each phase, our framework accelerates response times, improves collaboration, and strengthens software maintenance practices for a more efficient, user-centric future.
Related papers
- Steering LLMs via Scalable Interactive Oversight [74.12746881843044]
Large Language Models increasingly automate complex, long-horizon tasks such as emphvibe coding, a supervision gap has emerged.<n>It presents a critical challenge in scalable oversight: enabling humans to responsibly steer AI systems on tasks that surpass their own ability to specify or verify.
arXiv Detail & Related papers (2026-02-04T04:52:00Z) - Talk Less, Verify More: Improving LLM Assistants with Semantic Checks and Execution Feedback [14.593478824805542]
This paper introduces two complementary verification techniques: Q*, which performs reverse translation and semantic matching between code and user intent, and Feedback+, which incorporates execution feedback to guide code refinement.<n> Evaluations on three benchmark datasets, Spider, Bird, and GSM8K, demonstrate that both Q* and Feedback+ reduce error rates and task completion time.
arXiv Detail & Related papers (2026-01-01T06:10:06Z) - Toward Training Superintelligent Software Agents through Self-Play SWE-RL [66.11447353341926]
Self-play SWE-RL is a first step toward training paradigms for superintelligent software agents.<n>Our approach takes minimal data assumptions, only requiring access to sandboxed repositories with source code and installed dependencies.<n>Our results, albeit early, suggest a path where agents autonomously gather extensive learning experiences from real-world software repositories.
arXiv Detail & Related papers (2025-12-21T00:49:40Z) - Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows [60.04362496037186]
We present the first controlled study of developer interactions with coding agents.<n>We evaluate two leading copilot and agentic coding assistants.<n>Our results show agents can assist developers in ways that surpass copilots.
arXiv Detail & Related papers (2025-07-10T20:12:54Z) - From Code Generation to Software Testing: AI Copilot with Context-Based RAG [8.28588489551341]
We propose a novel perspective on software testing by positing bug detection and coding with fewer bugs as two interconnected problems.<n>We introduce Copilot for Testing, an automated testing system that synchronizes bug detection with updates.<n>Our evaluation demonstrates a 31.2% improvement in bug detection accuracy, a 12.6% increase in critical test coverage, and a 10.5% higher user acceptance rate.
arXiv Detail & Related papers (2025-04-02T16:20:05Z) - AEGIS: An Agent-based Framework for General Bug Reproduction from Issue Descriptions [10.686849324750556]
Automated gEneral buG reproductIon Scripts generation framework, named AEGIS, is the first agent-based framework for the task.<n>AEGIS can improve the relative resolved rate of Agentless by 12.5%.
arXiv Detail & Related papers (2024-11-27T03:16:47Z) - LLMs as Continuous Learners: Improving the Reproduction of Defective Code in Software Issues [62.12404317786005]
EvoCoder is a continuous learning framework for issue code reproduction.
Our results show a 20% improvement in issue reproduction rates over existing SOTA methods.
arXiv Detail & Related papers (2024-11-21T08:49:23Z) - MarsCode Agent: AI-native Automated Bug Fixing [7.909344108948294]
We introduce MarsCode Agent, a novel framework that leverages large language models to automatically identify and repair bugs in software code.
Our approach follows a systematic process of planning, bug reproduction, fault localization, candidate patch generation, and validation to ensure high-quality bug fixes.
Our results show that MarsCode Agent achieves a high success rate in bug fixing compared to most of the existing automated approaches.
arXiv Detail & Related papers (2024-09-02T02:24:38Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - Prompting Is All You Need: Automated Android Bug Replay with Large Language Models [28.69675481931385]
We propose AdbGPT, a new lightweight approach to automatically reproduce the bugs from bug reports through prompt engineering.
AdbGPT leverages few-shot learning and chain-of-thought reasoning to elicit human knowledge and logical reasoning from LLMs.
Our evaluations demonstrate the effectiveness and efficiency of our AdbGPT to reproduce 81.3% of bug reports in 253.6 seconds.
arXiv Detail & Related papers (2023-06-03T03:03:52Z) - Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers.
We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z) - Learning to Communicate and Correct Pose Errors [75.03747122616605]
We study the setting proposed in V2VNet, where nearby self-driving vehicles jointly perform object detection and motion forecasting in a cooperative manner.
We propose a novel neural reasoning framework that learns to communicate, to estimate potential errors, and to reach a consensus about those errors.
arXiv Detail & Related papers (2020-11-10T18:19:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.