Agentic Software Issue Resolution with Large Language Models: A Survey
- URL: http://arxiv.org/abs/2512.22256v1
- Date: Wed, 24 Dec 2025 08:05:10 GMT
- Title: Agentic Software Issue Resolution with Large Language Models: A Survey
- Authors: Zhonghao Jiang, David Lo, Zhongxin Liu,
- Abstract summary: Software issue resolution aims to address real-world issues in software repositories based on natural language descriptions provided by users.<n>Large language models (LLMs) in reasoning and generative capabilities have made significant progress in automated software issue resolution.<n>Recently, LLM-based agentic systems have become mainstream for software issue resolution.
- Score: 9.583478737157531
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Software issue resolution aims to address real-world issues in software repositories (e.g., bug fixing and efficiency optimization) based on natural language descriptions provided by users, representing a key aspect of software maintenance. With the rapid development of large language models (LLMs) in reasoning and generative capabilities, LLM-based approaches have made significant progress in automated software issue resolution. However, real-world software issue resolution is inherently complex and requires long-horizon reasoning, iterative exploration, and feedback-driven decision making, which demand agentic capabilities beyond conventional single-step approaches. Recently, LLM-based agentic systems have become mainstream for software issue resolution. Advancements in agentic software issue resolution not only greatly enhance software maintenance efficiency and quality but also provide a realistic environment for validating agentic systems' reasoning, planning, and execution capabilities, bridging artificial intelligence and software engineering. This work presents a systematic survey of 126 recent studies at the forefront of LLM-based agentic software issue resolution research. It outlines the general workflow of the task and establishes a taxonomy across three dimensions: benchmarks, techniques, and empirical studies. Furthermore, it highlights how the emergence of agentic reinforcement learning has brought a paradigm shift in the design and training of agentic systems for software engineering. Finally, it summarizes key challenges and outlines promising directions for future research.
Related papers
- Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey [59.3507264893654]
Issue resolution is a complex Software Engineering task integral to real-world development.<n> benchmarks like SWE-bench revealed this task as profoundly difficult for large language models.<n>This paper presents a systematic survey of this emerging domain.
arXiv Detail & Related papers (2026-01-15T18:55:03Z) - LLM-Based Agentic Systems for Software Engineering: Challenges and Opportunities [0.03437656066916039]
This concept paper systematically reviews the emerging paradigm of LLM-based multi-agent systems.<n>We delve into a wide range of topics such as language model selection, SE evaluation benchmarks, state-of-the-art agentic frameworks and communication protocols.
arXiv Detail & Related papers (2026-01-14T19:28:30Z) - Studying and Automating Issue Resolution for Software Quality [1.7767466724342065]
This research aims to address challenges through three complementary directions.<n>First, we enhance issue report quality by proposing techniques that leverage LLM reasoning and application-specific information.<n>Second, we empirically characterize developer in both traditional and AI-augmented systems.<n>Third, we automate cognitively demanding resolution tasks, including buggy UI localization and solution identification.
arXiv Detail & Related papers (2025-12-11T02:44:40Z) - A Comprehensive Survey on Benchmarks and Solutions in Software Engineering of LLM-Empowered Agentic System [56.40989626804489]
This survey provides the first holistic analysis of Large Language Models-powered software engineering.<n>We review over 150 recent papers and propose a taxonomy along two key dimensions: (1) Solutions, categorized into prompt-based, fine-tuning-based, and agent-based paradigms, and (2) Benchmarks, including tasks such as code generation, translation, and repair.
arXiv Detail & Related papers (2025-10-10T06:56:50Z) - Large Language Models in Operations Research: Methods, Applications, and Challenges [9.208082097215314]
Operations research (OR) supports complex system decision-making, with broad applications in transportation, supply chain management, and production scheduling.<n>Traditional approaches that rely on expert-driven modeling and manual parameter tuning often struggle with large-scale, dynamic, and multi-constraint problems.<n>This paper systematically reviews progress in applying large language models (LLMs) to OR, categorizing existing methods into three pathways: automatic modeling, auxiliary optimization, and direct solving.
arXiv Detail & Related papers (2025-09-18T01:52:19Z) - A Survey on Code Generation with LLM-based Agents [61.474191493322415]
Code generation agents powered by large language models (LLMs) are revolutionizing the software development paradigm.<n>LLMs are characterized by three core features.<n>This paper presents a systematic survey of the field of LLM-based code generation agents.
arXiv Detail & Related papers (2025-07-31T18:17:36Z) - Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.<n>Our framework incorporates two complementary strategies: internal TTC and external TTC.<n>We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z) - Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement [62.94719119451089]
Lingma SWE-GPT series learns from and simulating real-world code submission activities.
Lingma SWE-GPT 72B resolves 30.20% of GitHub issues, marking a significant improvement in automatic issue resolution.
arXiv Detail & Related papers (2024-11-01T14:27:16Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - LLM-Based Multi-Agent Systems for Software Engineering: Literature Review, Vision and the Road Ahead [14.834072370183106]
This paper explores the transformative potential of integrating Large Language Models into Multi-Agent (LMA) systems.<n>By leveraging the collaborative and specialized abilities of multiple agents, LMA systems enable autonomous problem-solving, improve robustness, and provide scalable solutions for managing the complexity of real-world software projects.
arXiv Detail & Related papers (2024-04-07T07:05:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.