RepoRepair: Leveraging Code Documentation for Repository-Level Automated Program Repair
- URL: http://arxiv.org/abs/2603.01048v1
- Date: Sun, 01 Mar 2026 11:06:24 GMT
- Title: RepoRepair: Leveraging Code Documentation for Repository-Level Automated Program Repair
- Authors: Zhongqiang Pan, Chuanyi Li, Wenkang Zhong, Yi Feng, Bin Luo, Vincent Ng,
- Abstract summary: We propose RepoRepair, a novel documentation-enhanced approach for repository-level fault localization and program repair.<n>Our core insight is to leverage LLMs to generate hierarchical code documentation (from functions to files) for code repositories.<n>RepoRepair first employs a text-based LLM to generate file/function-level code documentation for repositories, which serves as auxiliary knowledge to guide fault localization.
- Score: 30.23781155493087
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated program repair (APR) struggles to scale from isolated functions to full repositories, as it demands a global, task-aware understanding to locate necessary changes. Current methods, limited by context and reliant on shallow retrieval or costly agent iterations, falter on complex cross-file issues. To this end, we propose RepoRepair, a novel documentation-enhanced approach for repository-level fault localization and program repair. Our core insight is to leverage LLMs to generate hierarchical code documentation (from functions to files) for code repositories, creating structured semantic abstractions that enable LLMs to comprehend repository-level context and dependencies. Specifically, RepoRepair first employs a text-based LLM (e.g., DeepSeek-V3) to generate file/function-level code documentation for repositories, which serves as auxiliary knowledge to guide fault localization. Subsequently, based on the fault localization results and the issue description, a powerful LLM (e.g., Claude-4) attempts to repair the identified suspicious code snippets. Evaluated on SWE-bench Lite, RepoRepair achieves a 45.7% repair rate at a low cost of $0.44 per fix. On SWE-bench Multimodal, it delivers state-of-the-art performance with a 37.1% repair rate despite a higher cost of $0.56 per fix, demonstrating robust and cost-effective performance across diverse problem domains.
Related papers
- SGAgent: Suggestion-Guided LLM-Based Multi-Agent Framework for Repository-Level Software Repair [22.745971570878435]
We propose a Suggestion-Guided multi-Agent framework for repository-level software repair.<n> SGAgent introduces a suggestion phase to strengthen the transition from localization to repair.<n>Three specialized sub-agents collaborate to achieve automated end-to-end software repair.
arXiv Detail & Related papers (2026-02-27T03:32:47Z) - Outcome-Conditioned Reasoning Distillation for Resolving Software Issues [49.16055123488827]
We present an Outcome-Conditioned Reasoning Distillation(O-CRD) framework that uses resolved in-repository issues with verified patches as supervision.<n>Starting from a historical fix, the method reconstructs a stage-wise repair trace backward from the verified outcome.<n>On SWE-Bench Lite, this approach increases Pass@1 by 10.4% with GPT-4o, 8.6% with DeepSeek-V3, and 10.3% with GPT-5.
arXiv Detail & Related papers (2026-01-30T18:25:39Z) - Repairing Regex Vulnerabilities via Localization-Guided Instructions [6.033257307910245]
Regular expressions (regexes) expose systems to regular expression denial of service (ReDoS)<n>Current approaches, however, are hampered by a trade-off.<n>We introduce a hybrid framework, localized repair (LRR), designed to harness generalization while enforcing reliability.
arXiv Detail & Related papers (2025-10-10T06:15:43Z) - RelRepair: Enhancing Automated Program Repair by Retrieving Relevant Code [11.74568238259256]
RelRepair retrieves relevant project-specific code to enhance automated program repair.<n>We evaluate RelRepair on two widely studied datasets, Defects4J V1.2 and ManySStuBs4J.
arXiv Detail & Related papers (2025-09-20T14:07:28Z) - RepoDebug: Repository-Level Multi-Task and Multi-Language Debugging Evaluation of Large Language Models [49.83481415540291]
Large Language Models (LLMs) have exhibited significant proficiency in code debug.<n>This paper introduces Repo Debug, a multi-task and multi-language repository-level code debug dataset.<n>We conduct evaluation experiments on 10 LLMs, where Claude 3.5 Sonnect, the best-performing model, still cannot perform well in repository-level debug.
arXiv Detail & Related papers (2025-09-04T10:13:21Z) - SweRank: Software Issue Localization with Code Ranking [109.3289316191729]
SweRank is an efficient retrieve-and-rerank framework for software issue localization.<n>We construct SweLoc, a large-scale dataset curated from public GitHub repositories.<n>We show that SweRank achieves state-of-the-art performance, outperforming both prior ranking models and costly agent-based systems.
arXiv Detail & Related papers (2025-05-07T19:44:09Z) - Enhancing repository-level software repair via repository-aware knowledge graphs [13.747293341707563]
Repository-level software repair faces challenges in bridging semantic gaps between issue descriptions and code patches.<n>Existing approaches, which rely on large language models (LLMs), are hindered by semantic ambiguities, limited understanding of structural context, and insufficient reasoning capabilities.<n>We propose a novel repository-aware knowledge graph (KG) that accurately links repository artifacts (issues and pull requests) and entities (files, classes, and functions)<n>A path-guided repair mechanism that leverages KG-mined paths, tracing through which allows us to augment contextual information along with explanations.
arXiv Detail & Related papers (2025-03-27T17:21:47Z) - ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - When Large Language Models Confront Repository-Level Automatic Program
Repair: How Well They Done? [13.693311241492827]
We introduce RepoBugs, a new benchmark comprising 124 typical repository-level bugs from open-source repositories.
Preliminary experiments using GPT3.5 based on the function where the error is located, reveal that the repair rate on RepoBugs is only 22.58%.
We propose a simple and universal repository-level context extraction method (RLCE) designed to provide more precise context for repository-level code repair tasks.
arXiv Detail & Related papers (2024-03-01T11:07:41Z) - ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks.
To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z) - RepoCoder: Repository-Level Code Completion Through Iterative Retrieval
and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process.
It incorporates a similarity-based retriever and a pre-trained code language model.
It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.