Related papers: DynaFix: Iterative Automated Program Repair Driven by Execution-Level Dynamic Information

DynaFix: Iterative Automated Program Repair Driven by Execution-Level Dynamic Information

URL: http://arxiv.org/abs/2512.24635v1
Date: Wed, 31 Dec 2025 05:13:34 GMT
Title: DynaFix: Iterative Automated Program Repair Driven by Execution-Level Dynamic Information
Authors: Zhili Huang, Ling Xu, Chao Liu, Weifeng Sun, Xu Zhang, Yan Lei, Meng Yan, Hongyu Zhang,
Abstract summary: Automated Program Repair (APR) aims to automatically generate correct patches for buggy programs.<n>Recent approaches leveraging large language models (LLMs) have shown promise but face limitations.<n>We propose DynaFix, an execution-level dynamic information-driven APR method that iteratively leverages runtime information to refine the repair process.
Score: 20.300297868395454
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automated Program Repair (APR) aims to automatically generate correct patches for buggy programs. Recent approaches leveraging large language models (LLMs) have shown promise but face limitations. Most rely solely on static analysis, ignoring runtime behaviors. Some attempt to incorporate dynamic signals, but these are often restricted to training or fine-tuning, or injected only once into the repair prompt, without iterative use. This fails to fully capture program execution. Current iterative repair frameworks typically rely on coarse-grained feedback, such as pass/fail results or exception types, and do not leverage fine-grained execution-level information effectively. As a result, models struggle to simulate human stepwise debugging, limiting their effectiveness in multi-step reasoning and complex bug repair. To address these challenges, we propose DynaFix, an execution-level dynamic information-driven APR method that iteratively leverages runtime information to refine the repair process. In each repair round, DynaFix captures execution-level dynamic information such as variable states, control-flow paths, and call stacks, transforming them into structured prompts to guide LLMs in generating candidate patches. If a patch fails validation, DynaFix re-executes the modified program to collect new execution information for the next attempt. This iterative loop incrementally improves patches based on updated feedback, similar to the stepwise debugging practices of human developers. We evaluate DynaFix on the Defects4J v1.2 and v2.0 benchmarks. DynaFix repairs 186 single-function bugs, a 10% improvement over state-of-the-art baselines, including 38 bugs previously unrepaired. It achieves correct patches within at most 35 attempts, reducing the patch search space by 70% compared with existing methods, thereby demonstrating both effectiveness and efficiency in repairing complex bugs.

Related papers

Specification Vibing for Automated Program Repair [8.68148153927532]
VibeRepair is a specification-centric APR technique that treats repair as behavior-specification repair rather than ad-hoc code editing.<n>On Defects4J v1.2, VibeRepair correctly repairs 174 bugs, exceeding the strongest state-of-the-art baseline by 28 bugs.<n>On Defects4J v2.0, it repairs 178 bugs, outperforming prior approaches by 33 bugs, representing a 23% improvement.
arXiv Detail & Related papers (2026-02-09T04:44:58Z)
InspectCoder: Dynamic Analysis-Enabled Self Repair through interactive LLM-Debugger Collaboration [71.18377595277018]
Large Language Models (LLMs) frequently generate buggy code with complex logic errors that are challenging to diagnose.<n>We present InspectCoder, the first agentic program repair system that empowers LLMs to actively conduct dynamic analysis via interactive debugger control.
arXiv Detail & Related papers (2025-10-21T06:26:29Z)
PathFix: Automated Program Repair with Expected Path [17.544454427324712]
We introduce a new APR method named PathFix to generate patches for repairing buggy code.<n>It is based on one observation: if a buggy program is repairable, at least one expected path is supposed to replace the fault path in the patched program.<n> Experimental results show that PathFix outperforms existing solutions, particularly in handling complex program structures.
arXiv Detail & Related papers (2025-10-16T06:21:49Z)
Repair Ingredients Are All You Need: Improving Large Language Model-Based Program Repair via Repair Ingredients Search [41.50068103527948]
We propose ReinFix, a framework that searches for repair ingredients throughout the reasoning and solution phases of bug fixing.<n>During the solution phase, ReinFix searches for external ingredients from historical bug fixes with similar bug patterns.<n> Evaluations on two popular benchmarks demonstrate the effectiveness of our approach over SOTA baselines.
arXiv Detail & Related papers (2025-06-29T06:02:11Z)
The Art of Repair: Optimizing Iterative Program Repair with Instruction-Tuned Models [48.073219761367184]
We investigate an APR pipeline that balances the generation of multiple outputs and multiple rounds of iteration.<n>We fine-tune each model on an APR dataset with three sizes (1K, 30K, 65K) and two techniques (Full Fine-Tuning and LoRA)<n>Our results show that by using only a fraction (1%) of the fine-tuning dataset, we can achieve improvements of up to 78% in the number of plausible patches generated.
arXiv Detail & Related papers (2025-05-05T18:06:51Z)
Show Me Why It's Correct: Saving 1/3 of Debugging Time in Program Repair with Interactive Runtime Comparison [18.933377426587015]
We propose an interactive approach called iFix to facilitate patch understanding and comparison.<n>iFix performs static analysis to identify runtime variables related to the buggy statement.<n>It captures runtime values during execution for each patch, allowing users to compare and contrast their runtime behavior.
arXiv Detail & Related papers (2025-03-01T20:52:49Z)
ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z)
Towards Practical and Useful Automated Program Repair for Debugging [4.216808129651161]
PracAPR is an interactive repair system that works in an Integrated Development Environment (IDE) PracAPR does not require a test suite or program re-execution.
arXiv Detail & Related papers (2024-07-12T03:19:54Z)
VDebugger: Harnessing Execution Feedback for Debugging Visual Programs [103.61860743476933]
We introduce V Debugger, a critic-refiner framework trained to localize and debug visual programs by tracking execution step by step. V Debugger identifies and corrects program errors leveraging detailed execution feedback, improving interpretability and accuracy. Evaluations on six datasets demonstrate V Debugger's effectiveness, showing performance improvements of up to 3.2% in downstream task accuracy.
arXiv Detail & Related papers (2024-06-19T11:09:16Z)
NExT: Teaching Large Language Models to Reason about Code Execution [50.93581376646064]
Large language models (LLMs) of code are typically trained on the surface textual form of programs. We propose NExT, a method to teach LLMs to inspect the execution traces of programs and reason about their run-time behavior.
arXiv Detail & Related papers (2024-04-23T01:46:32Z)
Boosting Redundancy-based Automated Program Repair by Fine-grained Pattern Mining [18.7107522872479]
We propose a new repair technique named Repatt, which incorporates a two-level pattern mining process for guiding effective patch generation.<n>We have conducted an experiment on the widely-used Defects4J benchmark and compared Repatt with ten state-of-the-art APR approaches.
arXiv Detail & Related papers (2023-12-26T08:42:32Z)
RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen) RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs. We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.