Pull Requests as a Training Signal for Repo-Level Code Editing
- URL: http://arxiv.org/abs/2602.07457v1
- Date: Sat, 07 Feb 2026 09:22:25 GMT
- Title: Pull Requests as a Training Signal for Repo-Level Code Editing
- Authors: Qinglin Zhu, Tianyu Chen, Shuai Lu, Lei Ji, Runcong Zhao, Murong Ma, Xiangxiang Dai, Yulan He, Lin Gui, Peng cheng, Yeyun Gong,
- Abstract summary: Clean Pull Request (Clean-PR) is a mid-training paradigm that leverages real-world GitHub pull requests as a training signal for repository-level editing.<n>We introduce a scalable pipeline that converts noisy pull request diffs into Search/Replace edit blocks through reconstruction and validation.<n>On SWE-bench, our model significantly outperforms the instruction-tuned baseline, achieving absolute improvements of 13.6% on SWE-bench Lite and 12.3% on SWE-bench Verified.
- Score: 49.82435173554125
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Repository-level code editing requires models to understand complex dependencies and execute precise multi-file modifications across a large codebase. While recent gains on SWE-bench rely heavily on complex agent scaffolding, it remains unclear how much of this capability can be internalised via high-quality training signals. To address this, we propose Clean Pull Request (Clean-PR), a mid-training paradigm that leverages real-world GitHub pull requests as a training signal for repository-level editing. We introduce a scalable pipeline that converts noisy pull request diffs into Search/Replace edit blocks through reconstruction and validation, resulting in the largest publicly available corpus of 2 million pull requests spanning 12 programming languages. Using this training signal, we perform a mid-training stage followed by an agentless-aligned supervised fine-tuning process with error-driven data augmentation. On SWE-bench, our model significantly outperforms the instruction-tuned baseline, achieving absolute improvements of 13.6% on SWE-bench Lite and 12.3% on SWE-bench Verified. These results demonstrate that repository-level code understanding and editing capabilities can be effectively internalised into model weights under a simplified, agentless protocol, without relying on heavy inference-time scaffolding.
Related papers
- Refinement Provenance Inference: Detecting LLM-Refined Training Prompts from Model Behavior [58.751981587234916]
This paper formalizes the Refinement Provenance Inference (RPI) audit task as Refinement Provenance Inference (RPI)<n>We propose RePro, a logit-based framework that fuses teacher-forced likelihood features with logit-ranking signals.<n>During training, RePro learns a transferable representation via shadow fine-tuning, and uses a lightweight linear head to infer provenance on unseen victims without training-data access.
arXiv Detail & Related papers (2026-01-05T10:16:41Z) - One Tool Is Enough: Reinforcement Learning for Repository-Level LLM Agents [16.281864564259827]
RepoNavigator is an agent equipped with a single execution-aware tool-jumping to the definition of an invoked symbol.<n>RepoNavigator is trained end-to-end via Reinforcement Learning directly from a pretrained model, without any closed-source distillation.
arXiv Detail & Related papers (2025-12-24T05:27:53Z) - Empowering RepoQA-Agent based on Reinforcement Learning Driven by Monte-carlo Tree Search [70.63903518295785]
We introduce RepoSearch-R1, a novel agentic reinforcement learning framework driven by Monte-carlo Tree Search.<n>Based on RepoSearch-R1, we construct a RepoQA-Agent specifically designed for repository question-answering tasks.
arXiv Detail & Related papers (2025-10-30T09:10:36Z) - Agentic Reinforcement Learning for Real-World Code Repair [7.512134741776294]
We tackle the challenge of training reliable code-fixing agents in real repositories.<n>We developed a verifiable pipeline with success defined as post-fix build validation.<n>We introduced a scalable simplified pipeline for large-scale reinforcement learning.
arXiv Detail & Related papers (2025-10-24T23:25:02Z) - Reinforcement Learning for Machine Learning Engineering Agents [52.03168614623642]
We show that agents backed by weaker models that improve via reinforcement learning can outperform agents backed by much larger, but static models.<n>We propose duration- aware gradient updates in a distributed asynchronous RL framework to amplify high-cost but high-reward actions.<n>We also propose environment instrumentation to offer partial credit, distinguishing almost-correct programs from those that fail early.
arXiv Detail & Related papers (2025-09-01T18:04:10Z) - Repeton: Structured Bug Repair with ReAct-Guided Patch-and-Test Cycles [1.387448620257867]
Large Language Models (LLMs) have shown strong capabilities in code generation and comprehension, yet their application to complex software engineering tasks often suffers from low precision and limited interpretability.<n>We present Repeton, a fully open-source framework that leverages LLMs for precise and automated code manipulation in real-world Git.
arXiv Detail & Related papers (2025-06-09T19:36:40Z) - SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution [56.9361004704428]
Large Language Models (LLMs) have demonstrated remarkable proficiency across a variety of complex tasks.<n>SWE-Fixer is a novel open-source framework designed to effectively and efficiently resolve GitHub issues.<n>We assess our approach on the SWE-Bench Lite and Verified benchmarks, achieving competitive performance among open-source models.
arXiv Detail & Related papers (2025-01-09T07:54:24Z) - Repository Structure-Aware Training Makes SLMs Better Issue Resolver [20.095559504482885]
We introduce ReSAT (Repository Structure-Aware Training) to enhance the model's understanding of repository structure and issue resolving ability.<n>We construct two types of training data: (1) localization training data, a multi-level progressive localization data to improve code understanding and localization capability; (2) code edit training data, which improves context-based code editing capability.
arXiv Detail & Related papers (2024-12-26T03:01:32Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.