Related papers: From Completion to Editing: Unlocking Context-Aware Code Infilling via Search-and-Replace Instruction Tuning

From Completion to Editing: Unlocking Context-Aware Code Infilling via Search-and-Replace Instruction Tuning

URL: http://arxiv.org/abs/2601.13384v1
Date: Mon, 19 Jan 2026 20:33:53 GMT
Title: From Completion to Editing: Unlocking Context-Aware Code Infilling via Search-and-Replace Instruction Tuning
Authors: Jiajun Zhang, Zeyu Cui, Jiaxi Yang, Lei Zhang, Yuheng Jing, Zeyao Ma, Tianyi Bai, Zilei Wang, Qiang Liu, Liang Wang, Binyuan Hui, Junyang Lin,
Abstract summary: We propose a framework that internalizes the agentic verification-and-editing mechanism into a unified, single-pass inference process.<n>With minimal data, SRI-Coder enables Chat models to surpass the completion performance of their Base counterparts.<n>Unlike FIM-style tuning, SRI preserves general coding competencies and maintains inference latency comparable to standard FIM.
Score: 81.97788535387286
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The dominant Fill-in-the-Middle (FIM) paradigm for code completion is constrained by its rigid inability to correct contextual errors and reliance on unaligned, insecure Base models. While Chat LLMs offer safety and Agentic workflows provide flexibility, they suffer from performance degradation and prohibitive latency, respectively. To resolve this dilemma, we propose Search-and-Replace Infilling (SRI), a framework that internalizes the agentic verification-and-editing mechanism into a unified, single-pass inference process. By structurally grounding edits via an explicit search phase, SRI harmonizes completion tasks with the instruction-following priors of Chat LLMs, extending the paradigm from static infilling to dynamic context-aware editing. We synthesize a high-quality dataset, SRI-200K, and fine-tune the SRI-Coder series. Extensive evaluations demonstrate that with minimal data (20k samples), SRI-Coder enables Chat models to surpass the completion performance of their Base counterparts. Crucially, unlike FIM-style tuning, SRI preserves general coding competencies and maintains inference latency comparable to standard FIM. We empower the entire Qwen3-Coder series with SRI, encouraging the developer community to leverage this framework for advanced auto-completion and assisted development.

Related papers

SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration [7.89414068452646]
SWE-CI is the first repository-level benchmark built upon the Continuous Integration loop.<n>It aims to shift the evaluation paradigm for code generation from static, short-term textitfunctional correctness toward dynamic, long-term textitmaintainability
arXiv Detail & Related papers (2026-03-04T08:20:25Z)
OMG-Agent: Toward Robust Missing Modality Generation with Decoupled Coarse-to-Fine Agentic Workflows [9.617220633655716]
We present textbfunderlineOmni-textbfunderlineModality textbfunderlineGeneration Agent (textbfOMG-Agent)
arXiv Detail & Related papers (2026-02-04T02:25:40Z)
RealSec-bench: A Benchmark for Evaluating Secure Code Generation in Real-World Repositories [58.32028251925354]
Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, but their proficiency in producing secure code remains a critical, under-explored area.<n>We introduce RealSec-bench, a new benchmark for secure code generation meticulously constructed from real-world, high-risk Java repositories.
arXiv Detail & Related papers (2026-01-30T08:29:01Z)
Veri-Sure: A Contract-Aware Multi-Agent Framework with Temporal Tracing and Formal Verification for Correct RTL Code Generation [4.723302382132762]
silicon-grade correctness remains bottlenecked by: (i) limited test coverage and reliability of simulation-centric evaluation, (ii) regressions and repair hallucinations, and (iii) semantic drift as intent is reinterpreted across agent handoffs.<n>We propose Veri-Sure, a multi-agent framework that establishes a design contract to align agents' intent and uses a patching mechanism guided by static dependency slicing to perform precise, localized repairs.
arXiv Detail & Related papers (2026-01-27T16:10:23Z)
PRISM: Purified Representation and Integrated Semantic Modeling for Generative Sequential Recommendation [28.629759086187352]
We propose a novel generative recommendation framework, PRISM, with Purified Representation and Integrated Semantic Modeling.<n>PRISM consistently outperforms state-of-the-art baselines across four real-world datasets.
arXiv Detail & Related papers (2026-01-23T08:50:16Z)
AgentCyTE: Leveraging Agentic AI to Generate Cybersecurity Training & Experimentation Scenarios [0.19999259391104388]
We present AgentCyTE, a framework integrating large language models with deterministic, schema-constrained network emulation.<n>AgentCyTE observes scenario outcomes, validates correctness, and iteratively enhances realism and consistency.
arXiv Detail & Related papers (2025-10-29T05:44:12Z)
CIR-CoT: Towards Interpretable Composed Image Retrieval via End-to-End Chain-of-Thought Reasoning [93.05917922306196]
Composed Image Retrieval (CIR) aims to find a target image from a reference image and a modification text.<n>CIR-CoT is the first end-to-end retrieval-oriented MLLM designed to integrate explicit Chain-of-Thought (CoT) reasoning.
arXiv Detail & Related papers (2025-10-09T09:41:45Z)
Joint Learning using Mixture-of-Expert-Based Representation for Enhanced Speech Generation and Robust Emotion Recognition [54.44798086835314]
Speech emotion recognition (SER) plays a critical role in building emotion-aware speech systems, but its performance degrades significantly under noisy conditions.<n>We propose the Sparse Mixture-of-Experts Representation Integration Technique (Sparse MERIT), a flexible MTL framework that applies frame-wise expert routing over self-supervised speech representations.<n> Experiments on the MSP-Podcast corpus show that Sparse MERIT consistently outperforms baseline models on both SER and SE tasks.
arXiv Detail & Related papers (2025-09-10T10:18:56Z)
Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition [95.54406667705999]
Pangu Embedded is an efficient Large Language Model (LLM) reasoner developed on Ascend Neural Processing Units (NPUs)<n>It addresses the significant computational costs and inference latency challenges prevalent in existing reasoning-optimized LLMs.<n>It delivers rapid responses and state-of-the-art reasoning quality within a single, unified model architecture.
arXiv Detail & Related papers (2025-05-28T14:03:02Z)
MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning [36.3918410061572]
MA-RAG addresses the inherent ambiguities and reasoning challenges in complex information-seeking tasks.<n>Unlike conventional RAG methods that rely on end-to-end fine-tuning or isolated component enhancements, MA-RAG orchestrates a collaborative set of specialized AI agents.<n>Our results highlight the effectiveness of collaborative, modular reasoning in retrieval-augmented systems.
arXiv Detail & Related papers (2025-05-26T15:05:18Z)
Can Code Language Models Learn Clarification-Seeking Behaviors? [4.788534218705066]
We present ClarifyCoder, a framework with synthetic data generation and instruction-tuning.<n>We show that ClarifyCoder achieves a 63% communication rate and a 52% good question rate on ambiguous tasks.
arXiv Detail & Related papers (2025-04-23T00:34:39Z)
Retrieval is Not Enough: Enhancing RAG Reasoning through Test-Time Critique and Optimization [58.390885294401066]
Retrieval-augmented generation (RAG) has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>We propose AlignRAG, a novel iterative framework grounded in Critique-Driven Alignment (CDA)<n>We introduce AlignRAG-auto, an autonomous variant that dynamically terminates refinement, removing the need to pre-specify the number of critique iterations.
arXiv Detail & Related papers (2025-04-21T04:56:47Z)
SagaLLM: Context Management, Validation, and Transaction Guarantees for Multi-Agent LLM Planning [2.1331883629523634]
SagaLLM is a structured multi-agent architecture designed to address four foundational limitations of current LLM-based planning systems.<n>It bridges this gap by integrating the Saga transactional pattern with persistent memory, automated compensation, and independent validation agents.<n>It achieves significant improvements in consistency, validation accuracy, and adaptive coordination under uncertainty.
arXiv Detail & Related papers (2025-03-15T01:43:03Z)
Semantic Integrity Constraints: Declarative Guardrails for AI-Augmented Data Processing Systems [39.23499993745249]
We introduce semantic integrity constraints (SICs) for specifying and enforcing correctness conditions over LLM outputs in semantic queries.<n>SICs generalize traditional database integrity constraints to semantic settings, supporting common types of constraints, such as grounding, soundness, and exclusion.<n>We present a system design for integrating SICs into query planning and runtime and discuss its realization in AI-augmented DPSs.
arXiv Detail & Related papers (2025-03-01T19:59:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.