Related papers: Architecture-Aware Multi-Design Generation for Repository-Level Feature Addition

Architecture-Aware Multi-Design Generation for Repository-Level Feature Addition

URL: http://arxiv.org/abs/2603.01814v1
Date: Mon, 02 Mar 2026 12:50:40 GMT
Title: Architecture-Aware Multi-Design Generation for Repository-Level Feature Addition
Authors: Mingwei Liu, Zhenxi Chen, Zheng Pei, Zihao Wang, Yanlin Wang, Zibin Zheng,
Abstract summary: RAIM is a multi-design and architecture-aware framework for repository-level feature addition.<n>It shifts away from linear patching by generating multiple diverse implementation designs.<n>Experiments on the NoCode-bench Verified dataset demonstrate that RAIM establishes a new state-of-the-art performance.
Score: 53.50448142467294
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Implementing new features across an entire codebase presents a formidable challenge for Large Language Models (LLMs). This proactive task requires a deep understanding of the global system architecture to prevent unintended disruptions to legacy functionalities. Conventional pipeline and agentic frameworks often fall short in this area because they suffer from architectural blindness and rely on greedy single-path code generation. To overcome these limitations, we propose RAIM, a multi-design and architecture-aware framework for repository-level feature addition. This framework introduces a localization mechanism that conducts multi-round explorations over a repository-scale code graph to accurately pinpoint dispersed cross-file modification targets. Crucially, RAIM shifts away from linear patching by generating multiple diverse implementation designs. The system then employs a rigorous impact-aware selection process based on static and dynamic analysis to choose the most architecturally sound patch and avoid system regressions. Comprehensive experiments on the NoCode-bench Verified dataset demonstrate that RAIM establishes a new state-of-the-art performance with a 39.47% success rate, achieving a 36.34% relative improvement over the strongest baseline. Furthermore, the approach exhibits robust generalization across various foundation models and empowers open-weight models like DeepSeek-v3.2 to surpass baseline systems powered by leading proprietary models. Detailed ablation studies confirm that the multi-design generation and impact validation modules are critical to effectively managing complex dependencies and reducing code errors. These findings highlight the vital role of structural awareness in automated software evolution.

Related papers

A Systematic Study of LLM-Based Architectures for Automated Patching [7.9821766277253845]
We present a controlled evaluation of four large language models (LLMs)-based patching paradigms.<n>We analyze patch correctness, failure modes, token usage, and execution time across real-world vulnerability tasks.<n>Our results reveal clear architectural trade-offs: fixed are efficient but brittle, single-agent systems balance flexibility and cost, and multi-agent designs improve at the expense of substantially higher overhead.
arXiv Detail & Related papers (2026-03-01T20:26:22Z)
Integrating Diverse Assignment Strategies into DETRs [61.61489761918158]
Label assignment is a critical component in object detectors, particularly within DETR-style frameworks.<n>We propose LoRA-DETR, a flexible and lightweight framework that seamlessly integrates diverse assignment strategies into any DETR-style detector.
arXiv Detail & Related papers (2026-01-14T07:28:54Z)
Iterative Structured Pruning for Large Language Models with Multi-Domain Calibration [73.40887151631088]
Large Language Models (LLMs) have achieved remarkable success across a wide spectrum of natural language processing tasks.<n>Their ever-growing scale introduces significant barriers to real-world deployment, including substantial computational overhead, memory footprint, and inference latency.<n>In this work, we explore structured pruning, which eliminates entire architectural components and maintains compatibility with standard hardware accelerators.
arXiv Detail & Related papers (2026-01-06T03:09:31Z)
AR-MOT: Autoregressive Multi-object Tracking [56.09738000988466]
We propose a novel autoregressive paradigm that formulates MOT as a sequence generation task within a large language model (LLM) framework.<n>This design enables the model to output structured results through flexible sequence construction, without requiring any task-specific heads.<n>To enhance region-level visual perception, we introduce an Object Tokenizer based on a pretrained detector.
arXiv Detail & Related papers (2026-01-05T09:17:28Z)
Towards Understanding Bugs in Distributed Training and Inference Frameworks for Large Language Models [7.486731499255164]
This paper conducts the first large-scale empirical analysis of 308 fixed bugs across three popular distributed training/inference frameworks: DeepSpeed, Megatron-LM, and Colossal-AI.<n>We examine bug symptoms, root causes, bug identification and fixing efforts, and common low-effort fixing strategies.
arXiv Detail & Related papers (2025-06-12T07:24:59Z)
Patchwork: A Unified Framework for RAG Serving [6.430565435912026]
Retrieval Augmented Generation (RAG) has emerged as a new paradigm for enhancing Large Language Model reliability through integration with external knowledge sources.<n>We introduce Patchwork, a comprehensive end-to-end RAG serving framework designed to address these efficiency bottlenecks.
arXiv Detail & Related papers (2025-05-01T18:58:26Z)
An Efficient and Mixed Heterogeneous Model for Image Restoration [71.85124734060665]
Current mainstream approaches are based on three architectural paradigms: CNNs, Transformers, and Mambas.<n>We propose RestorMixer, an efficient and general-purpose IR model based on mixed-architecture fusion.
arXiv Detail & Related papers (2025-04-15T08:19:12Z)
Restore Anything Model via Efficient Degradation Adaptation [129.38475243424563]
RAM takes a unified path that leverages inherent similarities across various degradations to enable efficient and comprehensive restoration.<n> RAM's SOTA performance confirms RAM's SOTA performance, reducing model complexity by approximately 82% in trainable parameters and 85% in FLOPs.
arXiv Detail & Related papers (2024-07-18T10:26:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.