MaintainCoder: Maintainable Code Generation Under Dynamic Requirements
- URL: http://arxiv.org/abs/2503.24260v1
- Date: Mon, 31 Mar 2025 16:06:47 GMT
- Title: MaintainCoder: Maintainable Code Generation Under Dynamic Requirements
- Authors: Zhengren Wang, Rui Ling, Chufan Wang, Yongan Yu, Zhiyu Li, Feiyu Xiong, Wentao Zhang,
- Abstract summary: We propose MaintainCoder to handle dynamic requirements with minimal rework.<n>It integrates Waterfall model, design patterns, and multi-agent collaboration.<n>Our work not only provides the foundation of maintainable code generation, but also highlights the need for more holistic code quality research.
- Score: 22.419305172274512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern code generation has made significant strides in functional correctness and execution efficiency. However, these systems often overlook a critical dimension in real-world software development: maintainability. To handle dynamic requirements with minimal rework, we propose MaintainCoder as a pioneering solution. It integrates Waterfall model, design patterns, and multi-agent collaboration to systematically enhance cohesion, reduce coupling, and improve adaptability. We also introduce MaintainBench, a benchmark comprising requirement changes and corresponding dynamic metrics on maintainance effort. Experiments demonstrate that existing code generation methods struggle to meet maintainability standards when requirements evolve. In contrast, MaintainCoder improves maintainability metrics by 14-30% with even higher correctness, i.e. pass@k. Our work not only provides the foundation of maintainable code generation, but also highlights the need for more holistic code quality research. Resources: https://github.com/IAAR-Shanghai/MaintainCoder.
Related papers
- ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination [19.115931862737508]
We introduce ReCoDe--a decentralized, hybrid framework that merges the reliability of optimization-based controllers with the adaptability of reinforcement learning.<n>In this work, we focus on applications of ReCoDe to multi-agent navigation tasks requiring intricate, context-based movements and consensus.<n>We give empirical (real robot) and theoretical evidence that retaining a user-defined controller, even when it is imperfect, is more efficient than learning from scratch.
arXiv Detail & Related papers (2025-07-25T10:47:39Z) - UniConFlow: A Unified Constrained Generalization Framework for Certified Motion Planning with Flow Matching Models [16.275286046169594]
Generative models have become increasingly powerful tools for robot motion generation, enabling flexible and multimodal trajectory generation across various tasks.<n>This paper proposes UniConFlow, a unified flow matching framework for trajectory generation that systematically incorporates both equality and inequality constraints.<n>We conduct mobile navigation and high-dimensional manipulation tasks, demonstrating improved safety and feasibility compared to state-of-the-art constrained generative planners.
arXiv Detail & Related papers (2025-06-03T14:48:04Z) - Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z) - SolBench: A Dataset and Benchmark for Evaluating Functional Correctness in Solidity Code Completion and Repair [51.0686873716938]
We introduce SolBench, a benchmark for evaluating the functional correctness of Solidity smart contracts generated by code completion models.
We propose a Retrieval-Augmented Code Repair framework to verify functional correctness of smart contracts.
Results show that code repair and retrieval techniques effectively enhance the correctness of smart contract completion while reducing computational costs.
arXiv Detail & Related papers (2025-03-03T01:55:20Z) - CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code Generation [24.090719826360342]
We introduce CodeIF, the first benchmark designed to assess the abilities of Large Language Models (LLMs) to adhere to task-oriented instructions within code generation scenarios.
We conduct extensive experiments with LLMs, analyzing their strengths and limitations in meeting the demands of these tasks.
arXiv Detail & Related papers (2025-02-26T14:19:49Z) - ConAIR:Consistency-Augmented Iterative Interaction Framework to Enhance the Reliability of Code Generation [17.68163468068264]
We propose a Consistency-Augmented Iterative Interaction Framework to enhance the reliability of Code Generation, ConAIR.
We show that with minimal human effort, performance can be significantly enhanced.
arXiv Detail & Related papers (2024-11-23T15:26:24Z) - See-Saw Generative Mechanism for Scalable Recursive Code Generation with Generative AI [0.0]
This paper introduces the See-Saw generative mechanism, a novel methodology for dynamic and iterative code generation.
The proposed approach alternates between main code updates and dependency generation to ensure alignment and functionality.
The mechanism ensures that all code components are synchronized and functional, enabling scalable and efficient project generation.
arXiv Detail & Related papers (2024-11-16T18:54:56Z) - LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging [80.17238673443127]
LiNeS is a post-training editing technique designed to preserve pre-trained generalization while enhancing fine-tuned task performance.<n>LiNeS demonstrates significant improvements in both single-task and multi-task settings across various benchmarks in vision and natural language processing.
arXiv Detail & Related papers (2024-10-22T16:26:05Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - MOSS: Enabling Code-Driven Evolution and Context Management for AI Agents [7.4159044558995335]
We introduce MOSS (llM-oriented Operating System Simulation), a novel framework that integrates code generation with a dynamic context management system.
At its core, the framework employs an Inversion of Control container in conjunction with decorators to enforce the least knowledge principle.
We show how this framework can enhance the efficiency and capabilities of agent development and highlight its advantages in moving towards Turing-complete agents.
arXiv Detail & Related papers (2024-09-24T14:30:21Z) - Better Python Programming for all: With the focus on Maintainability [5.669063174637433]
This study aims to enhance the maintainability of code generated by Large Language Models (LLMs)
Our approach involves the use of a specifically designed dataset for training and evaluating the model.
After fine-tuning an LLM to prioritize code maintainability, our evaluations indicate that this model significantly improves code maintainability standards.
arXiv Detail & Related papers (2024-08-17T08:14:22Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - On the Impacts of Contexts on Repository-Level Code Generation [5.641402231731082]
We present RepoExec, a novel benchmark designed to evaluate repository-level code generation.<n>We focus on three key aspects: executability, functional correctness through comprehensive test case generation, and accurate utilization of cross-file contexts.
arXiv Detail & Related papers (2024-06-17T10:45:22Z) - SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents [50.82665351100067]
FlowGen is a code generation framework that emulates software process models based on multiple Large Language Model (LLM) agents.
We evaluate FlowGenScrum on four benchmarks: HumanEval, HumanEval-ET, MBPP, and MBPP-ET.
arXiv Detail & Related papers (2024-03-23T14:04:48Z) - MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks [50.61968901704187]
We introduce a framework for MoT instruction tuning, designed to promote the decomposition of tasks into logical sub-tasks and sub-modules.<n>Our investigations reveal that, through the cultivation and utilization of sub-modules, MoTCoder significantly improves both the modularity and correctness of the generated solutions.
arXiv Detail & Related papers (2023-12-26T08:49:57Z) - ReCode: Robustness Evaluation of Code Generation Models [90.10436771217243]
We propose ReCode, a comprehensive robustness evaluation benchmark for code generation models.
We customize over 30 transformations specifically for code on docstrings, function and variable names, code syntax, and code format.
With human annotators, we verified that over 90% of the perturbed prompts do not alter the semantic meaning of the original prompt.
arXiv Detail & Related papers (2022-12-20T14:11:31Z) - COLD Decoding: Energy-based Constrained Text Generation with Langevin
Dynamics [69.8062252611486]
Cold decoding is a flexible framework that can be applied directly to off-the-shelf left-to-right language models.
Our experiments on constrained generation tasks point to the effectiveness of our approach, both in terms of automatic and human evaluation.
arXiv Detail & Related papers (2022-02-23T18:59:27Z) - NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead
Heuristics [73.96837492216204]
We propose NeuroLogic A*esque, a decoding algorithm that incorporates estimates of future cost.
We develop efficient lookaheads that are efficient for large-scale language models.
Our approach achieves competitive baselines on five generation tasks, and new state-of-the-art performance on table-to-text generation, constrained machine translation, and keyword-constrained generation.
arXiv Detail & Related papers (2021-12-16T09:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.