Related papers: Lifecycle-Aware code generation: Leveraging Software Engineering Phases in LLMs

Lifecycle-Aware code generation: Leveraging Software Engineering Phases in LLMs

URL: http://arxiv.org/abs/2510.24019v1
Date: Tue, 28 Oct 2025 02:54:02 GMT
Title: Lifecycle-Aware code generation: Leveraging Software Engineering Phases in LLMs
Authors: Xing Xing, Wei Wang, Lipeng Ma, Weidong Yang, Junjie Zheng,
Abstract summary: We introduce a lifecycle-aware framework that incorporates intermediate artifacts into both the training and inference stages.<n> Experiments show that lifecycle-level fine-tuning improves code correctness by up to 75% over the same model before fine-tuning.<n>Open-source LLMs, once fine-tuned under our framework, match or slightly outperform models pretrained on code.
Score: 12.70863561286374
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent progress in large language models (LLMs) has advanced automatic code generation, yet most approaches rely on direct, single-step translation from problem descriptions to code, disregarding structured software engineering practices. We introduce a lifecycle-aware framework that systematically incorporates intermediate artifacts such as requirements analysis, state machine modeling, and pseudocode into both the training and inference stages. This design aligns code generation with standard software development phases and enables more structured reasoning. Experiments show that lifecycle-level fine-tuning improves code correctness by up to 75% over the same model before fine-tuning, with performance gains compounding across intermediate stages. Multi-step inference consistently surpasses single-step generation, demonstrating the effectiveness of intermediate scaffolding. Notably, open-source LLMs, once fine-tuned under our framework, match or slightly outperform models pretrained on code. When applied to DeepSeek-Coder-1.3B, our framework yields relative CodeBLEU improvements of 34.3%, 20.0%, 11.2%, and 22.3% over ChatGPT-3.5, ChatGPT-4o-mini, DeepSeek-R1, and LLaMA-8B, respectively. Our pipeline also proves robust with up to 80\% less training data, confirming its resilience. Ablation studies further reveal that each intermediate artifact contributes distinctly to final code quality, with state machine modeling yielding the most substantial impact. Our source code and detailed experimental data are available at https://anonymous.4open.science/r/Lifecycle-Aware-3CCB.

Related papers

Environment-Aware Code Generation: How far are We? [52.69113158357018]
It is unclear whether large language models (LLMs) can reliably generate executable code tailored to a user's specific environment.<n>We present the first systematic study of Environment-Aware Code Generation (EACG), where generated code must be functionally correct and directly executable under arbitrary software configurations.<n>Our results show that current LLMs struggle with environment-specific code generation, while our adaptations improve environment compatibility and executability.
arXiv Detail & Related papers (2026-01-18T04:58:15Z)
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence [150.3696990310269]
Large language models (LLMs) have transformed automated software development by enabling direct translation of natural language descriptions into functional code.<n>We provide a comprehensive synthesis and practical guide (a series of analytic and probing experiments) about code LLMs.<n>We analyze the code capability of the general LLMs (GPT-4, Claude, LLaMA) and code-specialized LLMs (StarCoder, Code LLaMA, DeepSeek-Coder, and QwenCoder)
arXiv Detail & Related papers (2025-11-23T17:09:34Z)
Evaluating Software Process Models for Multi-Agent Class-Level Code Generation [5.545076518491288]
Large Language Models (LLMs) are increasingly used to automate software development.<n>This work examines how process structure and role shape multi-agent specialization for class-level code generation.
arXiv Detail & Related papers (2025-11-12T22:53:12Z)
Benchmarking Correctness and Security in Multi-Turn Code Generation [41.75392001830794]
We introduce MTSec, the first benchmark to evaluate correctness and security in multi-turn coding scenarios.<n>We evaluate 32 open- and closed-source models, and three agent-scaffolding on MT-Sec.<n>We find that while agent-generated scaffoldings boost single-turn code generation performance, they are not quite as effective in multiturn evaluations.
arXiv Detail & Related papers (2025-10-13T01:20:46Z)
Beyond Single LLMs: Enhanced Code Generation via Multi-Stage Performance-Guided LLM Orchestration [12.674888937998086]
Large Language Models (LLMs) have become the predominant paradigm for automated code generation.<n>This paper challenges the single-model convention by introducing a multi-stage, performance-guided orchestration framework.<n>Perch orchestrates top-performing LLMs for each task context through stage-wise validation and rollback mechanisms.
arXiv Detail & Related papers (2025-10-01T19:07:16Z)
Reinforcement Learning for Machine Learning Engineering Agents [52.03168614623642]
We show that agents backed by weaker models that improve via reinforcement learning can outperform agents backed by much larger, but static models.<n>We propose duration- aware gradient updates in a distributed asynchronous RL framework to amplify high-cost but high-reward actions.<n>We also propose environment instrumentation to offer partial credit, distinguishing almost-correct programs from those that fail early.
arXiv Detail & Related papers (2025-09-01T18:04:10Z)
SynthCoder: A Synthetical Strategy to Tune LLMs for Code Completion [7.668823606571788]
Code completion is a prominent application of Large Language Models (LLMs) in software engineering.<n>This paper proposes SynthCoder, a model that integrates leading industry practices to achieve state-of-the-art on the Fill-in-the-Middle (FIM) code completion task.
arXiv Detail & Related papers (2025-08-21T12:23:49Z)
Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks.<n>However, improvement is plateauing due to the exhaustion of readily available high-quality data.<n>We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z)
UnitCoder: Scalable Iterative Code Synthesis with Unit Test Guidance [65.01483640267885]
Large Language Models (LLMs) have demonstrated remarkable capabilities in various tasks, yet code generation remains a major challenge.<n>We introduce UnitCoder, a systematic pipeline leveraging model-generated unit tests to guide and validate the code generation process.<n>Our work presents a scalable approach that leverages model-generated unit tests to guide the synthesis of high-quality code data from pre-training corpora.
arXiv Detail & Related papers (2025-02-17T05:37:02Z)
SOEN-101: Code Generation by Emulating Software Process Models Using Large Language Model Agents [50.82665351100067]
FlowGen is a code generation framework that emulates software process models based on multiple Large Language Model (LLM) agents. We evaluate FlowGenScrum on four benchmarks: HumanEval, HumanEval-ET, MBPP, and MBPP-ET.
arXiv Detail & Related papers (2024-03-23T14:04:48Z)
LLM-Assisted Code Cleaning For Training Accurate Code Generators [53.087019724256606]
We investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system. We build a novel data-cleaning pipeline that uses these principles to transform existing programs. We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B improves the performance by up to 30% compared to fine-tuning on the original dataset.
arXiv Detail & Related papers (2023-11-25T02:45:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.