Exploring the Power of Diffusion Large Language Models for Software Engineering: An Empirical Investigation
- URL: http://arxiv.org/abs/2510.04605v1
- Date: Mon, 06 Oct 2025 09:13:25 GMT
- Title: Exploring the Power of Diffusion Large Language Models for Software Engineering: An Empirical Investigation
- Authors: Jingyao Zhang, Tianlin Li, Xiaoyu Zhang, Qiang Hu, Bin Shi,
- Abstract summary: DiffusionDLLMs offer a promising alternative to Autoregressive Large Language Models (AR-LLMs)<n>7BDLLMs outperform AR-LLMs with a 30% average accuracy improvement achieving a 113% gain on cross-file repair.
- Score: 27.11701612946034
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autoregressive Large Language Models (AR-LLMs) are widely used in software engineering (SE) but face limitations in processing code structure information and suffer from high inference latency. Diffusion LLMs (DLLMs) offer a promising alternative with global bidirectional encoding and decoupled generation steps. This work presents the first comprehensive evaluation of DLLMs across the software development lifecycle, including code generation, defect detection, and program repair. On a large-scale benchmark of 52,937 tasks, 7Bparameter DLLMs outperform AR-LLMs with a 30% average accuracy improvement achieving a 113% gain on cross-file repair, while maintaining superior efficiency and reduced latency. Our results establish DLLMs as a superior paradigm for SE tasks.
Related papers
- SLMFix: Leveraging Small Language Models for Error Fixing with Reinforcement Learning [39.94602104823846]
Large language models (LLMs) generate programs that contains syntactic errors and fail to complete the given tasks.<n>In this work, we propose SLMFix, a novel code generation pipeline that leverages a small language model (SLM) finetuned using reinforcement learning (RL) techniques.
arXiv Detail & Related papers (2025-11-24T18:56:47Z) - Saber: An Efficient Sampling with Adaptive Acceleration and Backtracking Enhanced Remasking for Diffusion Language Model [98.35868970993232]
Diffusion language models (DLMs) are emerging as a powerful and promising alternative to the dominant autoregressive paradigm.<n>We introduce efficient Sampling with Adaptive acceleration and Backtracking Enhanced Remasking (i.e., Saber) to achieve better inference speed and output quality in code generation.
arXiv Detail & Related papers (2025-10-20T23:38:12Z) - Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models [82.87985794856803]
Large Language Models (LLMs) have achieved state-of-the-art performance on a broad range of Natural Language Processing (NLP) tasks.<n>Recently, Diffusion Language Models (DLMs) have emerged as a promising alternative architecture.
arXiv Detail & Related papers (2025-10-05T10:50:52Z) - Beyond Single LLMs: Enhanced Code Generation via Multi-Stage Performance-Guided LLM Orchestration [12.674888937998086]
Large Language Models (LLMs) have become the predominant paradigm for automated code generation.<n>This paper challenges the single-model convention by introducing a multi-stage, performance-guided orchestration framework.<n>Perch orchestrates top-performing LLMs for each task context through stage-wise validation and rollback mechanisms.
arXiv Detail & Related papers (2025-10-01T19:07:16Z) - Wide-In, Narrow-Out: Revokable Decoding for Efficient and Effective DLLMs [57.69190972274813]
Diffusion Large Language Models (DLLMs) have emerged as a compelling alternative to Autoregressive models.<n>ExistingDLLMs are plagued by a severe quality-speed trade-off, where faster parallel decoding leads to significant performance degradation.<n>We introduce Wide-In, Narrow-Out (WINO), a training-free decoding algorithm that enables revokable decoding inDLLMs.
arXiv Detail & Related papers (2025-07-24T16:51:33Z) - The CodeInverter Suite: Control-Flow and Data-Mapping Augmented Binary Decompilation with LLMs [43.591384969171614]
We develop the CodeInverter Suite to improve binary decompilation.<n>We use control flow graphs and explicit data mappings to improve decompilation.<n>Our CIM-6.7B can achieve state-of-the-art decompilation performance.
arXiv Detail & Related papers (2025-03-10T11:52:48Z) - Quantizing Large Language Models for Code Generation: A Differentiated Replication [51.85505914274633]
Large Language Models (LLMs) have shown an impressive capability in code generation and, specifically, to automatically implement requirements described in natural language.<n>LLMs pose significant challenges related to their memory (and, consequently, carbon) footprint.<n>New frontier for LLM quantization is 4-bit precision, resulting in an average memory footprint reduction of 70%.
arXiv Detail & Related papers (2025-03-10T09:26:08Z) - Improving the Ability of Pre-trained Language Model by Imparting Large Language Model's Experience [4.814313782484443]
Large Language Models (LLMs) and pre-trained Language Models (LMs) have achieved impressive success on many software engineering tasks.<n>We use LLMs to generate domain-specific data, thereby improving the performance of pre-trained LMs on the target tasks.
arXiv Detail & Related papers (2024-08-16T06:37:59Z) - What's Wrong with Your Code Generated by Large Language Models? An Extensive Study [80.18342600996601]
Large language models (LLMs) produce code that is shorter yet more complicated as compared to canonical solutions.
We develop a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types.
We propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback.
arXiv Detail & Related papers (2024-07-08T17:27:17Z) - MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-tuning [25.03477973238162]
Fine-tuning approaches for Large language models (LLMs) on program repair tasks overlook the need to reason about the logic behind code changes.<n>We apply MOobjective to fine-tune four open-source LLMs with different sizes and architectures.<n>We show that our fine-tuning strategy yields superior performance compared to the state-of-the-art approaches.
arXiv Detail & Related papers (2024-04-19T05:36:21Z) - Exploring Data-Efficient Adaptation of Large Language Models for Code Generation [64.5583894165813]
We propose a novel adaptation approach named DEED, which stands for Data-Efficient adaptation with Error-Driven learning for code generation.<n> Experimental results show that, compared to other mainstream fine-tuning approaches, DEED achieves superior performance with few training data.
arXiv Detail & Related papers (2024-02-29T16:09:02Z) - LLM-Assisted Code Cleaning For Training Accurate Code Generators [53.087019724256606]
We investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system.
We build a novel data-cleaning pipeline that uses these principles to transform existing programs.
We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B improves the performance by up to 30% compared to fine-tuning on the original dataset.
arXiv Detail & Related papers (2023-11-25T02:45:50Z) - Accelerating LLaMA Inference by Enabling Intermediate Layer Decoding via
Instruction Tuning with LITE [62.13435256279566]
Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks.
However, their large size makes their inference slow and computationally expensive.
We show that it enables these layers to acquire 'good' generation ability without affecting the generation ability of the final layer.
arXiv Detail & Related papers (2023-10-28T04:07:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.