Related papers: CoDA: Coding LM via Diffusion Adaptation

CoDA: Coding LM via Diffusion Adaptation

URL: http://arxiv.org/abs/2510.03270v1
Date: Sat, 27 Sep 2025 05:41:55 GMT
Title: CoDA: Coding LM via Diffusion Adaptation
Authors: Haolin Chen, Shiyu Wang, Can Qin, Bo Pang, Zuxin Liu, Jielin Qiu, Jianguo Zhang, Yingbo Zhou, Zeyuan Chen, Ran Xu, Shelby Heinecke, Silvio Savarese, Caiming Xiong, Huan Wang, Weiran Yao,
Abstract summary: CoDA pairs large-scale diffusion pre-training with code-centric mid-training and instruction tuning.<n>On Humaneval, MBPP, and EvalPlus, CoDA-1.7B-Instruct matches or surpasses diffusion models up to 7B parameters.
Score: 102.62730448092888
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Diffusion language models promise bidirectional context and infilling capabilities that autoregressive coders lack, yet practical systems remain heavyweight. We introduce CoDA, a 1.7B-parameter diffusion coder trained on TPU with a fully open-source training pipeline. CoDA pairs large-scale diffusion pre-training with code-centric mid-training and instruction tuning, enabling confidence-guided sampling that keeps inference latency competitive. On Humaneval, MBPP, and EvalPlus, CoDA-1.7B-Instruct matches or surpasses diffusion models up to 7B parameters. Our release includes model checkpoints, evaluation harnesses, and TPU training pipelines to accelerate research on lightweight diffusion-based coding assistants.

Related papers

Align Your Tangent: Training Better Consistency Models via Manifold-Aligned Tangents [55.43139356528315]
Consistency Models (CMs) are trained to be consistent on flow ordinary differential equation trajectories.<n>CMs typically require prolonged training with large batch sizes to obtain competitive sample quality.<n>We propose a new loss function, called the manifold feature distance (MFD), which provides manifold-aligned tangents that point toward the data manifold.
arXiv Detail & Related papers (2025-10-01T08:35:18Z)
DiffusionNFT: Online Diffusion Reinforcement with Forward Process [99.94852379720153]
Diffusion Negative-aware FineTuning (DiffusionNFT) is a new online RL paradigm that optimize diffusion models directly on the forward process via flow matching.<n>DiffusionNFT is up to $25times$ more efficient than FlowGRPO in head-to-head comparisons, while being CFG-free.
arXiv Detail & Related papers (2025-09-19T16:09:33Z)
Dream-Coder 7B: An Open Diffusion Language Model for Code [99.14959222355988]
We present Dream-Coder 7B, an open-source discrete diffusion language model for code generation that exhibits emergent any-order generation capabilities.<n>Unlike traditional autoregressive (AR) models that decode strictly left-to-right, Dream-Coder 7B adaptively determines its decoding strategy based on the coding task.
arXiv Detail & Related papers (2025-09-01T05:30:56Z)
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference [53.65845680932835]
Conditional Adapter (CoDA) is a parameter-efficient transfer learning method that also improves inference efficiency. Across a variety of language, vision, and speech tasks, CoDA achieves a 2x to 8x inference speed-up.
arXiv Detail & Related papers (2023-04-11T03:17:37Z)
Denoising Diffusion Autoencoders are Unified Self-supervised Learners [58.194184241363175]
This paper shows that the networks in diffusion models, namely denoising diffusion autoencoders (DDAE), are unified self-supervised learners. DDAE has already learned strongly linear-separable representations within its intermediate layers without auxiliary encoders. Our diffusion-based approach achieves 95.9% and 50.0% linear evaluation accuracies on CIFAR-10 and Tiny-ImageNet.
arXiv Detail & Related papers (2023-03-17T04:20:47Z)
Deep Diffusion Models for Robust Channel Estimation [1.7259824817932292]
We introduce a novel approach for multiple-input multiple-output (MIMO) channel estimation using deep diffusion models. Our method uses a deep neural network that is trained to estimate the gradient of the log-likelihood of wireless channels at any point in high-dimensional space.
arXiv Detail & Related papers (2021-11-16T01:32:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.