Related papers: CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think

CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think

URL: http://arxiv.org/abs/2603.02547v1
Date: Tue, 03 Mar 2026 03:05:15 GMT
Title: CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think
Authors: Junzhe Shen, Jieru Zhao, Ziwei He, Zhouhan Lin,
Abstract summary: CoDAR is a two-stage framework that keeps diffusion entirely continuous in an embedding space while learning a strong, context-conditional discretizer.<n>Experiments on LM1B and OpenWebText demonstrate that CoDAR substantially improves generation quality over latent diffusion.
Score: 17.27394520177311
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study why continuous diffusion language models (DLMs) have lagged behind discrete diffusion approaches despite their appealing continuous generative dynamics. Under a controlled token--recovery study, we identify token rounding, the final projection from denoised embeddings to tokens, as a primary bottleneck. Building on these insights, we propose CoDAR (Continuous Diffusion with Contextual AutoRegressive Decoder), a two--stage framework that keeps diffusion entirely continuous in an embedding space while learning a strong, context--conditional discretizer: an autoregressive Transformer decoder that cross--attends to the denoised embedding sequence and performs contextualized rounding to tokens. Experiments on LM1B and OpenWebText demonstrate that CoDAR substantially improves generation quality over latent diffusion and becomes competitive with strong discrete DLMs, while exposing a simple decoder--temperature knob to navigate the fluency--diversity trade off.

Related papers

Diffusion-DRF: Differentiable Reward Flow for Video Diffusion Fine-Tuning [72.16213872139748]
Diffusion-DRF is a differentiable reward flow for fine-tuning video diffusion models.<n>It backpropagates VLM feedback through the diffusion denoising chain.<n>It improves video quality and semantic alignment while mitigating reward hacking and collapse.
arXiv Detail & Related papers (2026-01-07T18:05:08Z)
Bridging the Discrete-Continuous Gap: Unified Multimodal Generation via Coupled Manifold Discrete Absorbing Diffusion [60.186310080523135]
Bifurcation of generative modeling into autoregressive approaches for discrete data (text) and diffusion approaches for continuous data (images) hinders development of truly unified multimodal systems.<n>We propose textbfCoM-DAD, a novel probabilistic framework that reformulates multimodal generation as a hierarchical dual-process.<n>Our method demonstrates superior stability over standard masked modeling, establishing a new paradigm for scalable, unified text-image generation.
arXiv Detail & Related papers (2026-01-07T16:21:19Z)
CANDI: Hybrid Discrete-Continuous Diffusion Models [36.61898210733147]
We show how noise corrupts discrete data through two mechanisms: discrete identity corruption and continuous rank degradation.<n>We propose CANDI, a hybrid framework that decouples discrete and continuous corruption.<n>This unlocks the benefits of continuous diffusion for discrete spaces.
arXiv Detail & Related papers (2025-10-26T03:24:31Z)
Latent Discrete Diffusion Models [18.979326092796896]
We study discrete diffusion for language and other categorical data.<n>We propose emphLatent Discrete Diffusion Models (LDDM)<n>We present two instantiations: (i) FUJI-LDDMs, which perform fully joint denoising of tokens and latents, and (ii) SEQ-LDDMs, which sequentially resolve the latent and then the discrete chain conditionally on it.<n>For both variants we derive ELBO-style objectives and discuss design choices to learn informative latents yet amenable to diffusoin modeling.
arXiv Detail & Related papers (2025-10-20T21:26:52Z)
Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner [66.86440230599656]
We argue that diffusion language models do not necessarily need to be in the discrete space.<n>In particular, we prove that continuous diffusion models have stronger expressivity than discrete diffusions and looped transformers.<n>We propose Coevolutionary Continuous Diffusion (CCDD), which defines a joint multimodal diffusion process on the union of a continuous representation space and a discrete token space.
arXiv Detail & Related papers (2025-10-03T17:44:41Z)
Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling [87.34677262370924]
Standard discrete diffusion models treat all unobserved states identically by mapping them to an absorbing [MASK] token.<n>This creates an 'information void' where semantic information that could be inferred from unmasked tokens is lost between denoising steps.<n>We introduce Continuously Augmented Discrete Diffusion, a framework that augments the discrete state space with a paired diffusion in a continuous latent space.
arXiv Detail & Related papers (2025-10-01T18:00:56Z)
Authentic Discrete Diffusion Model [72.31371542619121]
Authentic Discrete Diffusion (ADD) framework redefines prior pseudo-discrete approaches.<n>ADD reformulates the diffusion input by directly using float-encoded one-hot class data.<n> experiments demonstrate that ADD achieves superior performance on classification tasks compared to the baseline.
arXiv Detail & Related papers (2025-10-01T15:51:10Z)
Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models [12.446047799880587]
Token-level diffusion doesn't model word-order dependencies explicitly.<n>Passage-level diffusion struggles with learning robust representations for long-form text.<n>We propose Segment-Level Diffusion, a framework that enhances diffusion-based text generation.
arXiv Detail & Related papers (2024-12-15T22:47:44Z)
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer [95.80384464922147]
ACDiT is a blockwise Conditional Diffusion Transformer.<n>It offers a flexible between token-wise autoregression and full-sequence diffusion.<n>We show that ACDiT performs best among all autoregressive baselines on image and video generation tasks.
arXiv Detail & Related papers (2024-12-10T18:13:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.