CANDI: Hybrid Discrete-Continuous Diffusion Models
- URL: http://arxiv.org/abs/2510.22510v2
- Date: Tue, 28 Oct 2025 19:55:41 GMT
- Title: CANDI: Hybrid Discrete-Continuous Diffusion Models
- Authors: Patrick Pynadath, Jiaxin Shi, Ruqi Zhang,
- Abstract summary: We show how noise corrupts discrete data through two mechanisms: discrete identity corruption and continuous rank degradation.<n>We propose CANDI, a hybrid framework that decouples discrete and continuous corruption.<n>This unlocks the benefits of continuous diffusion for discrete spaces.
- Score: 36.61898210733147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While continuous diffusion has shown remarkable success in continuous domains such as image generation, its direct application to discrete data has underperformed compared to purely discrete formulations. This gap is counterintuitive, given that continuous diffusion learns score functions that enable joint evolution across multiple positions. To understand this gap, we introduce token identifiability as an analytical framework for understanding how Gaussian noise corrupts discrete data through two mechanisms: discrete identity corruption and continuous rank degradation. We reveal that these mechanisms scale differently with vocabulary size, creating a temporal dissonance: at noise levels where discrete corruption preserves enough structure for conditional learning, continuous denoising is trivial; at noise levels where continuous denoising is meaningful, discrete corruption destroys nearly all conditional structure. To solve this, we propose CANDI (Continuous ANd DIscrete diffusion), a hybrid framework that decouples discrete and continuous corruption, enabling simultaneous learning of both conditional structure and continuous geometry. We empirically validate the temporal dissonance phenomenon and demonstrate that CANDI successfully avoids it. This unlocks the benefits of continuous diffusion for discrete spaces: on controlled generation, CANDI enables classifier-based guidance with off-the-shelf classifiers through simple gradient addition; on text generation, CANDI outperforms masked diffusion at low NFE, demonstrating the value of learning continuous gradients for discrete spaces. We include the code on the project page available here: https://patrickpynadath1.github.io/candi-lander
Related papers
- CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think [17.27394520177311]
CoDAR is a two-stage framework that keeps diffusion entirely continuous in an embedding space while learning a strong, context-conditional discretizer.<n>Experiments on LM1B and OpenWebText demonstrate that CoDAR substantially improves generation quality over latent diffusion.
arXiv Detail & Related papers (2026-03-03T03:05:15Z) - Rejection Mixing: Fast Semantic Propagation of Mask Tokens for Efficient DLLM Inference [58.189320101488725]
DLLMs promise fast non-autoregressive inference but suffer a severe quality-speed trade-off in parallel decoding.<n>We address this by integrating continuous representations into the discrete decoding process, as they preserve rich inter-position dependency.<n>We propose ReMix, a framework that introduces a novel Continuous Mixing State as an intermediate between the initial masked state and the final decoded token state.
arXiv Detail & Related papers (2026-02-26T11:08:11Z) - Foundations of Diffusion Models in General State Spaces: A Self-Contained Introduction [54.95522167029998]
This article is a self-contained primer on diffusion over general state spaces.<n>We develop the discrete-time view (forward noising via Markov kernels and learned reverse dynamics) alongside its continuous-time limits.<n>A common variational treatment yields the ELBO that underpins standard training losses.
arXiv Detail & Related papers (2025-12-04T18:55:36Z) - Coevolutionary Continuous Discrete Diffusion: Make Your Diffusion Language Model a Latent Reasoner [66.86440230599656]
We argue that diffusion language models do not necessarily need to be in the discrete space.<n>In particular, we prove that continuous diffusion models have stronger expressivity than discrete diffusions and looped transformers.<n>We propose Coevolutionary Continuous Diffusion (CCDD), which defines a joint multimodal diffusion process on the union of a continuous representation space and a discrete token space.
arXiv Detail & Related papers (2025-10-03T17:44:41Z) - Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling [87.34677262370924]
Standard discrete diffusion models treat all unobserved states identically by mapping them to an absorbing [MASK] token.<n>This creates an 'information void' where semantic information that could be inferred from unmasked tokens is lost between denoising steps.<n>We introduce Continuously Augmented Discrete Diffusion, a framework that augments the discrete state space with a paired diffusion in a continuous latent space.
arXiv Detail & Related papers (2025-10-01T18:00:56Z) - Constrained Discrete Diffusion [61.81569616239755]
This paper introduces Constrained Discrete Diffusion (CDD), a novel integration of differentiable constraint optimization within the diffusion process.<n>CDD directly imposes constraints into the discrete diffusion sampling process, resulting in a training-free and effective approach.
arXiv Detail & Related papers (2025-03-12T19:48:12Z) - Unified Discrete Diffusion for Categorical Data [37.56355078250024]
We present a series of mathematical simplifications of the variational lower bound that enable more accurate and easy-to-optimize training for discrete diffusion.
We derive a simple formulation for backward denoising that enables exact and accelerated sampling, and importantly, an elegant unification of discrete-time and continuous-time discrete diffusion.
arXiv Detail & Related papers (2024-02-06T04:42:36Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises [38.72460741779243]
We introduce DINOISER to facilitate diffusion models for sequence generation by manipulating noises.
Experiments show that DINOISER enables consistent improvement over the baselines of previous diffusion-based sequence generative models.
arXiv Detail & Related papers (2023-02-20T15:14:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.