Related papers: Unifying Masked Diffusion Models with Various Generation Orders and Beyond

Unifying Masked Diffusion Models with Various Generation Orders and Beyond

URL: http://arxiv.org/abs/2602.02112v1
Date: Mon, 02 Feb 2026 13:54:32 GMT
Title: Unifying Masked Diffusion Models with Various Generation Orders and Beyond
Authors: Chunsan Hong, Sanghyun Lee, Jong Chul Ye,
Abstract summary: Masked diffusion models (MDMs) are a potential alternative to autoregressive models (ARMs) for language generation.<n>We propose order-expressive masked diffusion model (OeMDM) for a broad class of diffusion generative processes.<n>We introduce learnable-order masked diffusion model (LoMDM) which jointly learns the generation ordering and diffusion backbone.
Score: 56.70289720766803
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Masked diffusion models (MDMs) are a potential alternative to autoregressive models (ARMs) for language generation, but generation quality depends critically on the generation order. Prior work either hard-codes an ordering (e.g., blockwise left-to-right) or learns an ordering policy for a pretrained MDM, which incurs extra cost and can yield suboptimal solutions due to the two-stage optimization. Motivated by this, we propose order-expressive masked diffusion model (OeMDM) for a broad class of diffusion generative processes with various generation orders, enabling the interpretation of MDM, ARM, and block diffusion in a single framework. Furthermore, building on OeMDM, we introduce learnable-order masked diffusion model (LoMDM), which jointly learns the generation ordering and diffusion backbone through a single objective from scratch, enabling the diffusion model to generate text in context-dependent ordering. Empirically, we confirm that LoMDM outperforms various discrete diffusion models across multiple language modeling benchmarks.

Related papers

Towards Latent Diffusion Suitable For Text [7.293508593001522]
We introduce Neural Flow Diffusion Models for language generation, an extension of NFDM that enables the straightforward application of continuous diffusion models to discrete state spaces.<n>Our model substantially reduces the likelihood gap with autoregressive models of the same size, while achieving sample quality comparable to that of previous latent diffusion models.
arXiv Detail & Related papers (2026-01-07T20:50:59Z)
Bridging the Discrete-Continuous Gap: Unified Multimodal Generation via Coupled Manifold Discrete Absorbing Diffusion [60.186310080523135]
Bifurcation of generative modeling into autoregressive approaches for discrete data (text) and diffusion approaches for continuous data (images) hinders development of truly unified multimodal systems.<n>We propose textbfCoM-DAD, a novel probabilistic framework that reformulates multimodal generation as a hierarchical dual-process.<n>Our method demonstrates superior stability over standard masked modeling, establishing a new paradigm for scalable, unified text-image generation.
arXiv Detail & Related papers (2026-01-07T16:21:19Z)
Improving Text Style Transfer using Masked Diffusion Language Models with Inference-time Scaling [37.795834398730555]
Masked diffusion language models (MDMs) have recently gained traction as a viable generative framework for natural language.<n>We propose a verifier-based inference-time scaling method that aids in finding a better candidate generation during the denoising process of the MDM.<n>Our experiments demonstrate the application of MDMs for standard text-style transfer tasks and establish MDMs as a better alternative to autoregressive language models.
arXiv Detail & Related papers (2025-08-14T18:01:22Z)
Variational Autoencoding Discrete Diffusion with Enhanced Dimensional Correlations Modeling [48.96034602889216]
Variencoding Discrete Diffusion (VADD) is a novel framework that enhances discrete diffusion with latent variable modeling.<n>By introducing an auxiliary recognition model, VADD enables stable training via variational lower bounds and amortized inference over the training set.<n> Empirical results on 2D toy data, pixel-level image generation, and text generation demonstrate that VADD consistently outperforms MDM baselines.
arXiv Detail & Related papers (2025-05-23T01:45:47Z)
Generalized Interpolating Discrete Diffusion [65.74168524007484]
Masked diffusion is a popular choice due to its simplicity and effectiveness.<n>We generalize a new family of general interpolating discrete diffusion (GIDD) which offers greater flexibility in the design of the noising processes.<n>Exploiting GIDD's flexibility, we explore a hybrid approach combining masking and uniform noise, leading to improved sample quality.
arXiv Detail & Related papers (2025-03-06T14:30:55Z)
Remasking Discrete Diffusion Models with Inference-Time Scaling [21.362017006523086]
We introduce the remasking diffusion model (ReMDM) sampler, a method that can be applied to pretrained masked diffusion models in a principled way.<n>Most interestingly, ReMDM endows discrete diffusion with a form of inference-time compute scaling.
arXiv Detail & Related papers (2025-03-01T02:37:51Z)
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference. Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable. We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z)
DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models [81.84866217721361]
DiffusionBERT is a new generative masked language model based on discrete diffusion models. We propose a new noise schedule for the forward diffusion process that controls the degree of noise added at each step. Experiments on unconditional text generation demonstrate that DiffusionBERT achieves significant improvement over existing diffusion models for text.
arXiv Detail & Related papers (2022-11-28T03:25:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.