Simple Denoising Diffusion Language Models
- URL: http://arxiv.org/abs/2510.22926v1
- Date: Mon, 27 Oct 2025 02:05:26 GMT
- Title: Simple Denoising Diffusion Language Models
- Authors: Huaisheng Zhu, Zhengyu Chen, Shijie Zhou, Zhihui Xie, Yige Yuan, Zhimeng Guo, Siyuan Xu, Hangfan Zhang, Vasant Honavar, Teng Xiao,
- Abstract summary: We propose a simplified denoising-based loss for Uniform-state Diffusion Models (USDMs)<n>We introduce a simple modification to our denoising loss with contrastive-inspired negative gradients, which is practical and yield additional improvements in generation quality.
- Score: 32.08084107137018
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models have recently been extended to language generation through Masked Diffusion Language Models (MDLMs), which achieve performance competitive with strong autoregressive models. However, MDLMs tend to degrade in the few-step regime and cannot directly adopt existing few-step distillation methods designed for continuous diffusion models, as they lack the intrinsic property of mapping from noise to data. Recent Uniform-state Diffusion Models (USDMs), initialized from a uniform prior, alleviate some limitations but still suffer from complex loss formulations that hinder scalability. In this work, we propose a simplified denoising-based loss for USDMs that optimizes only noise-replaced tokens, stabilizing training and matching ELBO-level performance. Furthermore, by framing denoising as self-supervised learning, we introduce a simple modification to our denoising loss with contrastive-inspired negative gradients, which is practical and yield additional improvements in generation quality.
Related papers
- One-step Language Modeling via Continuous Denoising [36.18484491074519]
We show that language models leveraging flow-based continuous denoising can outperform discrete diffusion in both quality and speed.<n>Our work calls into question the widely held hypothesis that discrete diffusion processes are necessary for generative modeling over discrete modalities.
arXiv Detail & Related papers (2026-02-18T19:23:07Z) - Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment [14.097906894386066]
PostDiff is a training-free framework for accelerating pre-trained diffusion models.<n>We show that PostDiff can significantly improve the fidelity-efficiency trade-off of state-of-the-art diffusion models.
arXiv Detail & Related papers (2025-08-08T09:29:37Z) - Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [70.8832906871441]
We study how to steer generation toward desired rewards without retraining the models.<n>Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement.<n>We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity.
arXiv Detail & Related papers (2025-07-11T08:00:47Z) - Variational Autoencoding Discrete Diffusion with Enhanced Dimensional Correlations Modeling [48.96034602889216]
Variencoding Discrete Diffusion (VADD) is a novel framework that enhances discrete diffusion with latent variable modeling.<n>By introducing an auxiliary recognition model, VADD enables stable training via variational lower bounds and amortized inference over the training set.<n> Empirical results on 2D toy data, pixel-level image generation, and text generation demonstrate that VADD consistently outperforms MDM baselines.
arXiv Detail & Related papers (2025-05-23T01:45:47Z) - Restoration Score Distillation: From Corrupted Diffusion Pretraining to One-Step High-Quality Generation [82.39763984380625]
We propose textitRestoration Score Distillation (RSD), a principled generalization of Denoising Score Distillation (DSD)<n>RSD accommodates a broader range of corruption types, such as blurred, incomplete, or low-resolution images.<n>It consistently surpasses its teacher model across diverse restoration tasks on both natural and scientific datasets.
arXiv Detail & Related papers (2025-05-19T17:21:03Z) - One-Step Diffusion Model for Image Motion-Deblurring [85.76149042561507]
We propose a one-step diffusion model for deblurring (OSDD), a novel framework that reduces the denoising process to a single step.<n>To tackle fidelity loss in diffusion models, we introduce an enhanced variational autoencoder (eVAE), which improves structural restoration.<n>Our method achieves strong performance on both full and no-reference metrics.
arXiv Detail & Related papers (2025-03-09T09:39:57Z) - Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios [10.57695963534794]
Methods based on VAEs are accompanied by issues of local jitter and global instability.
We introduce a conditional GAN to capture audio control signals and implicitly match the multimodal denoising distribution between the diffusion and denoising steps.
arXiv Detail & Related papers (2024-10-27T07:25:11Z) - One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion
Schedule Flaws and Enhancing Low-Frequency Controls [77.42510898755037]
One More Step (OMS) is a compact network that incorporates an additional simple yet effective step during inference.
OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.
Once trained, various pre-trained diffusion models with the same latent domain can share the same OMS module.
arXiv Detail & Related papers (2023-11-27T12:02:42Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection [80.20339155618612]
DiffusionAD is a novel anomaly detection pipeline comprising a reconstruction sub-network and a segmentation sub-network.<n>A rapid one-step denoising paradigm achieves hundreds of times acceleration while preserving comparable reconstruction quality.<n>Considering the diversity in the manifestation of anomalies, we propose a norm-guided paradigm to integrate the benefits of multiple noise scales.
arXiv Detail & Related papers (2023-03-15T16:14:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.