Related papers: Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling

Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling

URL: http://arxiv.org/abs/2507.08390v1
Date: Fri, 11 Jul 2025 08:00:47 GMT
Title: Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling
Authors: Meihua Dang, Jiaqi Han, Minkai Xu, Kai Xu, Akash Srivastava, Stefano Ermon,
Abstract summary: We introduce a novel inference-time scaling approach based on particle Gibbs sampling for discrete diffusion models.<n>Our method consistently outperforms prior inference-time strategies on reward-guided text generation tasks.
Score: 62.640128548633946
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Discrete diffusion models have emerged as a powerful paradigm for language modeling, rivaling auto-regressive models by training-time scaling. However, inference-time scaling in discrete diffusion models remains relatively under-explored. In this work, we study sampling-based approaches for achieving high-quality text generation from discrete diffusion models in reward-guided settings. We introduce a novel inference-time scaling approach based on particle Gibbs sampling for discrete diffusion models. The particle Gibbs sampling algorithm iteratively refines full diffusion trajectories using conditional Sequential Monte Carlo as its transition mechanism. This process ensures that the updated samples progressively improve and move closer to the reward-weighted target distribution. Unlike existing inference-time scaling methods, which are often limited to single diffusion trajectories, our approach leverages iterative refinement across multiple trajectories. Within this framework, we further analyze the trade-offs between four key axes for inference-time scaling under fixed compute budgets: particle Gibbs iterations, particle count, denoising steps, and reward estimation cost. Empirically, our method consistently outperforms prior inference-time strategies on reward-guided text generation tasks, achieving significant improvement in accuracy under varying compute budgets.

Related papers

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing [10.542645300983878]
We propose an inference-time scaling approach for pretrained flow models.<n>We show that SDE-based generation, particularly variance-preserving (VP) interpolant-based generation, improves particle sampling methods for inference-time scaling in flow models.
arXiv Detail & Related papers (2025-03-25T06:30:45Z)
Accelerated Diffusion Models via Speculative Sampling [89.43940130493233]
Speculative sampling is a popular technique for accelerating inference in Large Language Models.<n>We extend speculative sampling to diffusion models, which generate samples via continuous, vector-valued Markov chains.<n>We propose various drafting strategies, including a simple and effective approach that does not require training a draft model.
arXiv Detail & Related papers (2025-01-09T16:50:16Z)
Convergence Analysis of Discrete Diffusion Model: Exact Implementation through Uniformization [17.535229185525353]
We introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points. Our results align with state-of-the-art achievements for diffusion models in $mathbbRd$ and further underscore the advantages of discrete diffusion models in comparison to the $mathbbRd$ setting.
arXiv Detail & Related papers (2024-02-12T22:26:52Z)
Improved off-policy training of diffusion samplers [93.66433483772055]
We study the problem of training diffusion models to sample from a distribution with an unnormalized density or energy function.<n>We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods.<n>Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work.
arXiv Detail & Related papers (2024-02-07T18:51:49Z)
Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time [49.598085130313514]
We propose discrete non-Markov diffusion models (DNDM), which naturally induce the predetermined transition time set.<n>This enables a training-free sampling algorithm that significantly reduces the number of function evaluations.<n>We study the transition from finite to infinite step sampling, offering new insights into bridging the gap between discrete and continuous-time processes.
arXiv Detail & Related papers (2023-12-14T18:14:11Z)
Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing [49.800746112114375]
We propose a novel post-training quantization method (Progressive and Relaxing) for text-to-image diffusion models. We are the first to achieve quantization for Stable Diffusion XL while maintaining the performance.
arXiv Detail & Related papers (2023-11-10T09:10:09Z)
A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE. We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z)
Fast Inference in Denoising Diffusion Models via MMD Finetuning [23.779985842891705]
We present MMD-DDM, a novel method for fast sampling of diffusion models. Our approach is based on the idea of using the Maximum Mean Discrepancy (MMD) to finetune the learned distribution with a given budget of timesteps. Our findings show that the proposed method is able to produce high-quality samples in a fraction of the time required by widely-used diffusion models.
arXiv Detail & Related papers (2023-01-19T09:48:07Z)
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution. We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.