Related papers: Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models

Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models

URL: http://arxiv.org/abs/2508.12361v1
Date: Sun, 17 Aug 2025 13:35:38 GMT
Title: Navigating the Exploration-Exploitation Tradeoff in Inference-Time Scaling of Diffusion Models
Authors: Xun Su, Jianming Huang, Yang Yusen, Zhongxi Fang, Hiroyuki Kasai,
Abstract summary: Inference-time scaling has achieved remarkable success in language models, yet its adaptation to diffusion models remains underexplored.<n>We propose two strategies: Schedule and Adaptive Temperature.<n>Our methods significantly enhance sample quality without increasing the total number of Noise Evaluations.
Score: 11.813933389519358
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Inference-time scaling has achieved remarkable success in language models, yet its adaptation to diffusion models remains underexplored. We observe that the efficacy of recent Sequential Monte Carlo (SMC)-based methods largely stems from globally fitting the The reward-tilted distribution, which inherently preserves diversity during multi-modal search. However, current applications of SMC to diffusion models face a fundamental dilemma: early-stage noise samples offer high potential for improvement but are difficult to evaluate accurately, whereas late-stage samples can be reliably assessed but are largely irreversible. To address this exploration-exploitation trade-off, we approach the problem from the perspective of the search algorithm and propose two strategies: Funnel Schedule and Adaptive Temperature. These simple yet effective methods are tailored to the unique generation dynamics and phase-transition behavior of diffusion models. By progressively reducing the number of maintained particles and down-weighting the influence of early-stage rewards, our methods significantly enhance sample quality without increasing the total number of Noise Function Evaluations. Experimental results on multiple benchmarks and state-of-the-art text-to-image diffusion models demonstrate that our approach outperforms previous baselines.

Related papers

G$^2$RPO: Granular GRPO for Precise Reward in Flow Models [74.21206048155669]
We propose a novel Granular-GRPO (G$2$RPO) framework that achieves precise and comprehensive reward assessments of sampling directions.<n>We introduce a Multi-Granularity Advantage Integration module that aggregates advantages computed at multiple diffusion scales.<n>Our G$2$RPO significantly outperforms existing flow-based GRPO baselines.
arXiv Detail & Related papers (2025-10-02T12:57:12Z)
Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [62.640128548633946]
We introduce a novel inference-time scaling approach based on particle Gibbs sampling for discrete diffusion models.<n>Our method consistently outperforms prior inference-time strategies on reward-guided text generation tasks.
arXiv Detail & Related papers (2025-07-11T08:00:47Z)
Provable Maximum Entropy Manifold Exploration via Diffusion Models [58.89696361871563]
Exploration is critical for solving real-world decision-making problems such as scientific discovery.<n>We introduce a novel framework that casts exploration as entropy over approximate data manifold implicitly defined by a pre-trained diffusion model.<n>We develop an algorithm based on mirror descent that solves the exploration problem as sequential fine-tuning of a pre-trained diffusion model.
arXiv Detail & Related papers (2025-06-18T11:59:15Z)
Training-free Diffusion Model Alignment with Sampling Demons [15.400553977713914]
We propose an optimization approach, dubbed Demon, to guide the denoising process at inference time without backpropagation through reward functions or model retraining.<n>Our approach works by controlling noise distribution in denoising steps to concentrate density on regions corresponding to high rewards through optimization.<n>Our experiments show that the proposed approach significantly improves the average aesthetics scores text-to-image generation.
arXiv Detail & Related papers (2024-10-08T07:33:49Z)
Learning Diffusion Priors from Observations by Expectation Maximization [6.224769485481242]
We present a novel method based on the expectation-maximization algorithm for training diffusion models from incomplete and noisy observations only. As part of our method, we propose and motivate an improved posterior sampling scheme for unconditional diffusion models.
arXiv Detail & Related papers (2024-05-22T15:04:06Z)
MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning Process [26.661721555671626]
We introduce a novel Multi-Granularity Time Series (MG-TSD) model, which achieves state-of-the-art predictive performance. Our approach does not rely on additional external data, making it versatile and applicable across various domains.
arXiv Detail & Related papers (2024-03-09T01:15:03Z)
Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts. We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep. We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z)
Fast Diffusion EM: a diffusion model for blind inverse problems with application to deconvolution [0.0]
Current methods assume the degradation to be known and provide impressive results in terms of restoration and diversity. In this work, we leverage the efficiency of those models to jointly estimate the restored image and unknown parameters of the kernel model. Our method alternates between approximating the expected log-likelihood of the problem using samples drawn from a diffusion model and a step to estimate unknown model parameters.
arXiv Detail & Related papers (2023-09-01T06:47:13Z)
Semi-Implicit Denoising Diffusion Models (SIDDMs) [50.30163684539586]
Existing models such as Denoising Diffusion Probabilistic Models (DDPM) deliver high-quality, diverse samples but are slowed by an inherently high number of iterative steps. We introduce a novel approach that tackles the problem by matching implicit and explicit factors. We demonstrate that our proposed method obtains comparable generative performance to diffusion-based models and vastly superior results to models with a small number of sampling steps.
arXiv Detail & Related papers (2023-06-21T18:49:22Z)
ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion Trajectories [144.03939123870416]
We propose a novel conditional diffusion model by introducing conditions into the forward process. We use extra latent space to allocate an exclusive diffusion trajectory for each condition based on some shifting rules. We formulate our method, which we call textbfShiftDDPMs, and provide a unified point of view on existing related methods.
arXiv Detail & Related papers (2023-02-05T12:48:21Z)
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution. We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction [31.61199061999173]
Diffusion models have a critical downside - they are inherently slow to sample from, needing few thousand steps of iteration to generate images from pure Gaussian noise. We show that starting from Gaussian noise is unnecessary. Instead, starting from a single forward diffusion with better initialization significantly reduces the number of sampling steps in the reverse conditional diffusion. New sampling strategy, dubbed ComeCloser-DiffuseFaster (CCDF), also reveals a new insight on how the existing feedforward neural network approaches for inverse problems can be synergistically combined with the diffusion models.
arXiv Detail & Related papers (2021-12-09T04:28:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.