Inference-Time Alignment for Diffusion Models via Doob's Matching
- URL: http://arxiv.org/abs/2601.06514v1
- Date: Sat, 10 Jan 2026 10:28:06 GMT
- Title: Inference-Time Alignment for Diffusion Models via Doob's Matching
- Authors: Jinyuan Chang, Chenguang Duan, Yuling Jiao, Yi Xu, Jerry Zhijian Yang,
- Abstract summary: Inference-time alignment for diffusion models aims to adapt a pre-trained diffusion model toward a target distribution without retraining the base score network.<n>We introduce Doob's matching, a novel framework for guidance estimation grounded in Doob's $h$-transform.<n>We prove non-asymptotic convergence guarantees for the generated distributions in the 2-Wasserstein distance.
- Score: 16.416975860645724
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inference-time alignment for diffusion models aims to adapt a pre-trained diffusion model toward a target distribution without retraining the base score network, thereby preserving the generative capacity of the base model while enforcing desired properties at the inference time. A central mechanism for achieving such alignment is guidance, which modifies the sampling dynamics through an additional drift term. In this work, we introduce Doob's matching, a novel framework for guidance estimation grounded in Doob's $h$-transform. Our approach formulates guidance as the gradient of logarithm of an underlying Doob's $h$-function and employs gradient-penalized regression to simultaneously estimate both the $h$-function and its gradient, resulting in a consistent estimator of the guidance. Theoretically, we establish non-asymptotic convergence rates for the estimated guidance. Moreover, we analyze the resulting controllable diffusion processes and prove non-asymptotic convergence guarantees for the generated distributions in the 2-Wasserstein distance.
Related papers
- Deep Bootstrap [15.173771421020751]
We propose a novel deep bootstrap framework for nonparametric regression based on conditional diffusion models.<n>With the expressive capacity of diffusion models, our method facilitates both efficient sampling from high-dimensional or multimodal distributions.
arXiv Detail & Related papers (2026-02-11T07:20:20Z) - Binary Flow Matching: Prediction-Loss Space Alignment for Robust Learning [23.616336786063552]
Flow matching has emerged as a powerful framework for generative modeling.<n>We identify a latent structural mismatch that arises when it is coupled with velocity-based objectives.<n>We prove that re-aligning the objective to the signal space eliminates the singular weighting.
arXiv Detail & Related papers (2026-02-11T02:02:30Z) - Diffusion Models: A Mathematical Introduction [3.8673630752805437]
We present a self-contained derivation of diffusion-based generative models.<n>We construct denoising diffusion probabilistic models from first principles.<n>Readers can both follow the theory and implement the corresponding algorithms in practice.
arXiv Detail & Related papers (2025-11-13T16:20:52Z) - Effective Test-Time Scaling of Discrete Diffusion through Iterative Refinement [51.54933696252104]
We introduce Iterative Reward-Guided Refinement (IterRef), a novel test-time scaling method tailored to discrete diffusion.<n>We formalize this process within a Multiple-Try Metropolis framework, proving convergence to the reward-aligned distribution.<n>IterRef achieves striking gains under low compute budgets, far surpassing prior state-of-the-art baselines.
arXiv Detail & Related papers (2025-11-04T02:33:23Z) - TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling [53.61290359948953]
Tangential Amplifying Guidance (TAG) operates solely on trajectory signals without modifying the underlying diffusion model.<n>We formalize this guidance process by leveraging a first-order Taylor expansion.<n> TAG is a plug-and-play, architecture-agnostic module that improves diffusion sampling fidelity with minimal computational addition.
arXiv Detail & Related papers (2025-10-06T06:53:29Z) - Semantically-Guided Inference for Conditional Diffusion Models: Enhancing Covariate Consistency in Time Series Forecasting [6.716179859091235]
SemGuide is a plug-and-play, inference-time method that enhances covariate consistency in conditional diffusion models.<n>Our approach introduces a scoring network to assess the semantic alignment between intermediate diffusion states and future covariates.
arXiv Detail & Related papers (2025-08-03T14:04:04Z) - Inference-Time Scaling of Diffusion Language Models with Particle Gibbs Sampling [70.8832906871441]
We study how to steer generation toward desired rewards without retraining the models.<n>Prior methods typically resample or filter within a single denoising trajectory, optimizing rewards step-by-step without trajectory-level refinement.<n>We introduce particle Gibbs sampling for diffusion language models (PG-DLM), a novel inference-time algorithm enabling trajectory-level refinement while preserving generation perplexity.
arXiv Detail & Related papers (2025-07-11T08:00:47Z) - Training-Free Stein Diffusion Guidance: Posterior Correction for Sampling Beyond High-Density Regions [46.59494117137471]
Training free diffusion guidance provides a flexible way to leverage off-the-shelf classifiers without additional training.<n>We introduce Stein Diffusion Guidance (SDG), a novel training-free framework grounded in a surrogate SOC objective.<n>Experiments on molecular low-density sampling tasks suggest that SDG consistently surpasses standard training-free guidance methods.
arXiv Detail & Related papers (2025-07-07T21:14:27Z) - Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional dependencies for general score-mismatched diffusion samplers.<n>We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.<n>This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Gradient Guidance for Diffusion Models: An Optimization Perspective [45.6080199096424]
This paper studies a form of gradient guidance for adapting a pre-trained diffusion model towards optimizing user-specified objectives.
We establish a mathematical framework for guided diffusion to systematically study its optimization theory and algorithmic design.
arXiv Detail & Related papers (2024-04-23T04:51:02Z) - Convergence Analysis of Flow Matching in Latent Space with Transformers [7.069772598731282]
We present theoretical convergence guarantees for ODE-based generative models, specifically flow matching.
We use a pre-trained autoencoder network to map high-dimensional original inputs to a low-dimensional latent space, where a transformer network is trained to predict the velocity field of the transformation from a standard normal distribution to the target latent distribution.
arXiv Detail & Related papers (2024-04-03T07:50:53Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.