Iterative Tilting for Diffusion Fine-Tuning
- URL: http://arxiv.org/abs/2512.03234v1
- Date: Tue, 02 Dec 2025 21:07:46 GMT
- Title: Iterative Tilting for Diffusion Fine-Tuning
- Authors: Jean Pachebat, Giovanni Conforti, Alain Durmus, Yazid Janati,
- Abstract summary: iterative tilting is a gradient-free method for fine-tuning diffusion models toward reward-tilted distributions.<n>We validate on a two-dimensional Gaussian mixture with linear reward, where the exact tilted distribution is available in closed form.
- Score: 14.620291917371937
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce iterative tilting, a gradient-free method for fine-tuning diffusion models toward reward-tilted distributions. The method decomposes a large reward tilt $\exp(λr)$ into $N$ sequential smaller tilts, each admitting a tractable score update via first-order Taylor expansion. This requires only forward evaluations of the reward function and avoids backpropagating through sampling chains. We validate on a two-dimensional Gaussian mixture with linear reward, where the exact tilted distribution is available in closed form.
Related papers
- Effective Test-Time Scaling of Discrete Diffusion through Iterative Refinement [51.54933696252104]
We introduce Iterative Reward-Guided Refinement (IterRef), a novel test-time scaling method tailored to discrete diffusion.<n>We formalize this process within a Multiple-Try Metropolis framework, proving convergence to the reward-aligned distribution.<n>IterRef achieves striking gains under low compute budgets, far surpassing prior state-of-the-art baselines.
arXiv Detail & Related papers (2025-11-04T02:33:23Z) - Learn to Guide Your Diffusion Model [84.82855046749657]
We study a technique for improving quality of samples from conditional diffusion models.<n>We learn guidance weights $omega_c,(s,t)$, which are functions of the conditioning $c$, the time $t$ from which we denoise, and the time $s$ towards which we denoise.<n>We extend our framework to reward guided sampling, enabling the model to target distributions tilted by a reward function.
arXiv Detail & Related papers (2025-10-01T12:21:48Z) - Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities [93.13866975467549]
We propose Progressive Inference-Time Annealing (PITA) to learn diffusion-based samplers.<n>PITA combines two complementary techniques: Annealing of the Boltzmann distribution and Diffusion smoothing.<n>It enables equilibrium sampling of N-body particle systems, Alanine Dipeptide, and tripeptides in Cartesian coordinates.
arXiv Detail & Related papers (2025-06-19T17:14:22Z) - Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness [51.302674884611335]
This work introduces a hybrid non-Euclidean optimization method which generalizes norm clipping by combining steepest descent and conditional gradient approaches.<n>We discuss how to instantiate the algorithms for deep learning and demonstrate their properties on image classification and language modeling.
arXiv Detail & Related papers (2025-06-02T17:34:29Z) - A Stein Gradient Descent Approach for Doubly Intractable Distributions [5.63014864822787]
We propose a novel Monte Carlo Stein variational gradient descent (MC-SVGD) approach for inference for doubly intractable distributions.<n>The proposed method achieves substantial computational gains over existing algorithms, while providing comparable inferential performance for the posterior distributions.
arXiv Detail & Related papers (2024-10-28T13:42:27Z) - Gradual Domain Adaptation via Manifold-Constrained Distributionally Robust Optimization [0.4732176352681218]
This paper addresses the challenge of gradual domain adaptation within a class of manifold-constrained data distributions.
We propose a methodology rooted in Distributionally Robust Optimization (DRO) with an adaptive Wasserstein radius.
Our bounds rely on a newly introduced it compatibility measure, which fully characterizes the error propagation dynamics along the sequence.
arXiv Detail & Related papers (2024-10-17T22:07:25Z) - Amortized Posterior Sampling with Diffusion Prior Distillation [55.03585818289934]
Amortized Posterior Sampling is a novel variational inference approach for efficient posterior sampling in inverse problems.<n>Our method trains a conditional flow model to minimize the divergence between the variational distribution and the posterior distribution implicitly defined by the diffusion model.<n>Unlike existing methods, our approach is unsupervised, requires no paired training data, and is applicable to both Euclidean and non-Euclidean domains.
arXiv Detail & Related papers (2024-07-25T09:53:12Z) - Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance [52.093434664236014]
Recent diffusion models provide a promising zero-shot solution to noisy linear inverse problems without retraining for specific inverse problems.
Inspired by this finding, we propose to improve recent methods by using more principled covariance determined by maximum likelihood estimation.
arXiv Detail & Related papers (2024-02-03T13:35:39Z) - Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients [0.8749675983608172]
We present an unbiased method for posterior means based on kinetic Langevin dynamics.
Our proposed estimator is unbiased, attains finite variance, and satisfies a central limit theorem.
Our results demonstrate that in large-scale applications, the unbiased algorithm we present can be 2-3 orders of magnitude more efficient than the gold-standard" randomized Hamiltonian Monte Carlo.
arXiv Detail & Related papers (2023-11-08T21:19:52Z) - Nearly $d$-Linear Convergence Bounds for Diffusion Models via Stochastic
Localization [40.808942894229325]
We provide the first convergence bounds which are linear in the data dimension.
We show that diffusion models require at most $tilde O(fracd log2(1/delta)varepsilon2)$ steps to approximate an arbitrary distribution.
arXiv Detail & Related papers (2023-08-07T16:01:14Z) - Gradient Coding with Iterative Block Leverage Score Sampling [42.21200677508463]
We generalize the leverage score sampling sketch for $ell$-subspace embeddings, to accommodate sampling subsets of the transformed data.
This is then used to derive an approximate coded computing approach for first-order methods.
arXiv Detail & Related papers (2023-08-06T12:22:12Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Fluctuation without dissipation: Microcanonical Langevin Monte Carlo [0.0]
Langevin Monte Carlo sampling algorithms are inspired by physical systems in a heat bath.<n>We show that the fluctuation-dissipation theorem is not required because only the configuration space distribution, and not the full phase space distribution, needs to be canonical.<n>We propose a continuous-time Microcanonical Langevin Monte Carlo (MCLMC) as a dissipation-free system of differential equations (SDE)
arXiv Detail & Related papers (2023-03-31T17:24:33Z) - Fast Margin Maximization via Dual Acceleration [52.62944011696364]
We present and analyze a momentum-based method for training linear classifiers with an exponentially-tailed loss.
This momentum-based method is derived via the convex dual of the maximum-margin problem, and specifically by applying Nesterov acceleration to this dual.
arXiv Detail & Related papers (2021-07-01T16:36:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.