AdjointDEIS: Efficient Gradients for Diffusion Models
- URL: http://arxiv.org/abs/2405.15020v1
- Date: Thu, 23 May 2024 19:51:33 GMT
- Title: AdjointDEIS: Efficient Gradients for Diffusion Models
- Authors: Zander W. Blasingame, Chen Liu,
- Abstract summary: We propose a novel method for solving the optimization of the latents and parameters of diffusion models.
We exploit the unique construction of diffusion SDEs to further simplify the formulation of the adjoint diffusion SDE.
The proposed adjoint diffusion solvers can efficiently compute the gradients for both the probability flow ODE and diffusion SDE.
- Score: 2.0795007613453445
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The optimization of the latents and parameters of diffusion models with respect to some differentiable metric defined on the output of the model is a challenging and complex problem. The sampling for diffusion models is done by solving either the probability flow ODE or diffusion SDE wherein a neural network approximates the score function or related quantity, allowing a numerical ODE/SDE solver to be used. However, na\"ive backpropagation techniques are memory intensive, requiring the storage of all intermediate states, and face additional complexity in handling the injected noise from the diffusion term of the diffusion SDE. We propose a novel method based on the stochastic adjoint sensitivity method to calculate the gradientwith respect to the initial noise, conditional information, and model parameters by solving an additional SDE whose solution is the gradient of the diffusion SDE. We exploit the unique construction of diffusion SDEs to further simplify the formulation of the adjoint diffusion SDE and use a change-of-variables to simplify the solution to an exponentially weighted integral. Using this formulation we derive a custom solver for the adjoint SDE as well as the simpler adjoint ODE. The proposed adjoint diffusion solvers can efficiently compute the gradients for both the probability flow ODE and diffusion SDE for latents and parameters of the model. Lastly, we demonstrate the effectiveness of the adjoint diffusion solvers onthe face morphing problem.
Related papers
- Closing the ODE-SDE gap in score-based diffusion models through the
Fokker-Planck equation [0.562479170374811]
We rigorously describe the range of dynamics and approximations that arise when training score-based diffusion models.
We show numerically that conventional score-based diffusion models can exhibit significant differences between ODE- and SDE-induced distributions.
arXiv Detail & Related papers (2023-11-27T16:44:50Z) - Gaussian Mixture Solvers for Diffusion Models [84.83349474361204]
We introduce a novel class of SDE-based solvers called GMS for diffusion models.
Our solver outperforms numerous SDE-based solvers in terms of sample quality in image generation and stroke-based synthesis.
arXiv Detail & Related papers (2023-11-02T02:05:38Z) - Elucidating the solution space of extended reverse-time SDE for
diffusion models [54.23536653351234]
Diffusion models (DMs) demonstrate potent image generation capabilities in various generative modeling tasks.
Their primary limitation lies in slow sampling speed, requiring hundreds or thousands of sequential function evaluations to generate high-quality images.
We formulate the sampling process as an extended reverse-time SDE, unifying prior explorations into ODEs and SDEs.
We devise fast and training-free samplers, ER-SDE-rs, achieving state-of-the-art performance across all samplers.
arXiv Detail & Related papers (2023-09-12T12:27:17Z) - SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models [66.67616086310662]
Diffusion Probabilistic Models (DPMs) have achieved considerable success in generation tasks.
As sampling from DPMs is equivalent to solving diffusion SDE or ODE which is time-consuming, numerous fast sampling methods built upon improved differential equation solvers are proposed.
We propose SA-of-r, which is an improved efficient Adams method for solving diffusion SDE to generate data with high quality.
arXiv Detail & Related papers (2023-09-10T12:44:54Z) - Latent SDEs on Homogeneous Spaces [9.361372513858043]
We consider the problem of variational Bayesian inference in a latent variable model where a (possibly complex) observed geometric process is governed by the solution of a latent differential equation (SDE)
Experiments demonstrate that a latent SDE of the proposed type can be learned efficiently by means of an existing one-step Euler-Maruyama scheme.
arXiv Detail & Related papers (2023-06-28T14:18:52Z) - Eliminating Lipschitz Singularities in Diffusion Models [51.806899946775076]
We show that diffusion models frequently exhibit the infinite Lipschitz near the zero point of timesteps.
This poses a threat to the stability and accuracy of the diffusion process, which relies on integral operations.
We propose a novel approach, dubbed E-TSDM, which eliminates the Lipschitz of the diffusion model near zero.
arXiv Detail & Related papers (2023-06-20T03:05:28Z) - A Variational Perspective on Solving Inverse Problems with Diffusion
Models [101.831766524264]
Inverse tasks can be formulated as inferring a posterior distribution over data.
This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable.
We propose a variational approach that by design seeks to approximate the true posterior distribution.
arXiv Detail & Related papers (2023-05-07T23:00:47Z) - Score-based Generative Modeling Through Backward Stochastic Differential
Equations: Inversion and Generation [6.2255027793924285]
The proposed BSDE-based diffusion model represents a novel approach to diffusion modeling, which extends the application of differential equations (SDEs) in machine learning.
We demonstrate the theoretical guarantees of the model, the benefits of using Lipschitz networks for score matching, and its potential applications in various areas such as diffusion inversion, conditional diffusion, and uncertainty quantification.
arXiv Detail & Related papers (2023-04-26T01:15:35Z) - Large-Scale Wasserstein Gradient Flows [84.73670288608025]
We introduce a scalable scheme to approximate Wasserstein gradient flows.
Our approach relies on input neural networks (ICNNs) to discretize the JKO steps.
As a result, we can sample from the measure at each step of the gradient diffusion and compute its density.
arXiv Detail & Related papers (2021-06-01T19:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.