Bespoke Solvers for Generative Flow Models
- URL: http://arxiv.org/abs/2310.19075v1
- Date: Sun, 29 Oct 2023 16:58:31 GMT
- Title: Bespoke Solvers for Generative Flow Models
- Authors: Neta Shaul, Juan Perez, Ricky T. Q. Chen, Ali Thabet, Albert Pumarola,
Yaron Lipman
- Abstract summary: Existing methods to alleviate the costly sampling process include model distillation and designing dedicated ODE solvers.
"Bespoke solvers" are a novel framework for constructing custom ODE solvers tailored to the ODE of a given pre-trained flow model.
- Score: 33.20695061095209
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion or flow-based models are powerful generative paradigms that are
notoriously hard to sample as samples are defined as solutions to
high-dimensional Ordinary or Stochastic Differential Equations (ODEs/SDEs)
which require a large Number of Function Evaluations (NFE) to approximate well.
Existing methods to alleviate the costly sampling process include model
distillation and designing dedicated ODE solvers. However, distillation is
costly to train and sometimes can deteriorate quality, while dedicated solvers
still require relatively large NFE to produce high quality samples. In this
paper we introduce "Bespoke solvers", a novel framework for constructing custom
ODE solvers tailored to the ODE of a given pre-trained flow model. Our approach
optimizes an order consistent and parameter-efficient solver (e.g., with 80
learnable parameters), is trained for roughly 1% of the GPU time required for
training the pre-trained model, and significantly improves approximation and
generation quality compared to dedicated solvers. For example, a Bespoke solver
for a CIFAR10 model produces samples with Fr\'echet Inception Distance (FID) of
2.73 with 10 NFE, and gets to 1% of the Ground Truth (GT) FID (2.59) for this
model with only 20 NFE. On the more challenging ImageNet-64$\times$64, Bespoke
samples at 2.2 FID with 10 NFE, and gets within 2% of GT FID (1.71) with 20
NFE.
Related papers
- Inductive Moment Matching [80.96561758341664]
We propose Inductive Moment Matching (IMM), a new class of generative models for one- or few-step sampling with a single-stage training procedure.
IMM surpasses diffusion models on ImageNet-256x256 with 1.99 FID using only 8 inference steps and achieves state-of-the-art 2-step FID of 1.98 on CIFAR-10 for a model trained from scratch.
arXiv Detail & Related papers (2025-03-10T17:37:39Z) - S4S: Solving for a Diffusion Model Solver [52.99341671532249]
Diffusion models (DMs) create samples from a data distribution by starting from random noise and solving a reverse-time ordinary differential equation (ODE)
We propose a new method that learns a good solver for the DM, which we call Solving for the Solver (S4S)
In all settings, S4S uniformly improves the sample quality relative to traditional ODE solvers.
arXiv Detail & Related papers (2025-02-24T18:55:54Z) - One-Step Diffusion Distillation through Score Implicit Matching [74.91234358410281]
We present Score Implicit Matching (SIM) a new approach to distilling pre-trained diffusion models into single-step generator models.
SIM shows strong empirical performances for one-step generators.
By applying SIM to a leading transformer-based diffusion model, we distill a single-step generator for text-to-image generation.
arXiv Detail & Related papers (2024-10-22T08:17:20Z) - Truncated Consistency Models [57.50243901368328]
Training consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints.
We empirically find that this training paradigm limits the one-step generation performance of consistency models.
We propose a new parameterization of the consistency function and a two-stage training procedure that prevents the truncated-time training from collapsing to a trivial solution.
arXiv Detail & Related papers (2024-10-18T22:38:08Z) - Simple and Fast Distillation of Diffusion Models [39.79747569096888]
We propose Simple and Fast Distillation (SFD) of diffusion models, which simplifies the paradigm used in existing methods.
SFD achieves 4.53 FID (NFE=2) on CIFAR-10 with only 0.64 hours of fine-tuning on a single NVIDIA A100 GPU.
arXiv Detail & Related papers (2024-09-29T12:13:06Z) - Directly Denoising Diffusion Models [6.109141407163027]
We present Directly Denoising Diffusion Model (DDDM), a simple and generic approach for generating realistic images with few-step sampling.
Our model achieves FID scores of 2.57 and 2.33 on CIFAR-10 in one-step and two-step sampling respectively, surpassing those obtained from GANs and distillation-based models.
For ImageNet 64x64, our approach stands as a competitive contender against leading models.
arXiv Detail & Related papers (2024-05-22T11:20:32Z) - A Unified Sampling Framework for Solver Searching of Diffusion
Probabilistic Models [21.305868355976394]
In this paper, we propose a unified sampling framework (USF) to study the optional strategies for solver.
Under this framework, we reveal that taking different solving strategies at different timesteps may help further decrease the truncation error.
We demonstrate that $S3$ can find outstanding solver schedules which outperform the state-of-the-art sampling methods.
arXiv Detail & Related papers (2023-12-12T13:19:40Z) - Improved Techniques for Training Consistency Models [13.475711217989975]
We present improved techniques for consistency training, where consistency models learn directly from data without distillation.
We propose a lognormal noise schedule for the consistency training objective, and propose to double total discretization steps every set number of training iterations.
These modifications enable consistency models to achieve FID scores of 2.51 and 3.25 on CIFAR-10 and ImageNet $64times 64$ respectively in a single sampling step.
arXiv Detail & Related papers (2023-10-22T05:33:38Z) - Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion [56.38386580040991]
Consistency Trajectory Model (CTM) is a generalization of Consistency Models (CM)
CTM enables the efficient combination of adversarial training and denoising score matching loss to enhance performance.
Unlike CM, CTM's access to the score function can streamline the adoption of established controllable/conditional generation methods.
arXiv Detail & Related papers (2023-10-01T05:07:17Z) - Distilling ODE Solvers of Diffusion Models into Smaller Steps [32.49916706943228]
We introduce Distilled-ODE solvers, a straightforward distillation approach grounded in ODE solver formulations.
Our method seamlessly integrates the strengths of both learning-free and learning-based sampling.
Our method incurs negligible computational overhead compared to previous distillation techniques.
arXiv Detail & Related papers (2023-09-28T13:12:18Z) - Consistency Models [89.68380014789861]
We propose a new family of models that generate high quality samples by directly mapping noise to data.
They support fast one-step generation by design, while still allowing multistep sampling to trade compute for sample quality.
They also support zero-shot data editing, such as image inpainting, colorization, and super-resolution, without requiring explicit training.
arXiv Detail & Related papers (2023-03-02T18:30:16Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.