Inductive Moment Matching
- URL: http://arxiv.org/abs/2503.07565v6
- Date: Mon, 31 Mar 2025 16:02:38 GMT
- Title: Inductive Moment Matching
- Authors: Linqi Zhou, Stefano Ermon, Jiaming Song,
- Abstract summary: We propose Inductive Moment Matching (IMM), a new class of generative models for one- or few-step sampling with a single-stage training procedure.<n>IMM surpasses diffusion models on ImageNet-256x256 with 1.99 FID using only 8 inference steps and achieves state-of-the-art 2-step FID of 1.98 on CIFAR-10 for a model trained from scratch.
- Score: 80.96561758341664
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models and Flow Matching generate high-quality samples but are slow at inference, and distilling them into few-step models often leads to instability and extensive tuning. To resolve these trade-offs, we propose Inductive Moment Matching (IMM), a new class of generative models for one- or few-step sampling with a single-stage training procedure. Unlike distillation, IMM does not require pre-training initialization and optimization of two networks; and unlike Consistency Models, IMM guarantees distribution-level convergence and remains stable under various hyperparameters and standard model architectures. IMM surpasses diffusion models on ImageNet-256x256 with 1.99 FID using only 8 inference steps and achieves state-of-the-art 2-step FID of 1.98 on CIFAR-10 for a model trained from scratch.
Related papers
- One-Step Diffusion Distillation through Score Implicit Matching [74.91234358410281]
We present Score Implicit Matching (SIM) a new approach to distilling pre-trained diffusion models into single-step generator models.
SIM shows strong empirical performances for one-step generators.
By applying SIM to a leading transformer-based diffusion model, we distill a single-step generator for text-to-image generation.
arXiv Detail & Related papers (2024-10-22T08:17:20Z) - One Step Diffusion via Shortcut Models [109.72495454280627]
We introduce shortcut models, a family of generative models that use a single network and training phase to produce high-quality samples.
Shortcut models condition the network on the current noise level and also on the desired step size, allowing the model to skip ahead in the generation process.
Compared to distillation, shortcut models reduce complexity to a single network and training phase and additionally allow varying step budgets at inference time.
arXiv Detail & Related papers (2024-10-16T13:34:40Z) - Directly Denoising Diffusion Models [6.109141407163027]
We present Directly Denoising Diffusion Model (DDDM), a simple and generic approach for generating realistic images with few-step sampling.
Our model achieves FID scores of 2.57 and 2.33 on CIFAR-10 in one-step and two-step sampling respectively, surpassing those obtained from GANs and distillation-based models.
For ImageNet 64x64, our approach stands as a competitive contender against leading models.
arXiv Detail & Related papers (2024-05-22T11:20:32Z) - Multistep Consistency Models [24.443707181138553]
A 1-step consistency model is a conventional consistency model whereas a $infty$-step consistency model is a diffusion model.
By increasing the sample budget from a single step to 2-8 steps, we can train models more easily that generate higher quality samples.
We show that our method scales to a text-to-image diffusion model, generating samples that are close to the quality of the original model.
arXiv Detail & Related papers (2024-03-11T15:26:34Z) - Bespoke Solvers for Generative Flow Models [33.20695061095209]
Existing methods to alleviate the costly sampling process include model distillation and designing dedicated ODE solvers.
"Bespoke solvers" are a novel framework for constructing custom ODE solvers tailored to the ODE of a given pre-trained flow model.
arXiv Detail & Related papers (2023-10-29T16:58:31Z) - Improved Techniques for Training Consistency Models [13.475711217989975]
We present improved techniques for consistency training, where consistency models learn directly from data without distillation.
We propose a lognormal noise schedule for the consistency training objective, and propose to double total discretization steps every set number of training iterations.
These modifications enable consistency models to achieve FID scores of 2.51 and 3.25 on CIFAR-10 and ImageNet $64times 64$ respectively in a single sampling step.
arXiv Detail & Related papers (2023-10-22T05:33:38Z) - Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion [56.38386580040991]
Consistency Trajectory Model (CTM) is a generalization of Consistency Models (CM)
CTM enables the efficient combination of adversarial training and denoising score matching loss to enhance performance.
Unlike CM, CTM's access to the score function can streamline the adoption of established controllable/conditional generation methods.
arXiv Detail & Related papers (2023-10-01T05:07:17Z) - Consistency Models [89.68380014789861]
We propose a new family of models that generate high quality samples by directly mapping noise to data.
They support fast one-step generation by design, while still allowing multistep sampling to trade compute for sample quality.
They also support zero-shot data editing, such as image inpainting, colorization, and super-resolution, without requiring explicit training.
arXiv Detail & Related papers (2023-03-02T18:30:16Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.