Bellman Optimal Stepsize Straightening of Flow-Matching Models
- URL: http://arxiv.org/abs/2312.16414v3
- Date: Tue, 20 Feb 2024 14:25:25 GMT
- Title: Bellman Optimal Stepsize Straightening of Flow-Matching Models
- Authors: Bao Nguyen, Binh Nguyen, Viet Anh Nguyen
- Abstract summary: This paper introduces Bellman Optimal Stepsize Straightening (BOSS) technique for distilling flow-matching generative models.
BOSS aims specifically for a few-step efficient image sampling while adhering to a computational budget constraint.
Our results reveal that BOSS achieves substantial gains in efficiency while maintaining competitive sample quality.
- Score: 14.920260435839992
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Flow matching is a powerful framework for generating high-quality samples in
various applications, especially image synthesis. However, the intensive
computational demands of these models, especially during the finetuning process
and sampling processes, pose significant challenges for low-resource scenarios.
This paper introduces Bellman Optimal Stepsize Straightening (BOSS) technique
for distilling flow-matching generative models: it aims specifically for a
few-step efficient image sampling while adhering to a computational budget
constraint. First, this technique involves a dynamic programming algorithm that
optimizes the stepsizes of the pretrained network. Then, it refines the
velocity network to match the optimal step sizes, aiming to straighten the
generation paths. Extensive experimental evaluations across image generation
tasks demonstrate the efficacy of BOSS in terms of both resource utilization
and image quality. Our results reveal that BOSS achieves substantial gains in
efficiency while maintaining competitive sample quality, effectively bridging
the gap between low-resource constraints and the demanding requirements of
flow-matching generative models. Our paper also fortifies the responsible
development of artificial intelligence, offering a more sustainable generative
model that reduces computational costs and environmental footprints. Our code
can be found at https://github.com/nguyenngocbaocmt02/BOSS.
Related papers
- One Step Diffusion via Shortcut Models [109.72495454280627]
We introduce shortcut models, a family of generative models that use a single network and training phase to produce high-quality samples.
Shortcut models condition the network on the current noise level and also on the desired step size, allowing the model to skip ahead in the generation process.
Compared to distillation, shortcut models reduce complexity to a single network and training phase and additionally allow varying step budgets at inference time.
arXiv Detail & Related papers (2024-10-16T13:34:40Z) - Efficient Point Cloud Classification via Offline Distillation Framework and Negative-Weight Self-Distillation Technique [46.266960248570086]
We introduce an innovative offline recording strategy that avoids the simultaneous loading of both teacher and student models.
This approach feeds a multitude of augmented samples into the teacher model, recording both the data augmentation parameters and the corresponding logit outputs.
Experimental results demonstrate that the proposed distillation strategy enables the student model to achieve performance comparable to state-of-the-art models.
arXiv Detail & Related papers (2024-09-03T16:12:12Z) - Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis [82.72941975704374]
Non-autoregressive Transformers (NATs) have been recognized for their rapid generation.
We re-evaluate the full potential of NATs by revisiting the design of their training and inference strategies.
We propose to go beyond existing methods by directly solving the optimal strategies in an automatic framework.
arXiv Detail & Related papers (2024-06-08T13:52:20Z) - Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models.
They rely on sequential denoising steps during sample generation.
We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z) - Model-Agnostic Human Preference Inversion in Diffusion Models [31.992947353231564]
We propose a novel sampling design to achieve high-quality one-step image generation aligning with human preferences.
Our approach, Prompt Adaptive Human Preference Inversion (PAHI), optimize the noise distributions for each prompt based on human preferences.
Our experiments showcase that the tailored noise distributions significantly improve image quality with only a marginal increase in computational cost.
arXiv Detail & Related papers (2024-04-01T03:18:12Z) - A-SDM: Accelerating Stable Diffusion through Redundancy Removal and
Performance Optimization [54.113083217869516]
In this work, we first explore the computational redundancy part of the network.
We then prune the redundancy blocks of the model and maintain the network performance.
Thirdly, we propose a global-regional interactive (GRI) attention to speed up the computationally intensive attention part.
arXiv Detail & Related papers (2023-12-24T15:37:47Z) - One-Step Diffusion Distillation via Deep Equilibrium Models [64.11782639697883]
We introduce a simple yet effective means of distilling diffusion models directly from initial noise to the resulting image.
Our method enables fully offline training with just noise/image pairs from the diffusion model.
We demonstrate that the DEQ architecture is crucial to this capability, as GET matches a $5times$ larger ViT in terms of FID scores.
arXiv Detail & Related papers (2023-12-12T07:28:40Z) - EDGE++: Improved Training and Sampling of EDGE [17.646159460584926]
We propose enhancements to the EDGE model to address these issues.
Specifically, we introduce a degree-specific noise schedule that optimize the number of active nodes at each timestep.
We also present an improved sampling scheme that fine-tunes the generative process, allowing for better control over the similarity between the synthesized and the true network.
arXiv Detail & Related papers (2023-10-22T22:54:20Z) - Flow Matching in Latent Space [2.9330609943398525]
Flow matching is a framework to train generative models that exhibits impressive empirical performance.
We propose to apply flow matching in the latent spaces of pretrained autoencoders, which offers improved computational efficiency.
Our work stands as a pioneering contribution in the integration of various conditions into flow matching for conditional generation tasks.
arXiv Detail & Related papers (2023-07-17T17:57:56Z) - An Adversarial Active Sampling-based Data Augmentation Framework for
Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable.
Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks.
We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z) - DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and
Quantization [75.72231742114951]
Large-scale pre-trained sequence-to-sequence models like BART and T5 achieve state-of-the-art performance on many generative NLP tasks.
These models pose a great challenge in resource-constrained scenarios owing to their large memory requirements and high latency.
We propose to jointly distill and quantize the model, where knowledge is transferred from the full-precision teacher model to the quantized and distilled low-precision student model.
arXiv Detail & Related papers (2022-03-21T18:04:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.