Related papers: Learning to Discretize Denoising Diffusion ODEs

Learning to Discretize Denoising Diffusion ODEs

URL: http://arxiv.org/abs/2405.15506v1
Date: Fri, 24 May 2024 12:51:23 GMT
Title: Learning to Discretize Denoising Diffusion ODEs
Authors: Vinh Tong, Anji Liu, Trung-Dung Hoang, Guy Van den Broeck, Mathias Niepert,
Abstract summary: We propose LD3, a framework for learning time discretization while sampling from the diffusion ODE encapsulated by DPMs. We demonstrate analytically and empirically that LD3 enhances sampling efficiency compared to distillation-based methods.
Score: 41.50816120270017
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion Probabilistic Models (DPMs) are powerful generative models showing competitive performance in various domains, including image synthesis and 3D point cloud generation. However, sampling from pre-trained DPMs involves multiple neural function evaluations (NFE) to transform Gaussian noise samples into images, resulting in higher computational costs compared to single-step generative models such as GANs or VAEs. Therefore, a crucial problem is to reduce NFE while preserving generation quality. To this end, we propose LD3, a lightweight framework for learning time discretization while sampling from the diffusion ODE encapsulated by DPMs. LD3 can be combined with various diffusion ODE solvers and consistently improves performance without retraining resource-intensive neural networks. We demonstrate analytically and empirically that LD3 enhances sampling efficiency compared to distillation-based methods, without the extensive computational overhead. We evaluate our method with extensive experiments on 5 datasets, covering unconditional and conditional sampling in both pixel-space and latent-space DPMs. For example, in about 5 minutes of training on a single GPU, our method reduces the FID score from 6.63 to 2.68 on CIFAR10 (7 NFE), and in around 20 minutes, decreases the FID from 8.51 to 5.03 on class-conditional ImageNet-256 (5 NFE). LD3 complements distillation methods, offering a more efficient approach to sampling from pre-trained diffusion models.

Related papers

UAVTwin: Neural Digital Twins for UAVs using Gaussian Splatting [57.63613048492219]
We present UAVTwin, a method for creating digital twins from real-world environments and facilitating data augmentation for training downstream models embedded in unmanned aerial vehicles (UAVs) This is achieved by integrating 3D Gaussian Splatting (3DGS) for reconstructing backgrounds along with controllable synthetic human models that display diverse appearances and actions in multiple poses.
arXiv Detail & Related papers (2025-04-02T22:17:30Z)
Distilling Multi-view Diffusion Models into 3D Generators [4.3238419212557115]
We introduce DD3G, a formulation that Distills a multi-view Diffusion model (MV-DM) into a 3D Generator using gaussian splatting. DD3G compresses and integrates extensive visual and spatial knowledge from the MV-DM. We propose PEPD, a generator consisting of Pattern Extraction and Progressive Decoding phases, which enables efficient fusion of probabilistic flow.
arXiv Detail & Related papers (2025-04-01T06:32:48Z)
DSplats: 3D Generation by Denoising Splats-Based Multiview Diffusion Models [67.50989119438508]
We introduce DSplats, a novel method that directly denoises multiview images using Gaussian-based Reconstructors to produce realistic 3D assets. Our experiments demonstrate that DSplats not only produces high-quality, spatially consistent outputs, but also sets a new standard in single-image to 3D reconstruction.
arXiv Detail & Related papers (2024-12-11T07:32:17Z)
Efficient NeRF Optimization -- Not All Samples Remain Equally Hard [9.404889815088161]
We propose an application of online hard sample mining for efficient training of Neural Radiance Fields (NeRF) NeRF models produce state-of-the-art quality for many 3D reconstruction and rendering tasks but require substantial computational resources.
arXiv Detail & Related papers (2024-08-06T13:49:01Z)
cDVGAN: One Flexible Model for Multi-class Gravitational Wave Signal and Glitch Generation [0.7853804618032806]
We present a novel conditional model in the Generative Adrial Network framework for simulating multiple classes of time-domain observations. Our proposed cDVGAN outperforms 4 different baseline GAN models in replicating the features of the three classes. Our experiments show that training convolutional neural networks with our cDVGAN-generated data improves the detection of samples embedded in detector noise.
arXiv Detail & Related papers (2024-01-29T17:59:26Z)
StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D [88.66678730537777]
We present StableDreamer, a methodology incorporating three advances. First, we formalize the equivalence of the SDS generative prior and a simple supervised L2 reconstruction loss. Second, our analysis shows that while image-space diffusion contributes to geometric precision, latent-space diffusion is crucial for vivid color rendition.
arXiv Detail & Related papers (2023-12-02T02:27:58Z)
DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics [23.030972042695275]
Diffusion models (DPMs) have exhibited excellent performance for high-fidelity image generation while suffering from inefficient sampling. Recent works accelerate the sampling procedure by proposing fast ODE solvers that leverage the specific ODE form of DPMs. We propose a novel formulation towards the optimal parameterization during sampling that minimizes the first-order discretization error.
arXiv Detail & Related papers (2023-10-20T04:23:12Z)
Parallel Sampling of Diffusion Models [76.3124029406809]
Diffusion models are powerful generative models but suffer from slow sampling. We present ParaDiGMS, a novel method to accelerate the sampling of pretrained diffusion models by denoising multiple steps in parallel.
arXiv Detail & Related papers (2023-05-25T17:59:42Z)
Fast Sampling of Diffusion Models via Operator Learning [74.37531458470086]
We use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models. Compared to other fast sampling methods that have a sequential nature, we are the first to propose a parallel decoding method. We show our method achieves state-of-the-art FID of 3.78 for CIFAR-10 and 7.83 for ImageNet-64 in the one-model-evaluation setting.
arXiv Detail & Related papers (2022-11-24T07:30:27Z)
BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis [45.58131296169655]
Diffusion probabilistic models (DPMs) and their extensions have emerged as competitive generative models yet confront challenges of efficient sampling. We propose a new bilateral denoising diffusion model that parameterizes both the forward and reverse processes with a schedule network and a score network. We show that the new surrogate objective can achieve a lower bound of the log marginal likelihood tighter than a conventional surrogate.
arXiv Detail & Related papers (2022-03-25T08:53:12Z)
Bilateral Denoising Diffusion Models [34.507876199641665]
Denoising diffusion probabilistic models (DDPMs) have emerged as competitive generative models. We propose novel bilateral denoising diffusion models (BDDMs) which take significantly fewer steps to generate high-quality samples.
arXiv Detail & Related papers (2021-08-26T13:23:41Z)
Hyperspectral Classification Based on Lightweight 3-D-CNN With Transfer Learning [67.40866334083941]
We propose an end-to-end 3-D lightweight convolutional neural network (CNN) for limited samples-based HSI classification. Compared with conventional 3-D-CNN models, the proposed 3-D-LWNet has a deeper network structure, less parameters, and lower computation cost. Our model achieves competitive performance for HSI classification compared to several state-of-the-art methods.
arXiv Detail & Related papers (2020-12-07T03:44:35Z)
Denoising Diffusion Implicit Models [117.03720513930335]
We present denoising diffusion implicit models (DDIMs) for iterative implicit probabilistic models with the same training procedure as DDPMs. DDIMs can produce high quality samples $10 times$ to $50 times$ faster in terms of wall-clock time compared to DDPMs.
arXiv Detail & Related papers (2020-10-06T06:15:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.