Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
- URL: http://arxiv.org/abs/2410.09873v1
- Date: Sun, 13 Oct 2024 15:19:18 GMT
- Title: Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy
- Authors: Hancheng Ye, Jiakang Yuan, Renqiu Xia, Xiangchao Yan, Tao Chen, Junchi Yan, Botian Shi, Bo Zhang,
- Abstract summary: We propose AdaptiveDiffusion to reduce noise prediction steps during the denoising process.
Our method can significantly speed up the denoising process while generating identical results to the original process, achieving up to an average 25x speedup.
- Score: 44.09909260046396
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have recently achieved great success in the synthesis of high-quality images and videos. However, the existing denoising techniques in diffusion models are commonly based on step-by-step noise predictions, which suffers from high computation cost, resulting in a prohibitive latency for interactive applications. In this paper, we propose AdaptiveDiffusion to relieve this bottleneck by adaptively reducing the noise prediction steps during the denoising process. Our method considers the potential of skipping as many noise prediction steps as possible while keeping the final denoised results identical to the original full-step ones. Specifically, the skipping strategy is guided by the third-order latent difference that indicates the stability between timesteps during the denoising process, which benefits the reusing of previous noise prediction results. Extensive experiments on image and video diffusion models demonstrate that our method can significantly speed up the denoising process while generating identical results to the original process, achieving up to an average 2~5x speedup without quality degradation.
Related papers
- Conditional GAN for Enhancing Diffusion Models in Efficient and Authentic Global Gesture Generation from Audios [10.57695963534794]
Methods based on VAEs are accompanied by issues of local jitter and global instability.
We introduce a conditional GAN to capture audio control signals and implicitly match the multimodal denoising distribution between the diffusion and denoising steps.
arXiv Detail & Related papers (2024-10-27T07:25:11Z) - Combining Pre- and Post-Demosaicking Noise Removal for RAW Video [2.772895608190934]
Denoising is one of the fundamental steps of the processing pipeline that converts data captured by a camera sensor into a display-ready image or video.
We propose a self-similarity-based denoising scheme that weights both a pre- and a post-demosaicking denoiser for Bayer-patterned CFA video data.
We show that a balance between the two leads to better image quality, and we empirically find that higher noise levels benefit from a higher influence pre-demosaicking.
arXiv Detail & Related papers (2024-10-03T15:20:19Z) - ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations.
We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z) - Efficient Diffusion Model for Image Restoration by Residual Shifting [63.02725947015132]
This study proposes a novel and efficient diffusion model for image restoration.
Our method avoids the need for post-acceleration during inference, thereby avoiding the associated performance deterioration.
Our method achieves superior or comparable performance to current state-of-the-art methods on three classical IR tasks.
arXiv Detail & Related papers (2024-03-12T05:06:07Z) - Diffusion Posterior Proximal Sampling for Image Restoration [27.35952624032734]
We present a refined paradigm for diffusion-based image restoration.
Specifically, we opt for a sample consistent with the measurement identity at each generative step.
The number of candidate samples used for selection is adaptively determined based on the signal-to-noise ratio of the timestep.
arXiv Detail & Related papers (2024-02-25T04:24:28Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - AdaDiff: Adaptive Step Selection for Fast Diffusion [88.8198344514677]
We introduce AdaDiff, a framework designed to learn instance-specific step usage policies.
AdaDiff is optimized using a policy gradient method to maximize a carefully designed reward function.
Our approach achieves similar results in terms of visual quality compared to the baseline using a fixed 50 denoising steps.
arXiv Detail & Related papers (2023-11-24T11:20:38Z) - Towards More Accurate Diffusion Model Acceleration with A Timestep
Aligner [84.97253871387028]
A diffusion model, which is formulated to produce an image using thousands of denoising steps, usually suffers from a slow inference speed.
We propose a timestep aligner that helps find a more accurate integral direction for a particular interval at the minimum cost.
Experiments show that our plug-in design can be trained efficiently and boost the inference performance of various state-of-the-art acceleration methods.
arXiv Detail & Related papers (2023-10-14T02:19:07Z) - SVNR: Spatially-variant Noise Removal with Denoising Diffusion [43.2405873681083]
We present a novel formulation of denoising diffusion that assumes a more realistic, spatially-variant noise model.
In experiments we demonstrate the advantages of our approach over a strong diffusion model baseline, as well as over a state-of-the-art single image denoising method.
arXiv Detail & Related papers (2023-06-28T09:32:00Z) - Learning Model-Blind Temporal Denoisers without Ground Truths [46.778450578529814]
Denoisers trained with synthetic data often fail to cope with the diversity of unknown noises.
Previous image-based method leads to noise overfitting if directly applied to video denoisers.
We propose a general framework for video denoising networks that successfully addresses these challenges.
arXiv Detail & Related papers (2020-07-07T07:19:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.