Zero-Shot Video Restoration and Enhancement Using Pre-Trained Image Diffusion Model
- URL: http://arxiv.org/abs/2407.01960v4
- Date: Sat, 01 Feb 2025 01:58:42 GMT
- Title: Zero-Shot Video Restoration and Enhancement Using Pre-Trained Image Diffusion Model
- Authors: Cong Cao, Huanjing Yue, Xin Liu, Jingyu Yang,
- Abstract summary: We propose the first framework for zero-shot video restoration and enhancement based on the pre-trained image diffusion model.<n>Our method is a plug-and-play module that can be inserted into any diffusion-based image restoration or enhancement methods.
- Score: 15.170889156729777
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion-based zero-shot image restoration and enhancement models have achieved great success in various tasks of image restoration and enhancement. However, directly applying them to video restoration and enhancement results in severe temporal flickering artifacts. In this paper, we propose the first framework for zero-shot video restoration and enhancement based on the pre-trained image diffusion model. By replacing the spatial self-attention layer with the proposed short-long-range (SLR) temporal attention layer, the pre-trained image diffusion model can take advantage of the temporal correlation between frames. We further propose temporal consistency guidance, spatial-temporal noise sharing, and an early stopping sampling strategy to improve temporally consistent sampling. Our method is a plug-and-play module that can be inserted into any diffusion-based image restoration or enhancement methods to further improve their performance. Experimental results demonstrate the superiority of our proposed method. Our code is available at https://github.com/cao-cong/ZVRD.
Related papers
- Temporal-Consistent Video Restoration with Pre-trained Diffusion Models [51.47188802535954]
Video restoration (VR) aims to recover high-quality videos from degraded ones.
Recent zero-shot VR methods using pre-trained diffusion models (DMs) suffer from approximation errors during reverse diffusion and insufficient temporal consistency.
We present a novel a Posterior Maximum (MAP) framework that directly parameterizes video frames in the seed space of DMs, eliminating approximation errors.
arXiv Detail & Related papers (2025-03-19T03:41:56Z) - Sequential Posterior Sampling with Diffusion Models [15.028061496012924]
We propose a novel approach that models the transition dynamics to improve the efficiency of sequential diffusion posterior sampling in conditional image synthesis.
We demonstrate the effectiveness of our approach on a real-world dataset of high frame rate cardiac ultrasound images.
Our method opens up new possibilities for real-time applications of diffusion models in imaging and other domains requiring real-time inference.
arXiv Detail & Related papers (2024-09-09T07:55:59Z) - Solving Video Inverse Problems Using Image Diffusion Models [58.464465016269614]
We introduce an innovative video inverse solver that leverages only image diffusion models.
Our method treats the time dimension of a video as the batch dimension image diffusion models.
We also introduce a batch-consistent sampling strategy that encourages consistency across batches.
arXiv Detail & Related papers (2024-09-04T09:48:27Z) - DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models [9.145545884814327]
We present DiffIR2VR-Zero, a zero-shot framework that enables any pre-trained image restoration model to perform high-quality video restoration without additional training.
Our framework works with any image restoration diffusion model, providing a versatile solution for video enhancement without task-specific training or modifications.
arXiv Detail & Related papers (2024-07-01T17:59:12Z) - Blind Image Restoration via Fast Diffusion Inversion [17.139433082780037]
Blind Image Restoration via fast Diffusion (BIRD) is a blind IR method that jointly optimize for the degradation model parameters and the restored image.
A key idea in our method is not to modify the reverse sampling, i.e., not to alter all the intermediate latents, once an initial noise is sampled.
We experimentally validate BIRD on several image restoration tasks and show that it achieves state of the art performance on all of them.
arXiv Detail & Related papers (2024-05-29T23:38:12Z) - Lossy Image Compression with Foundation Diffusion Models [10.407650300093923]
In this work we formulate the removal of quantization error as a denoising task, using diffusion to recover lost information in the transmitted image latent.
Our approach allows us to perform less than 10% of the full diffusion generative process and requires no architectural changes to the diffusion model.
arXiv Detail & Related papers (2024-04-12T16:23:42Z) - ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations.
We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z) - Efficient Diffusion Model for Image Restoration by Residual Shifting [63.02725947015132]
This study proposes a novel and efficient diffusion model for image restoration.
Our method avoids the need for post-acceleration during inference, thereby avoiding the associated performance deterioration.
Our method achieves superior or comparable performance to current state-of-the-art methods on three classical IR tasks.
arXiv Detail & Related papers (2024-03-12T05:06:07Z) - Diffusion Posterior Proximal Sampling for Image Restoration [27.35952624032734]
We present a refined paradigm for diffusion-based image restoration.
Specifically, we opt for a sample consistent with the measurement identity at each generative step.
The number of candidate samples used for selection is adaptively determined based on the signal-to-noise ratio of the timestep.
arXiv Detail & Related papers (2024-02-25T04:24:28Z) - Make a Cheap Scaling: A Self-Cascade Diffusion Model for
Higher-Resolution Adaptation [112.08287900261898]
This paper proposes a novel self-cascade diffusion model for rapid adaptation to higher-resolution image and video generation.
Our approach achieves a 5X training speed-up and requires only an additional 0.002M tuning parameters.
Experiments demonstrate that our approach can quickly adapt to higher resolution image and video synthesis by fine-tuning for just 10k steps, with virtually no additional inference time.
arXiv Detail & Related papers (2024-02-16T07:48:35Z) - ExposureDiffusion: Learning to Expose for Low-light Image Enhancement [87.08496758469835]
This work addresses the issue by seamlessly integrating a diffusion model with a physics-based exposure model.
Our method obtains significantly improved performance and reduced inference time compared with vanilla diffusion models.
The proposed framework can work with both real-paired datasets, SOTA noise models, and different backbone networks.
arXiv Detail & Related papers (2023-07-15T04:48:35Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - Refusion: Enabling Large-Size Realistic Image Restoration with
Latent-Space Diffusion Models [9.245782611878752]
We enhance the diffusion model in several aspects such as network architecture, noise level, denoising steps, training image size, and perceptual/scheduler scores.
We also propose a U-Net based latent diffusion model which performs diffusion in a low-resolution latent space while preserving high-resolution information from the original input for the decoding process.
These modifications allow us to apply diffusion models to various image restoration tasks, including real-world shadow removal, HR non-homogeneous dehazing, stereo super-resolution, and bokeh effect transformation.
arXiv Detail & Related papers (2023-04-17T14:06:49Z) - ADIR: Adaptive Diffusion for Image Reconstruction [46.838084286784195]
We propose a conditional sampling scheme that exploits the prior learned by diffusion models.
We then combine it with a novel approach for adapting pretrained diffusion denoising networks to their input.
We show that our proposed adaptive diffusion for image reconstruction' approach achieves a significant improvement in the super-resolution, deblurring, and text-based editing tasks.
arXiv Detail & Related papers (2022-12-06T18:39:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.