Zero-Shot Video Restoration and Enhancement with Assistance of Video Diffusion Models
- URL: http://arxiv.org/abs/2601.21922v1
- Date: Thu, 29 Jan 2026 16:14:07 GMT
- Title: Zero-Shot Video Restoration and Enhancement with Assistance of Video Diffusion Models
- Authors: Cong Cao, Huanjing Yue, Shangbin Xie, Xin Liu, Jingyu Yang,
- Abstract summary: We propose the first framework that utilizes the rapidly-developed video diffusion model to assist the image-based method in maintaining more temporal consistency.<n>We propose latents fusion, heterogenous latents fusion, and a COT-based fusion ratio strategy to utilize both and heterogenous text-to-video diffusion models to complement the image method.<n>Our method is training-free and can be applied to any diffusion-based image restoration and enhancement methods.
- Score: 23.205162529582747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although diffusion-based zero-shot image restoration and enhancement methods have achieved great success, applying them to video restoration or enhancement will lead to severe temporal flickering. In this paper, we propose the first framework that utilizes the rapidly-developed video diffusion model to assist the image-based method in maintaining more temporal consistency for zero-shot video restoration and enhancement. We propose homologous latents fusion, heterogenous latents fusion, and a COT-based fusion ratio strategy to utilize both homologous and heterogenous text-to-video diffusion models to complement the image method. Moreover, we propose temporal-strengthening post-processing to utilize the image-to-video diffusion model to further improve temporal consistency. Our method is training-free and can be applied to any diffusion-based image restoration and enhancement methods. Experimental results demonstrate the superiority of the proposed method.
Related papers
- Improving Temporal Consistency and Fidelity at Inference-time in Perceptual Video Restoration by Zero-shot Image-based Diffusion Models [5.61537470581101]
We address the challenge of improving temporal coherence in video restoration using zero-shot image-based diffusion models.<n>We propose two complementary inference-time strategies: Perceptual Straightening Guidance (PSG) and Ensemble Sampling (MPES)
arXiv Detail & Related papers (2025-10-29T11:40:06Z) - LVTINO: LAtent Video consisTency INverse sOlver for High Definition Video Restoration [3.2944592608677614]
We propose LVTINO, the first zero-shot or plug-and-play inverse solver for high definition video restoration with priors encoded by VCMs.<n>Our conditioning mechanism bypasses the need for automatic differentiation and achieves state-of-the-art video reconstruction quality with only a few neural function evaluations.
arXiv Detail & Related papers (2025-10-01T18:10:08Z) - Solving Video Inverse Problems Using Image Diffusion Models [58.464465016269614]
We introduce an innovative video inverse solver that leverages only image diffusion models.<n>Our method treats the time dimension of a video as the batch dimension image diffusion models.<n>We also introduce a batch-consistent sampling strategy that encourages consistency across batches.
arXiv Detail & Related papers (2024-09-04T09:48:27Z) - Zero-Shot Video Restoration and Enhancement Using Pre-Trained Image Diffusion Model [15.170889156729777]
We propose the first framework for zero-shot video restoration and enhancement based on the pre-trained image diffusion model.<n>Our method is a plug-and-play module that can be inserted into any diffusion-based image restoration or enhancement methods.
arXiv Detail & Related papers (2024-07-02T05:31:59Z) - DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models [9.145545884814327]
We present DiffIR2VR-Zero, a zero-shot framework that enables any pre-trained image restoration model to perform high-quality video restoration without additional training.<n>Our framework works with any image restoration diffusion model, providing a versatile solution for video enhancement without task-specific training or modifications.
arXiv Detail & Related papers (2024-07-01T17:59:12Z) - ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations.
We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z) - FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation [85.29772293776395]
We introduce FRESCO, intra-frame correspondence alongside inter-frame correspondence to establish a more robust spatial-temporal constraint.
This enhancement ensures a more consistent transformation of semantically similar content across frames.
Our approach involves an explicit update of features to achieve high spatial-temporal consistency with the input video.
arXiv Detail & Related papers (2024-03-19T17:59:18Z) - Make a Cheap Scaling: A Self-Cascade Diffusion Model for
Higher-Resolution Adaptation [112.08287900261898]
This paper proposes a novel self-cascade diffusion model for rapid adaptation to higher-resolution image and video generation.
Our approach achieves a 5X training speed-up and requires only an additional 0.002M tuning parameters.
Experiments demonstrate that our approach can quickly adapt to higher resolution image and video synthesis by fine-tuning for just 10k steps, with virtually no additional inference time.
arXiv Detail & Related papers (2024-02-16T07:48:35Z) - Diffusion Models for Image Restoration and Enhancement: A Comprehensive Survey [73.86861112002593]
We present a comprehensive review of recent diffusion model-based methods on image restoration.<n>We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR.<n>We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z) - Low-Light Image Enhancement with Wavelet-based Diffusion Models [50.632343822790006]
Diffusion models have achieved promising results in image restoration tasks, yet suffer from time-consuming, excessive computational resource consumption, and unstable restoration.
We propose a robust and efficient Diffusion-based Low-Light image enhancement approach, dubbed DiffLL.
arXiv Detail & Related papers (2023-06-01T03:08:28Z) - ADIR: Adaptive Diffusion for Image Reconstruction [42.90778718695398]
Denoising diffusion models have recently achieved remarkable success in image generation, capturing rich information about natural image statistics.<n>We introduce a conditional sampling framework that leverages the powerful priors learned by diffusion models while enforcing consistency with the available measurements.<n>We employ LoRA-based adaptation using images that are semantically and visually similar to the degraded input, efficiently retrieved from a large and diverse dataset.
arXiv Detail & Related papers (2022-12-06T18:39:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.