Related papers: Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models

Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models

URL: http://arxiv.org/abs/2407.10285v1
Date: Sun, 14 Jul 2024 17:59:56 GMT
Title: Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion Models
Authors: Qinyu Yang, Haoxin Chen, Yong Zhang, Menghan Xia, Xiaodong Cun, Zhixun Su, Ying Shan,
Abstract summary: We propose a novel formulation that considers both visual quality and consistency of content. Consistency of content is ensured by a proposed loss function that maintains the structure of the input, while visual quality is improved by utilizing the denoising process of pretrained diffusion models.
Score: 47.518487213173785
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In order to improve the quality of synthesized videos, currently, one predominant method involves retraining an expert diffusion model and then implementing a noising-denoising process for refinement. Despite the significant training costs, maintaining consistency of content between the original and enhanced videos remains a major challenge. To tackle this challenge, we propose a novel formulation that considers both visual quality and consistency of content. Consistency of content is ensured by a proposed loss function that maintains the structure of the input, while visual quality is improved by utilizing the denoising process of pretrained diffusion models. To address the formulated optimization problem, we have developed a plug-and-play noise optimization strategy, referred to as Noise Calibration. By refining the initial random noise through a few iterations, the content of original video can be largely preserved, and the enhancement effect demonstrates a notable improvement. Extensive experiments have demonstrated the effectiveness of the proposed method.

Related papers

ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos [32.14142910911528]
Video diffusion models (VDMs) facilitate the generation of high-quality videos. Recent studies have uncovered the existence of "golden noises" that can enhance video quality during generation. We propose ScalingNoise, a plug-and-play inference-time search strategy that identifies golden initial noises for the diffusion sampling process.
arXiv Detail & Related papers (2025-03-20T17:54:37Z)
Tuning-Free Long Video Generation via Global-Local Collaborative Diffusion [22.988212617368095]
We propose GLC-Diffusion, a tuning-free method for long video generation.<n>It models the long video denoising process by establishing Global-Local Collaborative Denoising.<n>We also propose a Video Motion Consistency Refinement (VMCR) module that computes the gradient of pixel-wise and frequency-wise losses.
arXiv Detail & Related papers (2025-01-08T05:49:39Z)
Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy [44.09909260046396]
We propose AdaptiveDiffusion to reduce noise prediction steps during the denoising process. Our method can significantly speed up the denoising process while generating identical results to the original process, achieving up to an average 25x speedup.
arXiv Detail & Related papers (2024-10-13T15:19:18Z)
VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide [48.22321420680046]
VideoGuide is a novel framework that enhances the temporal consistency of pretrained text-to-video (T2V) models. It improves temporal quality by interpolating the guiding model's denoised samples into the sampling model's denoising process. The proposed method brings about significant improvement in temporal consistency and image fidelity.
arXiv Detail & Related papers (2024-10-06T05:46:17Z)
Combining Pre- and Post-Demosaicking Noise Removal for RAW Video [2.772895608190934]
Denoising is one of the fundamental steps of the processing pipeline that converts data captured by a camera sensor into a display-ready image or video. We propose a self-similarity-based denoising scheme that weights both a pre- and a post-demosaicking denoiser for Bayer-patterned CFA video data. We show that a balance between the two leads to better image quality, and we empirically find that higher noise levels benefit from a higher influence pre-demosaicking.
arXiv Detail & Related papers (2024-10-03T15:20:19Z)
FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process [120.91393949012014]
FreeEnhance is a framework for content-consistent image enhancement using off-the-shelf image diffusion models. In the noising stage, FreeEnhance is devised to add lighter noise to the region with higher frequency to preserve the high-frequent patterns in the original image. In the denoising stage, we present three target properties as constraints to regularize the predicted noise, enhancing images with high acutance and high visual quality.
arXiv Detail & Related papers (2024-09-11T17:58:50Z)
Zero-Shot Video Editing through Adaptive Sliding Score Distillation [51.57440923362033]
This study proposes a novel paradigm of video-based score distillation, facilitating direct manipulation of original video content. We propose an Adaptive Sliding Score Distillation strategy, which incorporates both global and local video guidance to reduce the impact of editing errors.
arXiv Detail & Related papers (2024-06-07T12:33:59Z)
FreeInit: Bridging Initialization Gap in Video Diffusion Models [42.38240625514987]
FreeInit is able to compensate the gap between training and inference, thus effectively improving the subject appearance and temporal consistency of generation results. Experiments demonstrate that FreeInit consistently enhances the generation quality of various text-to-video diffusion models without additional training or fine-tuning.
arXiv Detail & Related papers (2023-12-12T18:59:16Z)
VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation [88.49030739715701]
This work presents a decomposed diffusion process via resolving the per-frame noise into a base noise that is shared among all frames and a residual noise that varies along the time axis. Experiments on various datasets confirm that our approach, termed as VideoFusion, surpasses both GAN-based and diffusion-based alternatives in high-quality video generation.
arXiv Detail & Related papers (2023-03-15T02:16:39Z)
Encoding in the Dark Grand Challenge: An Overview [60.9261003831389]
We propose a Grand Challenge on encoding low-light video sequences. VVC achieves a high performance compared to simply denoising the video source prior to encoding. The quality of the video streams can be further improved by employing a post-processing image enhancement method.
arXiv Detail & Related papers (2020-05-07T08:22:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.