Recurrent Self-Supervised Video Denoising with Denser Receptive Field
- URL: http://arxiv.org/abs/2308.03608v1
- Date: Mon, 7 Aug 2023 14:09:08 GMT
- Title: Recurrent Self-Supervised Video Denoising with Denser Receptive Field
- Authors: Zichun Wang, Yulun Zhang, Debing Zhang, Ying Fu
- Abstract summary: Self-supervised video denoising has seen decent progress through the use of blind spot networks.
Previous self-supervised video denoising methods suffer from significant information loss and texture destruction in either the whole reference frame or neighbor frames.
We propose RDRF for self-supervised video denoising, which fully exploits both the reference and neighbor frames with a denser receptive field.
- Score: 33.3711070590966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised video denoising has seen decent progress through the use of
blind spot networks. However, under their blind spot constraints, previous
self-supervised video denoising methods suffer from significant information
loss and texture destruction in either the whole reference frame or neighbor
frames, due to their inadequate consideration of the receptive field. Moreover,
the limited number of available neighbor frames in previous methods leads to
the discarding of distant temporal information. Nonetheless, simply adopting
existing recurrent frameworks does not work, since they easily break the
constraints on the receptive field imposed by self-supervision. In this paper,
we propose RDRF for self-supervised video denoising, which not only fully
exploits both the reference and neighbor frames with a denser receptive field,
but also better leverages the temporal information from both local and distant
neighbor features. First, towards a comprehensive utilization of information
from both reference and neighbor frames, RDRF realizes a denser receptive field
by taking more neighbor pixels along the spatial and temporal dimensions.
Second, it features a self-supervised recurrent video denoising framework,
which concurrently integrates distant and near-neighbor temporal features. This
enables long-term bidirectional information aggregation, while mitigating error
accumulation in the plain recurrent framework. Our method exhibits superior
performance on both synthetic and real video denoising datasets. Codes will be
available at https://github.com/Wang-XIaoDingdd/RDRF.
Related papers
- Temporal As a Plugin: Unsupervised Video Denoising with Pre-Trained Image Denoisers [30.965705043127144]
In this paper, we propose a novel unsupervised video denoising framework, named Temporal As aTAP' (TAP)
By incorporating temporal modules, our method can harness temporal information across noisy frames, complementing its power of spatial denoising.
Compared to other unsupervised video denoising methods, our framework demonstrates superior performance on both sRGB and raw video denoising datasets.
arXiv Detail & Related papers (2024-09-17T15:05:33Z) - TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models [94.24861019513462]
TRIP is a new recipe of image-to-video diffusion paradigm.
It pivots on image noise prior derived from static image to jointly trigger inter-frame relational reasoning.
Extensive experiments on WebVid-10M, DTDB and MSR-VTT datasets demonstrate TRIP's effectiveness.
arXiv Detail & Related papers (2024-03-25T17:59:40Z) - Video Dynamics Prior: An Internal Learning Approach for Robust Video
Enhancements [83.5820690348833]
We present a framework for low-level vision tasks that does not require any external training data corpus.
Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
arXiv Detail & Related papers (2023-12-13T01:57:11Z) - RIGID: Recurrent GAN Inversion and Editing of Real Face Videos [73.97520691413006]
GAN inversion is indispensable for applying the powerful editability of GAN to real images.
Existing methods invert video frames individually often leading to undesired inconsistent results over time.
We propose a unified recurrent framework, named textbfRecurrent vtextbfIdeo textbfGAN textbfInversion and etextbfDiting (RIGID)
Our framework learns the inherent coherence between input frames in an end-to-end manner.
arXiv Detail & Related papers (2023-08-11T12:17:24Z) - RViDeformer: Efficient Raw Video Denoising Transformer with a Larger
Benchmark Dataset [16.131438855407175]
There is no large dataset with realistic motions for supervised raw video denoising.
We construct a video denoising dataset (named as ReCRVD) with 120 groups of noisy-clean videos.
We propose an efficient raw video denoising transformer network (RViDeformer) that explores both short and long-distance correlations.
arXiv Detail & Related papers (2023-05-01T11:06:58Z) - Real-time Streaming Video Denoising with Bidirectional Buffers [48.57108807146537]
Real-time denoising algorithms are typically adopted on the user device to remove the noise involved during the shooting and transmission of video streams.
Recent multi-output inference works propagate the bidirectional temporal feature with a parallel or recurrent framework.
We propose a Bidirectional Streaming Video Denoising framework, to achieve high-fidelity real-time denoising for streaming videos with both past and future temporal receptive fields.
arXiv Detail & Related papers (2022-07-14T14:01:03Z) - Unidirectional Video Denoising by Mimicking Backward Recurrent Modules
with Look-ahead Forward Ones [72.68740880786312]
Bidirectional recurrent networks (BiRNN) have exhibited appealing performance in several video restoration tasks.
BiRNN is intrinsically offline because it uses backward recurrent modules to propagate from the last to current frames.
We present a novel recurrent network consisting of forward and look-ahead recurrent modules for unidirectional video denoising.
arXiv Detail & Related papers (2022-04-12T05:33:15Z) - Multi-Stage Raw Video Denoising with Adversarial Loss and Gradient Mask [14.265454188161819]
We propose a learning-based approach for denoising raw videos captured under low lighting conditions.
We first explicitly align the neighboring frames to the current frame using a convolutional neural network (CNN)
We then fuse the registered frames using another CNN to obtain the final denoised frame.
arXiv Detail & Related papers (2021-03-04T06:57:48Z) - Learning Model-Blind Temporal Denoisers without Ground Truths [46.778450578529814]
Denoisers trained with synthetic data often fail to cope with the diversity of unknown noises.
Previous image-based method leads to noise overfitting if directly applied to video denoisers.
We propose a general framework for video denoising networks that successfully addresses these challenges.
arXiv Detail & Related papers (2020-07-07T07:19:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.