Real-time Streaming Video Denoising with Bidirectional Buffers
- URL: http://arxiv.org/abs/2207.06937v1
- Date: Thu, 14 Jul 2022 14:01:03 GMT
- Title: Real-time Streaming Video Denoising with Bidirectional Buffers
- Authors: Chenyang Qi, Junming Chen, Xin Yang, Qifeng Chen
- Abstract summary: Real-time denoising algorithms are typically adopted on the user device to remove the noise involved during the shooting and transmission of video streams.
Recent multi-output inference works propagate the bidirectional temporal feature with a parallel or recurrent framework.
We propose a Bidirectional Streaming Video Denoising framework, to achieve high-fidelity real-time denoising for streaming videos with both past and future temporal receptive fields.
- Score: 48.57108807146537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video streams are delivered continuously to save the cost of storage and
device memory. Real-time denoising algorithms are typically adopted on the user
device to remove the noise involved during the shooting and transmission of
video streams. However, sliding-window-based methods feed multiple input frames
for a single output and lack computation efficiency. Recent multi-output
inference works propagate the bidirectional temporal feature with a parallel or
recurrent framework, which either suffers from performance drops on the
temporal edges of clips or can not achieve online inference. In this paper, we
propose a Bidirectional Streaming Video Denoising (BSVD) framework, to achieve
high-fidelity real-time denoising for streaming videos with both past and
future temporal receptive fields. The bidirectional temporal fusion for online
inference is considered not applicable in the MoViNet. However, we introduce a
novel Bidirectional Buffer Block as the core module of our BSVD, which makes it
possible during our pipeline-style inference. In addition, our method is
concise and flexible to be utilized in both non-blind and blind video
denoising. We compare our model with various state-of-the-art video denoising
models qualitatively and quantitatively on synthetic and real noise. Our method
outperforms previous methods in terms of restoration fidelity and runtime. Our
source code is publicly available at https://github.com/ChenyangQiQi/BSVD
Related papers
- Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models [64.2445487645478]
Large Language Models have shown remarkable efficacy in generating streaming data such as text and audio.
We present Live2Diff, the first attempt at designing a video diffusion model with uni-directional temporal attention, specifically targeting live streaming video translation.
arXiv Detail & Related papers (2024-07-11T17:34:51Z) - SF-V: Single Forward Video Generation Model [57.292575082410785]
We propose a novel approach to obtain single-step video generation models by leveraging adversarial training to fine-tune pre-trained models.
Experiments demonstrate that our method achieves competitive generation quality of synthesized videos with significantly reduced computational overhead.
arXiv Detail & Related papers (2024-06-06T17:58:27Z) - FIFO-Diffusion: Generating Infinite Videos from Text without Training [44.65468310143439]
FIFO-Diffusion is conceptually capable of generating infinitely long videos without additional training.
Our method dequeues a fully denoised frame at the head while enqueuing a new random noise frame at the tail.
We have demonstrated the promising results and effectiveness of the proposed methods on existing text-to-video generation baselines.
arXiv Detail & Related papers (2024-05-19T07:48:41Z) - Recurrent Self-Supervised Video Denoising with Denser Receptive Field [33.3711070590966]
Self-supervised video denoising has seen decent progress through the use of blind spot networks.
Previous self-supervised video denoising methods suffer from significant information loss and texture destruction in either the whole reference frame or neighbor frames.
We propose RDRF for self-supervised video denoising, which fully exploits both the reference and neighbor frames with a denser receptive field.
arXiv Detail & Related papers (2023-08-07T14:09:08Z) - VideoFusion: Decomposed Diffusion Models for High-Quality Video
Generation [88.49030739715701]
This work presents a decomposed diffusion process via resolving the per-frame noise into a base noise that is shared among all frames and a residual noise that varies along the time axis.
Experiments on various datasets confirm that our approach, termed as VideoFusion, surpasses both GAN-based and diffusion-based alternatives in high-quality video generation.
arXiv Detail & Related papers (2023-03-15T02:16:39Z) - Low Latency Video Denoising for Online Conferencing Using CNN
Architectures [4.7805617044617446]
We propose a pipeline for real-time video denoising with low runtime cost and high perceptual quality.
A custom noise detector analyzer provides real-time feedback to adapt the weights and improve the models' output.
arXiv Detail & Related papers (2023-02-17T00:55:54Z) - Learning Task-Oriented Flows to Mutually Guide Feature Alignment in
Synthesized and Real Video Denoising [137.5080784570804]
Video denoising aims at removing noise from videos to recover clean ones.
Some existing works show that optical flow can help the denoising by exploiting the additional spatial-temporal clues from nearby frames.
We propose a new multi-scale refined optical flow-guided video denoising method, which is more robust to different noise levels.
arXiv Detail & Related papers (2022-08-25T00:09:18Z) - End to End Lip Synchronization with a Temporal AutoEncoder [95.94432031144716]
We study the problem of syncing the lip movement in a video with the audio stream.
Our solution finds an optimal alignment using a dual-domain recurrent neural network.
As an application, we demonstrate our ability to robustly align text-to-speech generated audio with an existing video stream.
arXiv Detail & Related papers (2022-03-30T12:00:18Z) - Restore from Restored: Video Restoration with Pseudo Clean Video [28.057705167363327]
We propose a self-supervised video denoising method called "restore-from-restored"
This method fine-tunes a pre-trained network by using a pseudo clean video during the test phase.
We analyze the restoration performance of the fine-tuned video denoising networks with the proposed self-supervision-based learning algorithm.
arXiv Detail & Related papers (2020-03-09T17:37:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.