Video Dynamics Prior: An Internal Learning Approach for Robust Video
Enhancements
- URL: http://arxiv.org/abs/2312.07835v1
- Date: Wed, 13 Dec 2023 01:57:11 GMT
- Title: Video Dynamics Prior: An Internal Learning Approach for Robust Video
Enhancements
- Authors: Gaurav Shrivastava, Ser-Nam Lim, Abhinav Shrivastava
- Abstract summary: We present a framework for low-level vision tasks that does not require any external training data corpus.
Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
- Score: 83.5820690348833
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we present a novel robust framework for low-level vision
tasks, including denoising, object removal, frame interpolation, and
super-resolution, that does not require any external training data corpus. Our
proposed approach directly learns the weights of neural modules by optimizing
over the corrupted test sequence, leveraging the spatio-temporal coherence and
internal statistics of videos. Furthermore, we introduce a novel spatial
pyramid loss that leverages the property of spatio-temporal patch recurrence in
a video across the different scales of the video. This loss enhances robustness
to unstructured noise in both the spatial and temporal domains. This further
results in our framework being highly robust to degradation in input frames and
yields state-of-the-art results on downstream tasks such as denoising, object
removal, and frame interpolation. To validate the effectiveness of our
approach, we conduct qualitative and quantitative evaluations on standard video
datasets such as DAVIS, UCF-101, and VIMEO90K-T.
Related papers
- Video Frame Interpolation Transformer [86.20646863821908]
We propose a Transformer-based video framework that allows content-aware aggregation weights and considers long-range dependencies with the self-attention operations.
To avoid the high computational cost of global self-attention, we introduce the concept of local attention into video.
In addition, we develop a multi-scale frame scheme to fully realize the potential of Transformers.
arXiv Detail & Related papers (2021-11-27T05:35:10Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Deep Video Matting via Spatio-Temporal Alignment and Aggregation [63.6870051909004]
We propose a deep learning-based video matting framework which employs a novel aggregation feature module (STFAM)
To eliminate frame-by-frame trimap annotations, a lightweight interactive trimap propagation network is also introduced.
Our framework significantly outperforms conventional video matting and deep image matting methods.
arXiv Detail & Related papers (2021-04-22T17:42:08Z) - Frame-rate Up-conversion Detection Based on Convolutional Neural Network
for Learning Spatiotemporal Features [7.895528973776606]
This paper proposes a frame-rate conversion detection network (FCDNet) that learns forensic features caused by FRUC in an end-to-end fashion.
FCDNet uses a stack of consecutive frames as the input and effectively learns artifacts using network blocks to learn features.
arXiv Detail & Related papers (2021-03-25T08:47:46Z) - Motion-blurred Video Interpolation and Extrapolation [72.3254384191509]
We present a novel framework for deblurring, interpolating and extrapolating sharp frames from a motion-blurred video in an end-to-end manner.
To ensure temporal coherence across predicted frames and address potential temporal ambiguity, we propose a simple, yet effective flow-based rule.
arXiv Detail & Related papers (2021-03-04T12:18:25Z) - Robust Unsupervised Video Anomaly Detection by Multi-Path Frame
Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design.
Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z) - Unsupervised Video Decomposition using Spatio-temporal Iterative
Inference [31.97227651679233]
Multi-object scene decomposition is a fast-emerging problem in learning.
We show that our model has a high accuracy even without color information.
We demonstrate the decomposition, segmentation prediction capabilities of our model and show that it outperforms the state-of-the-art on several benchmark datasets.
arXiv Detail & Related papers (2020-06-25T22:57:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.