Diagnosing and Preventing Instabilities in Recurrent Video Processing
- URL: http://arxiv.org/abs/2010.05099v2
- Date: Sat, 17 Oct 2020 14:44:21 GMT
- Title: Diagnosing and Preventing Instabilities in Recurrent Video Processing
- Authors: Thomas Tanay, Aivar Sootla, Matteo Maggioni, Puneet K. Dokania, Philip
Torr, Ales Leonardis and Gregory Slabaugh
- Abstract summary: We show that video stability models tend to fail catastrophically at inference time on long visualizations.
We introduce a diagnostic tool which produces adversarial input sequences optimized to trigger instabilities.
We then introduce Stable Rank Normalization of the Layers (SRNL), a new algorithm that enforces these constraints.
- Score: 23.39527368516591
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent models are becoming a popular choice for video enhancement tasks
such as video denoising. In this work, we focus on their stability as dynamical
systems and show that they tend to fail catastrophically at inference time on
long video sequences. To address this issue, we (1) introduce a diagnostic tool
which produces adversarial input sequences optimized to trigger instabilities
and that can be interpreted as visualizations of spatio-temporal receptive
fields, and (2) propose two approaches to enforce the stability of a model:
constraining the spectral norm or constraining the stable rank of its
convolutional layers. We then introduce Stable Rank Normalization of the Layers
(SRNL), a new algorithm that enforces these constraints, and verify
experimentally that it successfully results in stable recurrent video
processing.
Related papers
- Video Dynamics Prior: An Internal Learning Approach for Robust Video
Enhancements [83.5820690348833]
We present a framework for low-level vision tasks that does not require any external training data corpus.
Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
arXiv Detail & Related papers (2023-12-13T01:57:11Z) - Fast Full-frame Video Stabilization with Iterative Optimization [21.962533235492625]
We propose an iterative optimization-based learning approach using synthetic datasets for video stabilization.
We develop a two-level (coarse-to-fine) stabilizing algorithm based on the probabilistic flow field.
We take a divide-and-conquer approach and propose a novel multiframe fusion strategy to render full-frame stabilized views.
arXiv Detail & Related papers (2023-07-24T13:24:19Z) - Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs)
GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations.
We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z) - GPU-accelerated SIFT-aided source identification of stabilized videos [63.084540168532065]
We exploit the parallelization capabilities of Graphics Processing Units (GPUs) in the framework of stabilised frames inversion.
We propose to exploit SIFT features.
to estimate the camera momentum and %to identify less stabilized temporal segments.
Experiments confirm the effectiveness of the proposed approach in reducing the required computational time and improving the source identification accuracy.
arXiv Detail & Related papers (2022-07-29T07:01:31Z) - Unsupervised Flow-Aligned Sequence-to-Sequence Learning for Video
Restoration [85.3323211054274]
How to properly model the inter-frame relation within the video sequence is an important but unsolved challenge for video restoration (VR)
In this work, we propose an unsupervised flow-aligned sequence-to-sequence model (S2SVR) to address this problem.
S2SVR shows superior performance in multiple VR tasks, including video deblurring, video super-resolution, and compressed video quality enhancement.
arXiv Detail & Related papers (2022-05-20T14:14:48Z) - Deep Motion Blind Video Stabilization [4.544151613454639]
This work aims to declutter this over-complicated formulation of video stabilization with the help of a novel dataset.
We successfully learn motion blind full-frame video stabilization through employing strictly conventional generative techniques.
Our method achieves $sim3times$ speed-up over the currently available fastest video stabilization methods.
arXiv Detail & Related papers (2020-11-19T07:26:06Z) - Robust Unsupervised Video Anomaly Detection by Multi-Path Frame
Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design.
Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z) - Training Generative Adversarial Networks by Solving Ordinary
Differential Equations [54.23691425062034]
We study the continuous-time dynamics induced by GAN training.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error.
We experimentally verify that well-known ODE solvers (such as Runge-Kutta) can stabilise training.
arXiv Detail & Related papers (2020-10-28T15:23:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.