Learning Truncated Causal History Model for Video Restoration
- URL: http://arxiv.org/abs/2410.03936v2
- Date: Tue, 15 Oct 2024 15:57:10 GMT
- Title: Learning Truncated Causal History Model for Video Restoration
- Authors: Amirhosein Ghasemabadi, Muhammad Kamran Janjua, Mohammad Salameh, Di Niu,
- Abstract summary: TURTLE learns the truncated causal history model for efficient and high-performing video restoration.
We report new state-of-the-art results on a multitude of video restoration benchmark tasks.
- Score: 14.381907888022615
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One key challenge to video restoration is to model the transition dynamics of video frames governed by motion. In this work, we propose TURTLE to learn the truncated causal history model for efficient and high-performing video restoration. Unlike traditional methods that process a range of contextual frames in parallel, TURTLE enhances efficiency by storing and summarizing a truncated history of the input frame latent representation into an evolving historical state. This is achieved through a sophisticated similarity-based retrieval mechanism that implicitly accounts for inter-frame motion and alignment. The causal design in TURTLE enables recurrence in inference through state-memorized historical features while allowing parallel training by sampling truncated video clips. We report new state-of-the-art results on a multitude of video restoration benchmark tasks, including video desnowing, nighttime video deraining, video raindrops and rain streak removal, video super-resolution, real-world and synthetic video deblurring, and blind video denoising while reducing the computational cost compared to existing best contextual methods on all these tasks.
Related papers
- DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models [9.145545884814327]
This paper introduces a method for zero-shot video restoration using pre-trained image restoration diffusion models.
We show that our method achieves top performance in zero-shot video restoration.
Our technique works with any 2D restoration diffusion model, offering a versatile and powerful tool for video enhancement tasks without extensive retraining.
arXiv Detail & Related papers (2024-07-01T17:59:12Z) - VJT: A Video Transformer on Joint Tasks of Deblurring, Low-light
Enhancement and Denoising [45.349350685858276]
Video restoration task aims to recover high-quality videos from low-quality observations.
Video often faces different types of degradation, such as blur, low light, and noise.
We propose an efficient end-to-end video transformer approach for the joint task of video deblurring, low-light enhancement, and denoising.
arXiv Detail & Related papers (2024-01-26T10:27:56Z) - ConVRT: Consistent Video Restoration Through Turbulence with Test-time
Optimization of Neural Video Representations [13.38405890753946]
We introduce a self-supervised method, Consistent Video Restoration through Turbulence (ConVRT)
ConVRT is a test-time optimization method featuring a neural video representation designed to enhance temporal consistency in restoration.
A key innovation of ConVRT is the integration of a pretrained vision-language model (CLIP) for semantic-oriented supervision.
arXiv Detail & Related papers (2023-12-07T20:19:48Z) - Just a Glimpse: Rethinking Temporal Information for Video Continual
Learning [58.7097258722291]
We propose a novel replay mechanism for effective video continual learning based on individual/single frames.
Under extreme memory constraints, video diversity plays a more significant role than temporal information.
Our method achieves state-of-the-art performance, outperforming the previous state-of-the-art by up to 21.49%.
arXiv Detail & Related papers (2023-05-28T19:14:25Z) - Video Event Restoration Based on Keyframes for Video Anomaly Detection [9.18057851239942]
Existing deep neural network based anomaly detection (VAD) methods mostly follow the route of frame reconstruction or frame prediction.
We introduce a brand-new VAD paradigm to break through these limitations.
We propose a novel U-shaped Swin Transformer Network with Dual Skip Connections (USTN-DSC) for video event restoration.
arXiv Detail & Related papers (2023-04-11T10:13:19Z) - Recurrent Video Restoration Transformer with Guided Deformable Attention [116.1684355529431]
We propose RVRT, which processes local neighboring frames in parallel within a globally recurrent framework.
RVRT achieves state-of-the-art performance on benchmark datasets with balanced model size, testing memory and runtime.
arXiv Detail & Related papers (2022-06-05T10:36:09Z) - Learning Trajectory-Aware Transformer for Video Super-Resolution [50.49396123016185]
Video super-resolution aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts.
Existing approaches usually align and aggregate video frames from limited adjacent frames.
We propose a novel Transformer for Video Super-Resolution (TTVSR)
arXiv Detail & Related papers (2022-04-08T03:37:39Z) - VRT: A Video Restoration Transformer [126.79589717404863]
Video restoration (e.g., video super-resolution) aims to restore high-quality frames from low-quality frames.
We propose a Video Restoration Transformer (VRT) with parallel frame prediction and long-range temporal dependency modelling abilities.
arXiv Detail & Related papers (2022-01-28T17:54:43Z) - Self-Conditioned Probabilistic Learning of Video Rescaling [70.10092286301997]
We propose a self-conditioned probabilistic framework for video rescaling to learn the paired downscaling and upscaling procedures simultaneously.
We decrease the entropy of the information lost in the downscaling by maximizing its conditioned probability on the strong spatial-temporal prior information.
We extend the framework to a lossy video compression system, in which a gradient estimator for non-differential industrial lossy codecs is proposed.
arXiv Detail & Related papers (2021-07-24T15:57:15Z) - Noisy-LSTM: Improving Temporal Awareness for Video Semantic Segmentation [29.00635219317848]
This paper presents a new model named Noisy-LSTM, which is trainable in an end-to-end manner.
We also present a simple yet effective training strategy, which replaces a frame in video sequence with noises.
arXiv Detail & Related papers (2020-10-19T13:08:15Z) - Temporal Context Aggregation for Video Retrieval with Contrastive
Learning [81.12514007044456]
We propose TCA, a video representation learning framework that incorporates long-range temporal information between frame-level features.
The proposed method shows a significant performance advantage (17% mAP on FIVR-200K) over state-of-the-art methods with video-level features.
arXiv Detail & Related papers (2020-08-04T05:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.