Related papers: Enhanced Spatio-Temporal Interaction Learning for Video Deraining: A Faster and Better Framework

Enhanced Spatio-Temporal Interaction Learning for Video Deraining: A Faster and Better Framework

URL: http://arxiv.org/abs/2103.12318v1
Date: Tue, 23 Mar 2021 05:19:35 GMT
Title: Enhanced Spatio-Temporal Interaction Learning for Video Deraining: A Faster and Better Framework
Authors: Kaihao Zhang, Dongxu Li, Wenhan Luo, Wen-Yan Lin, Fang Zhao, Wenqi Ren, Wei Liu, Hongdong Li
Abstract summary: Video deraining is an important task in computer vision as the unwanted rain hampers the visibility of videos and deteriorates the robustness of most outdoor vision systems. We present a new end-to-end deraining framework, named Enhanced Spatio-Temporal Interaction Network (ESTINet) ESTINet considerably boosts current state-of-the-art video deraining quality and speed.
Score: 93.37833982180538
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Video deraining is an important task in computer vision as the unwanted rain hampers the visibility of videos and deteriorates the robustness of most outdoor vision systems. Despite the significant success which has been achieved for video deraining recently, two major challenges remain: 1) how to exploit the vast information among continuous frames to extract powerful spatio-temporal features across both the spatial and temporal domains, and 2) how to restore high-quality derained videos with a high-speed approach. In this paper, we present a new end-to-end video deraining framework, named Enhanced Spatio-Temporal Interaction Network (ESTINet), which considerably boosts current state-of-the-art video deraining quality and speed. The ESTINet takes the advantage of deep residual networks and convolutional long short-term memory, which can capture the spatial features and temporal correlations among continuing frames at the cost of very little computational source. Extensive experiments on three public datasets show that the proposed ESTINet can achieve faster speed than the competitors, while maintaining better performance than the state-of-the-art methods.

Related papers

TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video [30.16638127979361]
We present TAPTRv3, which is built upon TAPTRv2, to improve its point tracking robustness in long videos. TAPTRv3 surpasses TAPTRv2 by a large margin on most of the challenging datasets and obtains state-of-the-art performance.
arXiv Detail & Related papers (2024-11-27T17:37:22Z)
BroadWay: Boost Your Text-to-Video Generation Model in a Training-free Way [72.1984861448374]
We present BroadWay, a training-free method to improve the quality of text-to-video generation without introducing additional parameters, augmenting memory or sampling time. Specifically, BroadWay is composed of two principal components: 1) Temporal Self-Guidance improves the structural plausibility and temporal consistency of generated videos by reducing the disparity between the temporal attention maps across various decoder blocks.
arXiv Detail & Related papers (2024-10-08T17:56:33Z)
Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement [48.76608212565327]
This paper makes endeavors in the direction of learning for low-light video enhancement without using paired ground truth. Compared to low-light image enhancement, enhancing low-light videos is more difficult due to the intertwined effects of noise, exposure, and contrast in the spatial domain, jointly with the need for temporal coherence. We propose the Unrolled Decomposed Unpaired Network (UDU-Net) for enhancing low-light videos by unrolling the optimization functions into a deep network to decompose the signal into spatial and temporal-related factors, which are updated iteratively.
arXiv Detail & Related papers (2024-08-22T11:45:11Z)
RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining [14.025870185802463]
We present an improved SSMs-based video deraining network (RainMamba) with a novel Hilbert mechanism to better capture sequence-level local information. We also introduce a difference-guided dynamic contrastive locality learning strategy to enhance the patch-level self-similarity learning ability of the proposed network.
arXiv Detail & Related papers (2024-07-31T17:48:22Z)
Online Video Quality Enhancement with Spatial-Temporal Look-up Tables [42.07242907586958]
Low latency rates are crucial for online video-based applications, such as video conferencing and cloud gaming. Existing quality enhancement methods are limited by slow inference speed and the requirement for temporal information contained in future frames. We propose STLVQE, specifically designed to address the rarely studied online video quality enhancement (Online-VQE) problem.
arXiv Detail & Related papers (2023-11-22T06:49:44Z)
ReBotNet: Fast Real-time Video Enhancement [59.08038313427057]
Most restoration networks are slow, have high computational bottleneck, and can't be used for real-time video enhancement. In this work, we design an efficient and fast framework to perform real-time enhancement for practical use-cases like live video calls and video streams. To evaluate our method, we emulate two new datasets that real-world video call and streaming scenarios, and show extensive results on multiple datasets where ReBotNet outperforms existing approaches with lower computations, reduced memory requirements, and faster inference time.
arXiv Detail & Related papers (2023-03-23T17:58:05Z)
Video Salient Object Detection via Contrastive Features and Attention Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection. A co-attention formulation is utilized to combine the low-level and high-level features. We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z)
An Efficient Recurrent Adversarial Framework for Unsupervised Real-Time Video Enhancement [132.60976158877608]
We propose an efficient adversarial video enhancement framework that learns directly from unpaired video examples. In particular, our framework introduces new recurrent cells that consist of interleaved local and global modules for implicit integration of spatial and temporal information. The proposed design allows our recurrent cells to efficiently propagate-temporal-information across frames and reduces the need for high complexity networks.
arXiv Detail & Related papers (2020-12-24T00:03:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.