ASF-Net: Robust Video Deraining via Temporal Alignment and Online
Adaptive Learning
- URL: http://arxiv.org/abs/2309.00956v1
- Date: Sat, 2 Sep 2023 14:50:13 GMT
- Title: ASF-Net: Robust Video Deraining via Temporal Alignment and Online
Adaptive Learning
- Authors: Xinwei Xue, Jia He, Long Ma, Xiangyu Meng, Wenlin Li, Risheng Liu
- Abstract summary: We propose a new computational paradigm, Alignment-Shift-Fusion Network (ASF-Net), which incorporates a temporal shift module.
We construct a LArge-scale RAiny video dataset (LARA) which supports the development of this community.
Our proposed approach exhibits superior performance in three benchmarks and compelling visual quality in real-world scenarios.
- Score: 47.10392889695035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent times, learning-based methods for video deraining have demonstrated
commendable results. However, there are two critical challenges that these
methods are yet to address: exploiting temporal correlations among adjacent
frames and ensuring adaptability to unknown real-world scenarios. To overcome
these challenges, we explore video deraining from a paradigm design perspective
to learning strategy construction. Specifically, we propose a new computational
paradigm, Alignment-Shift-Fusion Network (ASF-Net), which incorporates a
temporal shift module. This module is novel to this field and provides deeper
exploration of temporal information by facilitating the exchange of
channel-level information within the feature space. To fully discharge the
model's characterization capability, we further construct a LArge-scale RAiny
video dataset (LARA) which also supports the development of this community. On
the basis of the newly-constructed dataset, we explore the parameters learning
process by developing an innovative re-degraded learning strategy. This
strategy bridges the gap between synthetic and real-world scenes, resulting in
stronger scene adaptability. Our proposed approach exhibits superior
performance in three benchmarks and compelling visual quality in real-world
scenarios, underscoring its efficacy. The code is available at
https://github.com/vis-opt-group/ASF-Net.
Related papers
- Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion.
Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z) - Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision.
Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes.
We introduce the bilevel paradigm to model the above latent correspondence.
A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z) - Pre-training Contextualized World Models with In-the-wild Videos for
Reinforcement Learning [54.67880602409801]
In this paper, we study the problem of pre-training world models with abundant in-the-wild videos for efficient learning of visual control tasks.
We introduce Contextualized World Models (ContextWM) that explicitly separate context and dynamics modeling.
Our experiments show that in-the-wild video pre-training equipped with ContextWM can significantly improve the sample efficiency of model-based reinforcement learning.
arXiv Detail & Related papers (2023-05-29T14:29:12Z) - Deeply-Coupled Convolution-Transformer with Spatial-temporal
Complementary Learning for Video-based Person Re-identification [91.56939957189505]
We propose a novel spatial-temporal complementary learning framework named Deeply-Coupled Convolution-Transformer (DCCT) for high-performance video-based person Re-ID.
Our framework could attain better performances than most state-of-the-art methods.
arXiv Detail & Related papers (2023-04-27T12:16:44Z) - Video-SwinUNet: Spatio-temporal Deep Learning Framework for VFSS
Instance Segmentation [10.789826145990016]
This paper presents a deep learning framework for medical video segmentation.
Our framework explicitly extracts features from neighbouring frames across the temporal dimension.
It incorporates them with a temporal feature blender, which then tokenises the high-level-temporal feature to form a strong global feature encoded via a Swin Transformer.
arXiv Detail & Related papers (2023-02-22T12:09:39Z) - FuTH-Net: Fusing Temporal Relations and Holistic Features for Aerial
Video Classification [49.06447472006251]
We propose a novel deep neural network, termed FuTH-Net, to model not only holistic features, but also temporal relations for aerial video classification.
Our model is evaluated on two aerial video classification datasets, ERA and Drone-Action, and achieves the state-of-the-art results.
arXiv Detail & Related papers (2022-09-22T21:15:58Z) - Spatiotemporal Inconsistency Learning for DeepFake Video Detection [51.747219106855624]
We present a novel temporal modeling paradigm in TIM by exploiting the temporal difference over adjacent frames along with both horizontal and vertical directions.
And the ISM simultaneously utilizes the spatial information from SIM and temporal information from TIM to establish a more comprehensive spatial-temporal representation.
arXiv Detail & Related papers (2021-09-04T13:05:37Z) - Fast Video Salient Object Detection via Spatiotemporal Knowledge
Distillation [20.196945571479002]
We present a lightweight network tailored for video salient object detection.
Specifically, we combine a saliency guidance embedding structure and spatial knowledge distillation to refine the spatial features.
In the temporal aspect, we propose a temporal knowledge distillation strategy, which allows the network to learn the robust temporal features.
arXiv Detail & Related papers (2020-10-20T04:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.