STDAN: Deformable Attention Network for Space-Time Video
Super-Resolution
- URL: http://arxiv.org/abs/2203.06841v1
- Date: Mon, 14 Mar 2022 03:40:35 GMT
- Title: STDAN: Deformable Attention Network for Space-Time Video
Super-Resolution
- Authors: Hai Wang, Xiaoyu Xiang, Yapeng Tian, Wenming Yang, Qingmin Liao
- Abstract summary: We propose a deformable attention network called STDAN for STVSR.
First, we devise a long-short term feature (LSTFI) module, which is capable of abundant content from more neighboring input frames.
Second, we put forward a spatial-temporal deformable feature aggregation (STDFA) module, in which spatial and temporal contexts are adaptively captured and aggregated.
- Score: 39.18399652834573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The target of space-time video super-resolution (STVSR) is to increase the
spatial-temporal resolution of low-resolution (LR) and low frame rate (LFR)
videos. Recent approaches based on deep learning have made significant
improvements, but most of them only use two adjacent frames, that is,
short-term features, to synthesize the missing frame embedding, which suffers
from fully exploring the information flow of consecutive input LR frames. In
addition, existing STVSR models hardly exploit the temporal contexts explicitly
to assist high-resolution (HR) frame reconstruction. To address these issues,
in this paper, we propose a deformable attention network called STDAN for
STVSR. First, we devise a long-short term feature interpolation (LSTFI) module,
which is capable of excavating abundant content from more neighboring input
frames for the interpolation process through a bidirectional RNN structure.
Second, we put forward a spatial-temporal deformable feature aggregation
(STDFA) module, in which spatial and temporal contexts in dynamic video frames
are adaptively captured and aggregated to enhance SR reconstruction.
Experimental results on several datasets demonstrate that our approach
outperforms state-of-the-art STVSR methods.
Related papers
- Continuous Space-Time Video Super-Resolution Utilizing Long-Range
Temporal Information [48.20843501171717]
We propose a continuous ST-VSR (CSTVSR) method that can convert the given video to any frame rate and spatial resolution.
We show that the proposed algorithm has good flexibility and achieves better performance on various datasets.
arXiv Detail & Related papers (2023-02-26T08:02:39Z) - Temporal Consistency Learning of inter-frames for Video Super-Resolution [38.26035126565062]
Video super-resolution (VSR) is a task that aims to reconstruct high-resolution (HR) frames from the low-resolution (LR) reference frame and multiple neighboring frames.
Existing methods generally explore information propagation and frame alignment to improve the performance of VSR.
We propose a Temporal Consistency learning Network (TCNet) for VSR in an end-to-end manner, to enhance the consistency of the reconstructed videos.
arXiv Detail & Related papers (2022-11-03T08:23:57Z) - Enhancing Space-time Video Super-resolution via Spatial-temporal Feature
Interaction [9.456643513690633]
The aim of space-time video super-resolution (STVSR) is to increase both the frame rate and the spatial resolution of a video.
Recent approaches solve STVSR using end-to-end deep neural networks.
We propose a spatial-temporal feature interaction network to enhance STVSR by exploiting both spatial and temporal correlations.
arXiv Detail & Related papers (2022-07-18T22:10:57Z) - Optical-Flow-Reuse-Based Bidirectional Recurrent Network for Space-Time
Video Super-Resolution [52.899234731501075]
Space-time video super-resolution (ST-VSR) simultaneously increases the spatial resolution and frame rate for a given video.
Existing methods typically suffer from difficulties in how to efficiently leverage information from a large range of neighboring frames.
We propose a coarse-to-fine bidirectional recurrent neural network instead of using ConvLSTM to leverage knowledge between adjacent frames.
arXiv Detail & Related papers (2021-10-13T15:21:30Z) - Temporal Modulation Network for Controllable Space-Time Video
Super-Resolution [66.06549492893947]
Space-time video super-resolution aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos.
Deformable convolution based methods have achieved promising STVSR performance, but they could only infer the intermediate frame pre-defined in the training stage.
We propose a Temporal Modulation Network (TMNet) to interpolate arbitrary intermediate frame(s) with accurate high-resolution reconstruction.
arXiv Detail & Related papers (2021-04-21T17:10:53Z) - Zooming SlowMo: An Efficient One-Stage Framework for Space-Time Video
Super-Resolution [100.11355888909102]
Space-time video super-resolution aims at generating a high-resolution (HR) slow-motion video from a low-resolution (LR) and low frame rate (LFR) video sequence.
We present a one-stage space-time video super-resolution framework, which can directly reconstruct an HR slow-motion video sequence from an input LR and LFR video.
arXiv Detail & Related papers (2021-04-15T17:59:23Z) - Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.