Enhancing Space-time Video Super-resolution via Spatial-temporal Feature
Interaction
- URL: http://arxiv.org/abs/2207.08960v3
- Date: Thu, 20 Apr 2023 06:48:18 GMT
- Title: Enhancing Space-time Video Super-resolution via Spatial-temporal Feature
Interaction
- Authors: Zijie Yue, Miaojing Shi
- Abstract summary: The aim of space-time video super-resolution (STVSR) is to increase both the frame rate and the spatial resolution of a video.
Recent approaches solve STVSR using end-to-end deep neural networks.
We propose a spatial-temporal feature interaction network to enhance STVSR by exploiting both spatial and temporal correlations.
- Score: 9.456643513690633
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The target of space-time video super-resolution (STVSR) is to increase both
the frame rate (also referred to as the temporal resolution) and the spatial
resolution of a given video. Recent approaches solve STVSR using end-to-end
deep neural networks. A popular solution is to first increase the frame rate of
the video; then perform feature refinement among different frame features; and
last increase the spatial resolutions of these features. The temporal
correlation among features of different frames is carefully exploited in this
process. The spatial correlation among features of different (spatial)
resolutions, despite being also very important, is however not emphasized. In
this paper, we propose a spatial-temporal feature interaction network to
enhance STVSR by exploiting both spatial and temporal correlations among
features of different frames and spatial resolutions. Specifically, the
spatial-temporal frame interpolation module is introduced to interpolate low-
and high-resolution intermediate frame features simultaneously and
interactively. The spatial-temporal local and global refinement modules are
respectively deployed afterwards to exploit the spatial-temporal correlation
among different features for their refinement. Finally, a novel motion
consistency loss is employed to enhance the motion continuity among
reconstructed frames. We conduct experiments on three standard benchmarks,
Vid4, Vimeo-90K and Adobe240, and the results demonstrate that our method
improves the state of the art methods by a considerable margin. Our codes will
be available at
https://github.com/yuezijie/STINet-Space-time-Video-Super-resolution.
Related papers
- HR-INR: Continuous Space-Time Video Super-Resolution via Event Camera [22.208120663778043]
Continuous space-time super-resolution (C-STVSR) aims to simultaneously enhance resolution and frame rate at an arbitrary scale.
We propose a novel C-STVSR framework, called HR-INR, which captures both holistic dependencies and regional motions based on implicit neural representation (INR)
We then propose a novel INR-based decoder withtemporal embeddings to capture long-term dependencies with a larger temporal perception field.
arXiv Detail & Related papers (2024-05-22T06:51:32Z) - Local-Global Temporal Difference Learning for Satellite Video
Super-Resolution [55.69322525367221]
We propose to exploit the well-defined temporal difference for efficient and effective temporal compensation.
To fully utilize the local and global temporal information within frames, we systematically modeled the short-term and long-term temporal discrepancies.
Rigorous objective and subjective evaluations conducted across five mainstream video satellites demonstrate that our method performs favorably against state-of-the-art approaches.
arXiv Detail & Related papers (2023-04-10T07:04:40Z) - Continuous Space-Time Video Super-Resolution Utilizing Long-Range
Temporal Information [48.20843501171717]
We propose a continuous ST-VSR (CSTVSR) method that can convert the given video to any frame rate and spatial resolution.
We show that the proposed algorithm has good flexibility and achieves better performance on various datasets.
arXiv Detail & Related papers (2023-02-26T08:02:39Z) - STDAN: Deformable Attention Network for Space-Time Video
Super-Resolution [39.18399652834573]
We propose a deformable attention network called STDAN for STVSR.
First, we devise a long-short term feature (LSTFI) module, which is capable of abundant content from more neighboring input frames.
Second, we put forward a spatial-temporal deformable feature aggregation (STDFA) module, in which spatial and temporal contexts are adaptively captured and aggregated.
arXiv Detail & Related papers (2022-03-14T03:40:35Z) - MEGAN: Memory Enhanced Graph Attention Network for Space-Time Video
Super-Resolution [8.111645835455658]
Space-time video super-resolution (STVSR) aims to construct a high space-time resolution video sequence from the corresponding low-frame-rate, low-resolution video sequence.
Inspired by the recent success to consider spatial-temporal information for space-time super-resolution, our main goal in this work is to take full considerations of spatial and temporal correlations.
arXiv Detail & Related papers (2021-10-28T17:37:07Z) - Temporal Modulation Network for Controllable Space-Time Video
Super-Resolution [66.06549492893947]
Space-time video super-resolution aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos.
Deformable convolution based methods have achieved promising STVSR performance, but they could only infer the intermediate frame pre-defined in the training stage.
We propose a Temporal Modulation Network (TMNet) to interpolate arbitrary intermediate frame(s) with accurate high-resolution reconstruction.
arXiv Detail & Related papers (2021-04-21T17:10:53Z) - Exploring Rich and Efficient Spatial Temporal Interactions for Real Time
Video Salient Object Detection [87.32774157186412]
Main stream methods formulate their video saliency mainly from two independent venues, i.e., the spatial and temporal branches.
In this paper, we propose atemporal network to achieve such improvement in a full interactive fashion.
Our method is easy to implement yet effective, achieving high quality video saliency detection in real-time speed with 50 FPS.
arXiv Detail & Related papers (2020-08-07T03:24:04Z) - Co-Saliency Spatio-Temporal Interaction Network for Person
Re-Identification in Videos [85.6430597108455]
We propose a novel Co-Saliency Spatio-Temporal Interaction Network (CSTNet) for person re-identification in videos.
It captures the common salient foreground regions among video frames and explores the spatial-temporal long-range context interdependency from such regions.
Multiple spatialtemporal interaction modules within CSTNet are proposed, which exploit the spatial and temporal long-range context interdependencies on such features and spatial-temporal information correlation.
arXiv Detail & Related papers (2020-04-10T10:23:58Z) - Space-Time-Aware Multi-Resolution Video Enhancement [25.90440000711309]
A proposed model called STARnet super-resolves jointly in space and time.
We show that STARnet improves the performances of space-time, spatial, and temporal video super-resolution by substantial margins on publicly available datasets.
arXiv Detail & Related papers (2020-03-30T00:33:17Z) - Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.