MEGAN: Memory Enhanced Graph Attention Network for Space-Time Video
Super-Resolution
- URL: http://arxiv.org/abs/2110.15327v1
- Date: Thu, 28 Oct 2021 17:37:07 GMT
- Title: MEGAN: Memory Enhanced Graph Attention Network for Space-Time Video
Super-Resolution
- Authors: Chenyu You, Lianyi Han, Aosong Feng, Ruihan Zhao, Hui Tang, Wei Fan
- Abstract summary: Space-time video super-resolution (STVSR) aims to construct a high space-time resolution video sequence from the corresponding low-frame-rate, low-resolution video sequence.
Inspired by the recent success to consider spatial-temporal information for space-time super-resolution, our main goal in this work is to take full considerations of spatial and temporal correlations.
- Score: 8.111645835455658
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Space-time video super-resolution (STVSR) aims to construct a high space-time
resolution video sequence from the corresponding low-frame-rate, low-resolution
video sequence. Inspired by the recent success to consider spatial-temporal
information for space-time super-resolution, our main goal in this work is to
take full considerations of spatial and temporal correlations within the video
sequences of fast dynamic events. To this end, we propose a novel one-stage
memory enhanced graph attention network (MEGAN) for space-time video
super-resolution. Specifically, we build a novel long-range memory graph
aggregation (LMGA) module to dynamically capture correlations along the channel
dimensions of the feature maps and adaptively aggregate channel features to
enhance the feature representations. We introduce a non-local residual block,
which enables each channel-wise feature to attend global spatial hierarchical
features. In addition, we adopt a progressive fusion module to further enhance
the representation ability by extensively exploiting spatial-temporal
correlations from multiple frames. Experiment results demonstrate that our
method achieves better results compared with the state-of-the-art methods
quantitatively and visually.
Related papers
- Global Spatial-Temporal Information-based Residual ConvLSTM for Video Space-Time Super-Resolution [29.74501891293423]
We propose a convolutional neural network (CNN) for space-time video super-resolution, namely GIRNet.
To generate highly accurate features, the proposed network integrates a feature-level temporal module with deformable convolutions and a global spatial-temporal information-based residual convolutional long short-term memory (convLSTM)
Experiments on the Vimeo90K dataset show that the proposed method outperforms state-of-the-art techniques in peak signal-to-noise-ratio (by 1.45 dB, 1.14 dB, and 0.02 dB over STARnet, TMNet, and 3DAttGAN,
arXiv Detail & Related papers (2024-07-11T13:01:44Z) - Continuous Space-Time Video Super-Resolution Utilizing Long-Range
Temporal Information [48.20843501171717]
We propose a continuous ST-VSR (CSTVSR) method that can convert the given video to any frame rate and spatial resolution.
We show that the proposed algorithm has good flexibility and achieves better performance on various datasets.
arXiv Detail & Related papers (2023-02-26T08:02:39Z) - Enhancing Space-time Video Super-resolution via Spatial-temporal Feature
Interaction [9.456643513690633]
The aim of space-time video super-resolution (STVSR) is to increase both the frame rate and the spatial resolution of a video.
Recent approaches solve STVSR using end-to-end deep neural networks.
We propose a spatial-temporal feature interaction network to enhance STVSR by exploiting both spatial and temporal correlations.
arXiv Detail & Related papers (2022-07-18T22:10:57Z) - VideoINR: Learning Video Implicit Neural Representation for Continuous
Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate.
We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z) - STDAN: Deformable Attention Network for Space-Time Video
Super-Resolution [39.18399652834573]
We propose a deformable attention network called STDAN for STVSR.
First, we devise a long-short term feature (LSTFI) module, which is capable of abundant content from more neighboring input frames.
Second, we put forward a spatial-temporal deformable feature aggregation (STDFA) module, in which spatial and temporal contexts are adaptively captured and aggregated.
arXiv Detail & Related papers (2022-03-14T03:40:35Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Temporal Modulation Network for Controllable Space-Time Video
Super-Resolution [66.06549492893947]
Space-time video super-resolution aims to increase the spatial and temporal resolutions of low-resolution and low-frame-rate videos.
Deformable convolution based methods have achieved promising STVSR performance, but they could only infer the intermediate frame pre-defined in the training stage.
We propose a Temporal Modulation Network (TMNet) to interpolate arbitrary intermediate frame(s) with accurate high-resolution reconstruction.
arXiv Detail & Related papers (2021-04-21T17:10:53Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Video Super-resolution with Temporal Group Attention [127.21615040695941]
We propose a novel method that can effectively incorporate temporal information in a hierarchical way.
The input sequence is divided into several groups, with each one corresponding to a kind of frame rate.
It achieves favorable performance against state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2020-07-21T04:54:30Z) - Space-Time-Aware Multi-Resolution Video Enhancement [25.90440000711309]
A proposed model called STARnet super-resolves jointly in space and time.
We show that STARnet improves the performances of space-time, spatial, and temporal video super-resolution by substantial margins on publicly available datasets.
arXiv Detail & Related papers (2020-03-30T00:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.