Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video
Super-Resolution via Cycle-Projected Mutual Learning
- URL: http://arxiv.org/abs/2205.05264v1
- Date: Wed, 11 May 2022 04:30:47 GMT
- Title: Spatial-Temporal Space Hand-in-Hand: Spatial-Temporal Video
Super-Resolution via Cycle-Projected Mutual Learning
- Authors: Mengshun Hu and Kui Jiang and Liang Liao and Jing Xiao and Junjun
Jiang and Zheng Wang
- Abstract summary: We propose a Cycle-projected Mutual learning network (CycMu-Net) for ST-VSR.
CycMu-Net makes full use of spatial-temporal correlations via the mutual learning between S-VSR and T-VSR.
Our method significantly outperforms state-of-the-art methods.
- Score: 48.68503274323906
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spatial-Temporal Video Super-Resolution (ST-VSR) aims to generate
super-resolved videos with higher resolution(HR) and higher frame rate (HFR).
Quite intuitively, pioneering two-stage based methods complete ST-VSR by
directly combining two sub-tasks: Spatial Video Super-Resolution (S-VSR) and
Temporal Video Super-Resolution(T-VSR) but ignore the reciprocal relations
among them. Specifically, 1) T-VSR to S-VSR: temporal correlations help
accurate spatial detail representation with more clues; 2) S-VSR to T-VSR:
abundant spatial information contributes to the refinement of temporal
prediction. To this end, we propose a one-stage based Cycle-projected Mutual
learning network (CycMu-Net) for ST-VSR, which makes full use of
spatial-temporal correlations via the mutual learning between S-VSR and T-VSR.
Specifically, we propose to exploit the mutual information among them via
iterative up-and-down projections, where the spatial and temporal features are
fully fused and distilled, helping the high-quality video reconstruction.
Besides extensive experiments on benchmark datasets, we also compare our
proposed CycMu-Net with S-VSR and T-VSR tasks, demonstrating that our method
significantly outperforms state-of-the-art methods.
Related papers
- Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors [80.92195378575671]
We describe a strong baseline for Arbitra-scale super-resolution (AVSR)
We then introduce ST-AVSR by equipping our baseline with a multi-scale structural and textural prior computed from the pre-trained VGG network.
Comprehensive experiments show that ST-AVSR significantly improves super-resolution quality, generalization ability, and inference speed over the state-of-theart.
arXiv Detail & Related papers (2024-07-13T15:27:39Z) - Enhancing Perceptual Quality in Video Super-Resolution through Temporally-Consistent Detail Synthesis using Diffusion Models [17.570136632211693]
We present StableVSR, a VSR method based on DMs that can enhance the perceptual quality of upscaled videos by synthesizing realistic and temporally-consistent details.
We demonstrate the effectiveness of StableVSR in enhancing the perceptual quality of upscaled videos while achieving better temporal consistency compared to existing state-of-the-art methods for VSR.
arXiv Detail & Related papers (2023-11-27T15:14:38Z) - Scale-Adaptive Feature Aggregation for Efficient Space-Time Video
Super-Resolution [14.135298731079164]
We propose a novel Scale-Adaptive Feature Aggregation (SAFA) network that adaptively selects sub-networks with different processing scales for individual samples.
Our SAFA network outperforms recent state-of-the-art methods such as TMNet and VideoINR by an average improvement of over 0.5dB on PSNR, while requiring less than half the number of parameters and only 1/3 computational costs.
arXiv Detail & Related papers (2023-10-26T10:18:51Z) - You Only Align Once: Bidirectional Interaction for Spatial-Temporal
Video Super-Resolution [14.624610700550754]
We propose an efficient recurrent network with bidirectional interaction for ST-VSR.
It first performs backward inference from future to past, and then follows forward inference to super-resolve intermediate frames.
Our method outperforms state-of-the-art methods in efficiency, and reduces calculation cost by about 22%.
arXiv Detail & Related papers (2022-07-13T17:01:16Z) - VideoINR: Learning Video Implicit Neural Representation for Continuous
Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate.
We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z) - STDAN: Deformable Attention Network for Space-Time Video
Super-Resolution [39.18399652834573]
We propose a deformable attention network called STDAN for STVSR.
First, we devise a long-short term feature (LSTFI) module, which is capable of abundant content from more neighboring input frames.
Second, we put forward a spatial-temporal deformable feature aggregation (STDFA) module, in which spatial and temporal contexts are adaptively captured and aggregated.
arXiv Detail & Related papers (2022-03-14T03:40:35Z) - Optical-Flow-Reuse-Based Bidirectional Recurrent Network for Space-Time
Video Super-Resolution [52.899234731501075]
Space-time video super-resolution (ST-VSR) simultaneously increases the spatial resolution and frame rate for a given video.
Existing methods typically suffer from difficulties in how to efficiently leverage information from a large range of neighboring frames.
We propose a coarse-to-fine bidirectional recurrent neural network instead of using ConvLSTM to leverage knowledge between adjacent frames.
arXiv Detail & Related papers (2021-10-13T15:21:30Z) - BasicVSR: The Search for Essential Components in Video Super-Resolution
and Beyond [75.62146968824682]
Video super-resolution (VSR) approaches tend to have more components than the image counterparts.
We show a succinct pipeline, BasicVSR, that achieves appealing improvements in terms of speed and restoration quality.
arXiv Detail & Related papers (2020-12-03T18:56:14Z) - Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.