Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
- URL: http://arxiv.org/abs/2407.09919v1
- Date: Sat, 13 Jul 2024 15:27:39 GMT
- Title: Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors
- Authors: Wei Shang, Dongwei Ren, Wanying Zhang, Yuming Fang, Wangmeng Zuo, Kede Ma,
- Abstract summary: We describe a strong baseline for Arbitra-scale super-resolution (AVSR)
We then introduce ST-AVSR by equipping our baseline with a multi-scale structural and textural prior computed from the pre-trained VGG network.
Comprehensive experiments show that ST-AVSR significantly improves super-resolution quality, generalization ability, and inference speed over the state-of-theart.
- Score: 80.92195378575671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Arbitrary-scale video super-resolution (AVSR) aims to enhance the resolution of video frames, potentially at various scaling factors, which presents several challenges regarding spatial detail reproduction, temporal consistency, and computational complexity. In this paper, we first describe a strong baseline for AVSR by putting together three variants of elementary building blocks: 1) a flow-guided recurrent unit that aggregates spatiotemporal information from previous frames, 2) a flow-refined cross-attention unit that selects spatiotemporal information from future frames, and 3) a hyper-upsampling unit that generates scaleaware and content-independent upsampling kernels. We then introduce ST-AVSR by equipping our baseline with a multi-scale structural and textural prior computed from the pre-trained VGG network. This prior has proven effective in discriminating structure and texture across different locations and scales, which is beneficial for AVSR. Comprehensive experiments show that ST-AVSR significantly improves super-resolution quality, generalization ability, and inference speed over the state-of-theart. The code is available at https://github.com/shangwei5/ST-AVSR.
Related papers
- You Only Align Once: Bidirectional Interaction for Spatial-Temporal
Video Super-Resolution [14.624610700550754]
We propose an efficient recurrent network with bidirectional interaction for ST-VSR.
It first performs backward inference from future to past, and then follows forward inference to super-resolve intermediate frames.
Our method outperforms state-of-the-art methods in efficiency, and reduces calculation cost by about 22%.
arXiv Detail & Related papers (2022-07-13T17:01:16Z) - A New Dataset and Transformer for Stereoscopic Video Super-Resolution [4.332879001008757]
Stereo video super-resolution aims to enhance the resolution of the low-resolution by reconstructing the high-resolution video.
Key challenges in SVSR are preserving the stereo-consistency and temporal-consistency, without which viewers may experience 3D fatigue.
In this paper, we propose a novel Transformer-based model for SVSR, namely Trans-SVSR.
arXiv Detail & Related papers (2022-04-21T11:49:29Z) - Learning Trajectory-Aware Transformer for Video Super-Resolution [50.49396123016185]
Video super-resolution aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts.
Existing approaches usually align and aggregate video frames from limited adjacent frames.
We propose a novel Transformer for Video Super-Resolution (TTVSR)
arXiv Detail & Related papers (2022-04-08T03:37:39Z) - STDAN: Deformable Attention Network for Space-Time Video
Super-Resolution [39.18399652834573]
We propose a deformable attention network called STDAN for STVSR.
First, we devise a long-short term feature (LSTFI) module, which is capable of abundant content from more neighboring input frames.
Second, we put forward a spatial-temporal deformable feature aggregation (STDFA) module, in which spatial and temporal contexts are adaptively captured and aggregated.
arXiv Detail & Related papers (2022-03-14T03:40:35Z) - Fast Online Video Super-Resolution with Deformable Attention Pyramid [172.16491820970646]
Video super-resolution (VSR) has many applications that pose strict causal, real-time, and latency constraints, including video streaming and TV.
We propose a recurrent VSR architecture based on a deformable attention pyramid (DAP)
arXiv Detail & Related papers (2022-02-03T17:49:04Z) - BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation
and Alignment [90.81396836308085]
We show that by empowering recurrent framework with enhanced propagation and alignment, one can exploit video information more effectively.
Our model BasicVSR++ surpasses BasicVSR by 0.82 dB in PSNR with similar number of parameters.
BasicVSR++ generalizes well to other video restoration tasks such as compressed video enhancement.
arXiv Detail & Related papers (2021-04-27T17:58:31Z) - BasicVSR: The Search for Essential Components in Video Super-Resolution
and Beyond [75.62146968824682]
Video super-resolution (VSR) approaches tend to have more components than the image counterparts.
We show a succinct pipeline, BasicVSR, that achieves appealing improvements in terms of speed and restoration quality.
arXiv Detail & Related papers (2020-12-03T18:56:14Z) - Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.