PP-MSVSR: Multi-Stage Video Super-Resolution
- URL: http://arxiv.org/abs/2112.02828v1
- Date: Mon, 6 Dec 2021 07:28:52 GMT
- Title: PP-MSVSR: Multi-Stage Video Super-Resolution
- Authors: Lielin Jiang and Na Wang and Qingqing Dang and Rui Liu and Baohua Lai
- Abstract summary: Key for Video Super-Resolution(VSR) task is to make full use of complementary information across frames to reconstruct the high-resolution sequence.
We propose a multi-stage VSR deep architecture, dubbed as PP-MSVSR, with local fusion module, auxiliary loss and re-align module.
Experiments substantiate that PP-MSVSR achieves a PSNR of 28.13dB with only 1.45M parameters.
- Score: 4.039183755023383
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Different from the Single Image Super-Resolution(SISR) task, the key for
Video Super-Resolution(VSR) task is to make full use of complementary
information across frames to reconstruct the high-resolution sequence. Since
images from different frames with diverse motion and scene, accurately aligning
multiple frames and effectively fusing different frames has always been the key
research work of VSR tasks. To utilize rich complementary information of
neighboring frames, in this paper, we propose a multi-stage VSR deep
architecture, dubbed as PP-MSVSR, with local fusion module, auxiliary loss and
re-align module to refine the enhanced result progressively. Specifically, in
order to strengthen the fusion of features across frames in feature
propagation, a local fusion module is designed in stage-1 to perform local
feature fusion before feature propagation. Moreover, we introduce an auxiliary
loss in stage-2 to make the features obtained by the propagation module reserve
more correlated information connected to the HR space, and introduce a re-align
module in stage-3 to make full use of the feature information of the previous
stage. Extensive experiments substantiate that PP-MSVSR achieves a promising
performance of Vid4 datasets, which achieves a PSNR of 28.13dB with only 1.45M
parameters. And the PP-MSVSR-L exceeds all state of the art method on REDS4
datasets with considerable parameters. Code and models will be released in
PaddleGAN\footnote{https://github.com/PaddlePaddle/PaddleGAN.}.
Related papers
- Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors [80.92195378575671]
We describe a strong baseline for Arbitra-scale super-resolution (AVSR)
We then introduce ST-AVSR by equipping our baseline with a multi-scale structural and textural prior computed from the pre-trained VGG network.
Comprehensive experiments show that ST-AVSR significantly improves super-resolution quality, generalization ability, and inference speed over the state-of-theart.
arXiv Detail & Related papers (2024-07-13T15:27:39Z) - Joint Reference Frame Synthesis and Post Filter Enhancement for Versatile Video Coding [53.703894799335735]
This paper presents the joint reference frame synthesis (RFS) and post-processing filter enhancement (PFE) for Versatile Video Coding (VVC)
Both RFS and PFE utilize the Space-Time Enhancement Network (STENet), which receives two input frames with artifacts and produces two enhanced frames with suppressed artifacts, along with an intermediate synthesized frame.
To reduce inference complexity, we propose joint inference of RFS and PFE (JISE), achieved through a single execution of STENet.
arXiv Detail & Related papers (2024-04-28T03:11:44Z) - You Only Align Once: Bidirectional Interaction for Spatial-Temporal
Video Super-Resolution [14.624610700550754]
We propose an efficient recurrent network with bidirectional interaction for ST-VSR.
It first performs backward inference from future to past, and then follows forward inference to super-resolve intermediate frames.
Our method outperforms state-of-the-art methods in efficiency, and reduces calculation cost by about 22%.
arXiv Detail & Related papers (2022-07-13T17:01:16Z) - STDAN: Deformable Attention Network for Space-Time Video
Super-Resolution [39.18399652834573]
We propose a deformable attention network called STDAN for STVSR.
First, we devise a long-short term feature (LSTFI) module, which is capable of abundant content from more neighboring input frames.
Second, we put forward a spatial-temporal deformable feature aggregation (STDFA) module, in which spatial and temporal contexts are adaptively captured and aggregated.
arXiv Detail & Related papers (2022-03-14T03:40:35Z) - Zooming SlowMo: An Efficient One-Stage Framework for Space-Time Video
Super-Resolution [100.11355888909102]
Space-time video super-resolution aims at generating a high-resolution (HR) slow-motion video from a low-resolution (LR) and low frame rate (LFR) video sequence.
We present a one-stage space-time video super-resolution framework, which can directly reconstruct an HR slow-motion video sequence from an input LR and LFR video.
arXiv Detail & Related papers (2021-04-15T17:59:23Z) - Multi-Stage Progressive Image Restoration [167.6852235432918]
We propose a novel synergistic design that can optimally balance these competing goals.
Our main proposal is a multi-stage architecture, that progressively learns restoration functions for the degraded inputs.
The resulting tightly interlinked multi-stage architecture, named as MPRNet, delivers strong performance gains on ten datasets.
arXiv Detail & Related papers (2021-02-04T18:57:07Z) - Deep Burst Super-Resolution [165.90445859851448]
We propose a novel architecture for the burst super-resolution task.
Our network takes multiple noisy RAW images as input, and generates a denoised, super-resolved RGB image as output.
In order to enable training and evaluation on real-world data, we additionally introduce the BurstSR dataset.
arXiv Detail & Related papers (2021-01-26T18:57:21Z) - Exploit Camera Raw Data for Video Super-Resolution via Hidden Markov
Model Inference [17.82232046395501]
We propose a new deep-learning Video Super-Resolution (VSR) method that can directly exploit camera sensor data.
The proposed method achieves superior VSR results compared to the state-of-the-art and can be adapted to any specific camera-ISP.
arXiv Detail & Related papers (2020-08-24T21:14:13Z) - Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z) - Video Saliency Prediction Using Enhanced Spatiotemporal Alignment
Network [35.932447204088845]
We develop an effective feature alignment network tailored to video saliency prediction (V)
The network learns to align the features of the neighboring frames to the reference one in a coarse-to-fine manner.
The proposed model is trained end-to-end without any post processing.
arXiv Detail & Related papers (2020-01-02T02:05:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.