Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution
- URL: http://arxiv.org/abs/2002.11616v1
- Date: Wed, 26 Feb 2020 16:59:48 GMT
- Title: Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution
- Authors: Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan P. Allebach,
Chenliang Xu
- Abstract summary: A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
- Score: 95.26202278535543
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we explore the space-time video super-resolution task, which
aims to generate a high-resolution (HR) slow-motion video from a low frame rate
(LFR), low-resolution (LR) video. A simple solution is to split it into two
sub-tasks: video frame interpolation (VFI) and video super-resolution (VSR).
However, temporal interpolation and spatial super-resolution are intra-related
in this task. Two-stage methods cannot fully take advantage of the natural
property. In addition, state-of-the-art VFI or VSR networks require a large
frame-synthesis or reconstruction module for predicting high-quality video
frames, which makes the two-stage methods have large model sizes and thus be
time-consuming. To overcome the problems, we propose a one-stage space-time
video super-resolution framework, which directly synthesizes an HR slow-motion
video from an LFR, LR video. Rather than synthesizing missing LR video frames
as VFI networks do, we firstly temporally interpolate LR frame features in
missing LR video frames capturing local temporal contexts by the proposed
feature temporal interpolation network. Then, we propose a deformable ConvLSTM
to align and aggregate temporal information simultaneously for better
leveraging global temporal contexts. Finally, a deep reconstruction network is
adopted to predict HR slow-motion video frames. Extensive experiments on
benchmark datasets demonstrate that the proposed method not only achieves
better quantitative and qualitative performance but also is more than three
times faster than recent two-stage state-of-the-art methods, e.g., DAIN+EDVR
and DAIN+RBPN.
Related papers
- Continuous Space-Time Video Super-Resolution Utilizing Long-Range
Temporal Information [48.20843501171717]
We propose a continuous ST-VSR (CSTVSR) method that can convert the given video to any frame rate and spatial resolution.
We show that the proposed algorithm has good flexibility and achieves better performance on various datasets.
arXiv Detail & Related papers (2023-02-26T08:02:39Z) - Towards Interpretable Video Super-Resolution via Alternating
Optimization [115.85296325037565]
We study a practical space-time video super-resolution (STVSR) problem which aims at generating a high-framerate high-resolution sharp video from a low-framerate blurry video.
We propose an interpretable STVSR framework by leveraging both model-based and learning-based methods.
arXiv Detail & Related papers (2022-07-21T21:34:05Z) - RSTT: Real-time Spatial Temporal Transformer for Space-Time Video
Super-Resolution [13.089535703790425]
Space-time video super-resolution (STVSR) is the task of interpolating videos with both Low Frame Rate (LFR) and Low Resolution (LR) to produce High-Frame-Rate (HFR) and also High-Resolution (HR) counterparts.
We propose using a spatial-temporal transformer that naturally incorporates the spatial and temporal super resolution modules into a single model.
arXiv Detail & Related papers (2022-03-27T02:16:26Z) - STDAN: Deformable Attention Network for Space-Time Video
Super-Resolution [39.18399652834573]
We propose a deformable attention network called STDAN for STVSR.
First, we devise a long-short term feature (LSTFI) module, which is capable of abundant content from more neighboring input frames.
Second, we put forward a spatial-temporal deformable feature aggregation (STDFA) module, in which spatial and temporal contexts are adaptively captured and aggregated.
arXiv Detail & Related papers (2022-03-14T03:40:35Z) - Optical-Flow-Reuse-Based Bidirectional Recurrent Network for Space-Time
Video Super-Resolution [52.899234731501075]
Space-time video super-resolution (ST-VSR) simultaneously increases the spatial resolution and frame rate for a given video.
Existing methods typically suffer from difficulties in how to efficiently leverage information from a large range of neighboring frames.
We propose a coarse-to-fine bidirectional recurrent neural network instead of using ConvLSTM to leverage knowledge between adjacent frames.
arXiv Detail & Related papers (2021-10-13T15:21:30Z) - Zooming SlowMo: An Efficient One-Stage Framework for Space-Time Video
Super-Resolution [100.11355888909102]
Space-time video super-resolution aims at generating a high-resolution (HR) slow-motion video from a low-resolution (LR) and low frame rate (LFR) video sequence.
We present a one-stage space-time video super-resolution framework, which can directly reconstruct an HR slow-motion video sequence from an input LR and LFR video.
arXiv Detail & Related papers (2021-04-15T17:59:23Z) - Efficient Space-time Video Super Resolution using Low-Resolution Flow
and Mask Upsampling [12.856102293479486]
This paper aims to generate High-resolution Slow-motion videos from Low Resolution and Low Frame rate videos.
A simplistic solution is the sequential running of Video Super Resolution and Video Frame models.
Our model is lightweight and performs better than current state-of-the-art models in REDS STSR validation set.
arXiv Detail & Related papers (2021-04-12T19:11:57Z) - Video Face Super-Resolution with Motion-Adaptive Feedback Cell [90.73821618795512]
Video super-resolution (VSR) methods have recently achieved a remarkable success due to the development of deep convolutional neural networks (CNN)
In this paper, we propose a Motion-Adaptive Feedback Cell (MAFC), a simple but effective block, which can efficiently capture the motion compensation and feed it back to the network in an adaptive way.
arXiv Detail & Related papers (2020-02-15T13:14:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.