STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution
Video Prediction
- URL: http://arxiv.org/abs/2203.16084v1
- Date: Wed, 30 Mar 2022 06:24:00 GMT
- Title: STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution
Video Prediction
- Authors: Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, and Wen Gao
- Abstract summary: We propose a StemporalResidual Predictive Model (STRPM) for high-resolution video prediction.
STRPM can generate more satisfactory results compared with various existing methods.
Experimental results show that STRPM can generate more satisfactory results compared with various existing methods.
- Score: 78.129039340528
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although many video prediction methods have obtained good performance in
low-resolution (64$\sim$128) videos, predictive models for high-resolution
(512$\sim$4K) videos have not been fully explored yet, which are more
meaningful due to the increasing demand for high-quality videos. Compared with
low-resolution videos, high-resolution videos contain richer appearance
(spatial) information and more complex motion (temporal) information. In this
paper, we propose a Spatiotemporal Residual Predictive Model (STRPM) for
high-resolution video prediction. On the one hand, we propose a Spatiotemporal
Encoding-Decoding Scheme to preserve more spatiotemporal information for
high-resolution videos. In this way, the appearance details for each frame can
be greatly preserved. On the other hand, we design a Residual Predictive Memory
(RPM) which focuses on modeling the spatiotemporal residual features (STRF)
between previous and future frames instead of the whole frame, which can
greatly help capture the complex motion information in high-resolution videos.
In addition, the proposed RPM can supervise the spatial encoder and temporal
encoder to extract different features in the spatial domain and the temporal
domain, respectively. Moreover, the proposed model is trained using generative
adversarial networks (GANs) with a learned perceptual loss (LP-loss) to improve
the perceptual quality of the predictions. Experimental results show that STRPM
can generate more satisfactory results compared with various existing methods.
Related papers
- Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution [151.1255837803585]
We propose a novel approach, pursuing Spatial Adaptation and Temporal Coherence (SATeCo) for video super-resolution.
SATeCo pivots on learning spatial-temporal guidance from low-resolution videos to calibrate both latent-space high-resolution video denoising and pixel-space video reconstruction.
Experiments conducted on the REDS4 and Vid4 datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-03-25T17:59:26Z) - A Codec Information Assisted Framework for Efficient Compressed Video
Super-Resolution [15.690562510147766]
Video Super-Resolution (VSR) using recurrent neural network architecture is a promising solution due to its efficient modeling of long-range temporal dependencies.
We propose a Codec Information Assisted Framework (CIAF) to boost and accelerate recurrent VSR models for compressed videos.
arXiv Detail & Related papers (2022-10-15T08:48:29Z) - STIP: A SpatioTemporal Information-Preserving and Perception-Augmented
Model for High-Resolution Video Prediction [78.129039340528]
We propose a Stemporal Information-Preserving and Perception-Augmented Model (STIP) to solve the above two problems.
The proposed model aims to preserve thetemporal information for videos during the feature extraction and the state transitions.
Experimental results show that the proposed STIP can predict videos with more satisfactory visual quality compared with a variety of state-of-the-art methods.
arXiv Detail & Related papers (2022-06-09T09:49:04Z) - Look Back and Forth: Video Super-Resolution with Explicit Temporal
Difference Modeling [105.69197687940505]
We propose to explore the role of explicit temporal difference modeling in both LR and HR space.
To further enhance the super-resolution result, not only spatial residual features are extracted, but the difference between consecutive frames in high-frequency domain is also computed.
arXiv Detail & Related papers (2022-04-14T17:07:33Z) - Learning Trajectory-Aware Transformer for Video Super-Resolution [50.49396123016185]
Video super-resolution aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts.
Existing approaches usually align and aggregate video frames from limited adjacent frames.
We propose a novel Transformer for Video Super-Resolution (TTVSR)
arXiv Detail & Related papers (2022-04-08T03:37:39Z) - A Novel Dual Dense Connection Network for Video Super-resolution [0.0]
Video super-resolution (VSR) refers to the reconstruction of high-resolution (HR) video from the corresponding low-resolution (LR) video.
We propose a novel dual dense connection network that can generate high-quality super-resolution (SR) results.
arXiv Detail & Related papers (2022-03-05T12:21:29Z) - Video Rescaling Networks with Joint Optimization Strategies for
Downscaling and Upscaling [15.630742638440998]
We present two joint optimization approaches based on invertible neural networks with coupling layers.
Our Long Short-Term Memory Video Rescaling Network (LSTM-VRN) leverages temporal information in the low-resolution video to form an explicit prediction of the missing high-frequency information for upscaling.
Our Multi-input Multi-output Video Rescaling Network (MIMO-VRN) proposes a new strategy for downscaling and upscaling a group of video frames simultaneously.
arXiv Detail & Related papers (2021-03-27T09:35:38Z) - Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video
Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR)
temporalsynthesis and spatial super-resolution are intra-related in this task.
We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.