Learning Data-Driven Vector-Quantized Degradation Model for Animation
Video Super-Resolution
- URL: http://arxiv.org/abs/2303.09826v2
- Date: Wed, 20 Sep 2023 03:52:59 GMT
- Title: Learning Data-Driven Vector-Quantized Degradation Model for Animation
Video Super-Resolution
- Authors: Zixi Tuo, Huan Yang, Jianlong Fu, Yujie Dun, Xueming Qian
- Abstract summary: We explore the characteristics of animation videos and leverage the rich priors in real-world animation data for a more practical animation VSR model.
We propose a multi-scale Vector-Quantized Degradation model for animation video Super-Resolution (VQD-SR) to decompose the local details from global structures.
A rich-content Real Animation Low-quality (RAL) video dataset is collected for extracting the priors.
- Score: 59.71387128485845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing real-world video super-resolution (VSR) methods focus on designing a
general degradation pipeline for open-domain videos while ignoring data
intrinsic characteristics which strongly limit their performance when applying
to some specific domains (eg., animation videos). In this paper, we thoroughly
explore the characteristics of animation videos and leverage the rich priors in
real-world animation data for a more practical animation VSR model. In
particular, we propose a multi-scale Vector-Quantized Degradation model for
animation video Super-Resolution (VQD-SR) to decompose the local details from
global structures and transfer the degradation priors in real-world animation
videos to a learned vector-quantized codebook for degradation modeling. A
rich-content Real Animation Low-quality (RAL) video dataset is collected for
extracting the priors. We further propose a data enhancement strategy for
high-resolution (HR) training videos based on our observation that existing HR
videos are mostly collected from the Web which contains conspicuous compression
artifacts. The proposed strategy is valid to lift the upper bound of animation
VSR performance, regardless of the specific VSR model. Experimental results
demonstrate the superiority of the proposed VQD-SR over state-of-the-art
methods, through extensive quantitative and qualitative evaluations of the
latest animation video super-resolution benchmark. The code and pre-trained
models can be downloaded at https://github.com/researchmm/VQD-SR.
Related papers
- DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models [9.145545884814327]
This paper introduces a method for zero-shot video restoration using pre-trained image restoration diffusion models.
We show that our method achieves top performance in zero-shot video restoration.
Our technique works with any 2D restoration diffusion model, offering a versatile and powerful tool for video enhancement tasks without extensive retraining.
arXiv Detail & Related papers (2024-07-01T17:59:12Z) - Retargeting video with an end-to-end framework [14.270721529264929]
We present an end-to-end RETVI method to retarget videos to arbitrary ratios.
Our system outperforms previous work in quality and running time.
arXiv Detail & Related papers (2023-11-08T04:56:41Z) - AnimeSR: Learning Real-World Super-Resolution Models for Animation
Videos [23.71771590274543]
This paper studies the problem of real-world video super-resolution (VSR) for animation videos, and reveals three key improvements for practical animation VSR.
We propose to learn such basic operators from real low-quality animation videos, and incorporate the learned ones into the degradation generation pipeline.
Our method, AnimeSR, is capable of restoring real-world low-quality animation videos effectively and efficiently, achieving superior performance to previous state-of-the-art methods.
arXiv Detail & Related papers (2022-06-14T17:57:11Z) - VideoINR: Learning Video Implicit Neural Representation for Continuous
Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate.
We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z) - VRAG: Region Attention Graphs for Content-Based Video Retrieval [85.54923500208041]
Region Attention Graph Networks (VRAG) improves the state-of-the-art video-level methods.
VRAG represents videos at a finer granularity via region-level features and encodes video-temporal dynamics through region-level relations.
We show that the performance gap between video-level and frame-level methods can be reduced by segmenting videos into shots and using shot embeddings for video retrieval.
arXiv Detail & Related papers (2022-05-18T16:50:45Z) - Learning Trajectory-Aware Transformer for Video Super-Resolution [50.49396123016185]
Video super-resolution aims to restore a sequence of high-resolution (HR) frames from their low-resolution (LR) counterparts.
Existing approaches usually align and aggregate video frames from limited adjacent frames.
We propose a novel Transformer for Video Super-Resolution (TTVSR)
arXiv Detail & Related papers (2022-04-08T03:37:39Z) - STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution
Video Prediction [78.129039340528]
We propose a StemporalResidual Predictive Model (STRPM) for high-resolution video prediction.
STRPM can generate more satisfactory results compared with various existing methods.
Experimental results show that STRPM can generate more satisfactory results compared with various existing methods.
arXiv Detail & Related papers (2022-03-30T06:24:00Z) - Real-Time Video Super-Resolution by Joint Local Inference and Global
Parameter Estimation [0.0]
We present a novel approach to synthesizing training data by simulating two digital-camera image-capture processes at different scales.
Our method produces image-pairs in which both images have properties of natural images.
We present an efficient CNN architecture, which enables real-time application of video SR on low-power edge-devices.
arXiv Detail & Related papers (2021-05-06T16:35:09Z) - Beyond Short Clips: End-to-End Video-Level Learning with Collaborative
Memories [56.91664227337115]
We introduce a collaborative memory mechanism that encodes information across multiple sampled clips of a video at each training iteration.
This enables the learning of long-range dependencies beyond a single clip.
Our proposed framework is end-to-end trainable and significantly improves the accuracy of video classification at a negligible computational overhead.
arXiv Detail & Related papers (2021-04-02T18:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.