Related papers: RTSR: A Real-Time Super-Resolution Model for AV1 Compressed Content

RTSR: A Real-Time Super-Resolution Model for AV1 Compressed Content

URL: http://arxiv.org/abs/2411.13362v1
Date: Wed, 20 Nov 2024 14:36:06 GMT
Title: RTSR: A Real-Time Super-Resolution Model for AV1 Compressed Content
Authors: Yuxuan Jiang, Jakub Nawała, Chen Feng, Fan Zhang, Xiaoqing Zhu, Joel Sole, David Bull,
Abstract summary: Super-resolution (SR) is a key technique for improving the visual quality of video content. To support real-time playback, it is important to implement fast SR models while preserving reconstruction quality. This paper proposes a low-complexity SR method, RTSR, designed to enhance the visual quality of compressed video content.
Score: 10.569678424799616
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Super-resolution (SR) is a key technique for improving the visual quality of video content by increasing its spatial resolution while reconstructing fine details. SR has been employed in many applications including video streaming, where compressed low-resolution content is typically transmitted to end users and then reconstructed with a higher resolution and enhanced quality. To support real-time playback, it is important to implement fast SR models while preserving reconstruction quality; however most existing solutions, in particular those based on complex deep neural networks, fail to do so. To address this issue, this paper proposes a low-complexity SR method, RTSR, designed to enhance the visual quality of compressed video content, focusing on resolution up-scaling from a) 360p to 1080p and from b) 540p to 4K. The proposed approach utilizes a CNN-based network architecture, which was optimized for AV1 (SVT)-encoded content at various quantization levels based on a dual-teacher knowledge distillation method. This method was submitted to the AIM 2024 Video Super-Resolution Challenge, specifically targeting the Efficient/Mobile Real-Time Video Super-Resolution competition. It achieved the best trade-off between complexity and coding performance (measured in PSNR, SSIM and VMAF) among all six submissions. The code will be available soon.

Related papers

Compressed Video Super-Resolution based on Hierarchical Encoding [24.869991871048764]
VSR-HE upscales low-resolution videos by a ratio of four, from 180p to 720p or from 270p to 1080p.<n>The proposed VSR-HE has been officially submitted to the ICME 2025 Grand Challenge on VSR for Video Conferencing.
arXiv Detail & Related papers (2025-06-17T10:26:07Z)
ICME 2025 Grand Challenge on Video Super-Resolution for Video Conferencing [11.461315814208437]
Super-Resolution (SR) is a critical task in computer vision, focusing on reconstructing high-resolution (HR) images from low-resolution (LR) inputs.<n>Video Super-Resolution (VSR) extends this to the temporal domain, aiming to enhance video quality using methods like local, uni-, bi-directional propagation, or traditional upscaling followed by restoration.<n>This challenge addresses VSR for conferencing, where LR videos are encoded with H.265 at fixed QPs.<n>The goal is to upscale videos by a specific factor, providing HR outputs with enhanced perceptual quality under a low-delay scenario using causal
arXiv Detail & Related papers (2025-06-13T22:46:27Z)
FCA2: Frame Compression-Aware Autoencoder for Modular and Fast Compressed Video Super-Resolution [68.77813885751308]
State-of-the-art (SOTA) compressed video super-resolution (CVSR) models face persistent challenges, including prolonged inference time, complex training pipelines, and reliance on auxiliary information.<n>We propose an efficient and scalable solution inspired by the structural and statistical similarities between hyperspectral images (HSI) and video data.<n>Our approach introduces a compression-driven dimensionality reduction strategy that reduces computational complexity, accelerates inference, and enhances the extraction of temporal information across frames.
arXiv Detail & Related papers (2025-06-13T07:59:52Z)
RepNet-VSR: Reparameterizable Architecture for High-Fidelity Video Super-Resolution [12.274092278786966]
We propose a Reizable Architecture for High Fidelity Video Super Resolution method, named RepNet-VSR, for real-time 4x video super-resolution. The proposed model achieves 27.79 dB PSNR when processing 180p to 720p frames in 103 ms per 10 frames on a MediaTek Dimensity NPU.
arXiv Detail & Related papers (2025-04-22T07:15:07Z)
Elevating Flow-Guided Video Inpainting with Reference Generation [50.03502211226332]
Video inpainting (VI) is a challenging task that requires effective propagation of observable content across frames while simultaneously generating new content not present in the original video. We propose a robust and practical VI framework that leverages a large generative model for reference generation in combination with an advanced pixel propagation algorithm. Our method not only significantly enhances frame-level quality for object removal but also synthesizes new content in the missing areas based on user-provided text prompts.
arXiv Detail & Related papers (2024-12-12T06:13:00Z)
Standard compliant video coding using low complexity, switchable neural wrappers [8.149130379436759]
We propose a new framework featuring standard compatibility, high performance, and low decoding complexity. We employ a set of jointly optimized neural pre- and post-processors, wrapping a standard video, to encode videos at different resolutions. We design a low complexity neural post-processor architecture that can handle different upsampling ratios.
arXiv Detail & Related papers (2024-07-10T06:36:45Z)
Hierarchical Patch Diffusion Models for High-Resolution Video Generation [50.42746357450949]
We develop deep context fusion, which propagates context information from low-scale to high-scale patches in a hierarchical manner. We also propose adaptive computation, which allocates more network capacity and computation towards coarse image details. The resulting model sets a new state-of-the-art FVD score of 66.32 and Inception Score of 87.68 in class-conditional video generation.
arXiv Detail & Related papers (2024-06-12T01:12:53Z)
Video Compression with Arbitrary Rescaling Network [8.489428003916622]
We propose a rate-guided arbitrary rescaling network (RARN) for video resizing before encoding. The lightweight RARN structure can process FHD (1080p) content at real-time speed (91 FPS) and obtain a considerable rate reduction.
arXiv Detail & Related papers (2023-06-07T07:15:18Z)
VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution [75.79379734567604]
We show that Video Implicit Neural Representation (VideoINR) can be decoded to videos of arbitrary spatial resolution and frame rate. We show that VideoINR achieves competitive performances with state-of-the-art STVSR methods on common up-sampling scales.
arXiv Detail & Related papers (2022-06-09T17:45:49Z)
Image Super-resolution with An Enhanced Group Convolutional Neural Network [102.2483249598621]
CNNs with strong learning abilities are widely chosen to resolve super-resolution problem. We present an enhanced super-resolution group CNN (ESRGCNN) with a shallow architecture. Experiments report that our ESRGCNN surpasses the state-of-the-arts in terms of SISR performance, complexity, execution speed, image quality evaluation and visual effect in SISR.
arXiv Detail & Related papers (2022-05-29T00:34:25Z)
Real-Time Super-Resolution System of 4K-Video Based on Deep Learning [6.182364004551161]
Video-resolution (VSR) technology excels in low-quality video computation, avoiding unpleasant blur effect caused by occupation-based algorithms. This paper explores the possibility of real-time VS system and designs an efficient generic VSR network, termed EGVSR. Compared with TecoGAN, the most advanced VSR network at present, we achieve 84% reduction of density and 7.92x performance speedups.
arXiv Detail & Related papers (2021-07-12T10:35:05Z)
BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment [90.81396836308085]
We show that by empowering recurrent framework with enhanced propagation and alignment, one can exploit video information more effectively. Our model BasicVSR++ surpasses BasicVSR by 0.82 dB in PSNR with similar number of parameters. BasicVSR++ generalizes well to other video restoration tasks such as compressed video enhancement.
arXiv Detail & Related papers (2021-04-27T17:58:31Z)
Efficient Video Compression via Content-Adaptive Super-Resolution [11.6624528293976]
Video compression is a critical component of Internet video delivery. Recent work has shown that deep learning techniques can rival or outperform human algorithms. This paper presents a new approach that augments a recent deep learning-based video compression scheme.
arXiv Detail & Related papers (2021-04-06T07:01:06Z)
AIM 2020 Challenge on Video Extreme Super-Resolution: Methods and Results [96.74919503142014]
This paper reviews the video extreme super-resolution challenge associated with the AIM 2020 workshop at ECCV 2020. Track 1 is set up to gauge the state-of-the-art for such a demanding task, where fidelity to the ground truth is measured by PSNR and SSIM. Track 2 therefore aims at generating visually pleasing results, which are ranked according to human perception, evaluated by a user study.
arXiv Detail & Related papers (2020-09-14T09:36:25Z)
Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution [95.26202278535543]
A simple solution is to split it into two sub-tasks: video frame (VFI) and video super-resolution (VSR) temporalsynthesis and spatial super-resolution are intra-related in this task. We propose a one-stage space-time video super-resolution framework, which directly synthesizes an HR slow-motion video from an LFR, LR video.
arXiv Detail & Related papers (2020-02-26T16:59:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.