Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement
- URL: http://arxiv.org/abs/2403.11556v1
- Date: Mon, 18 Mar 2024 08:13:26 GMT
- Title: Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement
- Authors: Qianyu Zhang, Bolun Zheng, Xinying Chen, Quan Chen, Zhunjie Zhu, Canjin Wang, Zongpeng Li, Chengang Yan,
- Abstract summary: We propose a hierarchical frequency-based upsampling and refining neural network (HFUR) for compressed video quality enhancement.
ImpFreqUp exploits DCT-domain prior derived through implicit DCT transform, and accurately reconstructs the DCT-domain loss via a coarse-to-fine transfer.
HIR is introduced to facilitate cross-collaboration and information compensation between the scales, thus further refine the feature maps and promote the visual quality of the final output.
- Score: 14.653248860008981
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video compression artifacts arise due to the quantization operation in the frequency domain. The goal of video quality enhancement is to reduce compression artifacts and reconstruct a visually-pleasant result. In this work, we propose a hierarchical frequency-based upsampling and refining neural network (HFUR) for compressed video quality enhancement. HFUR consists of two modules: implicit frequency upsampling module (ImpFreqUp) and hierarchical and iterative refinement module (HIR). ImpFreqUp exploits DCT-domain prior derived through implicit DCT transform, and accurately reconstructs the DCT-domain loss via a coarse-to-fine transfer. Consequently, HIR is introduced to facilitate cross-collaboration and information compensation between the scales, thus further refine the feature maps and promote the visual quality of the final output. We demonstrate the effectiveness of the proposed modules via ablation experiments and visualized results. Extensive experiments on public benchmarks show that HFUR achieves state-of-the-art performance for both constant bit rate and constant QP modes.
Related papers
- Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - End-to-End Optimized Image Compression with the Frequency-Oriented
Transform [8.27145506280741]
We propose the end-to-end optimized image compression model facilitated by the frequency-oriented transform.
The model enables scalable coding through the selective transmission of arbitrary frequency components.
Our model outperforms all traditional codecs including next-generation standard H.266/VVC on MS-SSIM metric.
arXiv Detail & Related papers (2024-01-16T08:16:10Z) - Frequency-Aware Transformer for Learned Image Compression [64.28698450919647]
We propose a frequency-aware transformer (FAT) block that for the first time achieves multiscale directional ananlysis for Learned Image Compression (LIC)
The FAT block comprises frequency-decomposition window attention (FDWA) modules to capture multiscale and directional frequency components of natural images.
We also introduce frequency-modulation feed-forward network (FMFFN) to adaptively modulate different frequency components, improving rate-distortion performance.
arXiv Detail & Related papers (2023-10-25T05:59:25Z) - Joint Channel Estimation and Feedback with Masked Token Transformers in
Massive MIMO Systems [74.52117784544758]
This paper proposes an encoder-decoder based network that unveils the intrinsic frequency-domain correlation within the CSI matrix.
The entire encoder-decoder network is utilized for channel compression.
Our method outperforms state-of-the-art channel estimation and feedback techniques in joint tasks.
arXiv Detail & Related papers (2023-06-08T06:15:17Z) - Catch Missing Details: Image Reconstruction with Frequency Augmented
Variational Autoencoder [27.149365819904745]
A higher compression rate induces more loss of visual signals on the higher frequency spectrum which reflect the details on pixel space.
A Frequency Complement Module (FCM) architecture is proposed to capture the missing frequency information for enhancing reconstruction quality.
A Cross-attention Autoregressive Transformer (CAT) is proposed to obtain more precise semantic attributes in texts.
arXiv Detail & Related papers (2023-05-04T04:30:21Z) - High Dynamic Range Image Quality Assessment Based on Frequency Disparity [78.36555631446448]
An image quality assessment (IQA) algorithm based on frequency disparity for high dynamic range ( HDR) images is proposed.
The proposed LGFM can provide a higher consistency with the subjective perception compared with the state-of-the-art HDR IQA methods.
arXiv Detail & Related papers (2022-09-06T08:22:13Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Multi-level Wavelet-based Generative Adversarial Network for Perceptual
Quality Enhancement of Compressed Video [51.631731922593225]
Existing methods mainly focus on enhancing the objective quality of compressed video while ignoring its perceptual quality.
We propose a novel generative adversarial network (GAN) based on multi-level wavelet packet transform (WPT) to enhance the perceptual quality of compressed video.
arXiv Detail & Related papers (2020-08-02T15:01:38Z) - Generalized Octave Convolutions for Learned Multi-Frequency Image
Compression [20.504561050200365]
We propose the first learned multi-frequency image compression and entropy coding approach.
It is based on the recently developed octave convolutions to factorize the latents into high and low frequency (resolution) components.
We show that the proposed generalized octave convolution can improve the performance of other auto-encoder-based computer vision tasks.
arXiv Detail & Related papers (2020-02-24T01:35:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.