Multi-level Wavelet-based Generative Adversarial Network for Perceptual
Quality Enhancement of Compressed Video
- URL: http://arxiv.org/abs/2008.00499v1
- Date: Sun, 2 Aug 2020 15:01:38 GMT
- Title: Multi-level Wavelet-based Generative Adversarial Network for Perceptual
Quality Enhancement of Compressed Video
- Authors: Jianyi Wang, Xin Deng, Mai Xu, Congyong Chen, Yuhang Song
- Abstract summary: Existing methods mainly focus on enhancing the objective quality of compressed video while ignoring its perceptual quality.
We propose a novel generative adversarial network (GAN) based on multi-level wavelet packet transform (WPT) to enhance the perceptual quality of compressed video.
- Score: 51.631731922593225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The past few years have witnessed fast development in video quality
enhancement via deep learning. Existing methods mainly focus on enhancing the
objective quality of compressed video while ignoring its perceptual quality. In
this paper, we focus on enhancing the perceptual quality of compressed video.
Our main observation is that enhancing the perceptual quality mostly relies on
recovering high-frequency sub-bands in wavelet domain. Accordingly, we propose
a novel generative adversarial network (GAN) based on multi-level wavelet
packet transform (WPT) to enhance the perceptual quality of compressed video,
which is called multi-level wavelet-based GAN (MW-GAN). In MW-GAN, we first
apply motion compensation with a pyramid architecture to obtain temporal
information. Then, we propose a wavelet reconstruction network with
wavelet-dense residual blocks (WDRB) to recover the high-frequency details. In
addition, the adversarial loss of MW-GAN is added via WPT to further encourage
high-frequency details recovery for video frames. Experimental results
demonstrate the superiority of our method.
Related papers
- High-Frequency Enhanced Hybrid Neural Representation for Video Compression [32.38933743785333]
This paper introduces a High-Frequency Enhanced Hybrid Neural Representation Network.
Our method focuses on leveraging high-frequency information to improve the synthesis of fine details by the network.
Experiments on the Bunny and UVG datasets demonstrate that our method outperforms other methods.
arXiv Detail & Related papers (2024-11-11T03:04:46Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement [14.653248860008981]
We propose a hierarchical frequency-based upsampling and refining neural network (HFUR) for compressed video quality enhancement.
ImpFreqUp exploits DCT-domain prior derived through implicit DCT transform, and accurately reconstructs the DCT-domain loss via a coarse-to-fine transfer.
HIR is introduced to facilitate cross-collaboration and information compensation between the scales, thus further refine the feature maps and promote the visual quality of the final output.
arXiv Detail & Related papers (2024-03-18T08:13:26Z) - High Visual-Fidelity Learned Video Compression [6.609832462227998]
We propose a novel High Visual-Fidelity Learned Video Compression framework (HVFVC)
Specifically, we design a novel confidence-based feature reconstruction method to address the issue of poor reconstruction in newly-emerged regions.
Extensive experiments have shown that the proposed HVFVC achieves excellent perceptual quality, outperforming the latest VVC standard with only 50% required.
arXiv Detail & Related papers (2023-10-07T03:27:45Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed
Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos.
We show that this improves restoration accuracy compared to prior compression correction methods.
We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z) - Boosting the Performance of Video Compression Artifact Reduction with
Reference Frame Proposals and Frequency Domain Information [31.053879834073502]
We propose an effective reference frame proposal strategy to boost the performance of the existing multi-frame approaches.
Experimental results show that our method achieves better fidelity and perceptual performance on MFQE 2.0 dataset than the state-of-the-art methods.
arXiv Detail & Related papers (2021-05-31T13:46:11Z) - Learning for Video Compression with Hierarchical Quality and Recurrent
Enhancement [164.7489982837475]
We propose a Hierarchical Learned Video Compression (HLVC) method with three hierarchical quality layers and a recurrent enhancement network.
In our HLVC approach, the hierarchical quality benefits the coding efficiency, since the high quality information facilitates the compression and enhancement of low quality frames at encoder and decoder sides.
arXiv Detail & Related papers (2020-03-04T09:31:37Z) - An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond
Feature and Signal [99.49099501559652]
Video Coding for Machine (VCM) aims to bridge the gap between visual feature compression and classical video coding.
We employ a conditional deep generation network to reconstruct video frames with the guidance of learned motion pattern.
By learning to extract sparse motion pattern via a predictive model, the network elegantly leverages the feature representation to generate the appearance of to-be-coded frames.
arXiv Detail & Related papers (2020-01-09T14:18:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.