Learning for Video Compression with Hierarchical Quality and Recurrent
Enhancement
- URL: http://arxiv.org/abs/2003.01966v7
- Date: Mon, 3 Aug 2020 18:35:37 GMT
- Title: Learning for Video Compression with Hierarchical Quality and Recurrent
Enhancement
- Authors: Ren Yang, Fabian Mentzer, Luc Van Gool, Radu Timofte
- Abstract summary: We propose a Hierarchical Learned Video Compression (HLVC) method with three hierarchical quality layers and a recurrent enhancement network.
In our HLVC approach, the hierarchical quality benefits the coding efficiency, since the high quality information facilitates the compression and enhancement of low quality frames at encoder and decoder sides.
- Score: 164.7489982837475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a Hierarchical Learned Video Compression (HLVC)
method with three hierarchical quality layers and a recurrent enhancement
network. The frames in the first layer are compressed by an image compression
method with the highest quality. Using these frames as references, we propose
the Bi-Directional Deep Compression (BDDC) network to compress the second layer
with relatively high quality. Then, the third layer frames are compressed with
the lowest quality, by the proposed Single Motion Deep Compression (SMDC)
network, which adopts a single motion map to estimate the motions of multiple
frames, thus saving bits for motion information. In our deep decoder, we
develop the Weighted Recurrent Quality Enhancement (WRQE) network, which takes
both compressed frames and the bit stream as inputs. In the recurrent cell of
WRQE, the memory and update signal are weighted by quality features to
reasonably leverage multi-frame information for enhancement. In our HLVC
approach, the hierarchical quality benefits the coding efficiency, since the
high quality information facilitates the compression and enhancement of low
quality frames at encoder and decoder sides, respectively. Finally, the
experiments validate that our HLVC approach advances the state-of-the-art of
deep video compression methods, and outperforms the "Low-Delay P (LDP) very
fast" mode of x265 in terms of both PSNR and MS-SSIM. The project page is at
https://github.com/RenYang-home/HLVC.
Related papers
- Learned Compression for Images and Point Clouds [1.7404865362620803]
This thesis provides three primary contributions to this new field of learned compression.
First, we present an efficient low-complexity entropy model that dynamically adapts the encoding distribution to a specific input by compressing and transmitting the encoding distribution itself as side information.
Secondly, we propose a novel lightweight low-complexity point cloud that is highly specialized for classification, attaining significant reductions in compared to non-specialized codecs.
arXiv Detail & Related papers (2024-09-12T19:57:44Z) - Bi-Directional Deep Contextual Video Compression [17.195099321371526]
We introduce a bi-directional deep contextual video compression scheme tailored for B-frames, termed DCVC-B.
First, we develop a bi-directional motion difference context propagation method for effective motion difference coding.
Second, we propose a bi-directional contextual compression model and a corresponding bi-directional temporal entropy model.
Third, we propose a hierarchical quality structure-based training strategy, leading to an effective bit allocation across large groups of pictures.
arXiv Detail & Related papers (2024-08-16T08:45:25Z) - Accelerating Learned Video Compression via Low-Resolution Representation Learning [18.399027308582596]
We introduce an efficiency-optimized framework for learned video compression that focuses on low-resolution representation learning.
Our method achieves performance levels on par with the low-decay P configuration of the H.266 reference software VTM.
arXiv Detail & Related papers (2024-07-23T12:02:57Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
Most of the existing methods lack a structured design to optimally leverage the priors within compression codecs.
A new paradigm is urgently needed for a more conscious'' process of quality enhancement.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - MISC: Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model [78.4051835615796]
This paper proposes a method called Multimodal Image Semantic Compression.
It consists of an LMM encoder for extracting the semantic information of the image, a map encoder to locate the region corresponding to the semantic, an image encoder generates an extremely compressed bitstream, and a decoder reconstructs the image based on the above information.
It can achieve optimal consistency and perception results while saving perceptual 50%, which has strong potential applications in the next generation of storage and communication.
arXiv Detail & Related papers (2024-02-26T17:11:11Z) - You Can Mask More For Extremely Low-Bitrate Image Compression [80.7692466922499]
Learned image compression (LIC) methods have experienced significant progress during recent years.
LIC methods fail to explicitly explore the image structure and texture components crucial for image compression.
We present DA-Mask that samples visible patches based on the structure and texture of original images.
We propose a simple yet effective masked compression model (MCM), the first framework that unifies LIC and LIC end-to-end for extremely low-bitrate compression.
arXiv Detail & Related papers (2023-06-27T15:36:22Z) - HiNeRV: Video Compression with Hierarchical Encoding-based Neural
Representation [14.088444622391501]
Implicit Representations (INRs) have previously been used to represent and compress image and video content.
Existing INR-based methods have failed to deliver rate quality performance comparable with the state of the art in video compression.
We propose HiNeRV, an INR that combines light weight layers with hierarchical positional encodings.
arXiv Detail & Related papers (2023-06-16T12:59:52Z) - Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed
Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos.
We show that this improves restoration accuracy compared to prior compression correction methods.
We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.