Causal Contextual Prediction for Learned Image Compression
- URL: http://arxiv.org/abs/2011.09704v5
- Date: Sun, 31 Oct 2021 05:06:18 GMT
- Title: Causal Contextual Prediction for Learned Image Compression
- Authors: Zongyu Guo, Zhizheng Zhang, Runsen Feng, Zhibo Chen
- Abstract summary: We propose the concept of separate entropy coding to leverage a serial decoding process for causal contextual entropy prediction in the latent space.
A causal context model is proposed that separates the latents across channels and makes use of cross-channel relationships to generate highly informative contexts.
We also propose a causal global prediction model, which is able to find global reference points for accurate predictions of unknown points.
- Score: 36.08393281509613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the past several years, we have witnessed impressive progress in the
field of learned image compression. Recent learned image codecs are commonly
based on autoencoders, that first encode an image into low-dimensional latent
representations and then decode them for reconstruction purposes. To capture
spatial dependencies in the latent space, prior works exploit hyperprior and
spatial context model to build an entropy model, which estimates the bit-rate
for end-to-end rate-distortion optimization. However, such an entropy model is
suboptimal from two aspects: (1) It fails to capture spatially global
correlations among the latents. (2) Cross-channel relationships of the latents
are still underexplored. In this paper, we propose the concept of separate
entropy coding to leverage a serial decoding process for causal contextual
entropy prediction in the latent space. A causal context model is proposed that
separates the latents across channels and makes use of cross-channel
relationships to generate highly informative contexts. Furthermore, we propose
a causal global prediction model, which is able to find global reference points
for accurate predictions of unknown points. Both these two models facilitate
entropy estimation without the transmission of overhead. In addition, we
further adopt a new separate attention module to build more powerful transform
networks. Experimental results demonstrate that our full image compression
model outperforms standard VVC/H.266 codec on Kodak dataset in terms of both
PSNR and MS-SSIM, yielding the state-of-the-art rate-distortion performance.
Related papers
- Improving Diffusion-Based Image Synthesis with Context Prediction [49.186366441954846]
Existing diffusion models mainly try to reconstruct input image from a corrupted one with a pixel-wise or feature-wise constraint along spatial axes.
We propose ConPreDiff to improve diffusion-based image synthesis with context prediction.
Our ConPreDiff consistently outperforms previous methods and achieves a new SOTA text-to-image generation results on MS-COCO, with a zero-shot FID score of 6.21.
arXiv Detail & Related papers (2024-01-04T01:10:56Z) - Corner-to-Center Long-range Context Model for Efficient Learned Image
Compression [70.0411436929495]
In the framework of learned image compression, the context model plays a pivotal role in capturing the dependencies among latent representations.
We propose the textbfCorner-to-Center transformer-based Context Model (C$3$M) designed to enhance context and latent predictions.
In addition, to enlarge the receptive field in the analysis and synthesis transformation, we use the Long-range Crossing Attention Module (LCAM) in the encoder/decoder.
arXiv Detail & Related papers (2023-11-29T21:40:28Z) - Multi-Context Dual Hyper-Prior Neural Image Compression [10.349258638494137]
We propose a Transformer-based nonlinear transform to efficiently capture both local and global information from the input image.
We also introduce a novel entropy model that incorporates two different hyperpriors to model cross-channel and spatial dependencies of the latent representation.
Our experiments show that our proposed framework performs better than the state-of-the-art methods in terms of rate-distortion performance.
arXiv Detail & Related papers (2023-09-19T17:44:44Z) - Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z) - Entroformer: A Transformer-based Entropy Model for Learned Image
Compression [17.51693464943102]
We propose a novel transformer-based entropy model, termed Entroformer, to capture long-range dependencies in probability distribution estimation.
The experiments show that the Entroformer achieves state-of-the-art performance on image compression while being time-efficient.
arXiv Detail & Related papers (2022-02-11T08:03:31Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types.
We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding.
We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z) - Joint Global and Local Hierarchical Priors for Learned Image Compression [30.44884350320053]
Recently, learned image compression methods have shown superior performance compared to the traditional hand-crafted image codecs.
We propose a novel entropy model called Information Transformer (Informer) that exploits both local and global information in a content-dependent manner.
Our experiments demonstrate that Informer improves rate-distortion performance over the state-of-the-art methods on the Kodak and Tecnick datasets.
arXiv Detail & Related papers (2021-12-08T06:17:37Z) - A Cross Channel Context Model for Latents in Deep Image Compression [10.20672454399047]
This paper presents a cross channel context model for latents in deep image compression.
The proposed model is combined with the joint autoregressive and hierarchical prior entropy model.
Using PSNR as the distortion metric, the combined model achieves BD-rate reductions of 6.30% and 6.31% over the baseline entropy model.
arXiv Detail & Related papers (2021-03-04T08:13:04Z) - Learning Context-Based Non-local Entropy Modeling for Image Compression [140.64888994506313]
In this paper, we propose a non-local operation for context modeling by employing the global similarity within the context.
The entropy model is further adopted as the rate loss in a joint rate-distortion optimization.
Considering that the width of the transforms is essential in training low distortion models, we finally produce a U-Net block in the transforms to increase the width with manageable memory consumption and time complexity.
arXiv Detail & Related papers (2020-05-10T13:28:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.