Learning Context-Based Non-local Entropy Modeling for Image Compression
- URL: http://arxiv.org/abs/2005.04661v1
- Date: Sun, 10 May 2020 13:28:18 GMT
- Title: Learning Context-Based Non-local Entropy Modeling for Image Compression
- Authors: Mu Li, Kai Zhang, Wangmeng Zuo, Radu Timofte, David Zhang
- Abstract summary: In this paper, we propose a non-local operation for context modeling by employing the global similarity within the context.
The entropy model is further adopted as the rate loss in a joint rate-distortion optimization.
Considering that the width of the transforms is essential in training low distortion models, we finally produce a U-Net block in the transforms to increase the width with manageable memory consumption and time complexity.
- Score: 140.64888994506313
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The entropy of the codes usually serves as the rate loss in the recent
learned lossy image compression methods. Precise estimation of the
probabilistic distribution of the codes plays a vital role in the performance.
However, existing deep learning based entropy modeling methods generally assume
the latent codes are statistically independent or depend on some side
information or local context, which fails to take the global similarity within
the context into account and thus hinder the accurate entropy estimation. To
address this issue, we propose a non-local operation for context modeling by
employing the global similarity within the context. Specifically, we first
introduce the proxy similarity functions and spatial masks to handle the
missing reference problem in context modeling. Then, we combine the local and
the global context via a non-local attention block and employ it in masked
convolutional networks for entropy modeling. The entropy model is further
adopted as the rate loss in a joint rate-distortion optimization to guide the
training of the analysis transform and the synthesis transform network in
transforming coding framework. Considering that the width of the transforms is
essential in training low distortion models, we finally produce a U-Net block
in the transforms to increase the width with manageable memory consumption and
time complexity. Experiments on Kodak and Tecnick datasets demonstrate the
superiority of the proposed context-based non-local attention block in entropy
modeling and the U-Net block in low distortion compression against the existing
image compression standards and recent deep image compression models.
Related papers
- Causal Context Adjustment Loss for Learned Image Compression [72.7300229848778]
In recent years, learned image compression (LIC) technologies have surpassed conventional methods notably in terms of rate-distortion (RD) performance.
Most present techniques are VAE-based with an autoregressive entropy model, which obviously promotes the RD performance by utilizing the decoded causal context.
In this paper, we make the first attempt in investigating the way to explicitly adjust the causal context with our proposed Causal Context Adjustment loss.
arXiv Detail & Related papers (2024-10-07T09:08:32Z) - Corner-to-Center Long-range Context Model for Efficient Learned Image
Compression [70.0411436929495]
In the framework of learned image compression, the context model plays a pivotal role in capturing the dependencies among latent representations.
We propose the textbfCorner-to-Center transformer-based Context Model (C$3$M) designed to enhance context and latent predictions.
In addition, to enlarge the receptive field in the analysis and synthesis transformation, we use the Long-range Crossing Attention Module (LCAM) in the encoder/decoder.
arXiv Detail & Related papers (2023-11-29T21:40:28Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Multi-Context Dual Hyper-Prior Neural Image Compression [10.349258638494137]
We propose a Transformer-based nonlinear transform to efficiently capture both local and global information from the input image.
We also introduce a novel entropy model that incorporates two different hyperpriors to model cross-channel and spatial dependencies of the latent representation.
Our experiments show that our proposed framework performs better than the state-of-the-art methods in terms of rate-distortion performance.
arXiv Detail & Related papers (2023-09-19T17:44:44Z) - Dynamic Kernel-Based Adaptive Spatial Aggregation for Learned Image
Compression [63.56922682378755]
We focus on extending spatial aggregation capability and propose a dynamic kernel-based transform coding.
The proposed adaptive aggregation generates kernel offsets to capture valid information in the content-conditioned range to help transform.
Experimental results demonstrate that our method achieves superior rate-distortion performance on three benchmarks compared to the state-of-the-art learning-based methods.
arXiv Detail & Related papers (2023-08-17T01:34:51Z) - Entroformer: A Transformer-based Entropy Model for Learned Image
Compression [17.51693464943102]
We propose a novel transformer-based entropy model, termed Entroformer, to capture long-range dependencies in probability distribution estimation.
The experiments show that the Entroformer achieves state-of-the-art performance on image compression while being time-efficient.
arXiv Detail & Related papers (2022-02-11T08:03:31Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Joint Global and Local Hierarchical Priors for Learned Image Compression [30.44884350320053]
Recently, learned image compression methods have shown superior performance compared to the traditional hand-crafted image codecs.
We propose a novel entropy model called Information Transformer (Informer) that exploits both local and global information in a content-dependent manner.
Our experiments demonstrate that Informer improves rate-distortion performance over the state-of-the-art methods on the Kodak and Tecnick datasets.
arXiv Detail & Related papers (2021-12-08T06:17:37Z) - Causal Contextual Prediction for Learned Image Compression [36.08393281509613]
We propose the concept of separate entropy coding to leverage a serial decoding process for causal contextual entropy prediction in the latent space.
A causal context model is proposed that separates the latents across channels and makes use of cross-channel relationships to generate highly informative contexts.
We also propose a causal global prediction model, which is able to find global reference points for accurate predictions of unknown points.
arXiv Detail & Related papers (2020-11-19T08:15:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.