Learned Image Compression with Gaussian-Laplacian-Logistic Mixture Model
and Concatenated Residual Modules
- URL: http://arxiv.org/abs/2107.06463v3
- Date: Fri, 9 Feb 2024 19:23:13 GMT
- Title: Learned Image Compression with Gaussian-Laplacian-Logistic Mixture Model
and Concatenated Residual Modules
- Authors: Haisheng Fu and Feng Liang and Jianping Lin and Bing Li and Mohammad
Akbari and Jie Liang and Guohe Zhang and Dong Liu and Chengjie Tu and
Jingning Han
- Abstract summary: Two key components of learned image compression are the entropy model of the latent representations and the encoding/decoding network architectures.
We propose a more flexible discretized Gaussian-Laplacian-Logistic mixture model (GLLMM) for the latent representations.
In the encoding/decoding network design part, we propose a residual blocks (CRB) where multiple residual blocks are serially connected with additional shortcut connections.
- Score: 22.818632387206257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently deep learning-based image compression methods have achieved
significant achievements and gradually outperformed traditional approaches
including the latest standard Versatile Video Coding (VVC) in both PSNR and
MS-SSIM metrics. Two key components of learned image compression are the
entropy model of the latent representations and the encoding/decoding network
architectures. Various models have been proposed, such as autoregressive,
softmax, logistic mixture, Gaussian mixture, and Laplacian. Existing schemes
only use one of these models. However, due to the vast diversity of images, it
is not optimal to use one model for all images, even different regions within
one image. In this paper, we propose a more flexible discretized
Gaussian-Laplacian-Logistic mixture model (GLLMM) for the latent
representations, which can adapt to different contents in different images and
different regions of one image more accurately and efficiently, given the same
complexity. Besides, in the encoding/decoding network design part, we propose a
concatenated residual blocks (CRB), where multiple residual blocks are serially
connected with additional shortcut connections. The CRB can improve the
learning ability of the network, which can further improve the compression
performance. Experimental results using the Kodak, Tecnick-100 and Tecnick-40
datasets show that the proposed scheme outperforms all the leading
learning-based methods and existing compression standards including VVC intra
coding (4:4:4 and 4:2:0) in terms of the PSNR and MS-SSIM. The source code is
available at \url{https://github.com/fengyurenpingsheng}
Related papers
- Exploiting Inter-Image Similarity Prior for Low-Bitrate Remote Sensing Image Compression [10.427300958330816]
We propose a codebook-based RS image compression (Code-RSIC) method with a generated discrete codebook.
The code significantly outperforms state-of-the-art traditional and learning-based image compression algorithms in terms of perception quality.
arXiv Detail & Related papers (2024-07-17T03:33:16Z) - You Can Mask More For Extremely Low-Bitrate Image Compression [80.7692466922499]
Learned image compression (LIC) methods have experienced significant progress during recent years.
LIC methods fail to explicitly explore the image structure and texture components crucial for image compression.
We present DA-Mask that samples visible patches based on the structure and texture of original images.
We propose a simple yet effective masked compression model (MCM), the first framework that unifies LIC and LIC end-to-end for extremely low-bitrate compression.
arXiv Detail & Related papers (2023-06-27T15:36:22Z) - Exploring Effective Mask Sampling Modeling for Neural Image Compression [171.35596121939238]
Most existing neural image compression methods rely on side information from hyperprior or context models to eliminate spatial redundancy.
Inspired by the mask sampling modeling in recent self-supervised learning methods for natural language processing and high-level vision, we propose a novel pretraining strategy for neural image compression.
Our method achieves competitive performance with lower computational complexity compared to state-of-the-art image compression methods.
arXiv Detail & Related papers (2023-06-09T06:50:20Z) - Binarized Spectral Compressive Imaging [59.18636040850608]
Existing deep learning models for hyperspectral image (HSI) reconstruction achieve good performance but require powerful hardwares with enormous memory and computational resources.
We propose a novel method, Binarized Spectral-Redistribution Network (BiSRNet)
BiSRNet is derived by using the proposed techniques to binarize the base model.
arXiv Detail & Related papers (2023-05-17T15:36:08Z) - Lossy Image Compression with Conditional Diffusion Models [25.158390422252097]
This paper outlines an end-to-end optimized lossy image compression framework using diffusion generative models.
In contrast to VAE-based neural compression, where the (mean) decoder is a deterministic neural network, our decoder is a conditional diffusion model.
Our approach yields stronger reported FID scores than the GAN-based model, while also yielding competitive performance with VAE-based models in several distortion metrics.
arXiv Detail & Related papers (2022-09-14T21:53:27Z) - Spatial-Separated Curve Rendering Network for Efficient and
High-Resolution Image Harmonization [59.19214040221055]
We propose a novel spatial-separated curve rendering network (S$2$CRNet) for efficient and high-resolution image harmonization.
The proposed method reduces more than 90% parameters compared with previous methods.
Our method can work smoothly on higher resolution images in real-time which is more than 10$times$ faster than the existing methods.
arXiv Detail & Related papers (2021-09-13T07:20:16Z) - Variable-Rate Deep Image Compression through Spatially-Adaptive Feature
Transform [58.60004238261117]
We propose a versatile deep image compression network based on Spatial Feature Transform (SFT arXiv:1804.02815)
Our model covers a wide range of compression rates using a single model, which is controlled by arbitrary pixel-wise quality maps.
The proposed framework allows us to perform task-aware image compressions for various tasks.
arXiv Detail & Related papers (2021-08-21T17:30:06Z) - Lossless Compression with Latent Variable Models [4.289574109162585]
We use latent variable models, which we call 'bits back with asymmetric numeral systems' (BB-ANS)
The method involves interleaving encode and decode steps, and achieves an optimal rate when compressing batches of data.
We describe 'Craystack', a modular software framework which we have developed for rapid prototyping of compression using deep generative models.
arXiv Detail & Related papers (2021-04-21T14:03:05Z) - Learned Multi-Resolution Variable-Rate Image Compression with
Octave-based Residual Blocks [15.308823742699039]
We propose a new variable-rate image compression framework, which employs generalized octave convolutions (GoConv) and generalized octave transposed-convolutions (GoTConv)
To enable a single model to operate with different bit rates and to learn multi-rate image features, a new objective function is introduced.
Experimental results show that the proposed framework trained with variable-rate objective function outperforms the standard codecs such as H.265/HEVC-based BPG and state-of-the-art learning-based variable-rate methods.
arXiv Detail & Related papers (2020-12-31T06:26:56Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z) - Deep Learning-based Image Compression with Trellis Coded Quantization [13.728517700074423]
We propose to incorporate trellis coded quantizer (TCQ) into a deep learning based image compression framework.
A soft-to-hard strategy is applied to allow for back propagation during training.
We develop a simple image compression model that consists of threeworks (encoder, decoder and entropy estimation) and optimize all of the components in an end-to-end manner.
arXiv Detail & Related papers (2020-01-26T08:00:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.