You Can Mask More For Extremely Low-Bitrate Image Compression
- URL: http://arxiv.org/abs/2306.15561v1
- Date: Tue, 27 Jun 2023 15:36:22 GMT
- Title: You Can Mask More For Extremely Low-Bitrate Image Compression
- Authors: Anqi Li, Feng Li, Jiaxin Han, Huihui Bai, Runmin Cong, Chunjie Zhang,
Meng Wang, Weisi Lin, Yao Zhao
- Abstract summary: Learned image compression (LIC) methods have experienced significant progress during recent years.
LIC methods fail to explicitly explore the image structure and texture components crucial for image compression.
We present DA-Mask that samples visible patches based on the structure and texture of original images.
We propose a simple yet effective masked compression model (MCM), the first framework that unifies LIC and LIC end-to-end for extremely low-bitrate compression.
- Score: 80.7692466922499
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learned image compression (LIC) methods have experienced significant progress
during recent years. However, these methods are primarily dedicated to
optimizing the rate-distortion (R-D) performance at medium and high bitrates (>
0.1 bits per pixel (bpp)), while research on extremely low bitrates is limited.
Besides, existing methods fail to explicitly explore the image structure and
texture components crucial for image compression, treating them equally
alongside uninformative components in networks. This can cause severe
perceptual quality degradation, especially under low-bitrate scenarios. In this
work, inspired by the success of pre-trained masked autoencoders (MAE) in many
downstream tasks, we propose to rethink its mask sampling strategy from
structure and texture perspectives for high redundancy reduction and
discriminative feature representation, further unleashing the potential of LIC
methods. Therefore, we present a dual-adaptive masking approach (DA-Mask) that
samples visible patches based on the structure and texture distributions of
original images. We combine DA-Mask and pre-trained MAE in masked image
modeling (MIM) as an initial compressor that abstracts informative semantic
context and texture representations. Such a pipeline can well cooperate with
LIC networks to achieve further secondary compression while preserving
promising reconstruction quality. Consequently, we propose a simple yet
effective masked compression model (MCM), the first framework that unifies MIM
and LIC end-to-end for extremely low-bitrate image compression. Extensive
experiments have demonstrated that our approach outperforms recent
state-of-the-art methods in R-D performance, visual quality, and downstream
applications, at very low bitrates. Our code is available at
https://github.com/lianqi1008/MCM.git.
Related papers
- Map-Assisted Remote-Sensing Image Compression at Extremely Low Bitrates [47.47031054057152]
Generative models have been explored to compress RS images into extremely low-bitrate streams.
These generative models struggle to reconstruct visually plausible images due to the highly ill-posed nature of extremely low-bitrate image compression.
We propose an image compression framework that utilizes a pre-trained diffusion model with powerful natural image priors to achieve high-realism reconstructions.
arXiv Detail & Related papers (2024-09-03T14:29:54Z) - MISC: Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model [78.4051835615796]
This paper proposes a method called Multimodal Image Semantic Compression.
It consists of an LMM encoder for extracting the semantic information of the image, a map encoder to locate the region corresponding to the semantic, an image encoder generates an extremely compressed bitstream, and a decoder reconstructs the image based on the above information.
It can achieve optimal consistency and perception results while saving perceptual 50%, which has strong potential applications in the next generation of storage and communication.
arXiv Detail & Related papers (2024-02-26T17:11:11Z) - Extreme Image Compression using Fine-tuned VQGANs [43.43014096929809]
We introduce vector quantization (VQ)-based generative models into the image compression domain.
The codebook learned by the VQGAN model yields a strong expressive capacity.
The proposed framework outperforms state-of-the-art codecs in terms of perceptual quality-oriented metrics.
arXiv Detail & Related papers (2023-07-17T06:14:19Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - Exploring Effective Mask Sampling Modeling for Neural Image Compression [171.35596121939238]
Most existing neural image compression methods rely on side information from hyperprior or context models to eliminate spatial redundancy.
Inspired by the mask sampling modeling in recent self-supervised learning methods for natural language processing and high-level vision, we propose a novel pretraining strategy for neural image compression.
Our method achieves competitive performance with lower computational complexity compared to state-of-the-art image compression methods.
arXiv Detail & Related papers (2023-06-09T06:50:20Z) - High-Fidelity Variable-Rate Image Compression via Invertible Activation
Transformation [24.379052026260034]
We propose the Invertible Activation Transformation (IAT) module to tackle the issue of high-fidelity fine variable-rate image compression.
IAT and QLevel together give the image compression model the ability of fine variable-rate control while better maintaining the image fidelity.
Our method outperforms the state-of-the-art variable-rate image compression method by a large margin, especially after multiple re-encodings.
arXiv Detail & Related papers (2022-09-12T07:14:07Z) - Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types.
We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding.
We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.