Exploring the Limits of Semantic Image Compression at Micro-bits per
Pixel
- URL: http://arxiv.org/abs/2402.13536v1
- Date: Wed, 21 Feb 2024 05:14:30 GMT
- Title: Exploring the Limits of Semantic Image Compression at Micro-bits per
Pixel
- Authors: Jordan Dotzel, Bahaa Kotb, James Dotzel, Mohamed Abdelfattah, Zhiru
Zhang
- Abstract summary: We use GPT-4V and DALL-E3 from OpenAI to explore the quality-compression frontier for image compression.
We push semantic compression as low as 100 $mu$bpp (up to $10,000times$ smaller than JPEG) by introducing an iterative reflection process.
We further hypothesize this 100 $mu$bpp level represents a soft limit on semantic compression at standard image resolutions.
- Score: 8.518076792914039
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional methods, such as JPEG, perform image compression by operating on
structural information, such as pixel values or frequency content. These
methods are effective to bitrates around one bit per pixel (bpp) and higher at
standard image sizes. In contrast, text-based semantic compression directly
stores concepts and their relationships using natural language, which has
evolved with humans to efficiently represent these salient concepts. These
methods can operate at extremely low bitrates by disregarding structural
information like location, size, and orientation. In this work, we use GPT-4V
and DALL-E3 from OpenAI to explore the quality-compression frontier for image
compression and identify the limitations of current technology. We push
semantic compression as low as 100 $\mu$bpp (up to $10,000\times$ smaller than
JPEG) by introducing an iterative reflection process to improve the decoded
image. We further hypothesize this 100 $\mu$bpp level represents a soft limit
on semantic compression at standard image resolutions.
Related papers
- MISC: Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model [78.4051835615796]
This paper proposes a method called Multimodal Image Semantic Compression.
It consists of an LMM encoder for extracting the semantic information of the image, a map encoder to locate the region corresponding to the semantic, an image encoder generates an extremely compressed bitstream, and a decoder reconstructs the image based on the above information.
It can achieve optimal consistency and perception results while saving perceptual 50%, which has strong potential applications in the next generation of storage and communication.
arXiv Detail & Related papers (2024-02-26T17:11:11Z) - Perceptual Image Compression with Cooperative Cross-Modal Side
Information [53.356714177243745]
We propose a novel deep image compression method with text-guided side information to achieve a better rate-perception-distortion tradeoff.
Specifically, we employ the CLIP text encoder and an effective Semantic-Spatial Aware block to fuse the text and image features.
arXiv Detail & Related papers (2023-11-23T08:31:11Z) - Towards image compression with perfect realism at ultra-low bitrates [28.511327714128413]
We dub our model PerCo for 'perceptual compression', and compare it to state-of-the-art codecs at rates from 0.1 down to 0.003 bits per pixel.
We find that our model leads to reconstruction with state-of-the-art visual quality as measured by FID and KID.
arXiv Detail & Related papers (2023-10-16T12:08:35Z) - You Can Mask More For Extremely Low-Bitrate Image Compression [80.7692466922499]
Learned image compression (LIC) methods have experienced significant progress during recent years.
LIC methods fail to explicitly explore the image structure and texture components crucial for image compression.
We present DA-Mask that samples visible patches based on the structure and texture of original images.
We propose a simple yet effective masked compression model (MCM), the first framework that unifies LIC and LIC end-to-end for extremely low-bitrate compression.
arXiv Detail & Related papers (2023-06-27T15:36:22Z) - Random-Access Neural Compression of Material Textures [1.2971248363246106]
We propose a novel neural compression technique specifically designed for material textures.
We unlock two more levels of detail, i.e., 16x more texels, using low compression.
Our method allows on-demand, real-time decompression with random access, enabling compression on disk and memory.
arXiv Detail & Related papers (2023-05-26T17:16:22Z) - COIN: COmpression with Implicit Neural representations [64.02694714768691]
We propose a new simple approach for image compression.
Instead of storing the RGB values for each pixel of an image, we store the weights of a neural network overfitted to the image.
arXiv Detail & Related papers (2021-03-03T10:58:39Z) - How to Exploit the Transferability of Learned Image Compression to
Conventional Codecs [25.622863999901874]
We show how learned image coding can be used as a surrogate to optimize an image for encoding.
Our approach can remodel a conventional image to adjust for the MS-SSIM distortion with over 20% rate improvement without any decoding overhead.
arXiv Detail & Related papers (2020-12-03T12:34:51Z) - Lossy Image Compression with Normalizing Flows [19.817005399746467]
State-of-the-art solutions for deep image compression typically employ autoencoders which map the input to a lower dimensional latent space.
In contrast, traditional approaches in image compression allow for a larger range of quality levels.
arXiv Detail & Related papers (2020-08-24T14:46:23Z) - Quantization Guided JPEG Artifact Correction [69.04777875711646]
We develop a novel architecture for artifact correction using the JPEG files quantization matrix.
This allows our single model to achieve state-of-the-art performance over models trained for specific quality settings.
arXiv Detail & Related papers (2020-04-17T00:10:08Z) - Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency.
Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images.
Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.