Towards image compression with perfect realism at ultra-low bitrates
- URL: http://arxiv.org/abs/2310.10325v2
- Date: Tue, 19 Mar 2024 09:54:41 GMT
- Title: Towards image compression with perfect realism at ultra-low bitrates
- Authors: Marlène Careil, Matthew J. Muckley, Jakob Verbeek, Stéphane Lathuilière,
- Abstract summary: We dub our model PerCo for 'perceptual compression', and compare it to state-of-the-art codecs at rates from 0.1 down to 0.003 bits per pixel.
We find that our model leads to reconstruction with state-of-the-art visual quality as measured by FID and KID.
- Score: 28.511327714128413
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Image codecs are typically optimized to trade-off bitrate \vs distortion metrics. At low bitrates, this leads to compression artefacts which are easily perceptible, even when training with perceptual or adversarial losses. To improve image quality and remove dependency on the bitrate, we propose to decode with iterative diffusion models. We condition the decoding process on a vector-quantized image representation, as well as a global image description to provide additional context. We dub our model PerCo for 'perceptual compression', and compare it to state-of-the-art codecs at rates from 0.1 down to 0.003 bits per pixel. The latter rate is more than an order of magnitude smaller than those considered in most prior work, compressing a 512x768 Kodak image with less than 153 bytes. Despite this ultra-low bitrate, our approach maintains the ability to reconstruct realistic images. We find that our model leads to reconstructions with state-of-the-art visual quality as measured by FID and KID. As predicted by rate-distortion-perception theory, visual quality is less dependent on the bitrate than previous methods.
Related papers
- Map-Assisted Remote-Sensing Image Compression at Extremely Low Bitrates [47.47031054057152]
Generative models have been explored to compress RS images into extremely low-bitrate streams.
These generative models struggle to reconstruct visually plausible images due to the highly ill-posed nature of extremely low-bitrate image compression.
We propose an image compression framework that utilizes a pre-trained diffusion model with powerful natural image priors to achieve high-realism reconstructions.
arXiv Detail & Related papers (2024-09-03T14:29:54Z) - Enhancing the Rate-Distortion-Perception Flexibility of Learned Image
Codecs with Conditional Diffusion Decoders [7.485128109817576]
We show that conditional diffusion models can lead to promising results in the generative compression task when used as a decoder.
In this paper, we show that conditional diffusion models can lead to promising results in the generative compression task when used as a decoder.
arXiv Detail & Related papers (2024-03-05T11:48:35Z) - MISC: Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model [78.4051835615796]
This paper proposes a method called Multimodal Image Semantic Compression.
It consists of an LMM encoder for extracting the semantic information of the image, a map encoder to locate the region corresponding to the semantic, an image encoder generates an extremely compressed bitstream, and a decoder reconstructs the image based on the above information.
It can achieve optimal consistency and perception results while saving perceptual 50%, which has strong potential applications in the next generation of storage and communication.
arXiv Detail & Related papers (2024-02-26T17:11:11Z) - Extreme Image Compression using Fine-tuned VQGANs [43.43014096929809]
We introduce vector quantization (VQ)-based generative models into the image compression domain.
The codebook learned by the VQGAN model yields a strong expressive capacity.
The proposed framework outperforms state-of-the-art codecs in terms of perceptual quality-oriented metrics.
arXiv Detail & Related papers (2023-07-17T06:14:19Z) - You Can Mask More For Extremely Low-Bitrate Image Compression [80.7692466922499]
Learned image compression (LIC) methods have experienced significant progress during recent years.
LIC methods fail to explicitly explore the image structure and texture components crucial for image compression.
We present DA-Mask that samples visible patches based on the structure and texture of original images.
We propose a simple yet effective masked compression model (MCM), the first framework that unifies LIC and LIC end-to-end for extremely low-bitrate compression.
arXiv Detail & Related papers (2023-06-27T15:36:22Z) - Are Visual Recognition Models Robust to Image Compression? [23.280147529096908]
We analyze the impact of image compression on visual recognition tasks.
We consider a wide range of compression levels, ranging from 0.1 to 2 bits-per-pixel (bpp)
We find that for all three tasks, the recognition ability is significantly impacted when using strong compression.
arXiv Detail & Related papers (2023-04-10T11:30:11Z) - PILC: Practical Image Lossless Compression with an End-to-end GPU
Oriented Neural Framework [88.18310777246735]
We propose an end-to-end image compression framework that achieves 200 MB/s for both compression and decompression with a single NVIDIA Tesla V100 GPU.
Experiments show that our framework compresses better than PNG by a margin of 30% in multiple datasets.
arXiv Detail & Related papers (2022-06-10T03:00:10Z) - Learning Scalable $\ell_\infty$-constrained Near-lossless Image
Compression via Joint Lossy Image and Residual Compression [118.89112502350177]
We propose a novel framework for learning $ell_infty$-constrained near-lossless image compression.
We derive the probability model of the quantized residual by quantizing the learned probability model of the original residual.
arXiv Detail & Related papers (2021-03-31T11:53:36Z) - How to Exploit the Transferability of Learned Image Compression to
Conventional Codecs [25.622863999901874]
We show how learned image coding can be used as a surrogate to optimize an image for encoding.
Our approach can remodel a conventional image to adjust for the MS-SSIM distortion with over 20% rate improvement without any decoding overhead.
arXiv Detail & Related papers (2020-12-03T12:34:51Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.