Extreme Image Compression using Fine-tuned VQGANs
- URL: http://arxiv.org/abs/2307.08265v3
- Date: Fri, 15 Dec 2023 14:39:13 GMT
- Title: Extreme Image Compression using Fine-tuned VQGANs
- Authors: Qi Mao, Tinghan Yang, Yinuo Zhang, Zijian Wang, Meng Wang, Shiqi Wang,
Siwei Ma
- Abstract summary: We introduce vector quantization (VQ)-based generative models into the image compression domain.
The codebook learned by the VQGAN model yields a strong expressive capacity.
The proposed framework outperforms state-of-the-art codecs in terms of perceptual quality-oriented metrics.
- Score: 43.43014096929809
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent advances in generative compression methods have demonstrated
remarkable progress in enhancing the perceptual quality of compressed data,
especially in scenarios with low bitrates. However, their efficacy and
applicability to achieve extreme compression ratios ($<0.05$ bpp) remain
constrained. In this work, we propose a simple yet effective coding framework
by introducing vector quantization (VQ)--based generative models into the image
compression domain. The main insight is that the codebook learned by the VQGAN
model yields a strong expressive capacity, facilitating efficient compression
of continuous information in the latent space while maintaining reconstruction
quality. Specifically, an image can be represented as VQ-indices by finding the
nearest codeword, which can be encoded using lossless compression methods into
bitstreams. We propose clustering a pre-trained large-scale codebook into
smaller codebooks through the K-means algorithm, yielding variable bitrates and
different levels of reconstruction quality within the coding framework.
Furthermore, we introduce a transformer to predict lost indices and restore
images in unstable environments. Extensive qualitative and quantitative
experiments on various benchmark datasets demonstrate that the proposed
framework outperforms state-of-the-art codecs in terms of perceptual
quality-oriented metrics and human perception at extremely low bitrates ($\le
0.04$ bpp). Remarkably, even with the loss of up to $20\%$ of indices, the
images can be effectively restored with minimal perceptual loss.
Related papers
- DeepHQ: Learned Hierarchical Quantizer for Progressive Deep Image Coding [27.875207681547074]
progressive image coding (PIC) aims to compress various qualities of images into a single bitstream.
Research on neural network (NN)-based PIC is in its early stages.
We propose an NN-based progressive coding method that firstly utilizes learned quantization step sizes via learning for each quantization layer.
arXiv Detail & Related papers (2024-08-22T06:32:53Z) - Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption [57.056311855630916]
We propose a Controllable Generative Image Compression framework, Control-GIC.
It is capable of fine-grained adaption across a broad spectrum while ensuring high-fidelity and generality compression.
We develop a conditional conditionalization that can trace back to historic encoded multi-granularity representations.
arXiv Detail & Related papers (2024-06-02T14:22:09Z) - Unifying Generation and Compression: Ultra-low bitrate Image Coding Via
Multi-stage Transformer [35.500720262253054]
This paper introduces a novel Unified Image Generation-Compression (UIGC) paradigm, merging the processes of generation and compression.
A key feature of the UIGC framework is the adoption of vector-quantized (VQ) image models for tokenization.
Experiments demonstrate the superiority of the proposed UIGC framework over existing codecs in perceptual quality and human perception.
arXiv Detail & Related papers (2024-03-06T14:27:02Z) - Enhancing the Rate-Distortion-Perception Flexibility of Learned Image
Codecs with Conditional Diffusion Decoders [7.485128109817576]
We show that conditional diffusion models can lead to promising results in the generative compression task when used as a decoder.
In this paper, we show that conditional diffusion models can lead to promising results in the generative compression task when used as a decoder.
arXiv Detail & Related papers (2024-03-05T11:48:35Z) - Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity.
We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss.
Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z) - Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG
Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends.
Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z) - Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types.
We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding.
We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z) - Modeling Lost Information in Lossy Image Compression [72.69327382643549]
Lossy image compression is one of the most commonly used operators for digital images.
We propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem.
arXiv Detail & Related papers (2020-06-22T04:04:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.