Improving Statistical Fidelity for Neural Image Compression with
Implicit Local Likelihood Models
- URL: http://arxiv.org/abs/2301.11189v3
- Date: Fri, 11 Aug 2023 02:21:27 GMT
- Title: Improving Statistical Fidelity for Neural Image Compression with
Implicit Local Likelihood Models
- Authors: Matthew J. Muckley, Alaaeldin El-Nouby, Karen Ullrich, Herv\'e
J\'egou, Jakob Verbeek
- Abstract summary: Lossy image compression aims to represent images in as few bits as possible while maintaining fidelity to the original.
We introduce a non-binary discriminator that is conditioned on quantized local image representations obtained via VQ-VAE autoencoders.
- Score: 31.308949268401047
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lossy image compression aims to represent images in as few bits as possible
while maintaining fidelity to the original. Theoretical results indicate that
optimizing distortion metrics such as PSNR or MS-SSIM necessarily leads to a
discrepancy in the statistics of original images from those of reconstructions,
in particular at low bitrates, often manifested by the blurring of the
compressed images. Previous work has leveraged adversarial discriminators to
improve statistical fidelity. Yet these binary discriminators adopted from
generative modeling tasks may not be ideal for image compression. In this
paper, we introduce a non-binary discriminator that is conditioned on quantized
local image representations obtained via VQ-VAE autoencoders. Our evaluations
on the CLIC2020, DIV2K and Kodak datasets show that our discriminator is more
effective for jointly optimizing distortion (e.g., PSNR) and statistical
fidelity (e.g., FID) than the PatchGAN of the state-of-the-art HiFiC model. On
CLIC2020, we obtain the same FID as HiFiC with 30-40\% fewer bits.
Related papers
- Higher fidelity perceptual image and video compression with a latent conditioned residual denoising diffusion model [55.2480439325792]
We propose a hybrid compression scheme optimized for perceptual quality, extending the approach of the CDC model with a decoder network.<n>We achieve up to +2dB PSNR fidelity improvements while maintaining comparable LPIPS and FID perceptual scores when compared with CDC.
arXiv Detail & Related papers (2025-05-19T14:13:14Z) - Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion [28.61304513668606]
ResULIC is a residual-guided ultra lowrate image compression system.<n>It incorporates residual signals into both semantic retrieval and the diffusion-based generation process.<n>It achieves superior objective and subjective performance compared to state-of-the-art diffusion-based methods.
arXiv Detail & Related papers (2025-05-13T06:51:23Z) - Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations.
Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations.
Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z) - A Rate-Distortion-Classification Approach for Lossy Image Compression [0.0]
In lossy image compression, the objective is to achieve minimal signal distortion while compressing images to a specified bit rate.
To bridge the gap between image compression and visual analysis, we propose a Rate-Distortion-Classification (RDC) model for lossy image compression.
arXiv Detail & Related papers (2024-05-06T14:11:36Z) - Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity.
We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss.
Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z) - Transferable Learned Image Compression-Resistant Adversarial Perturbations [66.46470251521947]
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks.
We introduce a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules.
arXiv Detail & Related papers (2024-01-06T03:03:28Z) - Extreme Image Compression using Fine-tuned VQGANs [43.43014096929809]
We introduce vector quantization (VQ)-based generative models into the image compression domain.
The codebook learned by the VQGAN model yields a strong expressive capacity.
The proposed framework outperforms state-of-the-art codecs in terms of perceptual quality-oriented metrics.
arXiv Detail & Related papers (2023-07-17T06:14:19Z) - Beyond Learned Metadata-based Raw Image Reconstruction [86.1667769209103]
Raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels.
They are not widely adopted by general users due to their substantial storage requirements.
We propose a novel framework that learns a compact representation in the latent space, serving as metadata.
arXiv Detail & Related papers (2023-06-21T06:59:07Z) - Machine Perception-Driven Image Compression: A Layered Generative
Approach [32.23554195427311]
layered generative image compression model is proposed to achieve high human vision-oriented image reconstructed quality.
Task-agnostic learning-based compression model is proposed, which effectively supports various compressed domain-based analytical tasks.
Joint optimization schedule is adopted to acquire best balance point among compression ratio, reconstructed image quality, and downstream perception performance.
arXiv Detail & Related papers (2023-04-14T02:12:38Z) - Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types.
We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding.
We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z) - Perceptually Optimizing Deep Image Compression [53.705543593594285]
Mean squared error (MSE) and $ell_p$ norms have largely dominated the measurement of loss in neural networks.
We propose a different proxy approach to optimize image analysis networks against quantitative perceptual models.
arXiv Detail & Related papers (2020-07-03T14:33:28Z) - Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency.
Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images.
Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z) - Saliency Driven Perceptual Image Compression [6.201592931432016]
The paper demonstrates that the popularly used evaluations metrics such as MS-SSIM and PSNR are inadequate for judging the performance of image compression techniques.
A new metric is proposed, which is learned on perceptual similarity data specific to image compression.
The model not only generates images which are visually better but also gives superior performance for subsequent computer vision tasks.
arXiv Detail & Related papers (2020-02-12T13:43:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.