Neural Image Compression: Generalization, Robustness, and Spectral
Biases
- URL: http://arxiv.org/abs/2307.08657v2
- Date: Fri, 27 Oct 2023 20:56:51 GMT
- Title: Neural Image Compression: Generalization, Robustness, and Spectral
Biases
- Authors: Kelsey Lieberman, James Diffenderfer, Charles Godfrey, and Bhavya
Kailkhura
- Abstract summary: Recent advances in neural image compression (NIC) have produced models that are starting to outperform classic codecs.
Successful adoption of any machine learning system in the wild requires it to generalize (and be robust) to unseen distribution shifts.
This paper presents a benchmark suite to evaluate the out-of-distribution performance of image compression methods.
- Score: 16.55855347335981
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in neural image compression (NIC) have produced models that
are starting to outperform classic codecs. While this has led to growing
excitement about using NIC in real-world applications, the successful adoption
of any machine learning system in the wild requires it to generalize (and be
robust) to unseen distribution shifts at deployment. Unfortunately, current
research lacks comprehensive datasets and informative tools to evaluate and
understand NIC performance in real-world settings. To bridge this crucial gap,
first, this paper presents a comprehensive benchmark suite to evaluate the
out-of-distribution (OOD) performance of image compression methods.
Specifically, we provide CLIC-C and Kodak-C by introducing 15 corruptions to
the popular CLIC and Kodak benchmarks. Next, we propose spectrally-inspired
inspection tools to gain deeper insight into errors introduced by image
compression methods as well as their OOD performance. We then carry out a
detailed performance comparison of several classic codecs and NIC variants,
revealing intriguing findings that challenge our current understanding of the
strengths and limitations of NIC. Finally, we corroborate our empirical
findings with theoretical analysis, providing an in-depth view of the OOD
performance of NIC and its dependence on the spectral properties of the data.
Our benchmarks, spectral inspection tools, and findings provide a crucial
bridge to the real-world adoption of NIC. We hope that our work will propel
future efforts in designing robust and generalizable NIC methods. Code and data
will be made available at https://github.com/klieberman/ood_nic.
Related papers
- Using DUCK-Net for Polyp Image Segmentation [0.0]
"DUCK-Net" is capable of effectively learning and generalizing from small amounts of medical images to perform accurate segmentation tasks.
We demonstrate its capabilities specifically for polyp segmentation in colonoscopy images.
arXiv Detail & Related papers (2023-11-03T20:58:44Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Multi-Sample Training for Neural Image Compression [11.167668701825134]
Current state-of-the-art (sota) methods adopt uniform posterior to approximate quantization noise, and single-sample pathwise estimator to approximate the gradient of evidence lower bound (ELBO)
We propose to train NIC with multiple-sample importance weighted autoencoder (IWAE) target, which is tighter than ELBO and converges to log likelihood as sample size increases.
Our MS-NIC is plug-and-play, and can be easily extended to other neural compression tasks.
arXiv Detail & Related papers (2022-09-28T04:42:02Z) - Flexible Neural Image Compression via Code Editing [8.499248314440557]
Neural image compression (NIC) has outperformed traditional image codecs in ratedistortion (R-D) performance.
It usually requires a dedicated encoder-decoder pair for each point on R-D curve, which greatly hinders its practical deployment.
We propose Code Editing, a highly flexible coding method for NIC based on semi-amortized inference and adaptive quantization.
arXiv Detail & Related papers (2022-09-19T09:41:43Z) - The Devil Is in the Details: Window-based Attention for Image
Compression [58.1577742463617]
Most existing learned image compression models are based on Convolutional Neural Networks (CNNs)
In this paper, we study the effects of multiple kinds of attention mechanisms for local features learning, then introduce a more straightforward yet effective window-based local attention block.
The proposed window-based attention is very flexible which could work as a plug-and-play component to enhance CNN and Transformer models.
arXiv Detail & Related papers (2022-03-16T07:55:49Z) - Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG
Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends.
Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z) - Self-Denoising Neural Networks for Few Shot Learning [66.38505903102373]
We present a new training scheme that adds noise at multiple stages of an existing neural architecture while simultaneously learning to be robust to this added noise.
This architecture, which we call a Self-Denoising Neural Network (SDNN), can be applied easily to most modern convolutional neural architectures.
arXiv Detail & Related papers (2021-10-26T03:28:36Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z) - On the Impact of Lossy Image and Video Compression on the Performance of
Deep Convolutional Neural Network Architectures [17.349420462716886]
This study investigates the impact of commonplace image and video compression techniques on the performance of deep learning architectures.
We examine the impact on performance across five discrete tasks: human pose estimation, semantic segmentation, object detection, action recognition, and monocular depth estimation.
Results show a non-linear and non-uniform relationship between network performance and the level of lossy compression applied.
arXiv Detail & Related papers (2020-07-28T15:37:37Z) - Learning End-to-End Lossy Image Compression: A Benchmark [90.35363142246806]
We first conduct a comprehensive literature survey of learned image compression methods.
We describe milestones in cutting-edge learned image-compression methods, review a broad range of existing works, and provide insights into their historical development routes.
By introducing a coarse-to-fine hyperprior model for entropy estimation and signal reconstruction, we achieve improved rate-distortion performance.
arXiv Detail & Related papers (2020-02-10T13:13:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.