Advancing The Rate-Distortion-Computation Frontier For Neural Image
Compression
- URL: http://arxiv.org/abs/2311.12821v1
- Date: Tue, 26 Sep 2023 19:47:31 GMT
- Title: Advancing The Rate-Distortion-Computation Frontier For Neural Image
Compression
- Authors: David Minnen and Nick Johnston
- Abstract summary: Rate-distortion-computation study shows that neither floating-point operations (FLOPs) nor runtime are sufficient on their own to accurately rank neural compression methods.
We identify a novel neural compression architecture that yields state-of-the-art RD performance with rate savings of 23.1% over BPG.
- Score: 6.167676495563641
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The rate-distortion performance of neural image compression models has
exceeded the state-of-the-art for non-learned codecs, but neural codecs are
still far from widespread deployment and adoption. The largest obstacle is
having efficient models that are feasible on a wide variety of consumer
hardware. Comparative research and evaluation is difficult due to the lack of
standard benchmarking platforms and due to variations in hardware architectures
and test environments. Through our rate-distortion-computation (RDC) study we
demonstrate that neither floating-point operations (FLOPs) nor runtime are
sufficient on their own to accurately rank neural compression methods. We also
explore the RDC frontier, which leads to a family of model architectures with
the best empirical trade-off between computational requirements and RD
performance. Finally, we identify a novel neural compression architecture that
yields state-of-the-art RD performance with rate savings of 23.1% over BPG
(7.0% over VTM and 3.0% over ELIC) without requiring significantly more FLOPs
than other learning-based codecs.
Related papers
- Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion [13.196774986841469]
We show that by focusing on modeling visual perception rather than the data distribution, we can achieve a good trade-off between visual quality and bit rate.
We do this by optimizing C3, an overfitted image, for Wasserstein Distortion (WD) and evaluating the image reconstructions with a human rater study.
arXiv Detail & Related papers (2024-11-30T15:05:01Z) - NeurLZ: On Enhancing Lossy Compression Performance based on Error-Controlled Neural Learning for Scientific Data [35.36879818366783]
Large-scale scientific simulations generate massive datasets that pose challenges for storage and I/O.
We propose NeurLZ, a novel cross-field learning-based and error-controlled compression framework for scientific data.
arXiv Detail & Related papers (2024-09-09T16:48:09Z) - Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity.
We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss.
Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z) - RECOMBINER: Robust and Enhanced Compression with Bayesian Implicit
Neural Representations [8.417694229876371]
COMBINER avoids quantization and enables direct optimization of the rate-distortion performance.
We propose Robust and Enhanced COMBINER (RECOMBINER) to overcome COMBINER's limitations.
We show that RECOMBINER achieves competitive results with the best INR-based methods and even outperforms autoencoder-based codecs on low-resolution images.
arXiv Detail & Related papers (2023-09-29T12:27:15Z) - ConvNeXt-ChARM: ConvNeXt-based Transform for Efficient Neural Image
Compression [18.05997169440533]
We propose ConvNeXt-ChARM, an efficient ConvNeXt-based transform coding framework, paired with a compute-efficient channel-wise auto-regressive auto-regressive.
We show that ConvNeXt-ChARM brings consistent and significant BD-rate (PSNR) reductions estimated on average to 5.24% and 1.22% over the versatile video coding (VVC) reference encoder (VTM-18.0) and the state-of-the-art learned image compression method SwinT-ChARM.
arXiv Detail & Related papers (2023-07-12T11:45:54Z) - Joint Hierarchical Priors and Adaptive Spatial Resolution for Efficient
Neural Image Compression [11.25130799452367]
We propose an absolute image compression transformer (ICT) for neural image compression (NIC)
ICT captures both global and local contexts from the latent representations and better parameterize the distribution of the quantized latents.
Our framework significantly improves the trade-off between coding efficiency and decoder complexity over the versatile video coding (VVC) reference encoder (VTM-18.0) and the neural SwinT-ChARM.
arXiv Detail & Related papers (2023-07-05T13:17:14Z) - Modality-Agnostic Variational Compression of Implicit Neural
Representations [96.35492043867104]
We introduce a modality-agnostic neural compression algorithm based on a functional view of data and parameterised as an Implicit Neural Representation (INR)
Bridging the gap between latent coding and sparsity, we obtain compact latent representations non-linearly mapped to a soft gating mechanism.
After obtaining a dataset of such latent representations, we directly optimise the rate/distortion trade-off in a modality-agnostic space using neural compression.
arXiv Detail & Related papers (2023-01-23T15:22:42Z) - SmoothNets: Optimizing CNN architecture design for differentially
private deep learning [69.10072367807095]
DPSGD requires clipping and noising of per-sample gradients.
This introduces a reduction in model utility compared to non-private training.
We distilled a new model architecture termed SmoothNet, which is characterised by increased robustness to the challenges of DP-SGD training.
arXiv Detail & Related papers (2022-05-09T07:51:54Z) - An Empirical Analysis of Recurrent Learning Algorithms In Neural Lossy
Image Compression Systems [73.48927855855219]
Recent advances in deep learning have resulted in image compression algorithms that outperform JPEG and JPEG 2000 on the standard Kodak benchmark.
In this paper, we perform the first large-scale comparison of recent state-of-the-art hybrid neural compression algorithms.
arXiv Detail & Related papers (2022-01-27T19:47:51Z) - Classification of COVID-19 in CT Scans using Multi-Source Transfer
Learning [91.3755431537592]
We propose the use of Multi-Source Transfer Learning to improve upon traditional Transfer Learning for the classification of COVID-19 from CT scans.
With our multi-source fine-tuning approach, our models outperformed baseline models fine-tuned with ImageNet.
Our best performing model was able to achieve an accuracy of 0.893 and a Recall score of 0.897, outperforming its baseline Recall score by 9.3%.
arXiv Detail & Related papers (2020-09-22T11:53:06Z) - ALF: Autoencoder-based Low-rank Filter-sharing for Efficient
Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF)
ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.