Related papers: Overfitting for Fun and Profit: Instance-Adaptive Data Compression

Overfitting for Fun and Profit: Instance-Adaptive Data Compression

URL: http://arxiv.org/abs/2101.08687v1
Date: Thu, 21 Jan 2021 15:58:58 GMT
Title: Overfitting for Fun and Profit: Instance-Adaptive Data Compression
Authors: Ties van Rozendaal, Iris A.M. Huijben, Taco S. Cohen
Abstract summary: Neural data compression has been shown to outperform classical methods in terms of $RD$ performance. In this paper we take this concept to the extreme, adapting the full model to a single video, and sending model updates along with the latent representation. We demonstrate that full-model adaptation improves $RD$ performance by 1 dB, with respect to encoder-only finetuning.
Score: 20.764189960709164
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural data compression has been shown to outperform classical methods in terms of $RD$ performance, with results still improving rapidly. At a high level, neural compression is based on an autoencoder that tries to reconstruct the input instance from a (quantized) latent representation, coupled with a prior that is used to losslessly compress these latents. Due to limitations on model capacity and imperfect optimization and generalization, such models will suboptimally compress test data in general. However, one of the great strengths of learned compression is that if the test-time data distribution is known and relatively low-entropy (e.g. a camera watching a static scene, a dash cam in an autonomous car, etc.), the model can easily be finetuned or adapted to this distribution, leading to improved $RD$ performance. In this paper we take this concept to the extreme, adapting the full model to a single video, and sending model updates (quantized and compressed using a parameter-space prior) along with the latent representation. Unlike previous work, we finetune not only the encoder/latents but the entire model, and - during finetuning - take into account both the effect of model quantization and the additional costs incurred by sending the model updates. We evaluate an image compression model on I-frames (sampled at 2 fps) from videos of the Xiph dataset, and demonstrate that full-model adaptation improves $RD$ performance by ~1 dB, with respect to encoder-only finetuning.

Related papers

CompGS++: Compressed Gaussian Splatting for Static and Dynamic Scene Representation [60.712165339762116]
CompGS++ is a novel framework that leverages compact Gaussian primitives to achieve accurate 3D modeling. Our design is based on the principle of eliminating redundancy both between and within primitives. Our implementation will be made publicly available on GitHub to facilitate further research.
arXiv Detail & Related papers (2025-04-17T15:33:01Z)
Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations. Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations. Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z)
Autoregressive Video Generation without Vector Quantization [90.87907377618747]
We reformulate the video generation problem as a non-quantized autoregressive modeling of temporal frame-by-frame prediction. With the proposed approach, we train a novel video autoregressive model without vector quantization, termed NOVA. Our results demonstrate that NOVA surpasses prior autoregressive video models in data efficiency, inference speed, visual fidelity, and video fluency, even with a much smaller model capacity.
arXiv Detail & Related papers (2024-12-18T18:59:53Z)
Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning [63.43972993473501]
Token compression expedites the training and inference of Vision Transformers (ViTs) However, when applied to downstream tasks, compression degrees are mismatched between training and inference stages. We propose a model arithmetic framework to decouple the compression degrees between the two stages.
arXiv Detail & Related papers (2024-08-13T10:36:43Z)
Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder [49.01721042973929]
This paper presents a diffusion-based image compression method that employs a privileged end-to-end decoder model as correction. Experiments demonstrate the superiority of our method in both distortion and perception compared with previous perceptual compression methods.
arXiv Detail & Related papers (2024-04-07T10:57:54Z)
Extreme Video Compression with Pre-trained Diffusion Models [11.898317376595697]
We present a novel approach to extreme video compression leveraging the predictive power of diffusion-based generative models at the decoder. The entire video is sequentially encoded to achieve a visually pleasing reconstruction, considering perceptual quality metrics. Results showcase the potential of exploiting the temporal relations in video data using generative models.
arXiv Detail & Related papers (2024-02-14T04:23:05Z)
Activations and Gradients Compression for Model-Parallel Training [85.99744701008802]
We study how simultaneous compression of activations and gradients in model-parallel distributed training setup affects convergence. We find that gradients require milder compression rates than activations. Experiments also show that models trained with TopK perform well only when compression is also applied during inference.
arXiv Detail & Related papers (2024-01-15T15:54:54Z)
Lossy Image Compression with Conditional Diffusion Models [25.158390422252097]
This paper outlines an end-to-end optimized lossy image compression framework using diffusion generative models. In contrast to VAE-based neural compression, where the (mean) decoder is a deterministic neural network, our decoder is a conditional diffusion model. Our approach yields stronger reported FID scores than the GAN-based model, while also yielding competitive performance with VAE-based models in several distortion metrics.
arXiv Detail & Related papers (2022-09-14T21:53:27Z)
CrAM: A Compression-Aware Minimizer [103.29159003723815]
We propose a new compression-aware minimizer dubbed CrAM that modifies the optimization step in a principled way. CrAM produces dense models that can be more accurate than the standard SGD/Adam-based baselines, but which are stable under weight pruning. CrAM can produce sparse models which perform well for transfer learning, and it also works for semi-structured 2:4 pruning patterns supported by GPU hardware.
arXiv Detail & Related papers (2022-07-28T16:13:28Z)
Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression [25.96187914295921]
This paper proposes a powerful entropy model which efficiently captures both spatial and temporal dependencies. Our entropy model can achieve 18.2% saving on UVG dataset when compared with H266 (VTM) using the highest compression ratio.
arXiv Detail & Related papers (2022-07-13T00:03:54Z)
Instance-Adaptive Video Compression: Improving Neural Codecs by Training on the Test Set [14.89208053104896]
We introduce a video compression algorithm based on instance-adaptive learning. On each video sequence to be transmitted, we finetune a pretrained compression model. We show that it enables a competitive performance even after reducing the network size by 70%.
arXiv Detail & Related papers (2021-11-19T16:25:34Z)
Substitutional Neural Image Compression [48.20906717052056]
Substitutional Neural Image Compression (SNIC) is a general approach for enhancing any neural image compression model. It boosts compression performance toward a flexible distortion metric and enables bit-rate control using a single model instance.
arXiv Detail & Related papers (2021-05-16T20:53:31Z)
Learning Scalable $\ell_\infty$-constrained Near-lossless Image Compression via Joint Lossy Image and Residual Compression [118.89112502350177]
We propose a novel framework for learning $ell_infty$-constrained near-lossless image compression. We derive the probability model of the quantized residual by quantizing the learned probability model of the original residual.
arXiv Detail & Related papers (2021-03-31T11:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.