On Uniform Scalar Quantization for Learned Image Compression
- URL: http://arxiv.org/abs/2309.17051v1
- Date: Fri, 29 Sep 2023 08:23:36 GMT
- Title: On Uniform Scalar Quantization for Learned Image Compression
- Authors: Haotian Zhang, Li Li, Dong Liu
- Abstract summary: We find two factors crucial: the discrepancy between the surrogate and rounding, leading to train-test mismatch, and gradient estimation risk due to the surrogate.
Our analyses enlighten us as to two subtle tricks: one is to set an appropriate lower bound for the variance of the estimated quantized latent distribution, which effectively reduces the train-test mismatch.
Our method with the tricks is verified to outperform the existing practices of quantization surrogates on a variety of representative image compression networks.
- Score: 17.24702997651976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learned image compression possesses a unique challenge when incorporating
non-differentiable quantization into the gradient-based training of the
networks. Several quantization surrogates have been proposed to fulfill the
training, but they were not systematically justified from a theoretical
perspective. We fill this gap by contrasting uniform scalar quantization, the
most widely used category with rounding being its simplest case, and its
training surrogates. In principle, we find two factors crucial: one is the
discrepancy between the surrogate and rounding, leading to train-test mismatch;
the other is gradient estimation risk due to the surrogate, which consists of
bias and variance of the gradient estimation. Our analyses and simulations
imply that there is a tradeoff between the train-test mismatch and the gradient
estimation risk, and the tradeoff varies across different network structures.
Motivated by these analyses, we present a method based on stochastic uniform
annealing, which has an adjustable temperature coefficient to control the
tradeoff. Moreover, our analyses enlighten us as to two subtle tricks: one is
to set an appropriate lower bound for the variance parameter of the estimated
quantized latent distribution, which effectively reduces the train-test
mismatch; the other is to use zero-center quantization with partial
stop-gradient, which reduces the gradient estimation variance and thus
stabilize the training. Our method with the tricks is verified to outperform
the existing practices of quantization surrogates on a variety of
representative image compression networks.
Related papers
- Improved Quantization Strategies for Managing Heavy-tailed Gradients in
Distributed Learning [20.91559450517002]
It is observed that gradient distributions are heavy-tailed, with outliers significantly influencing the design of compression strategies.
Existing parameter quantization methods experience performance degradation when this heavy-tailed feature is ignored.
We introduce a novel compression scheme specifically engineered for heavy-tailed gradient gradients, which effectively combines truncation with quantization.
arXiv Detail & Related papers (2024-02-02T06:14:31Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Distribution-Free Model-Agnostic Regression Calibration via
Nonparametric Methods [9.662269016653296]
We consider an individual calibration objective for characterizing the quantiles of the prediction model.
Existing methods have been largely and lack of statistical guarantee in terms of individual calibration.
We propose simple nonparametric calibration methods that are agnostic of the underlying prediction model.
arXiv Detail & Related papers (2023-05-20T21:31:51Z) - Regularized Vector Quantization for Tokenized Image Synthesis [126.96880843754066]
Quantizing images into discrete representations has been a fundamental problem in unified generative modeling.
deterministic quantization suffers from severe codebook collapse and misalignment with inference stage while quantization suffers from low codebook utilization and reconstruction objective.
This paper presents a regularized vector quantization framework that allows to mitigate perturbed above issues effectively by applying regularization from two perspectives.
arXiv Detail & Related papers (2023-03-11T15:20:54Z) - Multi-Head Multi-Loss Model Calibration [13.841172927454204]
We introduce a form of simplified ensembling that bypasses the costly training and inference of deep ensembles.
Specifically, each head is trained to minimize a weighted Cross-Entropy loss, but the weights are different among the different branches.
We show that the resulting averaged predictions can achieve excellent calibration without sacrificing accuracy in two challenging datasets.
arXiv Detail & Related papers (2023-03-02T09:32:32Z) - Deblurring via Stochastic Refinement [85.42730934561101]
We present an alternative framework for blind deblurring based on conditional diffusion models.
Our method is competitive in terms of distortion metrics such as PSNR.
arXiv Detail & Related papers (2021-12-05T04:36:09Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - Quantized Adam with Error Feedback [11.91306069500983]
We present a distributed variant of adaptive gradient method for training deep neural networks.
We incorporate two types of quantization schemes to reduce the communication cost among the workers.
arXiv Detail & Related papers (2020-04-29T13:21:54Z) - Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples.
This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network.
Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.