DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick
- URL: http://arxiv.org/abs/2509.26469v1
- Date: Tue, 30 Sep 2025 16:17:21 GMT
- Title: DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick
- Authors: Mohammad Hassan Vali, Tom Bäckström, Arno Solin,
- Abstract summary: We propose DiVeQ, which treats quantization as adding an error vector that mimics the quantization distortion, keeping the forward pass hard while letting gradients flow.<n>We also present a space-filling variant (SF-DiVeQ) that assigns to a curve constructed by the lines connecting codewords, resulting in less quantization error and full codebook usage.
- Score: 24.345802081757995
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vector quantization is common in deep models, yet its hard assignments block gradients and hinder end-to-end training. We propose DiVeQ, which treats quantization as adding an error vector that mimics the quantization distortion, keeping the forward pass hard while letting gradients flow. We also present a space-filling variant (SF-DiVeQ) that assigns to a curve constructed by the lines connecting codewords, resulting in less quantization error and full codebook usage. Both methods train end-to-end without requiring auxiliary losses or temperature schedules. On VQ-VAE compression and VQGAN generation across various data sets, they improve reconstruction and sample quality over alternative quantization approaches.
Related papers
- BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models [56.504879072674015]
We propose Bit-Plane Decomposition Quantization (BPDQ), which constructs a variable quantization grid via bit-planes and scalar coefficients.<n>BPDQ enables serving Qwen2.5-72B on a single GTX 3090 with 83.85% GSM8K accuracy (vs. 90.83% at 16-bit)
arXiv Detail & Related papers (2026-02-04T02:54:37Z) - Spherical Leech Quantization for Visual Tokenization and Generation [37.37290605007169]
We present a unified formulation of different non-parametric quantization methods through the lens of lattice coding.<n>In image tokenization and compression tasks, this quantization approach achieves better reconstruction quality across all metrics than BSQ.
arXiv Detail & Related papers (2025-12-16T18:59:57Z) - MPQ-DMv2: Flexible Residual Mixed Precision Quantization for Low-Bit Diffusion Models with Temporal Distillation [74.34220141721231]
We present MPQ-DMv2, an improved textbfMixed textbfPrecision textbfQuantization framework for extremely low-bit textbfDiffusion textbfModels.
arXiv Detail & Related papers (2025-07-06T08:16:50Z) - FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation [55.12070409045766]
Post-training quantization (PTQ) has stood out as a cost-effective and promising model compression paradigm in recent years.<n>Current PTQ methods for Vision Transformers (ViTs) still suffer from significant accuracy degradation, especially under low-bit quantization.
arXiv Detail & Related papers (2025-06-13T07:57:38Z) - Restructuring Vector Quantization with the Rotation Trick [36.03697966463205]
Vector Quantized Variational AutoEncoders (VQ-VAEs) are designed to compress a continuous input to a discrete latent space and reconstruct it with minimal distortion.<n>As vector quantization is non-differentiable, the gradient to the encoder flows around the vector quantization layer rather than through it in a straight-through approximation.<n>We propose a way to propagate gradients through the vector quantization layer of VQ-VAEs.
arXiv Detail & Related papers (2024-10-08T23:39:34Z) - Vector Quantization for Deep-Learning-Based CSI Feedback in Massive MIMO
Systems [7.934232975873179]
This paper presents a finite-rate deep-learning (DL)-based channel state information (CSI) feedback method for massive multiple-input multiple-output (MIMO) systems.
The presented method provides a finite-bit representation of the latent vector based on a vector-quantized variational autoencoder (VQ-VAE) framework.
arXiv Detail & Related papers (2024-03-12T06:28:41Z) - LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient
Representations [0.0]
We introduce learnable lattice vector quantization and demonstrate its effectiveness for learning discrete representations.
Our method, termed LL-VQ-VAE, replaces the vector quantization layer in VQ-VAE with lattice-based discretization.
Compared to VQ-VAE, our method obtains lower reconstruction errors under the same training conditions, trains in a fraction of the time, and with a constant number of parameters.
arXiv Detail & Related papers (2023-10-13T20:03:18Z) - Soft Convex Quantization: Revisiting Vector Quantization with Convex
Optimization [40.1651740183975]
We propose Soft Convex Quantization (SCQ) as a direct substitute for Vector Quantization (VQ)
SCQ works like a differentiable convex optimization (DCO) layer.
We demonstrate its efficacy on the CIFAR-10, GTSRB and LSUN datasets.
arXiv Detail & Related papers (2023-10-04T17:45:14Z) - Improving Diffusion Models for Inverse Problems using Manifold Constraints [55.91148172752894]
We show that current solvers throw the sample path off the data manifold, and hence the error accumulates.
To address this, we propose an additional correction term inspired by the manifold constraint.
We show that our method is superior to the previous methods both theoretically and empirically.
arXiv Detail & Related papers (2022-06-02T09:06:10Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - Q-Match: Iterative Shape Matching via Quantum Annealing [64.74942589569596]
Finding shape correspondences can be formulated as an NP-hard quadratic assignment problem (QAP)
This paper proposes Q-Match, a new iterative quantum method for QAPs inspired by the alpha-expansion algorithm.
Q-Match can be applied for shape matching problems iteratively, on a subset of well-chosen correspondences, allowing us to scale to real-world problems.
arXiv Detail & Related papers (2021-05-06T17:59:38Z) - Optimal Gradient Quantization Condition for Communication-Efficient
Distributed Training [99.42912552638168]
Communication of gradients is costly for training deep neural networks with multiple devices in computer vision applications.
In this work, we deduce the optimal condition of both the binary and multi-level gradient quantization for textbfANY gradient distribution.
Based on the optimal condition, we develop two novel quantization schemes: biased BinGrad and unbiased ORQ for binary and multi-level gradient quantization respectively.
arXiv Detail & Related papers (2020-02-25T18:28:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.