Beyond Stationarity: Rethinking Codebook Collapse in Vector Quantization
- URL: http://arxiv.org/abs/2602.18896v1
- Date: Sat, 21 Feb 2026 16:36:50 GMT
- Title: Beyond Stationarity: Rethinking Codebook Collapse in Vector Quantization
- Authors: Hao Lu, Onur C. Koyun, Yongxin Guo, Zhengjie Zhu, Abbas Alili, Metin Nafi Gurcan,
- Abstract summary: We show that as the encoder drifts, unselected code vectors fail to receive updates and gradually become inactive.<n>To address this, we propose two new methods: Non-Stationary Vector Quantization (NSVQ) and Transformer-based Vector Quantization (TransVQ)<n> Experiments on the CelebA-HQ dataset demonstrate that both methods achieve near-complete codebook utilization and superior reconstruction quality.
- Score: 12.305907179979426
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vector Quantization (VQ) underpins many modern generative frameworks such as VQ-VAE, VQ-GAN, and latent diffusion models. Yet, it suffers from the persistent problem of codebook collapse, where a large fraction of code vectors remains unused during training. This work provides a new theoretical explanation by identifying the nonstationary nature of encoder updates as the fundamental cause of this phenomenon. We show that as the encoder drifts, unselected code vectors fail to receive updates and gradually become inactive. To address this, we propose two new methods: Non-Stationary Vector Quantization (NSVQ), which propagates encoder drift to non-selected codes through a kernel-based rule, and Transformer-based Vector Quantization (TransVQ), which employs a lightweight mapping to adaptively transform the entire codebook while preserving convergence to the k-means solution. Experiments on the CelebA-HQ dataset demonstrate that both methods achieve near-complete codebook utilization and superior reconstruction quality compared to baseline VQ variants, providing a principled and scalable foundation for future VQ-based generative models. The code is available at: https://github.com/CAIR- LAB- WFUSM/NSVQ-TransVQ.git
Related papers
- Generalized Radius and Integrated Codebook Transforms for Differentiable Vector Quantization [11.898954874548073]
We introduce a unified surrogate framework that keeps hard assignments in the forward pass while making VQ fully differentiable.<n>GRIT-VQ consistently improves reconstruction error, generative quality, and accuracy compared to existing VQ variants.
arXiv Detail & Related papers (2026-02-01T10:22:35Z) - Scalable Training for Vector-Quantized Networks with 100% Codebook Utilization [60.294965457786844]
Vector quantization (VQ) is a key component in discrete tokenizers for image generation.<n>VQBridge is a robust, scalable, and efficient projector based on the map function method.<n>FVQ attains 100% codebook usage even with a 262k-codebook.
arXiv Detail & Related papers (2025-09-12T11:08:21Z) - Scalable Image Tokenization with Index Backpropagation Quantization [74.15447383432262]
Index Backpropagation Quantization (IBQ) is a new VQ method for the joint optimization of all codebook embeddings and the visual encoder.<n>IBQ enables scalable training of visual tokenizers and, for the first time, achieves a large-scale codebook with high dimension ($256$) and high utilization.
arXiv Detail & Related papers (2024-12-03T18:59:10Z) - Addressing Representation Collapse in Vector Quantized Models with One Linear Layer [33.46194711570412]
Vector Quantization (VQ) is essential for discretizing continuous representations in unsupervised learning.<n>VQ suffers from representation collapse, causing low codebook utilization and limiting scalability.<n>We propose textbfSimpletextbfVQ, which re parameterizes code vectors through a learnable linear transformation layer over a latent basis.
arXiv Detail & Related papers (2024-11-04T12:40:18Z) - Restructuring Vector Quantization with the Rotation Trick [36.03697966463205]
Vector Quantized Variational AutoEncoders (VQ-VAEs) are designed to compress a continuous input to a discrete latent space and reconstruct it with minimal distortion.<n>As vector quantization is non-differentiable, the gradient to the encoder flows around the vector quantization layer rather than through it in a straight-through approximation.<n>We propose a way to propagate gradients through the vector quantization layer of VQ-VAEs.
arXiv Detail & Related papers (2024-10-08T23:39:34Z) - HyperVQ: MLR-based Vector Quantization in Hyperbolic Space [56.4245885674567]
A common solution is to employ Vector Quantization (VQ) within VQ Variational Autoencoders (VQVAEs)<n>We introduce HyperVQ, a novel approach that formulates VQ as a hyperbolic Multinomial Logistic Regression (MLR) problem.<n>Our experiments demonstrate that HyperVQ matches traditional VQ in generative and reconstruction tasks, while surpassing it in discriminative performance.
arXiv Detail & Related papers (2024-03-18T03:17:08Z) - Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling [15.132926378740882]
We propose a novel codebook transfer framework with part-of-speech, called VQCT, which aims to transfer a well-trained codebook from pretrained language models to VQIM.
Experimental results on four datasets show that our VQCT method achieves superior VQIM performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2024-03-15T07:24:13Z) - Online Clustered Codebook [100.1650001618827]
We present a simple alternative method for online codebook learning, Clustering VQ-VAE (CVQ-VAE)
Our approach selects encoded features as anchors to update the dead'' codevectors, while optimising the codebooks which are alive via the original loss.
Our CVQ-VAE can be easily integrated into the existing models with just a few lines of code.
arXiv Detail & Related papers (2023-07-27T18:31:04Z) - Towards Accurate Image Coding: Improved Autoregressive Image Generation
with Dynamic Vector Quantization [73.52943587514386]
Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm.
We propose a novel two-stage framework: (1) Dynamic-Quantization VAE (DQ-VAE) which encodes image regions into variable-length codes based their information densities for accurate representation.
arXiv Detail & Related papers (2023-05-19T14:56:05Z) - VQFR: Blind Face Restoration with Vector-Quantized Dictionary and
Parallel Decoder [83.63843671885716]
We propose a VQ-based face restoration method -- VQFR.
VQFR takes advantage of high-quality low-level feature banks extracted from high-quality faces.
To further fuse low-level features from inputs while not "contaminating" the realistic details generated from the VQ codebook, we proposed a parallel decoder.
arXiv Detail & Related papers (2022-05-13T17:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.