Switchcodec: Adaptive residual-expert sparse quantization for high-fidelity neural audio coding
- URL: http://arxiv.org/abs/2601.20362v1
- Date: Wed, 28 Jan 2026 08:26:20 GMT
- Title: Switchcodec: Adaptive residual-expert sparse quantization for high-fidelity neural audio coding
- Authors: Xiangbo Wang, Wenbin Jiang, Jin Wang, Yubo You, Sheng Fang, Fei Wen,
- Abstract summary: SwitchCodec is a neural audio based on Residual Experts Vector Quantization (REVQ)<n> REVQ combines a shared quantizer with dynamically routed expert quantizers that are activated according to the input audio.<n>SwitchCodec surpasses existing baselines on both objective metrics and subjective listening tests.
- Score: 11.19956590509655
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent neural audio compression models often rely on residual vector quantization for high-fidelity coding, but using a fixed number of per-frame codebooks is suboptimal for the wide variability of audio content-especially for signals that are either very simple or highly complex. To address this limitation, we propose SwitchCodec, a neural audio codec based on Residual Experts Vector Quantization (REVQ). REVQ combines a shared quantizer with dynamically routed expert quantizers that are activated according to the input audio, decoupling bitrate from codebook capacity and improving compression efficiency. This design ensures full training and utilization of each quantizer. In addition, a variable-bitrate mechanism adjusts the number of active expert quantizers at inference, enabling multi-bitrate operation without retraining. Experiments demonstrate that SwitchCodec surpasses existing baselines on both objective metrics and subjective listening tests.
Related papers
- Finite Scalar Quantization Enables Redundant and Transmission-Robust Neural Audio Compression at Low Bit-rates [1.445167946386569]
We show that Finite Scalar Quantization (FSQ) encodes baked-in redundancy which produces an encoding which is robust when transmitted through noisy channels.<n>We demonstrate that FSQ has vastly superior bit-level perturbation by comparing the performance of RVQ and FSQ codecs when simulating the transmission of code sequences through a noisy channel.
arXiv Detail & Related papers (2025-09-11T15:39:59Z) - L3AC: Towards a Lightweight and Lossless Audio Codec [10.903708510237875]
We introduce L3AC, a lightweight neural audio that addresses challenges by leveraging a single quantizer and a highly efficient architecture.<n>L3AC explores streamlined convolutional networks and local Transformer modules, alongside TConv--a novel structure designed to capture acoustic variations across multiple temporal scales.
arXiv Detail & Related papers (2025-04-07T11:34:39Z) - Efficient Evaluation of Quantization-Effects in Neural Codecs [4.897318643396687]
Training neural codecs requires techniques to allow a non-zero gradient across the quantizer.<n>This paper proposes an efficient evaluation framework for neural codecs using simulated data.<n>We validate our findings against an internal neural audio gradient and against the state-of-the-art descript-audio-codec.
arXiv Detail & Related papers (2025-02-07T09:11:19Z) - SNAC: Multi-Scale Neural Audio Codec [1.0753191494611891]
Multi-Scale Neural Audio Codec is a simple extension of RVQ where the quantizers can operate at different temporal resolutions.
This paper proposes Multi-Scale Neural Audio Codec, a simple extension of RVQ where the quantizers can operate at different temporal resolutions.
arXiv Detail & Related papers (2024-10-18T12:24:05Z) - QSpec: Speculative Decoding with Complementary Quantization Schemes [53.960146187821685]
Quantization is widely adopted to accelerate inference and reduce memory consumption in large language models (LLMs)<n>We propose QSpec, a novel quantization paradigm that decouples efficiency from quality.<n>QSpec reuses both weights and KV cache across stages, enabling near-zero-cost switching without retraining or auxiliary models.
arXiv Detail & Related papers (2024-10-15T05:57:51Z) - Variable Bitrate Residual Vector Quantization for Audio Coding [29.368893236587343]
Recent neural audio compression models have progressively adopted residual vector quantization (RVQ)<n>These models employ a fixed number of codebooks per frame, which can be suboptimal in terms of rate-distortion tradeoffs.<n>We propose variable RVQ (VRVQ) for audio codecs, which allows for more efficient coding by adapting the number of codebooks used per frame.
arXiv Detail & Related papers (2024-10-08T13:18:24Z) - Accelerating Error Correction Code Transformers [56.75773430667148]
We introduce a novel acceleration method for transformer-based decoders.
We achieve a 90% compression ratio and reduce arithmetic operation energy consumption by at least 224 times on modern hardware.
arXiv Detail & Related papers (2024-10-08T11:07:55Z) - The END: An Equivariant Neural Decoder for Quantum Error Correction [73.4384623973809]
We introduce a data efficient neural decoder that exploits the symmetries of the problem.
We propose a novel equivariant architecture that achieves state of the art accuracy compared to previous neural decoders.
arXiv Detail & Related papers (2023-04-14T19:46:39Z) - Deep Quantum Error Correction [73.54643419792453]
Quantum error correction codes (QECC) are a key component for realizing the potential of quantum computing.
In this work, we efficiently train novel emphend-to-end deep quantum error decoders.
The proposed method demonstrates the power of neural decoders for QECC by achieving state-of-the-art accuracy.
arXiv Detail & Related papers (2023-01-27T08:16:26Z) - Cross-Scale Vector Quantization for Scalable Neural Speech Coding [22.65761249591267]
Bitrate scalability is a desirable feature for audio coding in real-time communications.
In this paper, we introduce a cross-scale scalable vector quantization scheme (CSVQ)
In this way, a coarse-level signal is reconstructed if only a portion of the bitstream is received, and progressively improves quality as more bits are available.
arXiv Detail & Related papers (2022-07-07T03:23:25Z) - Improved decoding of circuit noise and fragile boundaries of tailored
surface codes [61.411482146110984]
We introduce decoders that are both fast and accurate, and can be used with a wide class of quantum error correction codes.
Our decoders, named belief-matching and belief-find, exploit all noise information and thereby unlock higher accuracy demonstrations of QEC.
We find that the decoders led to a much higher threshold and lower qubit overhead in the tailored surface code with respect to the standard, square surface code.
arXiv Detail & Related papers (2022-03-09T18:48:54Z) - Variational Autoencoders: A Harmonic Perspective [79.49579654743341]
We study Variational Autoencoders (VAEs) from the perspective of harmonic analysis.
We show that the encoder variance of a VAE controls the frequency content of the functions parameterised by the VAE encoder and decoder neural networks.
arXiv Detail & Related papers (2021-05-31T10:39:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.