On the Spectral Flattening of Quantized Embeddings
- URL: http://arxiv.org/abs/2602.00969v1
- Date: Sun, 01 Feb 2026 02:21:53 GMT
- Title: On the Spectral Flattening of Quantized Embeddings
- Authors: Junlin Huang, Wenyi Fang, Zhenheng Tang, Yuxin Wang, Xueze Kang, Yang Zheng, Bo Li, Xiaowen Chu,
- Abstract summary: Training Large Language Models at ultra-low precision is critically impeded by instability rooted in the conflict between discrete quantization constraints and the intrinsic heavy-tailed spectral nature of linguistic data.<n>This work not only quantifies the spectral sensitivity of LLMs but also establishes spectral fidelity as a necessary condition for stable low-bit optimization.
- Score: 25.64641307046705
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training Large Language Models (LLMs) at ultra-low precision is critically impeded by instability rooted in the conflict between discrete quantization constraints and the intrinsic heavy-tailed spectral nature of linguistic data. By formalizing the connection between Zipfian statistics and random matrix theory, we prove that the power-law decay in the singular value spectra of embeddings is a fundamental requisite for semantic encoding. We derive theoretical bounds showing that uniform quantization introduces a noise floor that disproportionately truncates this spectral tail, which induces spectral flattening and a strictly provable increase in the stable rank of representations. Empirical validation across diverse architectures including GPT-2 and TinyLlama corroborates that this geometric degradation precipitates representational collapse. This work not only quantifies the spectral sensitivity of LLMs but also establishes spectral fidelity as a necessary condition for stable low-bit optimization.
Related papers
- Spectral Collapse in Diffusion Inversion [44.781674986581244]
Conditional diffusion inversion fails when the source domain is spectrally sparse compared to the target domain.<n>We propose Orthogonal Variance Guidance (OVG), an inference-time method that corrects the ODE dynamics to enforce the theoretical Gaussian noise magnitude.<n>OVG effectively restores photorealistic textures while preserving structural fidelity.
arXiv Detail & Related papers (2026-02-09T17:53:21Z) - Spectral Disentanglement and Enhancement: A Dual-domain Contrastive Framework for Representation Learning [28.392130815615545]
Spectral Disentanglement and Enhancement (SDE) is a novel framework that bridges the gap between the geometry of the embedded spaces and their spectral properties.<n>SDE consistently improves representation and robustness, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2026-02-09T07:29:43Z) - Preconditioning Benefits of Spectral Orthogonalization in Muon [50.62925024212989]
We study the effectiveness of a simplified variant of Muon in two case studies: matrix factorization and in-context learning of linear transformers.<n>Our analysis reveals that the Muon dynamics decouple into a collection of independent scalar sequences in the spectral domain, each exhibiting similar convergence behavior.
arXiv Detail & Related papers (2026-01-20T00:08:31Z) - SIGMA: Scalable Spectral Insights for LLM Collapse [51.863164847253366]
We introduce SIGMA (Spectral Inequalities for Gram Matrix Analysis), a unified framework for model collapse.<n>By utilizing benchmarks that deriving and deterministic bounds on the matrix's spectrum, SIGMA provides a mathematically grounded metric to track the contraction of the representation space.<n>We demonstrate that SIGMA effectively captures the transition towards states, offering both theoretical insights into the mechanics of collapse.
arXiv Detail & Related papers (2026-01-06T19:47:11Z) - The Homogeneity Trap: Spectral Collapse in Doubly-Stochastic Deep Networks [1.7523718031184992]
We identify a critical spectral degradation phenomenon inherent to structure-preserving deep architectures.<n>We show that maximum-entropy bias drives the mixing operator towards the uniform barycenter, suppressing the subdominant singular value .<n>We derive a spectral bound linking to the network's effective depth, showing that high-entropy constraints restrict feature transformation to a shallow receptive field.
arXiv Detail & Related papers (2026-01-05T13:09:42Z) - SPECTRA: Spectral Target-Aware Graph Augmentation for Imbalanced Molecular Property Regression [45.62053904749856]
SPECTRA is a Spectral Target-Aware graph augmentation framework.<n>It generates realistic molecular graphs in the spectral domain.<n>It consistently improves error in relevant target ranges while maintaining competitive overall MAE.
arXiv Detail & Related papers (2025-11-06T21:57:21Z) - Quantum Filtering and Analysis of Multiplicities in Eigenvalue Spectra [4.081730190778995]
We introduce QFAMES, a quantum algorithm that efficiently identifies clusters of eigenvalues and determines their multiplicities.<n>QFAMES also enables the estimation of observable expectation values within targeted energy clusters.<n>We validate the effectiveness of QFAMES through numerical demonstrations.
arXiv Detail & Related papers (2025-10-08T18:37:36Z) - Theoretical Bounds for Stable In-Context Learning [0.0]
In-context learning (ICL) is flexible but its reliability is sensitive to prompt length.<n>This paper establishes a non-asymptotic lower bound that links the minimal number of demonstrations to ICL stability.<n>We propose a two-stage observable estimator with a one-shot calibration that produces practitioner-ready prompt-length estimates.
arXiv Detail & Related papers (2025-09-25T02:25:05Z) - Avoiding spectral pollution for transfer operators using residuals [0.6116681488656472]
We present algorithms for computing spectral properties of transfer operators without spectral pollution.<n>Case studies range from families of Blaschke maps with known spectrum to a molecular dynamics model of protein folding.<n>Our methods offer robust tools for spectral estimation across a broad range of applications.
arXiv Detail & Related papers (2025-07-22T18:01:05Z) - Avoided-crossings, degeneracies and Berry phases in the spectrum of quantum noise through analytic Bloch-Messiah decomposition [49.1574468325115]
"analytic Bloch-Messiah decomposition" provides approach for characterizing dynamics of quantum optical systems.<n>We show that avoided crossings arise naturally when a single parameter is varied, leading to hypersensitivity of the singular vectors.<n>We highlight the possibility of programming the spectral response of photonic systems through the deliberate design of avoided crossings.
arXiv Detail & Related papers (2025-04-29T13:14:15Z) - Holistic Physics Solver: Learning PDEs in a Unified Spectral-Physical Space [54.13671100638092]
Holistic Physics Mixer (HPM) is a framework for integrating spectral and physical information in a unified space.<n>We show that HPM consistently outperforms state-of-the-art methods in both accuracy and computational efficiency.
arXiv Detail & Related papers (2024-10-15T08:19:39Z) - Spectral Decomposition Representation for Reinforcement Learning [100.0424588013549]
We propose an alternative spectral method, Spectral Decomposition Representation (SPEDER), that extracts a state-action abstraction from the dynamics without inducing spurious dependence on the data collection policy.
A theoretical analysis establishes the sample efficiency of the proposed algorithm in both the online and offline settings.
An experimental investigation demonstrates superior performance over current state-of-the-art algorithms across several benchmarks.
arXiv Detail & Related papers (2022-08-19T19:01:30Z) - Leave-one-out Singular Subspace Perturbation Analysis for Spectral
Clustering [7.342677574855651]
The singular subspaces perturbation theory is of fundamental importance in probability and statistics.
We consider two arbitrary matrices where one is a leave-one-column-out submatrix of the other one.
It is well-suited for mixture models and results in a sharper and finer statistical analysis than classical perturbation bounds such as Wedin's Theorem.
arXiv Detail & Related papers (2022-05-30T05:07:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.