Related papers: On the Spectral Flattening of Quantized Embeddings

On the Spectral Flattening of Quantized Embeddings

URL: http://arxiv.org/abs/2602.00969v1
Date: Sun, 01 Feb 2026 02:21:53 GMT
Title: On the Spectral Flattening of Quantized Embeddings
Authors: Junlin Huang, Wenyi Fang, Zhenheng Tang, Yuxin Wang, Xueze Kang, Yang Zheng, Bo Li, Xiaowen Chu,
Abstract summary: Training Large Language Models at ultra-low precision is critically impeded by instability rooted in the conflict between discrete quantization constraints and the intrinsic heavy-tailed spectral nature of linguistic data.<n>This work not only quantifies the spectral sensitivity of LLMs but also establishes spectral fidelity as a necessary condition for stable low-bit optimization.
Score: 25.64641307046705
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training Large Language Models (LLMs) at ultra-low precision is critically impeded by instability rooted in the conflict between discrete quantization constraints and the intrinsic heavy-tailed spectral nature of linguistic data. By formalizing the connection between Zipfian statistics and random matrix theory, we prove that the power-law decay in the singular value spectra of embeddings is a fundamental requisite for semantic encoding. We derive theoretical bounds showing that uniform quantization introduces a noise floor that disproportionately truncates this spectral tail, which induces spectral flattening and a strictly provable increase in the stable rank of representations. Empirical validation across diverse architectures including GPT-2 and TinyLlama corroborates that this geometric degradation precipitates representational collapse. This work not only quantifies the spectral sensitivity of LLMs but also establishes spectral fidelity as a necessary condition for stable low-bit optimization.

Related papers

Spectral Collapse in Diffusion Inversion [44.781674986581244]
Conditional diffusion inversion fails when the source domain is spectrally sparse compared to the target domain.<n>We propose Orthogonal Variance Guidance (OVG), an inference-time method that corrects the ODE dynamics to enforce the theoretical Gaussian noise magnitude.<n>OVG effectively restores photorealistic textures while preserving structural fidelity.
arXiv Detail & Related papers (2026-02-09T17:53:21Z)
Spectral Disentanglement and Enhancement: A Dual-domain Contrastive Framework for Representation Learning [28.392130815615545]
Spectral Disentanglement and Enhancement (SDE) is a novel framework that bridges the gap between the geometry of the embedded spaces and their spectral properties.<n>SDE consistently improves representation and robustness, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2026-02-09T07:29:43Z)
Preconditioning Benefits of Spectral Orthogonalization in Muon [50.62925024212989]
We study the effectiveness of a simplified variant of Muon in two case studies: matrix factorization and in-context learning of linear transformers.<n>Our analysis reveals that the Muon dynamics decouple into a collection of independent scalar sequences in the spectral domain, each exhibiting similar convergence behavior.
arXiv Detail & Related papers (2026-01-20T00:08:31Z)
SIGMA: Scalable Spectral Insights for LLM Collapse [51.863164847253366]
We introduce SIGMA (Spectral Inequalities for Gram Matrix Analysis), a unified framework for model collapse.<n>By utilizing benchmarks that deriving and deterministic bounds on the matrix's spectrum, SIGMA provides a mathematically grounded metric to track the contraction of the representation space.<n>We demonstrate that SIGMA effectively captures the transition towards states, offering both theoretical insights into the mechanics of collapse.
arXiv Detail & Related papers (2026-01-06T19:47:11Z)
The Homogeneity Trap: Spectral Collapse in Doubly-Stochastic Deep Networks [1.7523718031184992]
We identify a critical spectral degradation phenomenon inherent to structure-preserving deep architectures.<n>We show that maximum-entropy bias drives the mixing operator towards the uniform barycenter, suppressing the subdominant singular value .<n>We derive a spectral bound linking to the network's effective depth, showing that high-entropy constraints restrict feature transformation to a shallow receptive field.
arXiv Detail & Related papers (2026-01-05T13:09:42Z)
SPECTRA: Spectral Target-Aware Graph Augmentation for Imbalanced Molecular Property Regression [45.62053904749856]
SPECTRA is a Spectral Target-Aware graph augmentation framework.<n>It generates realistic molecular graphs in the spectral domain.<n>It consistently improves error in relevant target ranges while maintaining competitive overall MAE.
arXiv Detail & Related papers (2025-11-06T21:57:21Z)
Quantum Filtering and Analysis of Multiplicities in Eigenvalue Spectra [4.081730190778995]
We introduce QFAMES, a quantum algorithm that efficiently identifies clusters of eigenvalues and determines their multiplicities.<n>QFAMES also enables the estimation of observable expectation values within targeted energy clusters.<n>We validate the effectiveness of QFAMES through numerical demonstrations.
arXiv Detail & Related papers (2025-10-08T18:37:36Z)
Theoretical Bounds for Stable In-Context Learning [0.0]
In-context learning (ICL) is flexible but its reliability is sensitive to prompt length.<n>This paper establishes a non-asymptotic lower bound that links the minimal number of demonstrations to ICL stability.<n>We propose a two-stage observable estimator with a one-shot calibration that produces practitioner-ready prompt-length estimates.
arXiv Detail & Related papers (2025-09-25T02:25:05Z)
Avoiding spectral pollution for transfer operators using residuals [0.6116681488656472]
We present algorithms for computing spectral properties of transfer operators without spectral pollution.<n>Case studies range from families of Blaschke maps with known spectrum to a molecular dynamics model of protein folding.<n>Our methods offer robust tools for spectral estimation across a broad range of applications.
arXiv Detail & Related papers (2025-07-22T18:01:05Z)
Avoided-crossings, degeneracies and Berry phases in the spectrum of quantum noise through analytic Bloch-Messiah decomposition [49.1574468325115]
"analytic Bloch-Messiah decomposition" provides approach for characterizing dynamics of quantum optical systems.<n>We show that avoided crossings arise naturally when a single parameter is varied, leading to hypersensitivity of the singular vectors.<n>We highlight the possibility of programming the spectral response of photonic systems through the deliberate design of avoided crossings.
arXiv Detail & Related papers (2025-04-29T13:14:15Z)
Holistic Physics Solver: Learning PDEs in a Unified Spectral-Physical Space [54.13671100638092]
Holistic Physics Mixer (HPM) is a framework for integrating spectral and physical information in a unified space.<n>We show that HPM consistently outperforms state-of-the-art methods in both accuracy and computational efficiency.
arXiv Detail & Related papers (2024-10-15T08:19:39Z)
Spectral Decomposition Representation for Reinforcement Learning [100.0424588013549]
We propose an alternative spectral method, Spectral Decomposition Representation (SPEDER), that extracts a state-action abstraction from the dynamics without inducing spurious dependence on the data collection policy. A theoretical analysis establishes the sample efficiency of the proposed algorithm in both the online and offline settings. An experimental investigation demonstrates superior performance over current state-of-the-art algorithms across several benchmarks.
arXiv Detail & Related papers (2022-08-19T19:01:30Z)
Leave-one-out Singular Subspace Perturbation Analysis for Spectral Clustering [7.342677574855651]
The singular subspaces perturbation theory is of fundamental importance in probability and statistics. We consider two arbitrary matrices where one is a leave-one-column-out submatrix of the other one. It is well-suited for mixture models and results in a sharper and finer statistical analysis than classical perturbation bounds such as Wedin's Theorem.
arXiv Detail & Related papers (2022-05-30T05:07:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.