Spectral Geometry for Deep Learning: Compression and Hallucination Detection via Random Matrix Theory
- URL: http://arxiv.org/abs/2601.17357v1
- Date: Sat, 24 Jan 2026 08:07:22 GMT
- Title: Spectral Geometry for Deep Learning: Compression and Hallucination Detection via Random Matrix Theory
- Authors: Davide Ettori,
- Abstract summary: This thesis proposes a unified framework based on spectral geometry and random matrix theory to address both problems.<n>The first contribution, EigenTrack, is a real-time method for detecting hallucinations and out-of-distribution behavior in language and vision-language models.<n>The second contribution, RMT-KD, is a principled compression method that identifies informative spectral components.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models and deep neural networks achieve strong performance but suffer from reliability issues and high computational cost. This thesis proposes a unified framework based on spectral geometry and random matrix theory to address both problems by analyzing the eigenvalue structure of hidden activations. The first contribution, EigenTrack, is a real-time method for detecting hallucinations and out-of-distribution behavior in language and vision-language models using spectral features and their temporal dynamics. The second contribution, RMT-KD, is a principled compression method that identifies informative spectral components and applies iterative knowledge distillation to produce compact and efficient models while preserving accuracy. Together, these results show that spectral statistics provide interpretable and robust signals for monitoring uncertainty and guiding compression in large-scale neural networks.
Related papers
- Structure and Redundancy in Large Language Models: A Spectral Study via Random Matrix Theory [0.0]
This thesis addresses two persistent and closely related challenges in modern deep learning, reliability and efficiency.<n>By analyzing the eigenvalue dynamics of hidden activations across layers and inputs, this work shows that spectral statistics provide a compact, stable, and interpretable lens on model behavior.<n>Within this framework, the first contribution, EigenTrack, introduces a real-time method for detecting hallucinations and out-of-distribution behavior in large language and vision-language models.<n>The second contribution, RMT-KD, presents a principled approach to compressing deep networks via random matrix theoretic knowledge distillation.
arXiv Detail & Related papers (2026-02-25T19:11:56Z) - From Eigenmodes to Proofs: Integrating Graph Spectral Operators with Symbolic Interpretable Reasoning [0.0]
We introduce Spectral NSR, a fully spectral neuro-symbolic reasoning framework.<n>It embeds logical rules as spectral templates and performs inference directly in the graph spectral domain.<n>We show that Spectral NSR achieves superior accuracy, faster inference, improved robustness to adversarial perturbations, and higher interpretability compared to leading baselines.
arXiv Detail & Related papers (2025-09-07T01:12:20Z) - Quantum Spectral Reasoning: A Non-Neural Architecture for Interpretable Machine Learning [0.0]
We propose a novel machine learning architecture that departs from conventional neural network paradigms.<n>We use quantum spectral methods, specifically Pade approximants and the Lanczos algorithm, for interpretable signal analysis and symbolic reasoning.<n>Our results show that this spectral-symbolic architecture achieves competitive accuracy while maintaining interpretability and data efficiency.
arXiv Detail & Related papers (2025-08-05T07:16:45Z) - Hallucination Detection in LLMs with Topological Divergence on Attention Graphs [60.83579255387347]
Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models.<n>We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting.
arXiv Detail & Related papers (2025-04-14T10:06:27Z) - Spectral-Adaptive Modulation Networks for Visual Perception [9.912286808419205]
We use graph spectral analysis to theoretically simulate and compare the frequency responses of 2D convolution and self-attention.<n>Our results corroborate previous empirical findings and reveal that node connectivity, modulated by window size, is a key factor in shaping spectral functions.<n>Based on SPAM, we develop SPANetV2 as a novel vision backbone.
arXiv Detail & Related papers (2025-03-31T10:53:42Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - ESSAformer: Efficient Transformer for Hyperspectral Image
Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Learning Neural Eigenfunctions for Unsupervised Semantic Segmentation [12.91586050451152]
Spectral clustering is a theoretically grounded solution to it where the spectral embeddings for pixels are computed to construct distinct clusters.
Current approaches still suffer from inefficiencies in spectral decomposition and inflexibility in applying them to the test data.
This work addresses these issues by casting spectral clustering as a parametric approach that employs neural network-based eigenfunctions to produce spectral embeddings.
In practice, the neural eigenfunctions are lightweight and take the features from pre-trained models as inputs, improving training efficiency and unleashing the potential of pre-trained models for dense prediction.
arXiv Detail & Related papers (2023-04-06T03:14:15Z) - Spectral Decomposition Representation for Reinforcement Learning [100.0424588013549]
We propose an alternative spectral method, Spectral Decomposition Representation (SPEDER), that extracts a state-action abstraction from the dynamics without inducing spurious dependence on the data collection policy.
A theoretical analysis establishes the sample efficiency of the proposed algorithm in both the online and offline settings.
An experimental investigation demonstrates superior performance over current state-of-the-art algorithms across several benchmarks.
arXiv Detail & Related papers (2022-08-19T19:01:30Z) - The Spectral Bias of Polynomial Neural Networks [63.27903166253743]
Polynomial neural networks (PNNs) have been shown to be particularly effective at image generation and face recognition, where high-frequency information is critical.
Previous studies have revealed that neural networks demonstrate a $textitspectral bias$ towards low-frequency functions, which yields faster learning of low-frequency components during training.
Inspired by such studies, we conduct a spectral analysis of the Tangent Kernel (NTK) of PNNs.
We find that the $Pi$-Net family, i.e., a recently proposed parametrization of PNNs, speeds up the
arXiv Detail & Related papers (2022-02-27T23:12:43Z) - Convolutional Spectral Kernel Learning [21.595130250234646]
We build an interpretable convolutional spectral kernel network (textttCSKN) based on the inverse Fourier transform.
We derive the generalization error bounds and introduce two regularizers to improve the performance.
Experiments results on real-world datasets validate the effectiveness of the learning framework.
arXiv Detail & Related papers (2020-02-28T14:35:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.