EigenTrack: Spectral Activation Feature Tracking for Hallucination and Out-of-Distribution Detection in LLMs and VLMs
- URL: http://arxiv.org/abs/2509.15735v3
- Date: Mon, 29 Sep 2025 10:54:21 GMT
- Title: EigenTrack: Spectral Activation Feature Tracking for Hallucination and Out-of-Distribution Detection in LLMs and VLMs
- Authors: Davide Ettori, Nastaran Darabi, Sina Tayebati, Ranganath Krishnan, Mahesh Subedar, Omesh Tickoo, Amit Ranjan Trivedi,
- Abstract summary: EigenTrack is an interpretable real-time detector for large language models (LLMs)<n>It tracks temporal shifts in representation structure that signal hallucination and OOD drift before surface errors appear.<n>Unlike existing white-box detectors, it preserves temporal context, aggregates global signals, and offers interpretable accuracy-latency trade-offs.
- Score: 8.616813040714883
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) offer broad utility but remain prone to hallucination and out-of-distribution (OOD) errors. We propose EigenTrack, an interpretable real-time detector that uses the spectral geometry of hidden activations, a compact global signature of model dynamics. By streaming covariance-spectrum statistics such as entropy, eigenvalue gaps, and KL divergence from random baselines into a lightweight recurrent classifier, EigenTrack tracks temporal shifts in representation structure that signal hallucination and OOD drift before surface errors appear. Unlike black- and grey-box methods, it needs only a single forward pass without resampling. Unlike existing white-box detectors, it preserves temporal context, aggregates global signals, and offers interpretable accuracy-latency trade-offs.
Related papers
- Structure and Redundancy in Large Language Models: A Spectral Study via Random Matrix Theory [0.0]
This thesis addresses two persistent and closely related challenges in modern deep learning, reliability and efficiency.<n>By analyzing the eigenvalue dynamics of hidden activations across layers and inputs, this work shows that spectral statistics provide a compact, stable, and interpretable lens on model behavior.<n>Within this framework, the first contribution, EigenTrack, introduces a real-time method for detecting hallucinations and out-of-distribution behavior in large language and vision-language models.<n>The second contribution, RMT-KD, presents a principled approach to compressing deep networks via random matrix theoretic knowledge distillation.
arXiv Detail & Related papers (2026-02-25T19:11:56Z) - LEFT: Learnable Fusion of Tri-view Tokens for Unsupervised Time Series Anomaly Detection [53.191369031661885]
Unsupervised time series anomaly detection aims to build a model for identifying abnormal timestamps without assuming the availability of annotations.<n>We present Learnable Fusion of Tri-view Tokens (LEFT), a unified unsupervised TSAD framework that models anomalies as inconsistencies across complementary representations.<n>Experiments on real-world benchmarks show that LEFT yields the best detection accuracy against SOTA baselines, while achieving a 5x reduction on FLOPs and 8x speed-up for training.
arXiv Detail & Related papers (2026-02-09T13:33:49Z) - TDGNet: Hallucination Detection in Diffusion Language Models via Temporal Dynamic Graphs [30.313604786976715]
Diffusion language models (D-LLMs) offer parallel denoising and bidirectional context.<n> hallucination detection for D-LLMs remains underexplored.<n>We introduce TDGNet, a temporal dynamic graph framework that formulates hallucination detection as learning over evolving token-level attention graphs.
arXiv Detail & Related papers (2026-02-08T16:35:30Z) - Beyond Observations: Reconstruction Error-Guided Irregularly Sampled Time Series Representation Learning [38.869433924831156]
iTimER is a self-supervised framework for ISTS representation learning.<n>It transforms unobserved timestamps into noise-aware training targets, enabling meaningful reconstruction signals.<n>iTimER consistently outperforms state-of-the-art methods under the ISTS setting.
arXiv Detail & Related papers (2025-11-10T08:53:10Z) - LLM Hallucination Detection: A Fast Fourier Transform Method Based on Hidden Layer Temporal Signals [10.85580316542761]
Hallucination remains a critical barrier for deploying large language models (LLMs) in reliability-sensitive applications.<n>We propose HSAD (Hidden Signal Analysis-based Detection), a novel hallucination detection framework that models the temporal dynamics of hidden representations.<n>Across multiple benchmarks, including TruthfulQA, HSAD achieves over 10 percentage points improvement compared to prior state-of-the-art methods.
arXiv Detail & Related papers (2025-09-16T15:08:19Z) - GrAInS: Gradient-based Attribution for Inference-Time Steering of LLMs and VLMs [56.93583799109029]
GrAInS is an inference-time steering approach that operates across both language-only and vision-language models and tasks.<n>During inference, GrAInS hidden activations at transformer layers guided by token-level attribution signals, and normalizes activations to preserve representational scale.<n>It consistently outperforms both fine-tuning and existing steering baselines.
arXiv Detail & Related papers (2025-07-24T02:34:13Z) - Time-RA: Towards Time Series Reasoning for Anomaly with LLM Feedback [55.284574165467525]
Time-series Reasoning for Anomaly (Time-RA) transforms classical time series anomaly detection into a generative, reasoning-intensive task.<n>Also, we introduce the first real-world multimodal benchmark dataset, RATs40K, explicitly annotated for anomaly reasoning.
arXiv Detail & Related papers (2025-07-20T18:02:50Z) - Shaking to Reveal: Perturbation-Based Detection of LLM Hallucinations [25.18901449626428]
A widely adopted strategy to detect hallucination, known as self-assessment, relies on the model's own output confidence to estimate the factual accuracy of its answers.<n>We propose Sample-Specific Prompting (SSP), a new framework that improves self-assessment by analyzing perturbation sensitivity at intermediate representations.<n>SSP significantly outperforms prior methods across a range of hallucination detection benchmarks.
arXiv Detail & Related papers (2025-06-03T09:44:28Z) - Trajectory Anomaly Detection with Language Models [21.401931052512595]
This paper presents a novel approach for trajectory anomaly detection using an autoregressive causal-attention model, termed LM-TAD.
By treating trajectories as sequences of tokens, our model learns the probability distributions over trajectories, enabling the identification of anomalous locations with high precision.
Our experiments demonstrate the effectiveness of LM-TAD on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-09-18T17:33:31Z) - Detecting Anomalies in Dynamic Graphs via Memory enhanced Normality [39.476378833827184]
Anomaly detection in dynamic graphs presents a significant challenge due to the temporal evolution of graph structures and attributes.
We introduce a novel spatial- temporal memories-enhanced graph autoencoder (STRIPE)
STRIPE significantly outperforms existing methods with 5.8% improvement in AUC scores and 4.62X faster in training time.
arXiv Detail & Related papers (2024-03-14T02:26:10Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - CARLA: Self-supervised Contrastive Representation Learning for Time Series Anomaly Detection [53.83593870825628]
One main challenge in time series anomaly detection (TSAD) is the lack of labelled data in many real-life scenarios.
Most of the existing anomaly detection methods focus on learning the normal behaviour of unlabelled time series in an unsupervised manner.
We introduce a novel end-to-end self-supervised ContrAstive Representation Learning approach for time series anomaly detection.
arXiv Detail & Related papers (2023-08-18T04:45:56Z) - Imputing Missing Observations with Time Sliced Synthetic Minority
Oversampling Technique [0.3973560285628012]
We present a simple yet novel time series imputation technique with the goal of constructing an irregular time series that is uniform across every sample in a data set.
We fix a grid defined by the midpoints of non-overlapping bins (dubbed "slices") of observation times and ensure that each sample has values for all of the features at that given time.
This allows one to both impute fully missing observations to allow uniform time series classification across the entire data and, in special cases, to impute individually missing features.
arXiv Detail & Related papers (2022-01-14T19:23:24Z) - Real-time detection of anomalies in large-scale transient surveys [0.0]
We present two novel methods of automatically detecting anomalous transient light curves in real-time.
Both methods are based on the simple idea that if the light curves from a known population of transients can be accurately modelled, any deviations from model predictions are likely anomalies.
arXiv Detail & Related papers (2021-10-29T18:29:25Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.