Related papers: Spectral Identifiability for Interpretable Probe Geometry

Spectral Identifiability for Interpretable Probe Geometry

URL: http://arxiv.org/abs/2511.16288v1
Date: Thu, 20 Nov 2025 12:09:42 GMT
Title: Spectral Identifiability for Interpretable Probe Geometry
Authors: William Hao-Cheng Huang,
Abstract summary: Linear probes are widely used to interpret and evaluate neural representations, yet their reliability remains unclear.<n>We uncover a spectral mechanism behind this phenomenon and formalize it as the Spectral Identifiability Principle (SIP), a verifiable Fisher-inspired condition for probe stability.<n>Our analysis connects eigengap geometry, sample size, and misclassification risk through finite-sample reasoning, providing an interpretable diagnostic rather than a loose generalization bound.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Linear probes are widely used to interpret and evaluate neural representations, yet their reliability remains unclear, as probes may appear accurate in some regimes but collapse unpredictably in others. We uncover a spectral mechanism behind this phenomenon and formalize it as the Spectral Identifiability Principle (SIP), a verifiable Fisher-inspired condition for probe stability. When the eigengap separating task-relevant directions is larger than the Fisher estimation error, the estimated subspace concentrates and accuracy remains consistent, whereas closing this gap induces instability in a phase-transition manner. Our analysis connects eigengap geometry, sample size, and misclassification risk through finite-sample reasoning, providing an interpretable diagnostic rather than a loose generalization bound. Controlled synthetic studies, where Fisher quantities are computed exactly, confirm these predictions and show how spectral inspection can anticipate unreliable probes before they distort downstream evaluation.

Related papers

Global Interpretability via Automated Preprocessing: A Framework Inspired by Psychiatric Questionnaires [1.8275108630751837]
Psychiatric questionnaires are highly context sensitive.<n> flexible nonlinear models can improve predictive accuracy.<n>limited interpretability can erode clinical trust.<n> REFINE outperforms other interpretable approaches.
arXiv Detail & Related papers (2026-02-26T19:23:20Z)
Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations [57.179679246370114]
We identify the distribution of random perturbations that minimizes the estimator's variance as the perturbation stepsize tends to zero.<n>Our findings reveal that such desired perturbations can align directionally with the true gradient, instead of maintaining a fixed length.
arXiv Detail & Related papers (2025-10-22T19:06:39Z)
Spectral Thresholds for Identifiability and Stability:Finite-Sample Phase Transitions in High-Dimensional Learning [0.0]
In high-dimensional learning, models remain stable until they collapse abruptly once the sample size falls below a critical level.<n>Our Fisher Threshold Theorem formalizes this by proving that requires the minimal Fisher eigenvalue to exceed an explicit $O(sqrtd/n)$ bound.<n>Unlike prior or model-specific criteria, this threshold is finite-sample and necessary, marking a sharp phase transition between reliable concentration and inevitable failure.
arXiv Detail & Related papers (2025-10-04T13:33:48Z)
On Spectral Properties of Gradient-based Explanation Methods [6.181300669254824]
We adopt novel probabilistic and spectral perspectives to analyze explanation methods.<n>Our study reveals a pervasive spectral bias stemming from the use of gradient, and sheds light on some common design choices.<n>We propose two remedies based on our proposed formalism: (i) a mechanism to determine a standard perturbation scale, and (ii) an aggregation method which we call SpectralLens.
arXiv Detail & Related papers (2025-08-14T12:37:22Z)
Ensembled Prediction Intervals for Causal Outcomes Under Hidden Confounding [49.1865229301561]
We present a simple approach to partial identification using existing causal sensitivity models and show empirically that Caus-Modens gives tighter outcome intervals. The last of our three diverse benchmarks is a novel usage of GPT-4 for observational experiments with unknown but probeable ground truth.
arXiv Detail & Related papers (2023-06-15T21:42:40Z)
Neural density estimation and uncertainty quantification for laser induced breakdown spectroscopy spectra [4.698576003197588]
We use normalizing flows on structured spectral latent spaces to estimate probability densities. We evaluate a method for uncertainty quantification when predicting unobserved state vectors. We demonstrate the capability of this approach on laser-induced breakdown spectroscopy data collected by the Mars rover Curiosity.
arXiv Detail & Related papers (2021-08-17T01:10:29Z)
The Hidden Uncertainty in a Neural Networks Activations [105.4223982696279]
The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data. This work investigates whether this distribution correlates with a model's epistemic uncertainty, thus indicating its ability to generalise to novel inputs.
arXiv Detail & Related papers (2020-12-05T17:30:35Z)
Interpreting Uncertainty in Model Predictions For COVID-19 Diagnosis [0.0]
COVID-19 has brought in the need to use assistive tools for faster diagnosis in addition to typical lab swab testing. Traditional convolutional networks use point estimate for predictions, lacking in capture of uncertainty. We develop a visualization framework to address interpretability of uncertainty and its components, with uncertainty in predictions computed with a Bayesian Convolutional Neural Network.
arXiv Detail & Related papers (2020-10-26T01:27:29Z)
Evaluating probabilistic classifiers: Reliability diagrams and score decompositions revisited [68.8204255655161]
We introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way. Corpor is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm.
arXiv Detail & Related papers (2020-08-07T08:22:26Z)
Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design. A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift. Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z)
Analytic Signal Phase in $N-D$ by Linear Symmetry Tensor--fingerprint modeling [69.35569554213679]
We show that the Analytic Signal phase, and its gradient have a hitherto unstudied discontinuity in $2-D $ and higher dimensions. This shortcoming can result in severe artifacts whereas the problem does not exist in $1-D $ signals. We suggest the use of Linear Symmetry phase, relying on more than one set of Gabor filters, but with a negligible computational add-on.
arXiv Detail & Related papers (2020-05-16T21:17:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.