Related papers: Spectral Thresholds for Identifiability and Stability:Finite-Sample Phase Transitions in High-Dimensional Learning

Spectral Thresholds for Identifiability and Stability:Finite-Sample Phase Transitions in High-Dimensional Learning

URL: http://arxiv.org/abs/2510.03809v1
Date: Sat, 04 Oct 2025 13:33:48 GMT
Title: Spectral Thresholds for Identifiability and Stability:Finite-Sample Phase Transitions in High-Dimensional Learning
Authors: William Hao-Cheng Huang,
Abstract summary: In high-dimensional learning, models remain stable until they collapse abruptly once the sample size falls below a critical level.<n>Our Fisher Threshold Theorem formalizes this by proving that requires the minimal Fisher eigenvalue to exceed an explicit $O(sqrtd/n)$ bound.<n>Unlike prior or model-specific criteria, this threshold is finite-sample and necessary, marking a sharp phase transition between reliable concentration and inevitable failure.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In high-dimensional learning, models remain stable until they collapse abruptly once the sample size falls below a critical level. This instability is not algorithm-specific but a geometric mechanism: when the weakest Fisher eigendirection falls beneath sample-level fluctuations, identifiability fails. Our Fisher Threshold Theorem formalizes this by proving that stability requires the minimal Fisher eigenvalue to exceed an explicit $O(\sqrt{d/n})$ bound. Unlike prior asymptotic or model-specific criteria, this threshold is finite-sample and necessary, marking a sharp phase transition between reliable concentration and inevitable failure. To make the principle constructive, we introduce the Fisher floor, a verifiable spectral regularization robust to smoothing and preconditioning. Synthetic experiments on Gaussian mixtures and logistic models confirm the predicted transition, consistent with $d/n$ scaling. Statistically, the threshold sharpens classical eigenvalue conditions into a non-asymptotic law; learning-theoretically, it defines a spectral sample-complexity frontier, bridging theory with diagnostics for robust high-dimensional inference.

Related papers

Fisher-Geometric Diffusion in Stochastic Gradient Descent: Optimal Rates, Oracle Complexity, and Information-Theoretic Limits [0.0]
We develop a theory of gradient descent in which mini-batch noise is an intrinsic, loss-induced matrix.<n>We prove oracle-complexity guarantees for epsilon-stationarity in the Fisher dual norm.
arXiv Detail & Related papers (2026-03-02T21:57:09Z)
Asymptotic Theory and Phase Transitions for Variable Importance in Quantile Regression Forests [0.0]
We develop a theory for variable intrinsic importance defined as the difference in pinball loss risks.<n>We prove that in the bias-dominated regime ($ge 1/2$), standard inference breaks down as the estimator converges to a deterministic bias constant rather than a zero-mean normal distribution.
arXiv Detail & Related papers (2025-11-28T14:18:05Z)
Spectral Identifiability for Interpretable Probe Geometry [0.0]
Linear probes are widely used to interpret and evaluate neural representations, yet their reliability remains unclear.<n>We uncover a spectral mechanism behind this phenomenon and formalize it as the Spectral Identifiability Principle (SIP), a verifiable Fisher-inspired condition for probe stability.<n>Our analysis connects eigengap geometry, sample size, and misclassification risk through finite-sample reasoning, providing an interpretable diagnostic rather than a loose generalization bound.
arXiv Detail & Related papers (2025-11-20T12:09:42Z)
Theoretical Bounds for Stable In-Context Learning [0.0]
In-context learning (ICL) is flexible but its reliability is sensitive to prompt length.<n>This paper establishes a non-asymptotic lower bound that links the minimal number of demonstrations to ICL stability.<n>We propose a two-stage observable estimator with a one-shot calibration that produces practitioner-ready prompt-length estimates.
arXiv Detail & Related papers (2025-09-25T02:25:05Z)
Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density [93.32594873253534]
Trustworthy machine learning requires meticulous regulation of model reliance on non-robust features. We propose a framework to delineate and regulate such features by attributing model predictions to the input.
arXiv Detail & Related papers (2024-07-05T09:16:56Z)
On the Convergence of Gradient Descent for Large Learning Rates [55.33626480243135]
We show that convergence is impossible when a fixed step size is used.<n>We provide a proof of this in the case of linear neural networks with a squared loss.<n>We also prove the impossibility of convergence for more general losses without requiring strong assumptions such as Lipschitz continuity for the gradient.
arXiv Detail & Related papers (2024-02-20T16:01:42Z)
A Precise Characterization of SGD Stability Using Loss Surface Geometry [8.942671556572073]
Descent Gradient (SGD) stands as a cornerstone optimization algorithm with proven real-world empirical successes but relatively limited theoretical understanding. Recent research has illuminated a key factor contributing to its practical efficacy: the implicit regularization it instigates.
arXiv Detail & Related papers (2024-01-22T19:46:30Z)
When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples. A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction. Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z)
Beyond the Edge of Stability via Two-step Gradient Updates [49.03389279816152]
Gradient Descent (GD) is a powerful workhorse of modern machine learning. GD's ability to find local minimisers is only guaranteed for losses with Lipschitz gradients. This work focuses on simple, yet representative, learning problems via analysis of two-step gradient updates.
arXiv Detail & Related papers (2022-06-08T21:32:50Z)
A Local Convergence Theory for the Stochastic Gradient Descent Method in Non-Convex Optimization With Non-isolated Local Minima [0.0]
Non-isolated minima presents a unique challenge that has remained under-explored. In this paper, we study the local convergence of the gradient descent method to non-isolated global minima.
arXiv Detail & Related papers (2022-03-21T13:33:37Z)
Modeling High-Dimensional Data with Unknown Cut Points: A Fusion Penalized Logistic Threshold Regression [2.520538806201793]
In traditional logistic regression models, the link function is often assumed to be linear and continuous in predictors. We consider a threshold model that all continuous features are discretized into ordinal levels, which further determine the binary responses. We find the lasso model is well suited in the problem of early detection and prediction for chronic disease like diabetes.
arXiv Detail & Related papers (2022-02-17T04:16:40Z)
Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation. Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle. We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z)
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates. This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting. To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.