Related papers: Beyond Accuracy: A Unified Random Matrix Theory Diagnostic Framework for Crash Classification Models

Beyond Accuracy: A Unified Random Matrix Theory Diagnostic Framework for Crash Classification Models

URL: http://arxiv.org/abs/2602.19528v1
Date: Mon, 23 Feb 2026 05:42:54 GMT
Title: Beyond Accuracy: A Unified Random Matrix Theory Diagnostic Framework for Crash Classification Models
Authors: Ibne Farabi Shihab, Sanjeda Akter, Anuj Sharma,
Abstract summary: We introduce a diagnostic framework grounded in Random Matrix Theory (RMT) and Heavy-Tailed Self-Regularization (HTSR)<n>We evaluate nine model families on two Iowa DOT crash classification tasks (173,512 and 371,062 records respectively)<n>We find that the power-law exponent $$ provides a structural quality signal: well-regularized models consistently yield $$ within $[2, 4]$ (mean $2.87 pm 0.34$)<n>We propose an $$-based early stopping criterion and a spectral model selection protocol, and validate both against cross-validated F
Score: 6.908972852063454
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Crash classification models in transportation safety are typically evaluated using accuracy, F1, or AUC, metrics that cannot reveal whether a model is silently overfitting. We introduce a spectral diagnostic framework grounded in Random Matrix Theory (RMT) and Heavy-Tailed Self-Regularization (HTSR) that spans the ML taxonomy: weight matrices for BERT/ALBERT/Qwen2.5, out-of-fold increment matrices for XGBoost/Random Forest, empirical Hessians for Logistic Regression, induced affinity matrices for Decision Trees, and Graph Laplacians for KNN. Evaluating nine model families on two Iowa DOT crash classification tasks (173,512 and 371,062 records respectively), we find that the power-law exponent $α$ provides a structural quality signal: well-regularized models consistently yield $α$ within $[2, 4]$ (mean $2.87 \pm 0.34$), while overfit variants show $α< 2$ or spectral collapse. We observe a strong rank correlation between $α$ and expert agreement (Spearman $ρ= 0.89$, $p < 0.001$), suggesting spectral quality captures model behaviors aligned with expert reasoning. We propose an $α$-based early stopping criterion and a spectral model selection protocol, and validate both against cross-validated F1 baselines. Sparse Lanczos approximations make the framework scalable to large datasets.

Related papers

ReLE: A Scalable System and Structured Benchmark for Diagnosing Capability Anisotropy in Chinese LLMs [37.23311145049677]
We present ReLE, a scalable system designed to diagnose Capability Anisotropy.<n>We evaluate 304 models across a Domain $times$ Capability Symbolic matrix comprising 207,843 samples.
arXiv Detail & Related papers (2026-01-24T09:57:59Z)
Spectral Sentinel: Scalable Byzantine-Robust Decentralized Federated Learning via Sketched Random Matrix Theory on Blockchain [0.0]
Byzantine clients poison gradients under heterogeneous (Non-IID) data.<n>We propose Spectral Sentinel, a Byzantine detection and aggregation framework.<n>We implement the full system with blockchain integration on Polygon networks.
arXiv Detail & Related papers (2025-12-14T09:43:03Z)
Skewness-Robust Causal Discovery in Location-Scale Noise Models [47.09233752567902]
We propose SkewD, a likelihood-based algorithm for causal discovery under location-scale noise models.<n>SkewD extends the usual normal-distribution framework to the skew-normal setting, enabling reliable inference under symmetric and skewed noise.<n>We evaluate SkewD on novel synthetically generated datasets with skewed noise as well as established benchmark datasets.
arXiv Detail & Related papers (2025-11-18T12:40:41Z)
Computational-Statistical Tradeoffs at the Next-Token Prediction Barrier: Autoregressive and Imitation Learning under Misspecification [50.717692060500696]
Next-token prediction with the logarithmic loss is a cornerstone of autoregressive sequence modeling.<n>Next-token prediction can be made robust so as to achieve $C=tilde O(H)$, representing moderate error amplification.<n>No computationally efficient algorithm can achieve sub-polynomial approximation factor $C=e(log H)1-Omega(1)$.
arXiv Detail & Related papers (2025-02-18T02:52:00Z)
Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure. We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z)
On the Identifiability and Estimation of Causal Location-Scale Noise Models [122.65417012597754]
We study the class of location-scale or heteroscedastic noise models (LSNMs) We show the causal direction is identifiable up to some pathological cases. We propose two estimators for LSNMs: an estimator based on (non-linear) feature maps, and one based on neural networks.
arXiv Detail & Related papers (2022-10-13T17:18:59Z)
Developing and Improving Risk Models using Machine-learning Based Algorithms [6.245537312562826]
The objective of this study is to develop a good risk model for classifying business delinquency. The rationale under the analyses is firstly to obtain good base binary classifiers via regularization. Two model ensembling algorithms including bagging and boosting are performed on the good base classifiers for further model improvement.
arXiv Detail & Related papers (2020-09-09T20:38:00Z)
The Generalized Lasso with Nonlinear Observations and Generative Priors [63.541900026673055]
We make the assumption of sub-Gaussian measurements, which is satisfied by a wide range of measurement models. We show that our result can be extended to the uniform recovery guarantee under the assumption of a so-called local embedding property.
arXiv Detail & Related papers (2020-06-22T16:43:35Z)
Towards Assessment of Randomized Smoothing Mechanisms for Certifying Adversarial Robustness [50.96431444396752]
We argue that the main difficulty is how to assess the appropriateness of each randomized mechanism. We first conclude that the Gaussian mechanism is indeed an appropriate option to certify $ell$-norm. Surprisingly, we show that the Gaussian mechanism is also an appropriate option for certifying $ell_infty$-norm, instead of the Exponential mechanism.
arXiv Detail & Related papers (2020-05-15T03:54:53Z)
A Precise High-Dimensional Asymptotic Theory for Boosting and Minimum-$\ell_1$-Norm Interpolated Classifiers [3.167685495996986]
This paper establishes a precise high-dimensional theory for boosting on separable data. Under a class of statistical models, we provide an exact analysis of the universality error of boosting. We also explicitly pin down the relation between the boosting test error and the optimal Bayes error.
arXiv Detail & Related papers (2020-02-05T00:24:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.