Related papers: On the Limits of Interpretable Machine Learning in Quintic Root Classification

On the Limits of Interpretable Machine Learning in Quintic Root Classification

URL: http://arxiv.org/abs/2602.23467v1
Date: Thu, 26 Feb 2026 19:53:41 GMT
Title: On the Limits of Interpretable Machine Learning in Quintic Root Classification
Authors: Rohan Thomas, Majid Bani-Yaghoub,
Abstract summary: We test an extensive set of Machine Learning models, including decision trees, logistic regression, support vector machines, random forest, gradient boosting, XGBoost, symbolic regression, and neural networks.<n>We find no evidence that the evaluated ML models autonomously recover discrete, human-interpretable mathematical rules from raw coefficients.<n>These results suggest that, in structured mathematical domains, interpretability may require explicit structural inductive bias rather than purely data-driven approximation.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Can Machine Learning (ML) autonomously recover interpretable mathematical structure from raw numerical data? We aim to answer this question using the classification of real-root configurations of polynomials up to degree five as a structured benchmark. We tested an extensive set of ML models, including decision trees, logistic regression, support vector machines, random forest, gradient boosting, XGBoost, symbolic regression, and neural networks. Neural networks achieved strong in-distribution performance on quintic classification using raw coefficients alone (84.3% + or - 0.9% balanced accuracy), whereas decision trees perform substantially worse (59.9% + or - 0.9\%). However, when provided with an explicit feature capturing sign changes at critical points, decision trees match neural performance (84.2% + or - 1.2%) and yield explicit classification rules. Knowledge distillation reveals that this single invariant accounts for 97.5% of the extracted decision structure. Out-of-distribution, data-efficiency, and noise robustness analyses indicate that neural networks learn continuous, data-dependent geometric approximations of the decision boundary rather than recovering scale-invariant symbolic rules. This distinction between geometric approximation and symbolic invariance explains the gap between predictive performance and interpretability observed across models. Although high predictive accuracy is attainable, we find no evidence that the evaluated ML models autonomously recover discrete, human-interpretable mathematical rules from raw coefficients. These results suggest that, in structured mathematical domains, interpretability may require explicit structural inductive bias rather than purely data-driven approximation.

Related papers

Differentiated Thyroid Cancer Recurrence Classification Using Machine Learning Models and Bayesian Neural Networks with Varying Priors: A SHAP-Based Interpretation of the Best Performing Model [0.0]
Differentiated thyroid cancer DTC recurrence is a major public health concern.<n>This study introduces a comprehensive framework for DTC recurrence classification using a dataset containing 383 patients.
arXiv Detail & Related papers (2025-07-25T06:31:31Z)
Advancing Tabular Stroke Modelling Through a Novel Hybrid Architecture and Feature-Selection Synergy [0.9999629695552196]
The present work develops and validates a data-driven and interpretable machine-learning framework designed to predict strokes.<n>Ten routinely gathered demographic, lifestyle, and clinical variables were sourced from a public cohort of 4,981 records.<n>The proposed model achieved an accuracy rate of 97.2% and an F1-score of 97.15%, indicating a significant enhancement compared to the leading individual model.
arXiv Detail & Related papers (2025-05-18T21:46:45Z)
Interpretable Machine Learning for Kronecker Coefficients [0.0]
We employ interpretable machine learning models to predict whether the Kronecker coefficients of the symmetric group are zero or not.<n>We achieve an accuracy of approximately 83% and derive explicit formulas for a decision function in terms of b-loadings.
arXiv Detail & Related papers (2025-02-17T13:07:37Z)
Scaling and renormalization in high-dimensional regression [72.59731158970894]
We present a unifying perspective on recent results on ridge regression.<n>We use the basic tools of random matrix theory and free probability, aimed at readers with backgrounds in physics and deep learning.<n>Our results extend and provide a unifying perspective on earlier models of scaling laws.
arXiv Detail & Related papers (2024-05-01T15:59:00Z)
Trade-Offs of Diagonal Fisher Information Matrix Estimators [53.35448232352667]
The Fisher information matrix can be used to characterize the local geometry of the parameter space of neural networks. We examine two popular estimators whose accuracy and sample complexity depend on their associated variances. We derive bounds of the variances and instantiate them in neural networks for regression and classification.
arXiv Detail & Related papers (2024-02-08T03:29:10Z)
Automated Learning of Interpretable Models with Quantified Uncertainty [0.0]
We introduce a new framework for genetic-programming-based symbolic regression (GPSR) GPSR uses model evidence to formulate replacement probability during the selection phase of evolution. It is shown to increase interpretability, improve robustness to noise, and reduce overfitting when compared to a conventional GPSR implementation.
arXiv Detail & Related papers (2022-04-12T19:56:42Z)
Active-LATHE: An Active Learning Algorithm for Boosting the Error Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$. Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z)
A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models. This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data. We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z)
Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters. We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples. We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.