Related papers: Interpretable Machine Learning for Kronecker Coefficients

Related papers

On the Limits of Interpretable Machine Learning in Quintic Root Classification [0.0]
We test an extensive set of Machine Learning models, including decision trees, logistic regression, support vector machines, random forest, gradient boosting, XGBoost, symbolic regression, and neural networks.<n>We find no evidence that the evaluated ML models autonomously recover discrete, human-interpretable mathematical rules from raw coefficients.<n>These results suggest that, in structured mathematical domains, interpretability may require explicit structural inductive bias rather than purely data-driven approximation.
arXiv Detail & Related papers (2026-02-26T19:53:41Z)
Detecting and Pruning Prominent but Detrimental Neurons in Large Language Models [68.57424628540907]
Large language models (LLMs) often develop learned mechanisms specialized to specific datasets.<n>We introduce a fine-tuning approach designed to enhance generalization by identifying and pruning neurons associated with dataset-specific mechanisms.<n>Our method employs Integrated Gradients to quantify each neuron's influence on high-confidence predictions, pinpointing those that disproportionately contribute to dataset-specific performance.
arXiv Detail & Related papers (2025-07-12T08:10:10Z)
Pure Component Property Estimation Framework Using Explainable Machine Learning Methods [4.8601239628666635]
The molecular representation method based on the connectivity matrix effectively considers atomic bonding relationships to automatically generate features.<n>The prediction results for normal boiling point (Tb), liquid molar volume, critical temperature (Tc) and critical pressure (Pc) obtained using Artificial Neural Network and Gaussian Process Regression models.<n>To enhance the interpretability of the model, a feature analysis method based on Shapley values is employed to determine the contribution of each feature to the property predictions.
arXiv Detail & Related papers (2025-05-14T20:21:23Z)
Adaptive Basis Function Selection for Computationally Efficient Predictions [2.1499203845437216]
We develop a method to automatically select the most important BFs for prediction in a sub-domain of the model domain. This significantly reduces the computational complexity of computing predictions while maintaining predictive accuracy.
arXiv Detail & Related papers (2024-08-14T11:53:18Z)
Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models. We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z)
Variational Bayesian surrogate modelling with application to robust design optimisation [0.9626666671366836]
Surrogate models provide a quick-to-evaluate approximation to complex computational models. We consider Bayesian inference for constructing statistical surrogates with input uncertainties and dimensionality reduction. We demonstrate intrinsic and robust structural optimisation problems where cost functions depend on a weighted sum of the mean and standard deviation of model outputs.
arXiv Detail & Related papers (2024-04-23T09:22:35Z)
Data-freeWeight Compress and Denoise for Large Language Models [101.53420111286952]
We propose a novel approach termed Data-free Joint Rank-k Approximation for compressing the parameter matrices. We achieve a model pruning of 80% parameters while retaining 93.43% of the original performance without any calibration data.
arXiv Detail & Related papers (2024-02-26T05:51:47Z)
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
A probabilistic, data-driven closure model for RANS simulations with aleatoric, model uncertainty [1.8416014644193066]
We propose a data-driven, closure model for Reynolds-averaged Navier-Stokes (RANS) simulations that incorporates aleatoric, model uncertainty. A fully Bayesian formulation is proposed, combined with a sparsity-inducing prior in order to identify regions in the problem domain where the parametric closure is insufficient.
arXiv Detail & Related papers (2023-07-05T16:53:31Z)
Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation [58.80806716024701]
We study the global structure of attention scores computed using dot-product based self-attention. We find that most of the variation among attention scores lie in a low-dimensional eigenspace. We propose to compute scores only for a partial subset of token pairs, and use them to estimate scores for the remaining pairs.
arXiv Detail & Related papers (2021-06-16T14:38:42Z)
Non-Asymptotic Performance Guarantees for Neural Estimation of $\mathsf{f}$-Divergences [22.496696555768846]
Statistical distances quantify the dissimilarity between probability distributions. A modern method for estimating such distances from data relies on parametrizing a variational form by a neural network (NN) and optimizing it. This paper explores this tradeoff by means of non-asymptotic error bounds, focusing on three popular choices of SDs.
arXiv Detail & Related papers (2021-03-11T19:47:30Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.