Related papers: Explainable Neural Networks with Guarantees: A Sparse Estimation Approach

Explainable Neural Networks with Guarantees: A Sparse Estimation Approach

URL: http://arxiv.org/abs/2501.02010v2
Date: Tue, 18 Feb 2025 22:45:28 GMT
Title: Explainable Neural Networks with Guarantees: A Sparse Estimation Approach
Authors: Antoine Ledent, Peng Liu,
Abstract summary: This paper introduces a novel approach to constructing an explainable neural network that harmonizes predictiveness and explainability.<n>Our model, termed SparXnet, is designed as a linear combination of a sparse set of jointly learned features.<n>Our research paves the way for further research on sparse and explainable neural networks with guarantee.
Score: 11.142723510517778
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Balancing predictive power and interpretability has long been a challenging research area, particularly in powerful yet complex models like neural networks, where nonlinearity obstructs direct interpretation. This paper introduces a novel approach to constructing an explainable neural network that harmonizes predictiveness and explainability. Our model, termed SparXnet, is designed as a linear combination of a sparse set of jointly learned features, each derived from a different trainable function applied to a single 1-dimensional input feature. Leveraging the ability to learn arbitrarily complex relationships, our neural network architecture enables automatic selection of a sparse set of important features, with the final prediction being a linear combination of rescaled versions of these features. We demonstrate the ability to select significant features while maintaining comparable predictive performance and direct interpretability through extensive experiments on synthetic and real-world datasets. We also provide theoretical analysis on the generalization bounds of our framework, which is favorably linear in the number of selected features and only logarithmic in the number of input features. We further lift any dependence of sample complexity on the number of parameters or the architectural details under very mild conditions. Our research paves the way for further research on sparse and explainable neural networks with guarantee.

Related papers

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization [66.03821840425539]
In this paper, we investigate the training dynamics of $L$-layer neural networks using the tensor gradient program (SGD) framework. We show that SGD enables these networks to learn linearly independent features that substantially deviate from their initial values. This rich feature space captures relevant data information and ensures that any convergent point of the training process is a global minimum.
arXiv Detail & Related papers (2025-03-12T17:33:13Z)
A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities [30.737171081270322]
We study how fully-connected two-layer neural networks adapt to the target function after a single, but aggressive, gradient descent step. This provides a sharp description of the impact of feature learning in the generalization of two-layer neural networks, beyond the random features and lazy training regimes.
arXiv Detail & Related papers (2024-10-24T17:24:34Z)
Relational Composition in Neural Networks: A Survey and Call to Action [54.47858085003077]
Many neural nets appear to represent data as linear combinations of "feature vectors" We argue that this success is incomplete without an understanding of relational composition.
arXiv Detail & Related papers (2024-07-19T20:50:57Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Going Beyond Neural Network Feature Similarity: The Network Feature Complexity and Its Interpretation Using Category Theory [64.06519549649495]
We provide the definition of what we call functionally equivalent features. These features produce equivalent output under certain transformations. We propose an efficient algorithm named Iterative Feature Merging.
arXiv Detail & Related papers (2023-10-10T16:27:12Z)
The Contextual Lasso: Sparse Linear Models via Deep Neural Networks [5.607237982617641]
We develop a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features. An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso.
arXiv Detail & Related papers (2023-02-02T05:00:29Z)
Seeking Interpretability and Explainability in Binary Activated Neural Networks [2.828173677501078]
We study the use of binary activated neural networks as interpretable and explainable predictors in the context of regression tasks. We present an approach based on the efficient computation of SHAP values for quantifying the relative importance of the features, hidden neurons and even weights.
arXiv Detail & Related papers (2022-09-07T20:11:17Z)
Cardinality-Minimal Explanations for Monotonic Neural Networks [25.212444848632515]
In this paper, we investigate whether tractability can be regained by focusing on neural models implementing a monotonic function. Although the relevant decision problems remain intractable, we can show that they become solvable in favourable time.
arXiv Detail & Related papers (2022-05-19T23:47:25Z)
An Information-Theoretic Framework for Supervised Learning [22.280001450122175]
We propose a novel information-theoretic framework with its own notions of regret and sample complexity. We study the sample complexity of learning from data generated by deep neural networks with ReLU activation units. We conclude by corroborating our theoretical results with experimental analysis of random single-hidden-layer neural networks.
arXiv Detail & Related papers (2022-03-01T05:58:28Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Consistent feature selection for neural networks via Adaptive Group Lasso [3.42658286826597]
We propose and establish a theoretical guarantee for the use of the adaptive group for selecting important features of neural networks. Specifically, we show that our feature selection method is consistent for single-output feed-forward neural networks with one hidden layer and hyperbolic tangent activation function.
arXiv Detail & Related papers (2020-05-30T18:50:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.