Related papers: Learning Active Subspaces and Discovering Important Features with Gaussian Radial Basis Functions Neural Networks

Learning Active Subspaces and Discovering Important Features with Gaussian Radial Basis Functions Neural Networks

URL: http://arxiv.org/abs/2307.05639v2
Date: Sat, 11 May 2024 04:11:23 GMT
Title: Learning Active Subspaces and Discovering Important Features with Gaussian Radial Basis Functions Neural Networks
Authors: Danny D'Agostino, Ilija Ilievski, Christine Annette Shoemaker,
Abstract summary: We show that precious information is contained in the spectrum of the precision matrix that can be extracted once the training of the model is completed. We conducted numerical experiments for regression, classification, and feature selection tasks. Our results demonstrate that the proposed model does not only yield an attractive prediction performance compared to the competitors.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Providing a model that achieves a strong predictive performance and is simultaneously interpretable by humans is one of the most difficult challenges in machine learning research due to the conflicting nature of these two objectives. To address this challenge, we propose a modification of the radial basis function neural network model by equipping its Gaussian kernel with a learnable precision matrix. We show that precious information is contained in the spectrum of the precision matrix that can be extracted once the training of the model is completed. In particular, the eigenvectors explain the directions of maximum sensitivity of the model revealing the active subspace and suggesting potential applications for supervised dimensionality reduction. At the same time, the eigenvectors highlight the relationship in terms of absolute variation between the input and the latent variables, thereby allowing us to extract a ranking of the input variables based on their importance to the prediction task enhancing the model interpretability. We conducted numerical experiments for regression, classification, and feature selection tasks, comparing our model against popular machine learning models, the state-of-the-art deep learning-based embedding feature selection techniques, and a transformer model for tabular data. Our results demonstrate that the proposed model does not only yield an attractive prediction performance compared to the competitors but also provides meaningful and interpretable results that potentially could assist the decision-making process in real-world applications. A PyTorch implementation of the model is available on GitHub at the following link. https://github.com/dannyzx/Gaussian-RBFNN

Related papers

IGANN Sparse: Bridging Sparsity and Interpretability with Non-linear Insight [4.010646933005848]
IGANN Sparse is a novel machine learning model from the family of generalized additive models. It promotes sparsity through a non-linear feature selection process during training. This ensures interpretability through improved model sparsity without sacrificing predictive performance.
arXiv Detail & Related papers (2024-03-17T22:44:36Z)
The Contextual Lasso: Sparse Linear Models via Deep Neural Networks [5.607237982617641]
We develop a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features. An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso.
arXiv Detail & Related papers (2023-02-02T05:00:29Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain. In this work, we build models that rely on the message-passing rule of predictive coding. We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z)
HyperImpute: Generalized Iterative Imputation with Automatic Model Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models. We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z)
Entropy optimized semi-supervised decomposed vector-quantized variational autoencoder model based on transfer learning for multiclass text classification and generation [3.9318191265352196]
We propose a semisupervised discrete latent variable model for multi-class text classification and text generation. The proposed model employs the concept of transfer learning for training a quantized transformer model. Experimental results indicate that the proposed model has surpassed the state-of-the-art models remarkably.
arXiv Detail & Related papers (2021-11-10T07:07:54Z)
X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning. To take the power of both worlds, we propose a novel X-model. X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z)
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning. Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
Model-agnostic multi-objective approach for the evolutionary discovery of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results. We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z)
Gaussian Function On Response Surface Estimation [12.35564140065216]
We propose a new framework for interpreting (features and samples) black-box machine learning models via a metamodeling technique. The metamodel can be estimated from data generated via a trained complex model by running the computer experiment on samples of data in the region of interest.
arXiv Detail & Related papers (2021-01-04T04:47:00Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.