Neural Basis Models for Interpretability
- URL: http://arxiv.org/abs/2205.14120v1
- Date: Fri, 27 May 2022 17:31:19 GMT
- Title: Neural Basis Models for Interpretability
- Authors: Filip Radenovic, Abhimanyu Dubey and Dhruv Mahajan
- Abstract summary: Generalized Additive Models (GAMs) are an inherently interpretable class of models.
We propose an entirely new subfamily of GAMs that utilize basis decomposition of shape functions.
A small number of basis functions are shared among all features, and are learned jointly for a given task.
- Score: 33.51591891812176
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the widespread use of complex machine learning models in real-world
applications, it is becoming critical to explain model predictions. However,
these models are typically black-box deep neural networks, explained post-hoc
via methods with known faithfulness limitations. Generalized Additive Models
(GAMs) are an inherently interpretable class of models that address this
limitation by learning a non-linear shape function for each feature separately,
followed by a linear model on top. However, these models are typically
difficult to train, require numerous parameters, and are difficult to scale.
We propose an entirely new subfamily of GAMs that utilizes basis
decomposition of shape functions. A small number of basis functions are shared
among all features, and are learned jointly for a given task, thus making our
model scale much better to large-scale data with high-dimensional features,
especially when features are sparse. We propose an architecture denoted as the
Neural Basis Model (NBM) which uses a single neural network to learn these
bases. On a variety of tabular and image datasets, we demonstrate that for
interpretable machine learning, NBMs are the state-of-the-art in accuracy,
model size, and, throughput and can easily model all higher-order feature
interactions.
Related papers
- Learning to Walk from Three Minutes of Real-World Data with Semi-structured Dynamics Models [9.318262213262866]
We introduce a novel framework for learning semi-structured dynamics models for contact-rich systems.
We make accurate long-horizon predictions with substantially less data than prior methods.
We validate our approach on a real-world Unitree Go1 quadruped robot.
arXiv Detail & Related papers (2024-10-11T18:11:21Z) - Neural Network-Based Piecewise Survival Models [0.3999851878220878]
A family of neural network-based survival models is presented.
The models can be seen as an extension of the commonly used discrete-time and piecewise exponential models.
arXiv Detail & Related papers (2024-03-27T15:08:00Z) - Accurate deep learning sub-grid scale models for large eddy simulations [0.0]
We present two families of sub-grid scale (SGS) turbulence models developed for large-eddy simulation (LES) purposes.
Their development required the formulation of physics-informed robust and efficient Deep Learning (DL) algorithms.
Explicit filtering of data from direct simulations of canonical channel flow at two friction Reynolds numbers provided accurate data for training and testing.
arXiv Detail & Related papers (2023-07-19T15:30:06Z) - Interpreting Black-box Machine Learning Models for High Dimensional
Datasets [40.09157165704895]
We train a black-box model on a high-dimensional dataset to learn the embeddings on which the classification is performed.
We then approximate the behavior of the black-box model by means of an interpretable surrogate model on the top-k feature space.
Our approach outperforms state-of-the-art methods like TabNet and XGboost when tested on different datasets.
arXiv Detail & Related papers (2022-08-29T07:36:17Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - On the balance between the training time and interpretability of neural
ODE for time series modelling [77.34726150561087]
The paper shows that modern neural ODE cannot be reduced to simpler models for time-series modelling applications.
The complexity of neural ODE is compared to or exceeds the conventional time-series modelling tools.
We propose a new view on time-series modelling using combined neural networks and an ODE system approach.
arXiv Detail & Related papers (2022-06-07T13:49:40Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - A Simple and Interpretable Predictive Model for Healthcare [0.0]
Deep learning models are currently dominating most state-of-the-art solutions for disease prediction.
These deep learning models, with trainable parameters running into millions, require huge amounts of compute and data to train and deploy.
We develop a simpler yet interpretable non-deep learning based model for application to EHR data.
arXiv Detail & Related papers (2020-07-27T08:13:37Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.