Neural Generators of Sparse Local Linear Models for Achieving both
Accuracy and Interpretability
- URL: http://arxiv.org/abs/2003.06441v1
- Date: Fri, 13 Mar 2020 18:49:36 GMT
- Title: Neural Generators of Sparse Local Linear Models for Achieving both
Accuracy and Interpretability
- Authors: Yuya Yoshikawa, Tomoharu Iwata
- Abstract summary: We propose neural generators of sparse local linear models (NGSLLs)
NGSLLs generate sparse linear weights for each sample using deep neural networks (DNNs)
We demonstrate the effectiveness of the NGSLL quantitatively and qualitatively by evaluating prediction performance and visualizing generated weights on image and text classification tasks.
- Score: 28.90948136731314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For reliability, it is important that the predictions made by machine
learning methods are interpretable by human. In general, deep neural networks
(DNNs) can provide accurate predictions, although it is difficult to interpret
why such predictions are obtained by DNNs. On the other hand, interpretation of
linear models is easy, although their predictive performance would be low since
real-world data is often intrinsically non-linear. To combine both the benefits
of the high predictive performance of DNNs and high interpretability of linear
models into a single model, we propose neural generators of sparse local linear
models (NGSLLs). The sparse local linear models have high flexibility as they
can approximate non-linear functions. The NGSLL generates sparse linear weights
for each sample using DNNs that take original representations of each sample
(e.g., word sequence) and their simplified representations (e.g., bag-of-words)
as input. By extracting features from the original representations, the weights
can contain rich information to achieve high predictive performance.
Additionally, the prediction is interpretable because it is obtained by the
inner product between the simplified representations and the sparse weights,
where only a small number of weights are selected by our gate module in the
NGSLL. In experiments with real-world datasets, we demonstrate the
effectiveness of the NGSLL quantitatively and qualitatively by evaluating
prediction performance and visualizing generated weights on image and text
classification tasks.
Related papers
- The Contextual Lasso: Sparse Linear Models via Deep Neural Networks [5.607237982617641]
We develop a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features.
An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso.
arXiv Detail & Related papers (2023-02-02T05:00:29Z) - Neural Additive Models for Location Scale and Shape: A Framework for
Interpretable Neural Regression Beyond the Mean [1.0923877073891446]
Deep neural networks (DNNs) have proven to be highly effective in a variety of tasks.
Despite this success, the inner workings of DNNs are often not transparent.
This lack of interpretability has led to increased research on inherently interpretable neural networks.
arXiv Detail & Related papers (2023-01-27T17:06:13Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Discovering Invariant Rationales for Graph Neural Networks [104.61908788639052]
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features.
We propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs.
arXiv Detail & Related papers (2022-01-30T16:43:40Z) - Locally Sparse Networks for Interpretable Predictions [7.362415721170984]
We propose a framework for training locally sparse neural networks where the local sparsity is learned via a sample-specific gating mechanism.
The sample-specific sparsity is predicted via a textitgating network, which is trained in tandem with the textitprediction network.
We demonstrate that our method outperforms state-of-the-art models when predicting the target function with far fewer features per instance.
arXiv Detail & Related papers (2021-06-11T15:46:50Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - The Surprising Power of Graph Neural Networks with Random Node
Initialization [54.4101931234922]
Graph neural networks (GNNs) are effective models for representation learning on relational data.
Standard GNNs are limited in their expressive power, as they cannot distinguish beyond the capability of the Weisfeiler-Leman graph isomorphism.
In this work, we analyze the expressive power of GNNs with random node (RNI)
We prove that these models are universal, a first such result for GNNs not relying on computationally demanding higher-order properties.
arXiv Detail & Related papers (2020-10-02T19:53:05Z) - Interpreting Graph Neural Networks for NLP With Differentiable Edge
Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models.
We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges.
We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.