Locally Sparse Networks for Interpretable Predictions
- URL: http://arxiv.org/abs/2106.06468v1
- Date: Fri, 11 Jun 2021 15:46:50 GMT
- Title: Locally Sparse Networks for Interpretable Predictions
- Authors: Junchen Yang, Ofir Lindenbaum, Yuval Kluger
- Abstract summary: We propose a framework for training locally sparse neural networks where the local sparsity is learned via a sample-specific gating mechanism.
The sample-specific sparsity is predicted via a textitgating network, which is trained in tandem with the textitprediction network.
We demonstrate that our method outperforms state-of-the-art models when predicting the target function with far fewer features per instance.
- Score: 7.362415721170984
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the enormous success of neural networks, they are still hard to
interpret and often overfit when applied to low-sample-size (LSS) datasets. To
tackle these obstacles, we propose a framework for training locally sparse
neural networks where the local sparsity is learned via a sample-specific
gating mechanism that identifies the subset of most relevant features for each
measurement. The sample-specific sparsity is predicted via a \textit{gating}
network, which is trained in tandem with the \textit{prediction} network. By
learning these subsets and weights of a prediction model, we obtain an
interpretable neural network that can handle LSS data and can remove nuisance
variables, which are irrelevant for the supervised learning task. Using both
synthetic and real-world datasets, we demonstrate that our method outperforms
state-of-the-art models when predicting the target function with far fewer
features per instance.
Related papers
- Sampling weights of deep neural networks [1.2370077627846041]
We introduce a probability distribution, combined with an efficient sampling algorithm, for weights and biases of fully-connected neural networks.
In a supervised learning context, no iterative optimization or gradient computations of internal network parameters are needed.
We prove that sampled networks are universal approximators.
arXiv Detail & Related papers (2023-06-29T10:13:36Z) - Iterative self-transfer learning: A general methodology for response
time-history prediction based on small dataset [0.0]
An iterative self-transfer learningmethod for training neural networks based on small datasets is proposed in this study.
The results show that the proposed method can improve the model performance by near an order of magnitude on small datasets.
arXiv Detail & Related papers (2023-06-14T18:48:04Z) - Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.
Our approach is capable of encoding neural networks in a model zoo of mixed architecture.
We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Out-of-Distribution Example Detection in Deep Neural Networks using
Distance to Modelled Embedding [0.0]
We present Distance to Modelled Embedding (DIME) that we use to detect out-of-distribution examples during prediction time.
By approximating the training set embedding into feature space as a linear hyperplane, we derive a simple, unsupervised, highly performant and computationally efficient method.
arXiv Detail & Related papers (2021-08-24T12:28:04Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - MLDS: A Dataset for Weight-Space Analysis of Neural Networks [0.0]
We present MLDS, a new dataset consisting of thousands of trained neural networks with carefully controlled parameters.
This dataset enables new insights into both model-to-model and model-to-training-data relationships.
arXiv Detail & Related papers (2021-04-21T14:24:26Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - NSL: Hybrid Interpretable Learning From Noisy Raw Data [66.15862011405882]
This paper introduces a hybrid neural-symbolic learning framework, called NSL, that learns interpretable rules from labelled unstructured data.
NSL combines pre-trained neural networks for feature extraction with FastLAS, a state-of-the-art ILP system for rule learning under the answer set semantics.
We demonstrate that NSL is able to learn robust rules from MNIST data and achieve comparable or superior accuracy when compared to neural network and random forest baselines.
arXiv Detail & Related papers (2020-12-09T13:02:44Z) - Sampling Prediction-Matching Examples in Neural Networks: A
Probabilistic Programming Approach [9.978961706999833]
We consider the problem of exploring the prediction level sets of a classifier using probabilistic programming.
We define a prediction level set to be the set of examples for which the predictor has the same specified prediction confidence.
We demonstrate this technique with experiments on a synthetic dataset and MNIST.
arXiv Detail & Related papers (2020-01-09T15:57:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.