It's FLAN time! Summing feature-wise latent representations for
interpretability
- URL: http://arxiv.org/abs/2106.10086v1
- Date: Fri, 18 Jun 2021 12:19:33 GMT
- Title: It's FLAN time! Summing feature-wise latent representations for
interpretability
- Authors: An-phi Nguyen, Maria Rodriguez Martinez
- Abstract summary: We propose a novel class of structurally-constrained neural networks, which we call FLANs (Feature-wise Latent Additive Networks)
FLANs process each input feature separately, computing for each of them a representation in a common latent space.
These feature-wise latent representations are then simply summed, and the aggregated representation is used for prediction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interpretability has become a necessary feature for machine learning models
deployed in critical scenarios, e.g. legal systems, healthcare. In these
situations, algorithmic decisions may have (potentially negative) long-lasting
effects on the end-user affected by the decision. In many cases, the
representational power of deep learning models is not needed, therefore simple
and interpretable models (e.g. linear models) should be preferred. However, in
high-dimensional and/or complex domains (e.g. computer vision), the universal
approximation capabilities of neural networks is required. Inspired by linear
models and the Kolmogorov-Arnol representation theorem, we propose a novel
class of structurally-constrained neural networks, which we call FLANs
(Feature-wise Latent Additive Networks). Crucially, FLANs process each input
feature separately, computing for each of them a representation in a common
latent space. These feature-wise latent representations are then simply summed,
and the aggregated representation is used for prediction. These constraints
(which are at the core of the interpretability of linear models) allow an user
to estimate the effect of each individual feature independently from the
others, enhancing interpretability. In a set of experiments across different
domains, we show how without compromising excessively the test performance, the
structural constraints proposed in FLANs indeed increase the interpretability
of deep learning models.
Related papers
- FocusLearn: Fully-Interpretable, High-Performance Modular Neural Networks for Time Series [0.3277163122167434]
This paper proposes a novel modular neural network model for time series prediction that is interpretable by construction.
A recurrent neural network learns the temporal dependencies in the data while an attention-based feature selection component selects the most relevant features.
A modular deep network is trained from the selected features independently to show the users how features influence outcomes, making the model interpretable.
arXiv Detail & Related papers (2023-11-28T14:51:06Z) - The Contextual Lasso: Sparse Linear Models via Deep Neural Networks [5.607237982617641]
We develop a new statistical estimator that fits a sparse linear model to the explanatory features such that the sparsity pattern and coefficients vary as a function of the contextual features.
An extensive suite of experiments on real and synthetic data suggests that the learned models, which remain highly transparent, can be sparser than the regular lasso.
arXiv Detail & Related papers (2023-02-02T05:00:29Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Combining Discrete Choice Models and Neural Networks through Embeddings:
Formulation, Interpretability and Performance [10.57079240576682]
This study proposes a novel approach that combines theory and data-driven choice models using Artificial Neural Networks (ANNs)
In particular, we use continuous vector representations, called embeddings, for encoding categorical or discrete explanatory variables.
Our models deliver state-of-the-art predictive performance, outperforming existing ANN-based models while drastically reducing the number of required network parameters.
arXiv Detail & Related papers (2021-09-24T15:55:31Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - DoLFIn: Distributions over Latent Features for Interpretability [8.807587076209568]
We propose a novel strategy for achieving interpretability in neural network models.
Our approach builds on the success of using probability as the central quantity.
We show that DoLFIn not only provides interpretable solutions, but even slightly outperforms the classical CNN and BiLSTM text classification.
arXiv Detail & Related papers (2020-11-10T18:32:53Z) - Neural Additive Models: Interpretable Machine Learning with Neural Nets [77.66871378302774]
Deep neural networks (DNNs) are powerful black-box predictors that have achieved impressive performance on a wide variety of tasks.
We propose Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models.
NAMs learn a linear combination of neural networks that each attend to a single input feature.
arXiv Detail & Related papers (2020-04-29T01:28:32Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.