NPEFF: Non-Negative Per-Example Fisher Factorization
- URL: http://arxiv.org/abs/2310.04649v1
- Date: Sat, 7 Oct 2023 02:02:45 GMT
- Title: NPEFF: Non-Negative Per-Example Fisher Factorization
- Authors: Michael Matena, Colin Raffel
- Abstract summary: We introduce a novel interpretability method called NPEFF that is readily applicable to any end-to-end differentiable model.
We demonstrate that NPEFF has interpretable tunings through experiments on language and vision models.
- Score: 52.44573961263344
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As deep learning models are deployed in more and more settings, it becomes
increasingly important to be able to understand why they produce a given
prediction, but interpretation of these models remains a challenge. In this
paper, we introduce a novel interpretability method called NPEFF that is
readily applicable to any end-to-end differentiable model. It operates on the
principle that processing of a characteristic shared across different examples
involves a specific subset of model parameters. We perform NPEFF by decomposing
each example's Fisher information matrix as a non-negative sum of components.
These components take the form of either non-negative vectors or rank-1
positive semi-definite matrices depending on whether we are using diagonal or
low-rank Fisher representations, respectively. For the latter form, we
introduce a novel and highly scalable algorithm. We demonstrate that components
recovered by NPEFF have interpretable tunings through experiments on language
and vision models. Using unique properties of NPEFF's parameter-space
representations, we ran extensive experiments to verify that the connections
between directions in parameters space and examples recovered by NPEFF actually
reflect the model's processing. We further demonstrate NPEFF's ability to
uncover the actual processing strategies used by a TRACR-compiled model. We
further explore a potential application of NPEFF in uncovering and correcting
flawed heuristics used by a model. We release our code to facilitate research
using NPEFF.
Related papers
- A Differentiable Partially Observable Generalized Linear Model with
Forward-Backward Message Passing [2.600709013150986]
We propose a new differentiable POGLM, which enables the pathwise gradient estimator, better than the score function gradient estimator used in existing works.
Our new method yields more interpretable parameters, underscoring its significance in neuroscience.
arXiv Detail & Related papers (2024-02-02T09:34:49Z) - Flow Factorized Representation Learning [109.51947536586677]
We introduce a generative model which specifies a distinct set of latent probability paths that define different input transformations.
We show that our model achieves higher likelihoods on standard representation learning benchmarks while simultaneously being closer to approximately equivariant models.
arXiv Detail & Related papers (2023-09-22T20:15:37Z) - MoEfication: Conditional Computation of Transformer Models for Efficient
Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost.
We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon.
We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z) - Combining Discrete Choice Models and Neural Networks through Embeddings:
Formulation, Interpretability and Performance [10.57079240576682]
This study proposes a novel approach that combines theory and data-driven choice models using Artificial Neural Networks (ANNs)
In particular, we use continuous vector representations, called embeddings, for encoding categorical or discrete explanatory variables.
Our models deliver state-of-the-art predictive performance, outperforming existing ANN-based models while drastically reducing the number of required network parameters.
arXiv Detail & Related papers (2021-09-24T15:55:31Z) - Locally Interpretable Model Agnostic Explanations using Gaussian
Processes [2.9189409618561966]
Local Interpretable Model-Agnostic Explanations (LIME) is a popular technique for explaining the prediction of a single instance.
We propose a Gaussian Process (GP) based variation of locally interpretable models.
We demonstrate that the proposed technique is able to generate faithful explanations using much fewer samples as compared to LIME.
arXiv Detail & Related papers (2021-08-16T05:49:01Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Feature Weighted Non-negative Matrix Factorization [92.45013716097753]
We propose the Feature weighted Non-negative Matrix Factorization (FNMF) in this paper.
FNMF learns the weights of features adaptively according to their importances.
It can be solved efficiently with the suggested optimization algorithm.
arXiv Detail & Related papers (2021-03-24T21:17:17Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - Controlling for sparsity in sparse factor analysis models: adaptive
latent feature sharing for piecewise linear dimensionality reduction [2.896192909215469]
We propose a simple and tractable parametric feature allocation model which can address key limitations of current latent feature decomposition techniques.
We derive a novel adaptive Factor analysis (aFA), as well as, an adaptive probabilistic principle component analysis (aPPCA) capable of flexible structure discovery and dimensionality reduction.
We show that aPPCA and aFA can infer interpretable high level features both when applied on raw MNIST and when applied for interpreting autoencoder features.
arXiv Detail & Related papers (2020-06-22T16:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.