A Functional Information Perspective on Model Interpretation
- URL: http://arxiv.org/abs/2206.05700v2
- Date: Tue, 14 Jun 2022 08:01:06 GMT
- Title: A Functional Information Perspective on Model Interpretation
- Authors: Itai Gat, Nitay Calderon, Roi Reichart, Tamir Hazan
- Abstract summary: This work suggests a theoretical framework for model interpretability.
We rely on the log-Sobolev inequality that bounds the functional entropy by the functional Fisher information.
We show that our method surpasses existing interpretability sampling-based methods on various data signals.
- Score: 30.101107406343665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contemporary predictive models are hard to interpret as their deep nets
exploit numerous complex relations between input elements. This work suggests a
theoretical framework for model interpretability by measuring the contribution
of relevant features to the functional entropy of the network with respect to
the input. We rely on the log-Sobolev inequality that bounds the functional
entropy by the functional Fisher information with respect to the covariance of
the data. This provides a principled way to measure the amount of information
contribution of a subset of features to the decision function. Through
extensive experiments, we show that our method surpasses existing
interpretability sampling-based methods on various data signals such as image,
text, and audio.
Related papers
- Directed Cyclic Graph for Causal Discovery from Multivariate Functional
Data [15.26007975367927]
We introduce a functional linear structural equation model for causal structure learning.
To enhance interpretability, our model involves a low-dimensional causal embedded space.
We prove that the proposed model is causally identifiable under standard assumptions.
arXiv Detail & Related papers (2023-10-31T15:19:24Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - A Mechanistic Interpretation of Arithmetic Reasoning in Language Models
using Causal Mediation Analysis [128.0532113800092]
We present a mechanistic interpretation of Transformer-based LMs on arithmetic questions.
This provides insights into how information related to arithmetic is processed by LMs.
arXiv Detail & Related papers (2023-05-24T11:43:47Z) - Relational Local Explanations [11.679389861042]
We develop a novel model-agnostic and permutation-based feature attribution algorithm based on relational analysis between input variables.
We are able to gain a broader insight into machine learning model decisions and data.
arXiv Detail & Related papers (2022-12-23T14:46:23Z) - Interpretability with full complexity by constraining feature
information [1.52292571922932]
Interpretability is a pressing issue for machine learning.
We approach interpretability from a new angle: constrain the information about the features without restricting the complexity of the model.
We develop a framework for extracting insight from the spectrum of approximate models.
arXiv Detail & Related papers (2022-11-30T18:59:01Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations.
We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z) - A geometric perspective on functional outlier detection [0.0]
We develop a conceptualization of functional outlier detection that is more widely applicable and realistic than previously proposed.
We show that simple manifold learning methods can be used to reliably infer and visualize the geometric structure of functional data sets.
Our experiments on synthetic and real data sets demonstrate that this approach leads to outlier detection performances at least on par with existing functional data-specific methods.
arXiv Detail & Related papers (2021-09-14T17:42:57Z) - Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
Functional Entropies [88.0813215220342]
Some modalities can more easily contribute to the classification results than others.
We develop a method based on the log-Sobolev inequality, which bounds the functional entropy with the functional-Fisher-information.
On the two challenging multi-modal datasets VQA-CPv2 and SocialIQ, we obtain state-of-the-art results while more uniformly exploiting the modalities.
arXiv Detail & Related papers (2020-10-21T07:40:33Z) - Explaining Black Box Predictions and Unveiling Data Artifacts through
Influence Functions [55.660255727031725]
Influence functions explain the decisions of a model by identifying influential training examples.
We conduct a comparison between influence functions and common word-saliency methods on representative tasks.
We develop a new measure based on influence functions that can reveal artifacts in training data.
arXiv Detail & Related papers (2020-05-14T00:45:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.