A Dual-Perspective Approach to Evaluating Feature Attribution Methods
- URL: http://arxiv.org/abs/2308.08949v1
- Date: Thu, 17 Aug 2023 12:41:04 GMT
- Title: A Dual-Perspective Approach to Evaluating Feature Attribution Methods
- Authors: Yawei Li, Yang Zhang, Kenji Kawaguchi, Ashkan Khakzar, Bernd Bischl,
Mina Rezaei
- Abstract summary: We propose two new perspectives within the faithfulness paradigm that reveal intuitive properties: soundness and completeness.
Soundness assesses the degree to which attributed features are truly predictive features, while completeness examines how well the resulting attribution reveals all the predictive features.
We apply these metrics to mainstream attribution methods, offering a novel lens through which to analyze and compare feature attribution methods.
- Score: 43.16453263420591
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature attribution methods attempt to explain neural network predictions by
identifying relevant features. However, establishing a cohesive framework for
assessing feature attribution remains a challenge. There are several views
through which we can evaluate attributions. One principal lens is to observe
the effect of perturbing attributed features on the model's behavior (i.e.,
faithfulness). While providing useful insights, existing faithfulness
evaluations suffer from shortcomings that we reveal in this paper. In this
work, we propose two new perspectives within the faithfulness paradigm that
reveal intuitive properties: soundness and completeness. Soundness assesses the
degree to which attributed features are truly predictive features, while
completeness examines how well the resulting attribution reveals all the
predictive features. The two perspectives are based on a firm mathematical
foundation and provide quantitative metrics that are computable through
efficient algorithms. We apply these metrics to mainstream attribution methods,
offering a novel lens through which to analyze and compare feature attribution
methods.
Related papers
- On The Coherence of Quantitative Evaluation of Visual Explanations [0.7212939068975619]
Evaluation methods have been proposed to assess the "goodness" of visual explanations.
We study a subset of the ImageNet-1k validation set where we evaluate a number of different commonly-used explanation methods.
Results of our study suggest that there is a lack of coherency on the grading provided by some of the considered evaluation methods.
arXiv Detail & Related papers (2023-02-14T13:41:57Z) - Fine-Grained Neural Network Explanation by Identifying Input Features
with Predictive Information [53.28701922632817]
We propose a method to identify features with predictive information in the input domain.
The core idea of our method is leveraging a bottleneck on the input that only lets input features associated with predictive latent features pass through.
arXiv Detail & Related papers (2021-10-04T14:13:42Z) - Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations.
We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z) - What Image Features Boost Housing Market Predictions? [81.32205133298254]
We propose a set of techniques for the extraction of visual features for efficient numerical inclusion in predictive algorithms.
We discuss techniques such as Shannon's entropy, calculating the center of gravity, employing image segmentation, and using Convolutional Neural Networks.
The set of 40 image features selected here carries a significant amount of predictive power and outperforms some of the strongest metadata predictors.
arXiv Detail & Related papers (2021-07-15T06:32:10Z) - Do Feature Attribution Methods Correctly Attribute Features? [5.58592454173439]
Feature attribution methods are exceedingly popular in interpretable machine learning.
There is no consensus on the definition of "attribution"
We evaluate three methods: saliency maps, rationales, and attention.
arXiv Detail & Related papers (2021-04-27T20:35:30Z) - Understanding Failures of Deep Networks via Robust Feature Extraction [44.204907883776045]
We introduce and study a method aimed at characterizing and explaining failures by identifying visual attributes whose presence or absence results in poor performance.
We leverage the representation of a separate robust model to extract interpretable features and then harness these features to identify failure modes.
arXiv Detail & Related papers (2020-12-03T08:33:29Z) - Towards Unifying Feature Attribution and Counterfactual Explanations:
Different Means to the Same End [17.226134854746267]
We present a method to generate feature attribution explanations from a set of counterfactual examples.
We show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency.
arXiv Detail & Related papers (2020-11-10T05:41:43Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis.
We obtain new explanations that are loosely necessary and sufficient for a prediction.
We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z) - A general framework for inference on algorithm-agnostic variable
importance [3.441021278275805]
We propose a framework for non inference on interpretable algorithm-agnostic variable importance.
We show that our proposal has good operating characteristics, and we illustrate it with data from a study of an antibody against HIV-1 infection.
arXiv Detail & Related papers (2020-04-07T20:09:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.