Evaluation of Similarity-based Explanations
- URL: http://arxiv.org/abs/2006.04528v2
- Date: Mon, 22 Mar 2021 22:15:53 GMT
- Title: Evaluation of Similarity-based Explanations
- Authors: Kazuaki Hanawa, Sho Yokoi, Satoshi Hara, Kentaro Inui
- Abstract summary: We investigated relevance metrics that can provide reasonable explanations to users.
Our experiments revealed that the cosine similarity of the gradients of the loss performs best.
Some metrics perform poorly in our tests and analyzed the reasons of their failure.
- Score: 36.10585276728203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explaining the predictions made by complex machine learning models helps
users to understand and accept the predicted outputs with confidence. One
promising way is to use similarity-based explanation that provides similar
instances as evidence to support model predictions. Several relevance metrics
are used for this purpose. In this study, we investigated relevance metrics
that can provide reasonable explanations to users. Specifically, we adopted
three tests to evaluate whether the relevance metrics satisfy the minimal
requirements for similarity-based explanation. Our experiments revealed that
the cosine similarity of the gradients of the loss performs best, which would
be a recommended choice in practice. In addition, we showed that some metrics
perform poorly in our tests and analyzed the reasons of their failure. We
expect our insights to help practitioners in selecting appropriate relevance
metrics and also aid further researches for designing better relevance metrics
for explanations.
Related papers
- Rethinking Distance Metrics for Counterfactual Explainability [53.436414009687]
We investigate a framing for counterfactual generation methods that considers counterfactuals not as independent draws from a region around the reference, but as jointly sampled with the reference from the underlying data distribution.
We derive a distance metric, tailored for counterfactual similarity that can be applied to a broad range of settings.
arXiv Detail & Related papers (2024-10-18T15:06:50Z) - Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - Goodhart's Law Applies to NLP's Explanation Benchmarks [57.26445915212884]
We critically examine two sets of metrics: the ERASER metrics (comprehensiveness and sufficiency) and the EVAL-X metrics.
We show that we can inflate a model's comprehensiveness and sufficiency scores dramatically without altering its predictions or explanations on in-distribution test inputs.
Our results raise doubts about the ability of current metrics to guide explainability research, underscoring the need for a broader reassessment of what precisely these metrics are intended to capture.
arXiv Detail & Related papers (2023-08-28T03:03:03Z) - The Solvability of Interpretability Evaluation Metrics [7.3709604810699085]
Feature attribution methods are often evaluated on metrics such as comprehensiveness and sufficiency.
In this paper, we highlight an intriguing property of these metrics: their solvability.
We present a series of investigations showing that this beam search explainer is generally comparable or favorable to current choices.
arXiv Detail & Related papers (2022-05-18T02:52:03Z) - A Unified Study of Machine Learning Explanation Evaluation Metrics [16.4602888153369]
Many existing metrics for explanations are introduced by researchers as by-products of their proposed explanation techniques to demonstrate the advantages of their methods.
We claim that the lack of acknowledged and justified metrics results in chaos in benchmarking these explanation methods.
We propose guidelines in dealing with the problems in evaluating machine learning explanation and encourage researchers to carefully deal with these problems when developing explanation techniques and metrics.
arXiv Detail & Related papers (2022-03-27T10:12:06Z) - Toward Scalable and Unified Example-based Explanation and Outlier
Detection [128.23117182137418]
We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction.
We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.
arXiv Detail & Related papers (2020-11-11T05:58:17Z) - Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis.
We obtain new explanations that are loosely necessary and sufficient for a prediction.
We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z) - An end-to-end approach for the verification problem: learning the right
distance [15.553424028461885]
We augment the metric learning setting by introducing a parametric pseudo-distance, trained jointly with the encoder.
We first show it approximates a likelihood ratio which can be used for hypothesis tests.
We observe training is much simplified under the proposed approach compared to metric learning with actual distances.
arXiv Detail & Related papers (2020-02-21T18:46:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.