Related papers: The Solvability of Interpretability Evaluation Metrics

The Solvability of Interpretability Evaluation Metrics

URL: http://arxiv.org/abs/2205.08696v1
Date: Wed, 18 May 2022 02:52:03 GMT
Title: The Solvability of Interpretability Evaluation Metrics
Authors: Yilun Zhou, Julie Shah
Abstract summary: Feature attribution methods are often evaluated on metrics such as comprehensiveness and sufficiency. In this paper, we highlight an intriguing property of these metrics: their solvability. We present a series of investigations showing that this beam search explainer is generally comparable or favorable to current choices.
Score: 7.3709604810699085
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Feature attribution methods are popular for explaining neural network predictions, and they are often evaluated on metrics such as comprehensiveness and sufficiency, which are motivated by the principle that more important features -- as judged by the explanation -- should have larger impacts on model prediction. In this paper, we highlight an intriguing property of these metrics: their solvability. Concretely, we can define the problem of optimizing an explanation for a metric and solve it using beam search. This brings up the obvious question: given such solvability, why do we still develop other explainers and then evaluate them on the metric? We present a series of investigations showing that this beam search explainer is generally comparable or favorable to current choices such as LIME and SHAP, suggest rethinking the goals of model interpretability, and identify several directions towards better evaluations of new method proposals.

Related papers

Feature Attribution from First Principles [6.836945436656676]
We argue that axiomatic frameworks that any feature attribution method should satisfy are often too restrictive.<n>Rather than imposing axioms, we start by defining attributions for the simplest possible models.<n>We derive closed-form expressions for attribution of deep ReLU networks, and take a step toward the optimization of evaluation metrics.
arXiv Detail & Related papers (2025-05-30T15:53:11Z)
Rethinking Distance Metrics for Counterfactual Explainability [53.436414009687]
We investigate a framing for counterfactual generation methods that considers counterfactuals not as independent draws from a region around the reference, but as jointly sampled with the reference from the underlying data distribution. We derive a distance metric, tailored for counterfactual similarity that can be applied to a broad range of settings.
arXiv Detail & Related papers (2024-10-18T15:06:50Z)
XForecast: Evaluating Natural Language Explanations for Time Series Forecasting [72.57427992446698]
Time series forecasting aids decision-making, especially for stakeholders who rely on accurate predictions. Traditional explainable AI (XAI) methods, which underline feature or temporal importance, often require expert knowledge. evaluating forecast NLEs is difficult due to the complex causal relationships in time series data.
arXiv Detail & Related papers (2024-10-18T05:16:39Z)
A Critical Assessment of Interpretable and Explainable Machine Learning for Intrusion Detection [0.0]
We study the use of overly complex and opaque ML models, unaccounted data imbalances and correlated features, inconsistent influential features across different explanation methods, and the implausible utility of explanations. Specifically, we advise avoiding complex opaque models such as Deep Neural Networks and instead using interpretable ML models such as Decision Trees. We find that feature-based model explanations are most often inconsistent across different settings.
arXiv Detail & Related papers (2024-07-04T15:35:42Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development. To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps. These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z)
Detection Accuracy for Evaluating Compositional Explanations of Units [5.220940151628734]
Two examples of methods that use this approach are Network Dissection and Compositional explanations. While intuitively, logical forms are more informative than atomic concepts, it is not clear how to quantify this improvement. We propose to use as evaluation metric the Detection Accuracy, which measures units' consistency of detection of their assigned explanations.
arXiv Detail & Related papers (2021-09-16T08:47:34Z)
Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals [72.00815192668193]
Feature importance (FI) estimates are a popular form of explanation, and they are commonly created and evaluated by computing the change in model confidence caused by removing certain input features at test time. We study several under-explored dimensions of FI-based explanations, providing conceptual and empirical improvements for this form of explanation.
arXiv Detail & Related papers (2021-06-01T20:36:48Z)
Evaluation of Similarity-based Explanations [36.10585276728203]
We investigated relevance metrics that can provide reasonable explanations to users. Our experiments revealed that the cosine similarity of the gradients of the loss performs best. Some metrics perform poorly in our tests and analyzed the reasons of their failure.
arXiv Detail & Related papers (2020-06-08T12:39:46Z)
Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis. We obtain new explanations that are loosely necessary and sufficient for a prediction. We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.