Related papers: Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?

Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?

URL: http://arxiv.org/abs/2005.01831v1
Date: Mon, 4 May 2020 20:35:17 GMT
Title: Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?
Authors: Peter Hase, Mohit Bansal
Abstract summary: We carry out human subject tests to isolate the effect of algorithmic explanations on model interpretability. Clear evidence of method effectiveness is found in very few cases. Our results provide the first reliable and comprehensive estimates of how explanations influence simulatability.
Score: 97.77183117452235
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Algorithmic approaches to interpreting machine learning models have proliferated in recent years. We carry out human subject tests that are the first of their kind to isolate the effect of algorithmic explanations on a key aspect of model interpretability, simulatability, while avoiding important confounding experimental factors. A model is simulatable when a person can predict its behavior on new inputs. Through two kinds of simulation tests involving text and tabular data, we evaluate five explanations methods: (1) LIME, (2) Anchor, (3) Decision Boundary, (4) a Prototype model, and (5) a Composite approach that combines explanations from each method. Clear evidence of method effectiveness is found in very few cases: LIME improves simulatability in tabular classification, and our Prototype method is effective in counterfactual simulation tests. We also collect subjective ratings of explanations, but we do not find that ratings are predictive of how helpful explanations are. Our results provide the first reliable and comprehensive estimates of how explanations influence simulatability across a variety of explanation methods and data domains. We show that (1) we need to be careful about the metrics we use to evaluate explanation methods, and (2) there is significant room for improvement in current methods. All our supporting code, data, and models are publicly available at: https://github.com/peterbhase/InterpretableNLP-ACL2020

Related papers

F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI [15.314388210699443]
XAI techniques can extract meaningful insights from deep learning models. How to properly evaluate them remains an open problem. We propose Fine-tuned Fidelity (F-Fidelity) as a robust evaluation framework for XAI.
arXiv Detail & Related papers (2024-10-03T20:23:06Z)
Interpretability in Symbolic Regression: a benchmark of Explanatory Methods using the Feynman data set [0.0]
Interpretability of machine learning models plays a role as important as the model accuracy. This paper proposes a benchmark scheme to evaluate explanatory methods to explain regression models. Results have shown that Symbolic Regression models can be an interesting alternative to white-box and black-box models.
arXiv Detail & Related papers (2024-04-08T23:46:59Z)
Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development. To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps. These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z)
Right for the Wrong Reason: Can Interpretable ML Techniques Detect Spurious Correlations? [2.7558542803110244]
We propose a rigorous evaluation strategy to assess an explanation technique's ability to correctly identify spurious correlations. We find that the post-hoc technique SHAP, as well as the inherently interpretable Attri-Net provide the best performance.
arXiv Detail & Related papers (2023-07-23T14:43:17Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles [50.81061839052459]
We formalize the generation of robust counterfactual explanations as a probabilistic problem. We show the link between the robustness of ensemble models and the robustness of base learners. Our method achieves high robustness with only a small increase in the distance from counterfactual explanations to their initial observations.
arXiv Detail & Related papers (2022-05-27T17:28:54Z)
Evaluation of Local Model-Agnostic Explanations Using Ground Truth [4.278336455989584]
Explanation techniques are commonly evaluated using human-grounded methods. We propose a functionally-grounded evaluation procedure for local model-agnostic explanation techniques.
arXiv Detail & Related papers (2021-06-04T13:47:31Z)
Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals [72.00815192668193]
Feature importance (FI) estimates are a popular form of explanation, and they are commonly created and evaluated by computing the change in model confidence caused by removing certain input features at test time. We study several under-explored dimensions of FI-based explanations, providing conceptual and empirical improvements for this form of explanation.
arXiv Detail & Related papers (2021-06-01T20:36:48Z)
Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? [86.60613602337246]
We introduce a leakage-adjusted simulatability (LAS) metric for evaluating NL explanations. LAS measures how well explanations help an observer predict a model's output, while controlling for how explanations can directly leak the output. We frame explanation generation as a multi-agent game and optimize explanations for simulatability while penalizing label leakage.
arXiv Detail & Related papers (2020-10-08T16:59:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.