The Explanation Game -- Rekindled (Extended Version)
- URL: http://arxiv.org/abs/2501.11429v2
- Date: Sat, 15 Feb 2025 03:49:28 GMT
- Title: The Explanation Game -- Rekindled (Extended Version)
- Authors: Joao Marques-Silva, Xuanxiang Huang, Olivier Letoffe,
- Abstract summary: Recent work demonstrated the existence of critical flaws in the current use of Shapley values in explainable AI (XAI)
This paper proposes a novel definition of SHAP scores that overcomes existing flaws.
- Score: 3.3766484312332303
- License:
- Abstract: Recent work demonstrated the existence of critical flaws in the current use of Shapley values in explainable AI (XAI), i.e. the so-called SHAP scores. These flaws are significant in that the scores provided to a human decision-maker can be misleading. Although these negative results might appear to indicate that Shapley values ought not be used in XAI, this paper argues otherwise. Concretely, this paper proposes a novel definition of SHAP scores that overcomes existing flaws. Furthermore, the paper outlines a practically efficient solution for the rigorous estimation of the novel SHAP scores. Preliminary experimental results confirm our claims, and further underscore the flaws of the current SHAP scores.
Related papers
- The Mirage of Model Editing: Revisiting Evaluation in the Wild [70.17413507444704]
We study the effectiveness of model editing in question answering applications.
Our single editing experiments indicate that current editing methods perform substantially worse than previously reported.
Our analysis provides a fundamental reexamination of both the real-world applicability of existing model editing methods and their evaluation practices.
arXiv Detail & Related papers (2025-02-16T15:57:55Z) - SHAP scores fail pervasively even when Lipschitz succeeds [3.3766484312332303]
Recent work devised examples of machine learning (ML) classifiers for which the computed SHAP scores are thoroughly unsatisfactory.
It was unclear how general were the issues identified with SHAP scores.
This paper shows that for Boolean classifiers there are arbitrarily many examples for which the SHAP scores must be deemed unsatisfactory.
arXiv Detail & Related papers (2024-12-18T14:02:15Z) - F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI [15.314388210699443]
Fine-tuned Fidelity F-Fidelity is a robust evaluation framework for XAI.
We show that F-Fidelity significantly improves upon prior evaluation metrics in recovering the ground-truth ranking of explainers.
We also show that given a faithful explainer, F-Fidelity metric can be used to compute the sparsity of influential input components.
arXiv Detail & Related papers (2024-10-03T20:23:06Z) - Towards trustable SHAP scores [3.3766484312332303]
This paper investigates how SHAP scores can be modified so as to extend axiomatic aggregations to the case of Shapley values in XAI.
The proposed new definition of SHAP scores avoids all the known cases where unsatisfactory results have been identified.
arXiv Detail & Related papers (2024-04-30T10:39:20Z) - The Distributional Uncertainty of the SHAP score in Explainable Machine Learning [2.655371341356892]
We propose a principled framework for reasoning on SHAP scores under unknown entity population distributions.
We study the basic problems of finding maxima and minima of this function, which allows us to determine tight ranges for the SHAP scores of all features.
arXiv Detail & Related papers (2024-01-23T13:04:02Z) - Goodhart's Law Applies to NLP's Explanation Benchmarks [57.26445915212884]
We critically examine two sets of metrics: the ERASER metrics (comprehensiveness and sufficiency) and the EVAL-X metrics.
We show that we can inflate a model's comprehensiveness and sufficiency scores dramatically without altering its predictions or explanations on in-distribution test inputs.
Our results raise doubts about the ability of current metrics to guide explainability research, underscoring the need for a broader reassessment of what precisely these metrics are intended to capture.
arXiv Detail & Related papers (2023-08-28T03:03:03Z) - Augmentation by Counterfactual Explanation -- Fixing an Overconfident
Classifier [11.233334009240947]
A highly accurate but overconfident model is ill-suited for deployment in critical applications such as healthcare and autonomous driving.
This paper proposes an application of counterfactual explanations in fixing an over-confident classifier.
arXiv Detail & Related papers (2022-10-21T18:53:16Z) - Shortcomings of Question Answering Based Factuality Frameworks for Error
Localization [51.01957350348377]
We show that question answering (QA)-based factuality metrics fail to correctly identify error spans in generated summaries.
Our analysis reveals a major reason for such poor localization: questions generated by the QG module often inherit errors from non-factual summaries which are then propagated further into downstream modules.
Our experiments conclusively show that there exist fundamental issues with localization using the QA framework which cannot be fixed solely by stronger QA and QG models.
arXiv Detail & Related papers (2022-10-13T05:23:38Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Latent Opinions Transfer Network for Target-Oriented Opinion Words
Extraction [63.70885228396077]
We propose a novel model to transfer opinions knowledge from resource-rich review sentiment classification datasets to low-resource task TOWE.
Our model achieves better performance compared to other state-of-the-art methods and significantly outperforms the base model without transferring opinions knowledge.
arXiv Detail & Related papers (2020-01-07T11:50:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.