Related papers: Scrutinizing XAI using linear ground-truth data with suppressor variables

Scrutinizing XAI using linear ground-truth data with suppressor variables

URL: http://arxiv.org/abs/2111.07473v2
Date: Thu, 22 Jun 2023 14:58:57 GMT
Title: Scrutinizing XAI using linear ground-truth data with suppressor variables
Authors: Rick Wilming, C\'eline Budding, Klaus-Robert M\"uller, Stefan Haufe
Abstract summary: Saliency methods rank input features according to some measure of 'importance' It has been demonstrated that some saliency methods can highlight features that have no statistical association with the prediction target (suppressor variables)
Score: 0.8602553195689513
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning (ML) is increasingly often used to inform high-stakes decisions. As complex ML models (e.g., deep neural networks) are often considered black boxes, a wealth of procedures has been developed to shed light on their inner workings and the ways in which their predictions come about, defining the field of 'explainable AI' (XAI). Saliency methods rank input features according to some measure of 'importance'. Such methods are difficult to validate since a formal definition of feature importance is, thus far, lacking. It has been demonstrated that some saliency methods can highlight features that have no statistical association with the prediction target (suppressor variables). To avoid misinterpretations due to such behavior, we propose the actual presence of such an association as a necessary condition and objective preliminary definition for feature importance. We carefully crafted a ground-truth dataset in which all statistical dependencies are well-defined and linear, serving as a benchmark to study the problem of suppressor variables. We evaluate common explanation methods including LRP, DTD, PatternNet, PatternAttribution, LIME, Anchors, SHAP, and permutation-based methods with respect to our objective definition. We show that most of these methods are unable to distinguish important features from suppressors in this setting.

Related papers

Explainable AI needs formal notions of explanation correctness [2.1309989863595677]
Machine learning in critical domains such as medicine poses risks and requires regulation. One requirement is that decisions of ML systems in high-risk applications should be human-understandable. In its current form, XAI is unfit to provide quality control for ML; it itself needs scrutiny.
arXiv Detail & Related papers (2024-09-22T20:47:04Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks. By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections. Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z)
XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification. XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations. Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z)
CLIMAX: An exploration of Classifier-Based Contrastive Explanations [5.381004207943597]
We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box. Our method, which we refer to as CLIMAX, is based on local classifiers. We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
arXiv Detail & Related papers (2023-07-02T22:52:58Z)
Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space. We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z)
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions. Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z)
BASED-XAI: Breaking Ablation Studies Down for Explainable Artificial Intelligence [1.2948254191169823]
We show how varying perturbations can help to avoid potentially flawed conclusions. We also show how treatment of categorical variables is an important consideration in both post-hoc explainability and ablation studies.
arXiv Detail & Related papers (2022-07-12T14:38:37Z)
Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning [80.20302993614594]
We provide a statistical analysis to overcome drawbacks of Laplacian regularization. We unveil a large body of spectral filtering methods that exhibit desirable behaviors. We provide realistic computational guidelines in order to make our method usable with large amounts of data.
arXiv Detail & Related papers (2020-09-09T14:28:54Z)
Explaining Predictions by Approximating the Local Decision Boundary [3.60160227126201]
We present a new procedure for local decision boundary approximation (DBA) We train a variational autoencoder to learn a Euclidean latent space of encoded data representations. We exploit attribute annotations to map the latent space to attributes that are meaningful to the user.
arXiv Detail & Related papers (2020-06-14T19:12:42Z)
Post-hoc explanation of black-box classifiers using confident itemsets [12.323983512532651]
Black-box Artificial Intelligence (AI) methods have been widely utilized to build predictive models. It is difficult to trust decisions made by such methods since their inner working and decision logic is hidden from the user.
arXiv Detail & Related papers (2020-05-05T08:11:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.