Scrutinizing XAI using linear ground-truth data with suppressor
variables
- URL: http://arxiv.org/abs/2111.07473v2
- Date: Thu, 22 Jun 2023 14:58:57 GMT
- Title: Scrutinizing XAI using linear ground-truth data with suppressor
variables
- Authors: Rick Wilming, C\'eline Budding, Klaus-Robert M\"uller, Stefan Haufe
- Abstract summary: Saliency methods rank input features according to some measure of 'importance'
It has been demonstrated that some saliency methods can highlight features that have no statistical association with the prediction target (suppressor variables)
- Score: 0.8602553195689513
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) is increasingly often used to inform high-stakes
decisions. As complex ML models (e.g., deep neural networks) are often
considered black boxes, a wealth of procedures has been developed to shed light
on their inner workings and the ways in which their predictions come about,
defining the field of 'explainable AI' (XAI). Saliency methods rank input
features according to some measure of 'importance'. Such methods are difficult
to validate since a formal definition of feature importance is, thus far,
lacking. It has been demonstrated that some saliency methods can highlight
features that have no statistical association with the prediction target
(suppressor variables). To avoid misinterpretations due to such behavior, we
propose the actual presence of such an association as a necessary condition and
objective preliminary definition for feature importance. We carefully crafted a
ground-truth dataset in which all statistical dependencies are well-defined and
linear, serving as a benchmark to study the problem of suppressor variables. We
evaluate common explanation methods including LRP, DTD, PatternNet,
PatternAttribution, LIME, Anchors, SHAP, and permutation-based methods with
respect to our objective definition. We show that most of these methods are
unable to distinguish important features from suppressors in this setting.
Related papers
- Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Debiasing Machine Learning Models by Using Weakly Supervised Learning [3.3298048942057523]
We tackle the problem of bias mitigation of algorithmic decisions in a setting where both the output of the algorithm and the sensitive variable are continuous.
Typical examples are unfair decisions made with respect to the age or the financial status.
Our bias mitigation strategy is a weakly supervised learning method which requires that a small portion of the data can be measured in a fair manner.
arXiv Detail & Related papers (2024-02-23T18:11:32Z) - XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification.
XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations.
Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z) - CLIMAX: An exploration of Classifier-Based Contrastive Explanations [5.381004207943597]
We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box.
Our method, which we refer to as CLIMAX, is based on local classifiers.
We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
arXiv Detail & Related papers (2023-07-02T22:52:58Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions.
Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z) - BASED-XAI: Breaking Ablation Studies Down for Explainable Artificial
Intelligence [1.2948254191169823]
We show how varying perturbations can help to avoid potentially flawed conclusions.
We also show how treatment of categorical variables is an important consideration in both post-hoc explainability and ablation studies.
arXiv Detail & Related papers (2022-07-12T14:38:37Z) - Overcoming the curse of dimensionality with Laplacian regularization in
semi-supervised learning [80.20302993614594]
We provide a statistical analysis to overcome drawbacks of Laplacian regularization.
We unveil a large body of spectral filtering methods that exhibit desirable behaviors.
We provide realistic computational guidelines in order to make our method usable with large amounts of data.
arXiv Detail & Related papers (2020-09-09T14:28:54Z) - Explaining Predictions by Approximating the Local Decision Boundary [3.60160227126201]
We present a new procedure for local decision boundary approximation (DBA)
We train a variational autoencoder to learn a Euclidean latent space of encoded data representations.
We exploit attribute annotations to map the latent space to attributes that are meaningful to the user.
arXiv Detail & Related papers (2020-06-14T19:12:42Z) - Post-hoc explanation of black-box classifiers using confident itemsets [12.323983512532651]
Black-box Artificial Intelligence (AI) methods have been widely utilized to build predictive models.
It is difficult to trust decisions made by such methods since their inner working and decision logic is hidden from the user.
arXiv Detail & Related papers (2020-05-05T08:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.