Related papers: Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI

Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI

URL: http://arxiv.org/abs/2003.07258v2
Date: Tue, 9 Feb 2021 16:18:05 GMT
Title: Ground Truth Evaluation of Neural Network Explanations with CLEVR-XAI
Authors: Leila Arras, Ahmed Osman, Wojciech Samek
Abstract summary: We propose a ground truth based evaluation framework for XAI methods based on the CLEVR visual question answering task. Our framework provides a (1) selective, (2) controlled and (3) realistic testbed for the evaluation of neural network explanations.
Score: 12.680653816836541
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rise of deep learning in today's applications entailed an increasing need in explaining the model's decisions beyond prediction performances in order to foster trust and accountability. Recently, the field of explainable AI (XAI) has developed methods that provide such explanations for already trained neural networks. In computer vision tasks such explanations, termed heatmaps, visualize the contributions of individual pixels to the prediction. So far XAI methods along with their heatmaps were mainly validated qualitatively via human-based assessment, or evaluated through auxiliary proxy tasks such as pixel perturbation, weak object localization or randomization tests. Due to the lack of an objective and commonly accepted quality measure for heatmaps, it was debatable which XAI method performs best and whether explanations can be trusted at all. In the present work, we tackle the problem by proposing a ground truth based evaluation framework for XAI methods based on the CLEVR visual question answering task. Our framework provides a (1) selective, (2) controlled and (3) realistic testbed for the evaluation of neural network explanations. We compare ten different explanation methods, resulting in new insights about the quality and properties of XAI methods, sometimes contradicting with conclusions from previous comparative studies. The CLEVR-XAI dataset and the benchmarking code can be found at https://github.com/ahmedmagdiosman/clevr-xai.

Related papers

Are Objective Explanatory Evaluation metrics Trustworthy? An Adversarial Analysis [12.921307214813357]
The aim of the paper is to come up with a novel explanatory technique called SHifted Adversaries using Pixel Elimination. We show that SHAPE is, infact, an adversarial explanation that fools causal metrics that are employed to measure the robustness and reliability of popular importance based visual XAI methods.
arXiv Detail & Related papers (2024-06-12T02:39:46Z)
Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z)
Explaining Explainability: Towards Deeper Actionable Insights into Deep Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level. We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z)
An Experimental Investigation into the Evaluation of Explainability Methods [60.54170260771932]
This work compares 14 different metrics when applied to nine state-of-the-art XAI methods and three dummy methods (e.g., random saliency maps) used as references. Experimental results show which of these metrics produces highly correlated results, indicating potential redundancy.
arXiv Detail & Related papers (2023-05-25T08:07:07Z)
Finding the right XAI method -- A Guide for the Evaluation and Ranking of Explainable AI Methods in Climate Science [2.8877394238963214]
We introduce XAI evaluation in the climate context and discuss different desired explanation properties. We find that XAI methods Integrated Gradients, layer-wise relevance propagation, and input times gradients exhibit considerable robustness, faithfulness, and complexity. We find architecture-dependent performance differences regarding robustness, complexity and localization skills of different XAI methods.
arXiv Detail & Related papers (2023-03-01T16:54:48Z)
Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe. GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z)
Neural Causal Models for Counterfactual Identification and Estimation [62.30444687707919]
We study the evaluation of counterfactual statements through neural models. First, we show that neural causal models (NCMs) are expressive enough. Second, we develop an algorithm for simultaneously identifying and estimating counterfactual distributions.
arXiv Detail & Related papers (2022-09-30T18:29:09Z)
Evaluating Explainable Artificial Intelligence Methods for Multi-label Deep Learning Classification Tasks in Remote Sensing [0.0]
We develop deep learning models with state-of-the-art performance in benchmark datasets. Ten XAI methods were employed towards understanding and interpreting models' predictions. Occlusion, Grad-CAM and Lime were the most interpretable and reliable XAI methods.
arXiv Detail & Related papers (2021-04-03T11:13:14Z)
Neural Network Attribution Methods for Problems in Geoscience: A Novel Synthetic Benchmark Dataset [0.05156484100374058]
We provide a framework to generate attribution benchmark datasets for regression problems in the geosciences. We train a fully-connected network to learn the underlying function that was used for simulation. We compare estimated attribution heatmaps from different XAI methods to the ground truth in order to identify examples where specific XAI methods perform well or poorly.
arXiv Detail & Related papers (2021-03-18T03:39:17Z)
Quantifying Explainability of Saliency Methods in Deep Neural Networks with a Synthetic Dataset [16.1448256306394]
This paper introduces a synthetic dataset that can be generated adhoc along with the ground-truth heatmaps for more objective quantitative assessment. Each sample data is an image of a cell with easily recognized features that are distinguished from localization ground-truth mask.
arXiv Detail & Related papers (2020-09-07T05:55:24Z)
Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL) In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.