XAI Benchmark for Visual Explanation
- URL: http://arxiv.org/abs/2310.08537v2
- Date: Wed, 22 Nov 2023 01:35:45 GMT
- Title: XAI Benchmark for Visual Explanation
- Authors: Yifei Zhang, Siyi Gu, James Song, Bo Pan, Guangji Bai, Liang Zhao
- Abstract summary: We develop a benchmark for visual explanation, consisting of eight datasets with human explanation annotations.
We devise a visual explanation pipeline that includes data loading, explanation generation, and method evaluation.
Our proposed benchmarks facilitate a fair evaluation and comparison of visual explanation methods.
- Score: 15.687509357300847
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rise of deep learning has ushered in significant progress in computer
vision (CV) tasks, yet the "black box" nature of these models often precludes
interpretability. This challenge has spurred the development of Explainable
Artificial Intelligence (XAI) by generating explanations to AI's
decision-making process. An explanation is aimed to not only faithfully reflect
the true reasoning process (i.e., faithfulness) but also align with humans'
reasoning (i.e., alignment). Within XAI, visual explanations employ visual cues
to elucidate the reasoning behind machine learning models, particularly in
image processing, by highlighting images' critical areas important to
predictions. Despite the considerable body of research in visual explanations,
standardized benchmarks for evaluating them are seriously underdeveloped. In
particular, to evaluate alignment, existing works usually merely illustrate a
few images' visual explanations, or hire some referees to report the
explanation quality under ad-hoc questionnaires. However, this cannot achieve a
standardized, quantitative, and comprehensive evaluation. To address this
issue, we develop a benchmark for visual explanation, consisting of eight
datasets with human explanation annotations from various domains, accommodating
both post-hoc and intrinsic visual explanation methods. Additionally, we devise
a visual explanation pipeline that includes data loading, explanation
generation, and method evaluation. Our proposed benchmarks facilitate a fair
evaluation and comparison of visual explanation methods. Building on our
curated collection of datasets, we benchmarked eight existing visual
explanation methods and conducted a thorough comparison across four selected
datasets using six alignment-based and causality-based metrics. Our benchmark
will be accessible through our website https://xaidataset.github.io.
Related papers
- MEGL: Multimodal Explanation-Guided Learning [23.54169888224728]
We propose a novel Multimodal Explanation-Guided Learning (MEGL) framework to enhance model interpretability and improve classification performance.
Our Saliency-Driven Textual Grounding (SDTG) approach integrates spatial information from visual explanations into textual rationales, providing spatially grounded and contextually rich explanations.
We validate MEGL on two new datasets, Object-ME and Action-ME, for image classification with multimodal explanations.
arXiv Detail & Related papers (2024-11-20T05:57:00Z) - Intrinsic Subgraph Generation for Interpretable Graph based Visual Question Answering [27.193336817953142]
We introduce an interpretable approach for graph-based Visual Question Answering (VQA)
Our model is designed to intrinsically produce a subgraph during the question-answering process as its explanation.
We compare these generated subgraphs against established post-hoc explainability methods for graph neural networks, and perform a human evaluation.
arXiv Detail & Related papers (2024-03-26T12:29:18Z) - Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - Explaining Explainability: Towards Deeper Actionable Insights into Deep
Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level.
We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z) - Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual Question Answering [58.64831511644917]
We introduce an interpretable by design model that factors model decisions into intermediate human-legible explanations.
We show that our inherently interpretable system can improve 4.64% over a comparable black-box system in reasoning-focused questions.
arXiv Detail & Related papers (2023-05-24T08:33:15Z) - Towards Unsupervised Visual Reasoning: Do Off-The-Shelf Features Know
How to Reason? [30.16956370267339]
We introduce a protocol to evaluate visual representations for the task of Visual Question Answering.
In order to decouple visual feature extraction from reasoning, we design a specific attention-based reasoning module.
We compare two types of visual representations, densely extracted local features and object-centric ones, against the performances of a perfect image representation using ground truth.
arXiv Detail & Related papers (2022-12-20T14:36:45Z) - REVEL Framework to measure Local Linear Explanations for black-box
models: Deep Learning Image Classification case of study [12.49538398746092]
We propose a procedure called REVEL to evaluate different aspects concerning the quality of explanations with a theoretically coherent development.
The experiments have been carried out on image four datasets as benchmark where we show REVEL's descriptive and analytical power.
arXiv Detail & Related papers (2022-11-11T12:15:36Z) - Understanding ME? Multimodal Evaluation for Fine-grained Visual
Commonsense [98.70218717851665]
It is unclear whether the models really understand the visual scene and underlying commonsense knowledge due to limited evaluation data resources.
We present a Multimodal Evaluation (ME) pipeline to automatically generate question-answer pairs to test models' understanding of the visual scene, text, and related knowledge.
We then take a step further to show that training with the ME data boosts the model's performance in standard VCR evaluation.
arXiv Detail & Related papers (2022-11-10T21:44:33Z) - CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing
Human Trust in Image Recognition Models [84.32751938563426]
We propose a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN)
In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process.
Our framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user.
arXiv Detail & Related papers (2021-09-03T09:46:20Z) - This is not the Texture you are looking for! Introducing Novel
Counterfactual Explanations for Non-Experts using Generative Adversarial
Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image.
We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques.
Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.