Related papers: CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models

CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models

URL: http://arxiv.org/abs/2109.01401v2
Date: Mon, 6 Sep 2021 07:00:34 GMT
Title: CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models
Authors: Arjun R. Akula, Keze Wang, Changsong Liu, Sari Saba-Sadiya, Hongjing Lu, Sinisa Todorovic, Joyce Chai, and Song-Chun Zhu
Abstract summary: We propose a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN) In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process. Our framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user.
Score: 84.32751938563426
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose CX-ToM, short for counterfactual explanations with theory-of mind, a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN). In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process, i.e. dialog, between the machine and human user. More concretely, our CX-ToM framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user. To do this, we use Theory of Mind (ToM) which helps us in explicitly modeling human's intention, machine's mind as inferred by the human as well as human's mind as inferred by the machine. Moreover, most state-of-the-art XAI frameworks provide attention (or heat map) based explanations. In our work, we show that these attention based explanations are not sufficient for increasing human trust in the underlying CNN model. In CX-ToM, we instead use counterfactual explanations called fault-lines which we define as follows: given an input image I for which a CNN classification model M predicts class c_pred, a fault-line identifies the minimal semantic-level features (e.g., stripes on zebra, pointed ears of dog), referred to as explainable concepts, that need to be added to or deleted from I in order to alter the classification category of I by M to another specified class c_alt. We argue that, due to the iterative, conceptual and counterfactual nature of CX-ToM explanations, our framework is practical and more natural for both expert and non-expert users to understand the internal workings of complex deep learning models. Extensive quantitative and qualitative experiments verify our hypotheses, demonstrating that our CX-ToM significantly outperforms the state-of-the-art explainable AI models.

Related papers

ForenX: Towards Explainable AI-Generated Image Detection with Multimodal Large Language Models [82.04858317800097]
We present ForenX, a novel method that not only identifies the authenticity of images but also provides explanations that resonate with human thoughts.<n>ForenX employs the powerful multimodal large language models (MLLMs) to analyze and interpret forensic cues.<n>We introduce ForgReason, a dataset dedicated to descriptions of forgery evidences in AI-generated images.
arXiv Detail & Related papers (2025-08-02T15:21:26Z)
Reasoning with trees: interpreting CNNs using hierarchies [3.6763102409647526]
We introduce a framework that uses hierarchical segmentation techniques for faithful and interpretable explanations of Convolutional Neural Networks (CNNs) Our method constructs model-based hierarchical segmentations that maintain the model's reasoning fidelity. Experiments show that our framework, xAiTrees, delivers highly interpretable and faithful model explanations.
arXiv Detail & Related papers (2024-06-19T06:45:19Z)
Less is More: Discovering Concise Network Explanations [26.126343100127936]
We introduce Discovering Conceptual Network Explanations (DCNE), a new approach for generating human-comprehensible visual explanations. Our method automatically finds visual explanations that are critical for discriminating between classes. DCNE represents a step forward in making neural network decisions accessible and interpretable to humans.
arXiv Detail & Related papers (2024-05-24T06:10:23Z)
Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development. To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps. These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z)
The future of human-centric eXplainable Artificial Intelligence (XAI) is not post-hoc explanations [3.7673721058583123]
We propose a shift from post-hoc explainability to designing interpretable neural network architectures. We identify five needs of human-centric XAI and propose two schemes for interpretable-by-design neural network.
arXiv Detail & Related papers (2023-07-01T15:24:47Z)
Explaining Explainability: Towards Deeper Actionable Insights into Deep Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level. We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z)
Motif-guided Time Series Counterfactual Explanations [1.1510009152620664]
We propose a novel model that generates intuitive post-hoc counterfactual explanations. We validated our model using five real-world time-series datasets from the UCR repository.
arXiv Detail & Related papers (2022-11-08T17:56:50Z)
Learning Theory of Mind via Dynamic Traits Attribution [59.9781556714202]
We propose a new neural ToM architecture that learns to generate a latent trait vector of an actor from the past trajectories. This trait vector then multiplicatively modulates the prediction mechanism via a fast weights' scheme in the prediction neural network. We empirically show that the fast weights provide a good inductive bias to model the character traits of agents and hence improves mindreading ability.
arXiv Detail & Related papers (2022-04-17T11:21:18Z)
Explanation as a process: user-centric construction of multi-level and multi-modal explanations [0.34410212782758043]
We present a process-based approach that combines multi-level and multi-modal explanations. We use Inductive Logic Programming, an interpretable machine learning approach, to learn a comprehensible model.
arXiv Detail & Related papers (2021-10-07T19:26:21Z)
This is not the Texture you are looking for! Introducing Novel Counterfactual Explanations for Non-Experts using Generative Adversarial Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image. We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques. Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z)
Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL) In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.