Related papers: DA-DGCEx: Ensuring Validity of Deep Guided Counterfactual Explanations With Distribution-Aware Autoencoder Loss

DA-DGCEx: Ensuring Validity of Deep Guided Counterfactual Explanations With Distribution-Aware Autoencoder Loss

URL: http://arxiv.org/abs/2104.09062v3
Date: Thu, 22 Apr 2021 06:41:30 GMT
Title: DA-DGCEx: Ensuring Validity of Deep Guided Counterfactual Explanations With Distribution-Aware Autoencoder Loss
Authors: Jokin Labaien, Ekhi Zugasti, Xabier De Carlos
Abstract summary: Deep Learning models are often seen as black boxes due to their lack of interpretability. This paper presents Distribution Aware Deep Guided Counterfactual Explanations (DA-DGCEx) It adds a term to the DGCEx cost function that penalizes out of distribution counterfactual instances.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Learning has become a very valuable tool in different fields, and no one doubts the learning capacity of these models. Nevertheless, since Deep Learning models are often seen as black boxes due to their lack of interpretability, there is a general mistrust in their decision-making process. To find a balance between effectiveness and interpretability, Explainable Artificial Intelligence (XAI) is gaining popularity in recent years, and some of the methods within this area are used to generate counterfactual explanations. The process of generating these explanations generally consists of solving an optimization problem for each input to be explained, which is unfeasible when real-time feedback is needed. To speed up this process, some methods have made use of autoencoders to generate instant counterfactual explanations. Recently, a method called Deep Guided Counterfactual Explanations (DGCEx) has been proposed, which trains an autoencoder attached to a classification model, in order to generate straightforward counterfactual explanations. However, this method does not ensure that the generated counterfactual instances are close to the data manifold, so unrealistic counterfactual instances may be generated. To overcome this issue, this paper presents Distribution Aware Deep Guided Counterfactual Explanations (DA-DGCEx), which adds a term to the DGCEx cost function that penalizes out of distribution counterfactual instances.

Related papers

XMENTOR: A Rank-Aware Aggregation Approach for Human-Centered Explainable AI in Just-in-Time Software Defect Prediction [5.646457568088472]
We introduce XMENTOR, a human-centered, rank-aware aggregation method implemented as a VS Code plugin.<n>XMENTOR unifies multiple post-hoc explanations into a single, coherent view by applying adaptive thresholding, rank and sign agreement.<n>Our findings show how combining explanations and embedding them into developer can enhance interpretability, usability, and trust.
arXiv Detail & Related papers (2026-02-25T20:54:49Z)
Activation-Deactivation: A General Framework for Robust Post-hoc Explainable AI [4.3331379059769395]
Activation-Deactivation (AD) removes the effects of occluded input features from the model's decision-making.<n>We introduce ConvAD, a drop-in mechanism that can be easily added to any trained Convolutional Neural Network (CNN)<n>We prove that the ConvAD mechanism does not change the decision-making process of the network.
arXiv Detail & Related papers (2025-10-01T15:42:58Z)
Demystifying Reinforcement Learning in Production Scheduling via Explainable AI [0.7515066610159392]
Deep Reinforcement Learning (DRL) is a frequently employed technique to solve scheduling problems. Although DRL agents ace at delivering viable results in short computing times, their reasoning remains opaque. We apply two explainable AI (xAI) frameworks to describe the reasoning behind scheduling decisions of a specialized DRL agent in a flow production.
arXiv Detail & Related papers (2024-08-19T09:39:01Z)
QUCE: The Minimisation and Quantification of Path-Based Uncertainty for Generative Counterfactual Explanations [1.649938899766112]
Quantified Uncertainty Counterfactual Explanations (QUCE) is a method designed to minimize path uncertainty. We show that QUCE quantifies uncertainty when presenting explanations and generates more certain counterfactual examples. We showcase the performance of the QUCE method by comparing it with competing methods for both path-based explanations and generative counterfactual examples.
arXiv Detail & Related papers (2024-02-27T14:00:08Z)
Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification. We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations. Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z)
Learning Transferable Conceptual Prototypes for Interpretable Unsupervised Domain Adaptation [79.22678026708134]
In this paper, we propose an inherently interpretable method, named Transferable Prototype Learning ( TCPL) To achieve this goal, we design a hierarchically prototypical module that transfers categorical basic concepts from the source domain to the target domain and learns domain-shared prototypes for explaining the underlying reasoning process. Comprehensive experiments show that the proposed method can not only provide effective and intuitive explanations but also outperform previous state-of-the-arts.
arXiv Detail & Related papers (2023-10-12T06:36:41Z)
Semi-supervised counterfactual explanations [3.6810543937967912]
We address the challenge of generating counterfactual explanations that lie in the same data distribution as that of the training data. This requirement has been addressed through the incorporation of auto-encoder reconstruction loss in the counterfactual search process. We show further improvement in the interpretability of counterfactual explanations when the auto-encoder is trained in a semi-supervised fashion with class tagged input data.
arXiv Detail & Related papers (2023-03-22T15:17:16Z)
VCNet: A self-explaining model for realistic counterfactual generation [52.77024349608834]
Counterfactual explanation is a class of methods to make local explanations of machine learning decisions. We present VCNet-Variational Counter Net, a model architecture that combines a predictor and a counterfactual generator. We show that VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem.
arXiv Detail & Related papers (2022-12-21T08:45:32Z)
Towards Formal Approximated Minimal Explanations of Neural Networks [0.0]
Deep neural networks (DNNs) are now being used in numerous domains. DNNs are "black-boxes", and cannot be interpreted by humans. We propose an efficient, verification-based method for finding minimal explanations.
arXiv Detail & Related papers (2022-10-25T11:06:37Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Principled Knowledge Extrapolation with GANs [92.62635018136476]
We study counterfactual synthesis from a new perspective of knowledge extrapolation. We show that an adversarial game with a closed-form discriminator can be used to address the knowledge extrapolation problem. Our method enjoys both elegant theoretical guarantees and superior performance in many scenarios.
arXiv Detail & Related papers (2022-05-21T08:39:42Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
On Generating Plausible Counterfactual and Semi-Factual Explanations for Deep Learning [15.965337956587373]
PlausIble Exceptionality-based Contrastive Explanations (PIECE), modifies all exceptional features in a test image to be normal from the perspective of the counterfactual class. Two controlled experiments compare PIECE to others in the literature, showing that PIECE not only generates the most plausible counterfactuals on several measures, but also the best semifactuals.
arXiv Detail & Related papers (2020-09-10T14:48:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.