Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations
- URL: http://arxiv.org/abs/2103.10226v1
- Date: Thu, 18 Mar 2021 12:57:34 GMT
- Title: Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations
- Authors: Pau Rodriguez, Massimo Caccia, Alexandre Lacoste, Lee Zamparo, Issam
Laradji, Laurent Charlin, David Vazquez
- Abstract summary: In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
- Score: 64.85696493596821
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Explainability for machine learning models has gained considerable attention
within our research community given the importance of deploying more reliable
machine-learning systems. In computer vision applications, generative
counterfactual methods indicate how to perturb a model's input to change its
prediction, providing details about the model's decision-making. Current
counterfactual methods make ambiguous interpretations as they combine multiple
biases of the model and the data in a single counterfactual interpretation of
the model's decision. Moreover, these methods tend to generate trivial
counterfactuals about the model's decision, as they often suggest to exaggerate
or remove the presence of the attribute being classified. For the machine
learning practitioner, these types of counterfactuals offer little value, since
they provide no new information about undesired model or data biases. In this
work, we propose a counterfactual method that learns a perturbation in a
disentangled latent space that is constrained using a diversity-enforcing loss
to uncover multiple valuable explanations about the model's prediction.
Further, we introduce a mechanism to prevent the model from producing trivial
explanations. Experiments on CelebA and Synbols demonstrate that our model
improves the success rate of producing high-quality valuable explanations when
compared to previous state-of-the-art methods. We will publish the code.
Related papers
- CNN-based explanation ensembling for dataset, representation and explanations evaluation [1.1060425537315088]
We explore the potential of ensembling explanations generated by deep classification models using convolutional model.
Through experimentation and analysis, we aim to investigate the implications of combining explanations to uncover a more coherent and reliable patterns of the model's behavior.
arXiv Detail & Related papers (2024-04-16T08:39:29Z) - VCNet: A self-explaining model for realistic counterfactual generation [52.77024349608834]
Counterfactual explanation is a class of methods to make local explanations of machine learning decisions.
We present VCNet-Variational Counter Net, a model architecture that combines a predictor and a counterfactual generator.
We show that VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem.
arXiv Detail & Related papers (2022-12-21T08:45:32Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - Feature Attributions and Counterfactual Explanations Can Be Manipulated [32.579094387004346]
We show how adversaries can design biased models that manipulate model agnostic feature attribution methods.
These vulnerabilities allow an adversary to deploy a biased model, yet explanations will not reveal this bias, thereby deceiving stakeholders into trusting the model.
We evaluate the manipulations on real world data sets, including COMPAS and Communities & Crime, and find explanations can be manipulated in practice.
arXiv Detail & Related papers (2021-06-23T17:43:31Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z) - Accurate and Intuitive Contextual Explanations using Linear Model Trees [0.0]
Local post hoc model explanations have gained massive adoption.
Current state of the art methods use rudimentary methods to generate synthetic data around the point to be explained.
We use a Generative Adversarial Network for synthetic data generation and train a piecewise linear model in the form of Linear Model Trees.
arXiv Detail & Related papers (2020-09-11T10:13:12Z) - Explainable Recommender Systems via Resolving Learning Representations [57.24565012731325]
Explanations could help improve user experience and discover system defects.
We propose a novel explainable recommendation model through improving the transparency of the representation learning process.
arXiv Detail & Related papers (2020-08-21T05:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.