CEnt: An Entropy-based Model-agnostic Explainability Framework to
Contrast Classifiers' Decisions
- URL: http://arxiv.org/abs/2301.07941v1
- Date: Thu, 19 Jan 2023 08:23:34 GMT
- Title: CEnt: An Entropy-based Model-agnostic Explainability Framework to
Contrast Classifiers' Decisions
- Authors: Julia El Zini, Mohammad Mansour and Mariette Awad
- Abstract summary: We present a novel approach to locally contrast the prediction of any classifier.
Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits.
CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction)
- Score: 2.543865489517869
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Current interpretability methods focus on explaining a particular model's
decision through present input features. Such methods do not inform the user of
the sufficient conditions that alter these decisions when they are not
desirable. Contrastive explanations circumvent this problem by providing
explanations of the form "If the feature $X>x$, the output $Y$ would be
different''. While different approaches are developed to find contrasts; these
methods do not all deal with mutability and attainability constraints.
In this work, we present a novel approach to locally contrast the prediction
of any classifier. Our Contrastive Entropy-based explanation method, CEnt,
approximates a model locally by a decision tree to compute entropy information
of different feature splits. A graph, G, is then built where contrast nodes are
found through a one-to-many shortest path search. Contrastive examples are
generated from the shortest path to reflect feature splits that alter model
decisions while maintaining lower entropy. We perform local sampling on
manifold-like distances computed by variational auto-encoders to reflect data
density. CEnt is the first non-gradient-based contrastive method generating
diverse counterfactuals that do not necessarily exist in the training data
while satisfying immutability (ex. race) and semi-immutability (ex. age can
only change in an increasing direction). Empirical evaluation on four
real-world numerical datasets demonstrates the ability of CEnt in generating
counterfactuals that achieve better proximity rates than existing methods
without compromising latency, feasibility, and attainability. We further extend
CEnt to imagery data to derive visually appealing and useful contrasts between
class labels on MNIST and Fashion MNIST datasets. Finally, we show how CEnt can
serve as a tool to detect vulnerabilities of textual classifiers.
Related papers
- PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE is a self-supervised learning framework that enhances global feature representation of point cloud mask autoencoders.
We show that PseudoNeg-MAE achieves state-of-the-art performance on the ModelNet40 and ScanObjectNN datasets.
arXiv Detail & Related papers (2024-09-24T07:57:21Z) - Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers [0.7704032792820767]
Deep neural networks are applied in more and more areas of everyday life.
They still lack essential abilities, such as robustly dealing with spatially transformed input signals.
We propose a novel technique to emulate such an inference process for neural nets.
arXiv Detail & Related papers (2024-05-06T09:47:29Z) - CLIMAX: An exploration of Classifier-Based Contrastive Explanations [5.381004207943597]
We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box.
Our method, which we refer to as CLIMAX, is based on local classifiers.
We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
arXiv Detail & Related papers (2023-07-02T22:52:58Z) - Posterior Collapse and Latent Variable Non-identifiability [54.842098835445]
We propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility.
Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
arXiv Detail & Related papers (2023-01-02T06:16:56Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Learning Conditional Invariance through Cycle Consistency [60.85059977904014]
We propose a novel approach to identify meaningful and independent factors of variation in a dataset.
Our method involves two separate latent subspaces for the target property and the remaining input information.
We demonstrate on synthetic and molecular data that our approach identifies more meaningful factors which lead to sparser and more interpretable models.
arXiv Detail & Related papers (2021-11-25T17:33:12Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Causality-based Counterfactual Explanation for Classification Models [11.108866104714627]
We propose a prototype-based counterfactual explanation framework (ProCE)
ProCE is capable of preserving the causal relationship underlying the features of the counterfactual data.
In addition, we design a novel gradient-free optimization based on the multi-objective genetic algorithm that generates the counterfactual explanations.
arXiv Detail & Related papers (2021-05-03T09:25:59Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z) - Explaining Predictions by Approximating the Local Decision Boundary [3.60160227126201]
We present a new procedure for local decision boundary approximation (DBA)
We train a variational autoencoder to learn a Euclidean latent space of encoded data representations.
We exploit attribute annotations to map the latent space to attributes that are meaningful to the user.
arXiv Detail & Related papers (2020-06-14T19:12:42Z) - PushNet: Efficient and Adaptive Neural Message Passing [1.9121961872220468]
Message passing neural networks have recently evolved into a state-of-the-art approach to representation learning on graphs.
Existing methods perform synchronous message passing along all edges in multiple subsequent rounds.
We consider a novel asynchronous message passing approach where information is pushed only along the most relevant edges until convergence.
arXiv Detail & Related papers (2020-03-04T18:15:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.