CLIMAX: An exploration of Classifier-Based Contrastive Explanations
- URL: http://arxiv.org/abs/2307.00680v1
- Date: Sun, 2 Jul 2023 22:52:58 GMT
- Title: CLIMAX: An exploration of Classifier-Based Contrastive Explanations
- Authors: Praharsh Nanavati, Ranjitha Prasad
- Abstract summary: We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box.
Our method, which we refer to as CLIMAX, is based on local classifiers.
We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
- Score: 5.381004207943597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Explainable AI is an evolving area that deals with understanding the decision
making of machine learning models so that these models are more transparent,
accountable, and understandable for humans. In particular, post-hoc
model-agnostic interpretable AI techniques explain the decisions of a black-box
ML model for a single instance locally, without the knowledge of the intrinsic
nature of the ML model. Despite their simplicity and capability in providing
valuable insights, existing approaches fail to deliver consistent and reliable
explanations. Moreover, in the context of black-box classifiers, existing
approaches justify the predicted class, but these methods do not ensure that
the explanation scores strongly differ as compared to those of another class.
In this work we propose a novel post-hoc model agnostic XAI technique that
provides contrastive explanations justifying the classification of a black box
classifier along with a reasoning as to why another class was not predicted.
Our method, which we refer to as CLIMAX which is short for Contrastive
Label-aware Influence-based Model Agnostic XAI, is based on local classifiers .
In order to ensure model fidelity of the explainer, we require the
perturbations to be such that it leads to a class-balanced surrogate dataset.
Towards this, we employ a label-aware surrogate data generation method based on
random oversampling and Gaussian Mixture Model sampling. Further, we propose
influence subsampling in order to retaining effective samples and hence ensure
sample complexity. We show that we achieve better consistency as compared to
baselines such as LIME, BayLIME, and SLIME. We also depict results on textual
and image based datasets, where we generate contrastive explanations for any
black-box classification model where one is able to only query the class
probabilities for an instance of interest.
Related papers
- Learning Model Agnostic Explanations via Constraint Programming [8.257194221102225]
Interpretable Machine Learning faces a recurring challenge of explaining predictions made by opaque classifiers in terms that are understandable to humans.
In this paper, the task is framed as a Constraint Optimization Problem, where the constraint solver seeks an explanation of minimum error and bounded size for an input data instance and a set of samples generated by the black box.
We evaluate the approach empirically on various datasets and show that it statistically outperforms the state-of-the-art Anchors method.
arXiv Detail & Related papers (2024-11-13T09:55:59Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - LaPLACE: Probabilistic Local Model-Agnostic Causal Explanations [1.0370398945228227]
We introduce LaPLACE-explainer, designed to provide probabilistic cause-and-effect explanations for machine learning models.
The LaPLACE-Explainer component leverages the concept of a Markov blanket to establish statistical boundaries between relevant and non-relevant features.
Our approach offers causal explanations and outperforms LIME and SHAP in terms of local accuracy and consistency of explained features.
arXiv Detail & Related papers (2023-10-01T04:09:59Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Streamlining models with explanations in the learning loop [0.0]
Several explainable AI methods allow a Machine Learning user to get insights on the classification process of a black-box model.
We exploit this information to design a feature engineering phase, where we combine explanations with feature values.
arXiv Detail & Related papers (2023-02-15T16:08:32Z) - Supervised Feature Compression based on Counterfactual Analysis [3.2458225810390284]
This work aims to leverage Counterfactual Explanations to detect the important decision boundaries of a pre-trained black-box model.
Using the discretized dataset, an optimal Decision Tree can be trained that resembles the black-box model, but that is interpretable and compact.
arXiv Detail & Related papers (2022-11-17T21:16:14Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Recurrence-Aware Long-Term Cognitive Network for Explainable Pattern
Classification [0.0]
We propose an LTCN-based model for interpretable pattern classification of structured data.
Our method brings its own mechanism for providing explanations by quantifying the relevance of each feature in the decision process.
Our interpretable model obtains competitive performance when compared to the state-of-the-art white and black boxes.
arXiv Detail & Related papers (2021-07-07T18:14:50Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.