Supervised Feature Compression based on Counterfactual Analysis
- URL: http://arxiv.org/abs/2211.09894v4
- Date: Fri, 24 Nov 2023 12:19:42 GMT
- Title: Supervised Feature Compression based on Counterfactual Analysis
- Authors: Veronica Piccialli, Dolores Romero Morales, Cecilia Salvatore
- Abstract summary: This work aims to leverage Counterfactual Explanations to detect the important decision boundaries of a pre-trained black-box model.
Using the discretized dataset, an optimal Decision Tree can be trained that resembles the black-box model, but that is interpretable and compact.
- Score: 3.2458225810390284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Counterfactual Explanations are becoming a de-facto standard in post-hoc
interpretable machine learning. For a given classifier and an instance
classified in an undesired class, its counterfactual explanation corresponds to
small perturbations of that instance that allows changing the classification
outcome. This work aims to leverage Counterfactual Explanations to detect the
important decision boundaries of a pre-trained black-box model. This
information is used to build a supervised discretization of the features in the
dataset with a tunable granularity. Using the discretized dataset, an optimal
Decision Tree can be trained that resembles the black-box model, but that is
interpretable and compact. Numerical results on real-world datasets show the
effectiveness of the approach in terms of accuracy and sparsity.
Related papers
- Hierarchical Bias-Driven Stratification for Interpretable Causal Effect
Estimation [1.6874375111244329]
BICauseTree is an interpretable balancing method that identifies clusters where natural experiments occur locally.
We evaluate the method's performance using synthetic and realistic datasets, explore its bias-interpretability tradeoff, and show that it is comparable with existing approaches.
arXiv Detail & Related papers (2024-01-31T10:58:13Z) - Generating collective counterfactual explanations in score-based
classification via mathematical optimization [4.281723404774889]
A counterfactual explanation of an instance indicates how this instance should be minimally modified so that the perturbed instance is classified in the desired class.
Most of the Counterfactual Analysis literature focuses on the single-instance single-counterfactual setting.
By means of novel Mathematical Optimization models, we provide a counterfactual explanation for each instance in a group of interest.
arXiv Detail & Related papers (2023-10-19T15:18:42Z) - CLIMAX: An exploration of Classifier-Based Contrastive Explanations [5.381004207943597]
We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box.
Our method, which we refer to as CLIMAX, is based on local classifiers.
We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
arXiv Detail & Related papers (2023-07-02T22:52:58Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Query-Adaptive Predictive Inference with Partial Labels [0.0]
We propose a new methodology to construct predictive sets using only partially labeled data on top of black-box predictive models.
Our experiments highlight the validity of our predictive set construction as well as the attractiveness of a more flexible user-dependent loss framework.
arXiv Detail & Related papers (2022-06-15T01:48:42Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations.
We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented.
It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts.
Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z) - Counterfactual Explanation Based on Gradual Construction for Deep
Networks [17.79934085808291]
The patterns that deep networks have learned from a training dataset can be grasped by observing the feature variation among various classes.
Current approaches perform the feature modification to increase the classification probability for the target class irrespective of the internal characteristics of deep networks.
We propose a counterfactual explanation method that exploits the statistics learned from a training dataset.
arXiv Detail & Related papers (2020-08-05T01:18:31Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.