Counterfactual Explanation Based on Gradual Construction for Deep
Networks
- URL: http://arxiv.org/abs/2008.01897v2
- Date: Mon, 6 Sep 2021 02:27:47 GMT
- Title: Counterfactual Explanation Based on Gradual Construction for Deep
Networks
- Authors: Hong-Gyu Jung, Sin-Han Kang, Hee-Dong Kim, Dong-Ok Won, Seong-Whan Lee
- Abstract summary: The patterns that deep networks have learned from a training dataset can be grasped by observing the feature variation among various classes.
Current approaches perform the feature modification to increase the classification probability for the target class irrespective of the internal characteristics of deep networks.
We propose a counterfactual explanation method that exploits the statistics learned from a training dataset.
- Score: 17.79934085808291
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To understand the black-box characteristics of deep networks, counterfactual
explanation that deduces not only the important features of an input space but
also how those features should be modified to classify input as a target class
has gained an increasing interest. The patterns that deep networks have learned
from a training dataset can be grasped by observing the feature variation among
various classes. However, current approaches perform the feature modification
to increase the classification probability for the target class irrespective of
the internal characteristics of deep networks. This often leads to unclear
explanations that deviate from real-world data distributions. To address this
problem, we propose a counterfactual explanation method that exploits the
statistics learned from a training dataset. Especially, we gradually construct
an explanation by iterating over masking and composition steps. The masking
step aims to select an important feature from the input data to be classified
as a target class. Meanwhile, the composition step aims to optimize the
previously selected feature by ensuring that its output score is close to the
logit space of the training data that are classified as the target class.
Experimental results show that our method produces human-friendly
interpretations on various classification datasets and verify that such
interpretations can be achieved with fewer feature modification.
Related papers
- Granularity Matters in Long-Tail Learning [62.30734737735273]
We offer a novel perspective on long-tail learning, inspired by an observation: datasets with finer granularity tend to be less affected by data imbalance.
We introduce open-set auxiliary classes that are visually similar to existing ones, aiming to enhance representation learning for both head and tail classes.
To prevent the overwhelming presence of auxiliary classes from disrupting training, we introduce a neighbor-silencing loss.
arXiv Detail & Related papers (2024-10-21T13:06:21Z) - Enhancing Hyperspectral Image Prediction with Contrastive Learning in Low-Label Regime [0.810304644344495]
Self-supervised contrastive learning is an effective approach for addressing the challenge of limited labelled data.
We evaluate the method's performance for both the single-label and multi-label classification tasks.
arXiv Detail & Related papers (2024-10-10T10:20:16Z) - A Fixed-Point Approach to Unified Prompt-Based Counting [51.20608895374113]
This paper aims to establish a comprehensive prompt-based counting framework capable of generating density maps for objects indicated by various prompt types, such as box, point, and text.
Our model excels in prominent class-agnostic datasets and exhibits superior performance in cross-dataset adaptation tasks.
arXiv Detail & Related papers (2024-03-15T12:05:44Z) - Semi-supervised counterfactual explanations [3.6810543937967912]
We address the challenge of generating counterfactual explanations that lie in the same data distribution as that of the training data.
This requirement has been addressed through the incorporation of auto-encoder reconstruction loss in the counterfactual search process.
We show further improvement in the interpretability of counterfactual explanations when the auto-encoder is trained in a semi-supervised fashion with class tagged input data.
arXiv Detail & Related papers (2023-03-22T15:17:16Z) - Deepfake Detection via Joint Unsupervised Reconstruction and Supervised
Classification [25.84902508816679]
We introduce a novel approach for deepfake detection, which considers the reconstruction and classification tasks simultaneously.
This method shares the information learned by one task with the other, which focuses on a different aspect other existing works rarely consider.
Our method achieves state-of-the-art performance on three commonly-used datasets.
arXiv Detail & Related papers (2022-11-24T05:44:26Z) - Supervised Feature Compression based on Counterfactual Analysis [3.2458225810390284]
This work aims to leverage Counterfactual Explanations to detect the important decision boundaries of a pre-trained black-box model.
Using the discretized dataset, an optimal Decision Tree can be trained that resembles the black-box model, but that is interpretable and compact.
arXiv Detail & Related papers (2022-11-17T21:16:14Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Fine-Grained Neural Network Explanation by Identifying Input Features
with Predictive Information [53.28701922632817]
We propose a method to identify features with predictive information in the input domain.
The core idea of our method is leveraging a bottleneck on the input that only lets input features associated with predictive latent features pass through.
arXiv Detail & Related papers (2021-10-04T14:13:42Z) - Discriminative Attribution from Counterfactuals [64.94009515033984]
We present a method for neural network interpretability by combining feature attribution with counterfactual explanations.
We show that this method can be used to quantitatively evaluate the performance of feature attribution methods in an objective manner.
arXiv Detail & Related papers (2021-09-28T00:53:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.