Related papers: ECINN: Efficient Counterfactuals from Invertible Neural Networks

ECINN: Efficient Counterfactuals from Invertible Neural Networks

URL: http://arxiv.org/abs/2103.13701v1
Date: Thu, 25 Mar 2021 09:23:24 GMT
Title: ECINN: Efficient Counterfactuals from Invertible Neural Networks
Authors: Frederik Hvilsh{\o}j, Alexandros Iosifidis, and Ira Assent
Abstract summary: We propose a method, ECINN, that utilizes the generative capacities of invertible neural networks for image classification to generate counterfactual examples efficiently. ECINN has a closed-form expression and generates a counterfactual in the time of only two evaluations. Our experiments demonstrate how ECINN alters class-dependent image regions to change the perceptual and predicted class of the counterfactuals.
Score: 80.94500245955591
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Counterfactual examples identify how inputs can be altered to change the predicted class of a classifier, thus opening up the black-box nature of, e.g., deep neural networks. We propose a method, ECINN, that utilizes the generative capacities of invertible neural networks for image classification to generate counterfactual examples efficiently. In contrast to competing methods that sometimes need a thousand evaluations or more of the classifier, ECINN has a closed-form expression and generates a counterfactual in the time of only two evaluations. Arguably, the main challenge of generating counterfactual examples is to alter only input features that affect the predicted outcome, i.e., class-dependent features. Our experiments demonstrate how ECINN alters class-dependent image regions to change the perceptual and predicted class of the counterfactuals. Additionally, we extend ECINN to also produce heatmaps (ECINNh) for easy inspection of, e.g., pairwise class-dependent changes in the generated counterfactual examples. Experimentally, we find that ECINNh outperforms established methods that generate heatmap-based explanations.

Related papers

Obtaining Example-Based Explanations from Deep Neural Networks [18.708235771482205]
EBE-DNN can provide highly concentrated example attributions, i.e., the predictions can be explained with few training examples. The choice of layer to use for the embeddings may have a large impact on the resulting accuracy.
arXiv Detail & Related papers (2025-02-27T05:10:48Z)
PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE is a self-supervised learning framework that enhances global feature representation of point cloud mask autoencoders. We show that PseudoNeg-MAE achieves state-of-the-art performance on the ModelNet40 and ScanObjectNN datasets.
arXiv Detail & Related papers (2024-09-24T07:57:21Z)
VCNet: A self-explaining model for realistic counterfactual generation [52.77024349608834]
Counterfactual explanation is a class of methods to make local explanations of machine learning decisions. We present VCNet-Variational Counter Net, a model architecture that combines a predictor and a counterfactual generator. We show that VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem.
arXiv Detail & Related papers (2022-12-21T08:45:32Z)
Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection. We propose a novel defense method that consists of "truncation" and "adrial training" Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z)
Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set. We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps. Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z)
Improving Sound Event Classification by Increasing Shift Invariance in Convolutional Neural Networks [14.236193187116047]
Recent studies have put into question the commonly assumed shift invariance property of convolutional networks. We evaluate two methods to improve shift invariance in CNNs, based on low-pass filtering and adaptive sampling of incoming feature maps. We show that these modifications consistently improve sound event classification in all cases considered, without adding any (or adding very few) trainable parameters.
arXiv Detail & Related papers (2021-07-01T17:21:02Z)
Improving Transformation-based Defenses against Adversarial Examples with First-order Perturbations [16.346349209014182]
Studies show that neural networks are susceptible to adversarial attacks. This exposes a potential threat to neural network-based intelligent systems. We propose a method for counteracting adversarial perturbations to improve adversarial robustness.
arXiv Detail & Related papers (2021-03-08T06:27:24Z)
Shift Invariance Can Reduce Adversarial Robustness [20.199887291186364]
Shift invariance is a critical property of CNNs that improves performance on classification. We show that invariance to circular shifts can also lead to greater sensitivity to adversarial attacks.
arXiv Detail & Related papers (2021-03-03T21:27:56Z)
Recoding latent sentence representations -- Dynamic gradient-based activation modification in RNNs [0.0]
In RNNs, encoding information in a suboptimal way can impact the quality of representations based on later elements in the sequence. I propose an augmentation to standard RNNs in form of a gradient-based correction mechanism. I conduct different experiments in the context of language modeling, where the impact of using such a mechanism is examined in detail.
arXiv Detail & Related papers (2021-01-03T17:54:17Z)
Explaining and Improving Model Behavior with k Nearest Neighbor Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions. We show that kNN representations are effective at uncovering learned spurious associations. Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z)
Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples. This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network. Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.