ECINN: Efficient Counterfactuals from Invertible Neural Networks
- URL: http://arxiv.org/abs/2103.13701v1
- Date: Thu, 25 Mar 2021 09:23:24 GMT
- Title: ECINN: Efficient Counterfactuals from Invertible Neural Networks
- Authors: Frederik Hvilsh{\o}j, Alexandros Iosifidis, and Ira Assent
- Abstract summary: We propose a method, ECINN, that utilizes the generative capacities of invertible neural networks for image classification to generate counterfactual examples efficiently.
ECINN has a closed-form expression and generates a counterfactual in the time of only two evaluations.
Our experiments demonstrate how ECINN alters class-dependent image regions to change the perceptual and predicted class of the counterfactuals.
- Score: 80.94500245955591
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Counterfactual examples identify how inputs can be altered to change the
predicted class of a classifier, thus opening up the black-box nature of, e.g.,
deep neural networks. We propose a method, ECINN, that utilizes the generative
capacities of invertible neural networks for image classification to generate
counterfactual examples efficiently. In contrast to competing methods that
sometimes need a thousand evaluations or more of the classifier, ECINN has a
closed-form expression and generates a counterfactual in the time of only two
evaluations. Arguably, the main challenge of generating counterfactual examples
is to alter only input features that affect the predicted outcome, i.e.,
class-dependent features. Our experiments demonstrate how ECINN alters
class-dependent image regions to change the perceptual and predicted class of
the counterfactuals. Additionally, we extend ECINN to also produce heatmaps
(ECINNh) for easy inspection of, e.g., pairwise class-dependent changes in the
generated counterfactual examples. Experimentally, we find that ECINNh
outperforms established methods that generate heatmap-based explanations.
Related papers
- PseudoNeg-MAE: Self-Supervised Point Cloud Learning using Conditional Pseudo-Negative Embeddings [55.55445978692678]
PseudoNeg-MAE is a self-supervised learning framework that enhances global feature representation of point cloud mask autoencoders.
We show that PseudoNeg-MAE achieves state-of-the-art performance on the ModelNet40 and ScanObjectNN datasets.
arXiv Detail & Related papers (2024-09-24T07:57:21Z) - VCNet: A self-explaining model for realistic counterfactual generation [52.77024349608834]
Counterfactual explanation is a class of methods to make local explanations of machine learning decisions.
We present VCNet-Variational Counter Net, a model architecture that combines a predictor and a counterfactual generator.
We show that VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem.
arXiv Detail & Related papers (2022-12-21T08:45:32Z) - Efficient and Robust Classification for Sparse Attacks [34.48667992227529]
We consider perturbations bounded by the $ell$--norm, which have been shown as effective attacks in the domains of image-recognition, natural language processing, and malware-detection.
We propose a novel defense method that consists of "truncation" and "adrial training"
Motivated by the insights we obtain, we extend these components to neural network classifiers.
arXiv Detail & Related papers (2022-01-23T21:18:17Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - Improving Sound Event Classification by Increasing Shift Invariance in
Convolutional Neural Networks [14.236193187116047]
Recent studies have put into question the commonly assumed shift invariance property of convolutional networks.
We evaluate two methods to improve shift invariance in CNNs, based on low-pass filtering and adaptive sampling of incoming feature maps.
We show that these modifications consistently improve sound event classification in all cases considered, without adding any (or adding very few) trainable parameters.
arXiv Detail & Related papers (2021-07-01T17:21:02Z) - Improving Transformation-based Defenses against Adversarial Examples
with First-order Perturbations [16.346349209014182]
Studies show that neural networks are susceptible to adversarial attacks.
This exposes a potential threat to neural network-based intelligent systems.
We propose a method for counteracting adversarial perturbations to improve adversarial robustness.
arXiv Detail & Related papers (2021-03-08T06:27:24Z) - Shift Invariance Can Reduce Adversarial Robustness [20.199887291186364]
Shift invariance is a critical property of CNNs that improves performance on classification.
We show that invariance to circular shifts can also lead to greater sensitivity to adversarial attacks.
arXiv Detail & Related papers (2021-03-03T21:27:56Z) - Recoding latent sentence representations -- Dynamic gradient-based
activation modification in RNNs [0.0]
In RNNs, encoding information in a suboptimal way can impact the quality of representations based on later elements in the sequence.
I propose an augmentation to standard RNNs in form of a gradient-based correction mechanism.
I conduct different experiments in the context of language modeling, where the impact of using such a mechanism is examined in detail.
arXiv Detail & Related papers (2021-01-03T17:54:17Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples.
This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network.
Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.