Toward Scalable and Unified Example-based Explanation and Outlier
Detection
- URL: http://arxiv.org/abs/2011.05577v3
- Date: Sun, 8 May 2022 10:11:42 GMT
- Title: Toward Scalable and Unified Example-based Explanation and Outlier
Detection
- Authors: Penny Chong, Ngai-Man Cheung, Yuval Elovici, Alexander Binder
- Abstract summary: We argue for a broader adoption of prototype-based student networks capable of providing an example-based explanation for their prediction.
We show that our prototype-based networks beyond similarity kernels deliver meaningful explanations and promising outlier detection results without compromising classification accuracy.
- Score: 128.23117182137418
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When neural networks are employed for high-stakes decision-making, it is
desirable that they provide explanations for their prediction in order for us
to understand the features that have contributed to the decision. At the same
time, it is important to flag potential outliers for in-depth verification by
domain experts. In this work we propose to unify two differing aspects of
explainability with outlier detection. We argue for a broader adoption of
prototype-based student networks capable of providing an example-based
explanation for their prediction and at the same time identify regions of
similarity between the predicted sample and the examples. The examples are real
prototypical cases sampled from the training set via our novel iterative
prototype replacement algorithm. Furthermore, we propose to use the prototype
similarity scores for identifying outliers. We compare performances in terms of
the classification, explanation quality, and outlier detection of our proposed
network with other baselines. We show that our prototype-based networks beyond
similarity kernels deliver meaningful explanations and promising outlier
detection results without compromising classification accuracy.
Related papers
- Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - Learning to Select Prototypical Parts for Interpretable Sequential Data
Modeling [7.376829794171344]
We propose a Self-Explaining Selective Model (SESM) that uses a linear combination of prototypical concepts to explain its own predictions.
For better interpretability, we design multiple constraints including diversity, stability, and locality as training objectives.
arXiv Detail & Related papers (2022-12-07T01:42:47Z) - A Novel Explainable Out-of-Distribution Detection Approach for Spiking
Neural Networks [6.100274095771616]
This work presents a novel OoD detector that can identify whether test examples input to a Spiking Neural Network belong to the distribution of the data over which it was trained.
We characterize the internal activations of the hidden layers of the network in the form of spike count patterns.
A local explanation method is devised to produce attribution maps revealing which parts of the input instance push most towards the detection of an example as an OoD sample.
arXiv Detail & Related papers (2022-09-30T11:16:35Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Adversarial Examples Detection with Bayesian Neural Network [57.185482121807716]
We propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors.
We propose a novel Bayesian adversarial example detector, short for BATer, to improve the performance of adversarial example detection.
arXiv Detail & Related papers (2021-05-18T15:51:24Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - Unsupervised Anomaly Detection From Semantic Similarity Scores [0.0]
We present a simple and generic framework, it SemSAD, that makes use of a semantic similarity score to carry out anomaly detection.
We are able to outperform previous approaches for anomaly, novelty, or out-of-distribution detection in the visual domain by a large margin.
arXiv Detail & Related papers (2020-12-01T13:12:31Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.