Explaining Predictions by Approximating the Local Decision Boundary
- URL: http://arxiv.org/abs/2006.07985v2
- Date: Thu, 22 Oct 2020 18:22:12 GMT
- Title: Explaining Predictions by Approximating the Local Decision Boundary
- Authors: Georgios Vlassopoulos, Tim van Erven, Henry Brighton and Vlado
Menkovski
- Abstract summary: We present a new procedure for local decision boundary approximation (DBA)
We train a variational autoencoder to learn a Euclidean latent space of encoded data representations.
We exploit attribute annotations to map the latent space to attributes that are meaningful to the user.
- Score: 3.60160227126201
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Constructing accurate model-agnostic explanations for opaque machine learning
models remains a challenging task. Classification models for high-dimensional
data, like images, are often inherently complex. To reduce this complexity,
individual predictions may be explained locally, either in terms of a simpler
local surrogate model or by communicating how the predictions contrast with
those of another class. However, existing approaches still fall short in the
following ways: a) they measure locality using a (Euclidean) metric that is not
meaningful for non-linear high-dimensional data; or b) they do not attempt to
explain the decision boundary, which is the most relevant characteristic of
classifiers that are optimized for classification accuracy; or c) they do not
give the user any freedom in specifying attributes that are meaningful to them.
We address these issues in a new procedure for local decision boundary
approximation (DBA). To construct a meaningful metric, we train a variational
autoencoder to learn a Euclidean latent space of encoded data representations.
We impose interpretability by exploiting attribute annotations to map the
latent space to attributes that are meaningful to the user. A difficulty in
evaluating explainability approaches is the lack of a ground truth. We address
this by introducing a new benchmark data set with artificially generated Iris
images, and showing that we can recover the latent attributes that locally
determine the class. We further evaluate our approach on tabular data and on
the CelebA image data set.
Related papers
- Adaptive $k$-nearest neighbor classifier based on the local estimation of the shape operator [49.87315310656657]
We introduce a new adaptive $k$-nearest neighbours ($kK$-NN) algorithm that explores the local curvature at a sample to adaptively defining the neighborhood size.
Results on many real-world datasets indicate that the new $kK$-NN algorithm yields superior balanced accuracy compared to the established $k$-NN method.
arXiv Detail & Related papers (2024-09-08T13:08:45Z) - MASALA: Model-Agnostic Surrogate Explanations by Locality Adaptation [3.587367153279351]
Existing local Explainable AI (XAI) methods select a region of the input space in the vicinity of a given input instance, for which they approximate the behaviour of a model using a simpler and more interpretable surrogate model.
We propose a novel method, MASALA, for generating explanations, which automatically determines the appropriate local region of impactful model behaviour for each individual instance being explained.
arXiv Detail & Related papers (2024-08-19T15:26:45Z) - XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification.
XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations.
Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z) - CLIMAX: An exploration of Classifier-Based Contrastive Explanations [5.381004207943597]
We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box.
Our method, which we refer to as CLIMAX, is based on local classifiers.
We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
arXiv Detail & Related papers (2023-07-02T22:52:58Z) - CEnt: An Entropy-based Model-agnostic Explainability Framework to
Contrast Classifiers' Decisions [2.543865489517869]
We present a novel approach to locally contrast the prediction of any classifier.
Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits.
CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction)
arXiv Detail & Related papers (2023-01-19T08:23:34Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Post-hoc explanation of black-box classifiers using confident itemsets [12.323983512532651]
Black-box Artificial Intelligence (AI) methods have been widely utilized to build predictive models.
It is difficult to trust decisions made by such methods since their inner working and decision logic is hidden from the user.
arXiv Detail & Related papers (2020-05-05T08:11:24Z) - Null It Out: Guarding Protected Attributes by Iterative Nullspace
Projection [51.041763676948705]
Iterative Null-space Projection (INLP) is a novel method for removing information from neural representations.
We show that our method is able to mitigate bias in word embeddings, as well as to increase fairness in a setting of multi-class classification.
arXiv Detail & Related papers (2020-04-16T14:02:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.