Related papers: Explaining Reject Options of Learning Vector Quantization Classifiers

Explaining Reject Options of Learning Vector Quantization Classifiers

URL: http://arxiv.org/abs/2202.07244v1
Date: Tue, 15 Feb 2022 08:16:10 GMT
Title: Explaining Reject Options of Learning Vector Quantization Classifiers
Authors: Andr\'e Artelt, Johannes Brinkrolf, Roel Visser, Barbara Hammer
Abstract summary: We propose to use counterfactual explanations for explaining rejects in machine learning models. We investigate how to efficiently compute counterfactual explanations of different reject options for an important class of models.
Score: 6.125017875330933
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While machine learning models are usually assumed to always output a prediction, there also exist extensions in the form of reject options which allow the model to reject inputs where only a prediction with an unacceptably low certainty would be possible. With the ongoing rise of eXplainable AI, a lot of methods for explaining model predictions have been developed. However, understanding why a given input was rejected, instead of being classified by the model, is also of interest. Surprisingly, explanations of rejects have not been considered so far. We propose to use counterfactual explanations for explaining rejects and investigate how to efficiently compute counterfactual explanations of different reject options for an important class of models, namely prototype-based classifiers such as learning vector quantization models.

Related papers

Selective Explanations [14.312717332216073]
A machine learning model is trained to predict feature attribution scores with only one inference. Despite their efficiency, amortized explainers can produce inaccurate predictions and misleading explanations. We propose selective explanations, a novel feature attribution method that detects when amortized explainers generate low-quality explanations.
arXiv Detail & Related papers (2024-05-29T23:08:31Z)
Logic-based Explanations for Linear Support Vector Classifiers with Reject Option [0.0]
Support Vector (SVC) is a well-known Machine Learning (ML) model for linear classification problems. We propose a logic-based approach with formal guarantees on the correctness and minimality of explanations for linear SVCs with reject option.
arXiv Detail & Related papers (2024-03-24T15:14:44Z)
Understanding Post-hoc Explainers: The Case of Anchors [6.681943980068051]
We present a theoretical analysis of a rule-based interpretability method that highlights a small set of words to explain a text's decision. After formalizing its algorithm and providing useful insights, we demonstrate mathematically that Anchors produces meaningful results.
arXiv Detail & Related papers (2023-03-15T17:56:34Z)
VCNet: A self-explaining model for realistic counterfactual generation [52.77024349608834]
Counterfactual explanation is a class of methods to make local explanations of machine learning decisions. We present VCNet-Variational Counter Net, a model architecture that combines a predictor and a counterfactual generator. We show that VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem.
arXiv Detail & Related papers (2022-12-21T08:45:32Z)
"Even if ..." -- Diverse Semifactual Explanations of Reject [8.132423340684568]
We propose a conceptual modeling of semifactual explanations for arbitrary reject options. We empirically evaluate a specific implementation on a conformal prediction based reject option.
arXiv Detail & Related papers (2022-07-05T08:53:08Z)
Pathologies of Pre-trained Language Models in Few-shot Fine-tuning [50.3686606679048]
We show that pre-trained language models with few examples show strong prediction bias across labels. Although few-shot fine-tuning can mitigate the prediction bias, our analysis shows models gain performance improvement by capturing non-task-related features. These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior.
arXiv Detail & Related papers (2022-04-17T15:55:18Z)
Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews. We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
A Note on High-Probability versus In-Expectation Guarantees of Generalization Bounds in Machine Learning [95.48744259567837]
Statistical machine learning theory often tries to give generalization guarantees of machine learning models. Statements made about the performance of machine learning models have to take the sampling process into account. We show how one may transform one statement to another.
arXiv Detail & Related papers (2020-10-06T09:41:35Z)
Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented. It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.