Related papers: Post-hoc explanation of black-box classifiers using confident itemsets

Post-hoc explanation of black-box classifiers using confident itemsets

URL: http://arxiv.org/abs/2005.01992v2
Date: Sun, 20 Sep 2020 21:24:58 GMT
Title: Post-hoc explanation of black-box classifiers using confident itemsets
Authors: Milad Moradi, Matthias Samwald
Abstract summary: Black-box Artificial Intelligence (AI) methods have been widely utilized to build predictive models. It is difficult to trust decisions made by such methods since their inner working and decision logic is hidden from the user.
Score: 12.323983512532651
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Black-box Artificial Intelligence (AI) methods, e.g. deep neural networks, have been widely utilized to build predictive models that can extract complex relationships in a dataset and make predictions for new unseen data records. However, it is difficult to trust decisions made by such methods since their inner working and decision logic is hidden from the user. Explainable Artificial Intelligence (XAI) refers to systems that try to explain how a black-box AI model produces its outcomes. Post-hoc XAI methods approximate the behavior of a black-box by extracting relationships between feature values and the predictions. Perturbation-based and decision set methods are among commonly used post-hoc XAI systems. The former explanators rely on random perturbations of data records to build local or global linear models that explain individual predictions or the whole model. The latter explanators use those feature values that appear more frequently to construct a set of decision rules that produces the same outcomes as the target black-box. However, these two classes of XAI methods have some limitations. Random perturbations do not take into account the distribution of feature values in different subspaces, leading to misleading approximations. Decision sets only pay attention to frequent feature values and miss many important correlations between features and class labels that appear less frequently but accurately represent decision boundaries of the model. In this paper, we address the above challenges by proposing an explanation method named Confident Itemsets Explanation (CIE). We introduce confident itemsets, a set of feature values that are highly correlated to a specific class label. CIE utilizes confident itemsets to discretize the whole decision space of a model to smaller subspaces.

Related papers

CHILLI: A data context-aware perturbation method for XAI [3.587367153279351]
The trustworthiness of Machine Learning (ML) models can be difficult to assess, but is critical in high-risk or ethically sensitive applications. We propose a novel framework, CHILLI, for incorporating data context into XAI by generating contextually aware perturbations. This is shown to improve both the soundness and accuracy of the explanations.
arXiv Detail & Related papers (2024-07-10T10:18:07Z)
Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data. We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures. We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z)
XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification. XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations. Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z)
CLIMAX: An exploration of Classifier-Based Contrastive Explanations [5.381004207943597]
We propose a novel post-hoc model XAI technique that provides contrastive explanations justifying the classification of a black box. Our method, which we refer to as CLIMAX, is based on local classifiers. We show that we achieve better consistency as compared to baselines such as LIME, BayLIME, and SLIME.
arXiv Detail & Related papers (2023-07-02T22:52:58Z)
Explanation-by-Example Based on Item Response Theory [0.0]
This research explores the Item Response Theory (IRT) as a tool to explaining the models and measuring the level of reliability of the Explanation-by-Example approach. From the test set, 83.8% of the errors are from instances in which the IRT points out the model as unreliable.
arXiv Detail & Related papers (2022-10-04T14:36:33Z)
Interpreting Black-box Machine Learning Models for High Dimensional Datasets [40.09157165704895]
We train a black-box model on a high-dimensional dataset to learn the embeddings on which the classification is performed. We then approximate the behavior of the black-box model by means of an interpretable surrogate model on the top-k feature space. Our approach outperforms state-of-the-art methods like TabNet and XGboost when tested on different datasets.
arXiv Detail & Related papers (2022-08-29T07:36:17Z)
An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task. The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them. By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z)
Understanding Neural Abstractive Summarization Models via Uncertainty [54.37665950633147]
seq2seq abstractive summarization models generate text in a free-form manner. We study the entropy, or uncertainty, of the model's token-level predictions. We show that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
arXiv Detail & Related papers (2020-10-15T16:57:27Z)
Explaining Predictions by Approximating the Local Decision Boundary [3.60160227126201]
We present a new procedure for local decision boundary approximation (DBA) We train a variational autoencoder to learn a Euclidean latent space of encoded data representations. We exploit attribute annotations to map the latent space to attributes that are meaningful to the user.
arXiv Detail & Related papers (2020-06-14T19:12:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.