Related papers: Deducing neighborhoods of classes from a fitted model

Deducing neighborhoods of classes from a fitted model

URL: http://arxiv.org/abs/2009.05516v2
Date: Thu, 17 Sep 2020 09:47:20 GMT
Title: Deducing neighborhoods of classes from a fitted model
Authors: Alexander Gerharz, Andreas Groll, Gunther Schauberger
Abstract summary: In this article a new kind of interpretable machine learning method is presented. It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
Score: 68.8204255655161
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In todays world the request for very complex models for huge data sets is rising steadily. The problem with these models is that by raising the complexity of the models, it gets much harder to interpret them. The growing field of \emph{interpretable machine learning} tries to make up for the lack of interpretability in these complex (or even blackbox-)models by using specific techniques that can help to understand those models better. In this article a new kind of interpretable machine learning method is presented, which can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. To illustrate in which situations this quantile shift method (QSM) could become beneficial, it is applied to a theoretical medical example and a real data example. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed. By comparing the predictions before and after the manipulations, under certain conditions the observed changes in the predictions can be interpreted as neighborhoods of the classes with regard to the manipulated features. Chordgraphs are used to visualize the observed changes.

Related papers

Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors [61.92704516732144]
We show that the most robust features for correctness prediction are those that play a distinctive causal role in the model's behavior.<n>We propose two methods that leverage causal mechanisms to predict the correctness of model outputs.
arXiv Detail & Related papers (2025-05-17T00:31:39Z)
A Lightweight Generative Model for Interpretable Subject-level Prediction [0.07989135005592125]
We propose a technique for single-subject prediction that is inherently interpretable. Experiments demonstrate that the resulting model can be efficiently inverted to make accurate subject-level predictions.
arXiv Detail & Related papers (2023-06-19T18:20:29Z)
Explanation Shift: How Did the Distribution Shift Impact the Model? [23.403838118256907]
We study how explanation characteristics shift when affected by distribution shifts. We analyze different types of distribution shifts using synthetic examples and real-world data sets. We release our methods in an open-source Python package, as well as the code used to reproduce our experiments.
arXiv Detail & Related papers (2023-03-14T17:13:01Z)
On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules. We study the generalization and adaption performance of such modular neural causal models. Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z)
Entropy optimized semi-supervised decomposed vector-quantized variational autoencoder model based on transfer learning for multiclass text classification and generation [3.9318191265352196]
We propose a semisupervised discrete latent variable model for multi-class text classification and text generation. The proposed model employs the concept of transfer learning for training a quantized transformer model. Experimental results indicate that the proposed model has surpassed the state-of-the-art models remarkably.
arXiv Detail & Related papers (2021-11-10T07:07:54Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function. Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z)
Explainable Matrix -- Visualization for Global and Local Interpretability of Random Forest Classification Ensembles [78.6363825307044]
We propose Explainable Matrix (ExMatrix), a novel visualization method for Random Forest (RF) interpretability. It employs a simple yet powerful matrix-like visual metaphor, where rows are rules, columns are features, and cells are rules predicates. ExMatrix applicability is confirmed via different examples, showing how it can be used in practice to promote RF models interpretability.
arXiv Detail & Related papers (2020-05-08T21:03:48Z)
Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks. We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task. Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
Metafeatures-based Rule-Extraction for Classifiers on Behavioral and Textual Data [0.0]
Rule-extraction techniques have been proposed to combine the desired predictive accuracy of complex "black-box" models with global explainability. We develop and test a rule-extraction methodology based on higher-level, less-sparse metafeatures. A key finding of our analysis is that metafeatures-based explanations are better at mimicking the behavior of the black-box prediction model.
arXiv Detail & Related papers (2020-03-10T15:08:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.