An Imprecise SHAP as a Tool for Explaining the Class Probability
Distributions under Limited Training Data
- URL: http://arxiv.org/abs/2106.09111v1
- Date: Wed, 16 Jun 2021 20:30:26 GMT
- Title: An Imprecise SHAP as a Tool for Explaining the Class Probability
Distributions under Limited Training Data
- Authors: Lev V. Utkin and Andrei V. Konstantinov and Kirill A. Vishniakov
- Abstract summary: An imprecise SHAP is proposed for cases when the class probability distributions are imprecise and represented by sets of distributions.
The first idea behind the imprecise SHAP is a new approach for computing the marginal contribution of a feature.
The second idea is an attempt to consider a general approach to calculating and reducing interval-valued Shapley values.
- Score: 5.8010446129208155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the most popular methods of the machine learning prediction
explanation is the SHapley Additive exPlanations method (SHAP). An imprecise
SHAP as a modification of the original SHAP is proposed for cases when the
class probability distributions are imprecise and represented by sets of
distributions. The first idea behind the imprecise SHAP is a new approach for
computing the marginal contribution of a feature, which fulfils the important
efficiency property of Shapley values. The second idea is an attempt to
consider a general approach to calculating and reducing interval-valued Shapley
values, which is similar to the idea of reachable probability intervals in the
imprecise probability theory. A simple special implementation of the general
approach in the form of linear optimization problems is proposed, which is
based on using the Kolmogorov-Smirnov distance and imprecise contamination
models. Numerical examples with synthetic and real data illustrate the
imprecise SHAP.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Probabilistically Plausible Counterfactual Explanations with Normalizing Flows [2.675793767640172]
We present PPCEF, a novel method for generating probabilistically plausible counterfactual explanations.
Our method enforces plausibility by directly optimizing the explicit density function without assuming a particular family of parametrized distributions.
PPCEF is a powerful tool for interpreting machine learning models and for improving fairness, accountability, and trust in AI systems.
arXiv Detail & Related papers (2024-05-27T20:24:03Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning [49.94607673097326]
We propose a highly adaptable framework, designated as SimPro, which does not rely on any predefined assumptions about the distribution of unlabeled data.
Our framework, grounded in a probabilistic model, innovatively refines the expectation-maximization algorithm.
Our method showcases consistent state-of-the-art performance across diverse benchmarks and data distribution scenarios.
arXiv Detail & Related papers (2024-02-21T03:39:04Z) - Variational Shapley Network: A Probabilistic Approach to Self-Explaining
Shapley values with Uncertainty Quantification [2.6699011287124366]
Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes.
We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass.
arXiv Detail & Related papers (2024-02-06T18:09:05Z) - Provably Stable Feature Rankings with SHAP and LIME [3.8642937395065124]
We devise attribution methods that ensure the most important features are ranked correctly with high probability.
We introduce efficient sampling algorithms for SHAP and LIME that guarantee the $K$ highest-ranked features have the proper ordering.
arXiv Detail & Related papers (2024-01-28T23:14:51Z) - DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation [23.646508094051768]
We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain.
The Shapley value is a natural tool to perform dataset valuation due to its formal axiomatic justification.
We propose a novel approximation, referred to as discrete uniform Shapley, which is expressed as an expectation under a discrete uniform distribution.
arXiv Detail & Related papers (2023-06-03T10:22:50Z) - Explaining the Uncertain: Stochastic Shapley Values for Gaussian Process
Models [15.715453687736028]
We present a novel approach for explaining Gaussian processes (GPs) that can utilize the full analytical covariance structure in GPs.
Our method is based on the popular solution concept of Shapley values extended to cooperative games, resulting in explanations that are random variables.
The GP explanations generated using our approach satisfy similar axioms to standard Shapley values and possess a tractable covariance function across features and data observations.
arXiv Detail & Related papers (2023-05-24T13:59:03Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - Evaluating probabilistic classifiers: Reliability diagrams and score
decompositions revisited [68.8204255655161]
We introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way.
Corpor is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm.
arXiv Detail & Related papers (2020-08-07T08:22:26Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.