The Shapley Value of coalition of variables provides better explanations
- URL: http://arxiv.org/abs/2103.13342v2
- Date: Thu, 25 Mar 2021 17:28:27 GMT
- Title: The Shapley Value of coalition of variables provides better explanations
- Authors: Salim I. Amoukou, Nicolas J-B. Brunel, Tangi Sala\"un
- Abstract summary: We show that Shapley Values (SV) are still poorly understood.
We show that the popular practice that consists in summing the SV of dummy variables is false.
We provide a correct computation and inference of important variables.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While Shapley Values (SV) are one of the gold standard for interpreting
machine learning models, we show that they are still poorly understood, in
particular in the presence of categorical variables or of variables of low
importance. For instance, we show that the popular practice that consists in
summing the SV of dummy variables is false as it provides wrong estimates of
all the SV in the model and implies spurious interpretations. Based on the
identification of null and active coalitions, and a coalitional version of the
SV, we provide a correct computation and inference of important variables.
Moreover, a Python library (All the experiments and simulations can be
reproduced with the publicly available library \emph{Active Coalition of
Variables} https://github.com/acvicml/ACV) that computes reliably conditional
expectations and SV for tree-based models, is implemented and compared with
state-of-the-art algorithms on toy models and real data sets.
Related papers
- Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - Generalization Analysis on Learning with a Concurrent Verifier [16.298786827265673]
We analyze how the learnability of a machine learning model changes with a CV.
We show that typical error bounds based on Rademacher complexity will be no larger than that of the original model.
arXiv Detail & Related papers (2022-10-11T10:51:55Z) - On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data.
Invariance measures consistency of model predictions on transformations of the data.
From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z) - Identifiable Variational Autoencoders via Sparse Decoding [37.30831737046145]
We develop the Sparse VAE, a deep generative model for unsupervised representation learning on high-dimensional data.
We first show that the Sparse VAE is identifiable: given data drawn from the model, there exists a uniquely optimal set of factors.
We empirically study the Sparse VAE with both simulated and real data.
arXiv Detail & Related papers (2021-10-20T22:11:33Z) - Multivariate Data Explanation by Jumping Emerging Patterns Visualization [78.6363825307044]
We present VAX (multiVariate dAta eXplanation), a new VA method to support the identification and visual interpretation of patterns in multivariate data sets.
Unlike the existing similar approaches, VAX uses the concept of Jumping Emerging Patterns to identify and aggregate several diversified patterns, producing explanations through logic combinations of data variables.
arXiv Detail & Related papers (2021-06-21T13:49:44Z) - Accurate Shapley Values for explaining tree-based models [0.0]
We introduce two estimators of Shapley Values that exploit the tree structure efficiently and are more accurate than state-of-the-art methods.
These methods are available as a Python package.
arXiv Detail & Related papers (2021-06-07T17:35:54Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Simple and Effective VAE Training with Calibrated Decoders [123.08908889310258]
Variational autoencoders (VAEs) provide an effective and simple method for modeling complex distributions.
We study the impact of calibrated decoders, which learn the uncertainty of the decoding distribution.
We propose a simple but novel modification to the commonly used Gaussian decoder, which computes the prediction variance analytically.
arXiv Detail & Related papers (2020-06-23T17:57:47Z) - Variational Mutual Information Maximization Framework for VAE Latent
Codes with Continuous and Discrete Priors [5.317548969642376]
Variational Autoencoder (VAE) is a scalable method for learning directed latent variable models of complex data.
We propose Variational Mutual Information Maximization Framework for VAE to address this issue.
arXiv Detail & Related papers (2020-06-02T09:05:51Z) - Parameter Space Factorization for Zero-Shot Learning across Tasks and
Languages [112.65994041398481]
We propose a Bayesian generative model for the space of neural parameters.
We infer the posteriors over such latent variables based on data from seen task-language combinations.
Our model yields comparable or better results than state-of-the-art, zero-shot cross-lingual transfer methods.
arXiv Detail & Related papers (2020-01-30T16:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.