Statistical Aspects of SHAP: Functional ANOVA for Model Interpretation
- URL: http://arxiv.org/abs/2208.09970v1
- Date: Sun, 21 Aug 2022 21:46:15 GMT
- Title: Statistical Aspects of SHAP: Functional ANOVA for Model Interpretation
- Authors: Andrew Herren and P. Richard Hahn
- Abstract summary: We show that challenges in SHAP approximations largely relate to the choice of a feature distribution and the number of $2p$ ANOVA terms estimated.
We argue that the connection between machine learning explainability and sensitivity analysis is illuminating in this case, but the immediate practical consequences are not obvious.
- Score: 0.456877715768796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: SHAP is a popular method for measuring variable importance in machine
learning models. In this paper, we study the algorithm used to estimate SHAP
scores and show that it is a transformation of the functional ANOVA
decomposition. We use this connection to show that challenges in SHAP
approximations largely relate to the choice of a feature distribution and the
number of $2^p$ ANOVA terms estimated. We argue that the connection between
machine learning explainability and sensitivity analysis is illuminating in
this case, but the immediate practical consequences are not obvious since the
two fields face a different set of constraints. Machine learning explainability
concerns models which are inexpensive to evaluate but often have hundreds, if
not thousands, of features. Sensitivity analysis typically deals with models
from physics or engineering which may be very time consuming to run, but
operate on a comparatively small space of inputs.
Related papers
- Unified Explanations in Machine Learning Models: A Perturbation Approach [0.0]
Inconsistencies between XAI and modeling techniques can have the undesirable effect of casting doubt upon the efficacy of these explainability approaches.
We propose a systematic, perturbation-based analysis against a popular, model-agnostic method in XAI, SHapley Additive exPlanations (Shap)
We devise algorithms to generate relative feature importance in settings of dynamic inference amongst a suite of popular machine learning and deep learning methods, and metrics that allow us to quantify how well explanations generated under the static case hold.
arXiv Detail & Related papers (2024-05-30T16:04:35Z) - On Least Square Estimation in Softmax Gating Mixture of Experts [78.3687645289918]
We investigate the performance of the least squares estimators (LSE) under a deterministic MoE model.
We establish a condition called strong identifiability to characterize the convergence behavior of various types of expert functions.
Our findings have important practical implications for expert selection.
arXiv Detail & Related papers (2024-02-05T12:31:18Z) - Interpretable Differencing of Machine Learning Models [20.99877540751412]
We formalize the problem of model differencing as one of predicting a dissimilarity function of two ML models' outputs.
A Joint Surrogate Tree (JST) is composed of two conjoined decision tree surrogates for the two models.
A JST provides an intuitive representation of differences and places the changes in the context of the models' decision logic.
arXiv Detail & Related papers (2023-06-10T16:15:55Z) - Metric Tools for Sensitivity Analysis with Applications to Neural
Networks [0.0]
Explainable Artificial Intelligence (XAI) aims to provide interpretations for predictions made by Machine Learning models.
In this paper, a theoretical framework is proposed to study sensitivities of ML models using metric techniques.
A complete family of new quantitative metrics called $alpha$-curves is extracted.
arXiv Detail & Related papers (2023-05-03T18:10:21Z) - On the Influence of Enforcing Model Identifiability on Learning dynamics
of Gaussian Mixture Models [14.759688428864159]
We propose a technique for extracting submodels from singular models.
Our method enforces model identifiability during training.
We show how the method can be applied to more complex models like deep neural networks.
arXiv Detail & Related papers (2022-06-17T07:50:22Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Locally Interpretable Model Agnostic Explanations using Gaussian
Processes [2.9189409618561966]
Local Interpretable Model-Agnostic Explanations (LIME) is a popular technique for explaining the prediction of a single instance.
We propose a Gaussian Process (GP) based variation of locally interpretable models.
We demonstrate that the proposed technique is able to generate faithful explanations using much fewer samples as compared to LIME.
arXiv Detail & Related papers (2021-08-16T05:49:01Z) - Instance-optimality in optimal value estimation: Adaptivity via
variance-reduced Q-learning [99.34907092347733]
We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions.
Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure.
In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning.
arXiv Detail & Related papers (2021-06-28T00:38:54Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Large-scale Neural Solvers for Partial Differential Equations [48.7576911714538]
Solving partial differential equations (PDE) is an indispensable part of many branches of science as many processes can be modelled in terms of PDEs.
Recent numerical solvers require manual discretization of the underlying equation as well as sophisticated, tailored code for distributed computing.
We examine the applicability of continuous, mesh-free neural solvers for partial differential equations, physics-informed neural networks (PINNs)
We discuss the accuracy of GatedPINN with respect to analytical solutions -- as well as state-of-the-art numerical solvers, such as spectral solvers.
arXiv Detail & Related papers (2020-09-08T13:26:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.