When is the Computation of a Feature Attribution Method Tractable?
- URL: http://arxiv.org/abs/2501.02356v1
- Date: Sat, 04 Jan 2025 18:37:02 GMT
- Title: When is the Computation of a Feature Attribution Method Tractable?
- Authors: P. Barceló, R. Cominetti, M. Morgado,
- Abstract summary: We study the computational complexity of power indices beyond SHAP.
We show that power indices can be simplified to a constant number of expected value evaluations.
We also explore interaction indices that quantify the importance of feature subsets.
- Score: 0.0
- License:
- Abstract: Feature attribution methods have become essential for explaining machine learning models. Many popular approaches, such as SHAP and Banzhaf values, are grounded in power indices from cooperative game theory, which measure the contribution of features to model predictions. This work studies the computational complexity of power indices beyond SHAP, addressing the conditions under which they can be computed efficiently. We identify a simple condition on power indices that ensures that computation is polynomially equivalent to evaluating expected values, extending known results for SHAP. We also introduce Bernoulli power indices, showing that their computation can be simplified to a constant number of expected value evaluations. Furthermore, we explore interaction power indices that quantify the importance of feature subsets, proving that their computation complexity mirrors that of individual features.
Related papers
- Graph-based Complexity for Causal Effect by Empirical Plug-in [56.14597641617531]
This paper focuses on the computational complexity of computing empirical plug-in estimates for causal effect queries.
We show that computation can be done efficiently, potentially in time linear in the data size, depending on the estimand's hypergraph.
arXiv Detail & Related papers (2024-11-15T07:42:01Z) - Feature-Specific Coefficients of Determination in Tree Ensembles [10.968795392216606]
Tree ensemble methods provide promising predictions with models difficult to interpret.
Recent introduction of Shapley values for individualized feature contributions, accompanied with several fast computing algorithms for predicted values, shows intriguing results.
We propose an efficient algorithm, Q-SHAP, that reduces the computational complexity to time when calculating Shapley values related to quadratic losses.
arXiv Detail & Related papers (2024-07-03T21:27:29Z) - From SHAP Scores to Feature Importance Scores [4.8158930873043335]
This paper shows that there is an essential relationship between feature attribution and a priori voting power.
It remains unclear how some of the most widely used power indices might be exploited as feature importance scores (FISs) in XAI.
arXiv Detail & Related papers (2024-05-20T03:52:41Z) - Expected Shapley-Like Scores of Boolean Functions: Complexity and Applications to Probabilistic Databases [3.386124605656362]
We adapt Shapley-like scores to probabilistic settings, the objective being to compute their expected value.
We show that the computations of expected Shapley values and of the expected values of Boolean functions are interreducible in time.
We present applications to databases through database provenance, and an effective implementation of this algorithm within the Provable system.
arXiv Detail & Related papers (2024-01-12T10:34:31Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Generalizing Backpropagation for Gradient-Based Interpretability [103.2998254573497]
We show that the gradient of a model is a special case of a more general formulation using semirings.
This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics.
arXiv Detail & Related papers (2023-07-06T15:19:53Z) - Grouping Shapley Value Feature Importances of Random Forests for
explainable Yield Prediction [0.8543936047647136]
We explain the concept of Shapley values directly computed for groups of features and introduce an algorithm to compute them efficiently on tree structures.
We provide a blueprint for designing swarm plots that combine many local explanations for global understanding.
arXiv Detail & Related papers (2023-04-14T13:03:33Z) - Multivariate Systemic Risk Measures and Computation by Deep Learning
Algorithms [63.03966552670014]
We discuss the key related theoretical aspects, with a particular focus on the fairness properties of primal optima and associated risk allocations.
The algorithms we provide allow for learning primals, optima for the dual representation and corresponding fair risk allocations.
arXiv Detail & Related papers (2023-02-02T22:16:49Z) - A $k$-additive Choquet integral-based approach to approximate the SHAP
values for local interpretability in machine learning [8.637110868126546]
This paper aims at providing some interpretability for machine learning models based on Shapley values.
A SHAP-based method called Kernel SHAP adopts an efficient strategy that approximates such values with less computational effort.
The obtained results attest that our proposal needs less computations on coalitions of attributes to approximate the SHAP values.
arXiv Detail & Related papers (2022-11-03T22:34:50Z) - Supervised Learning for Non-Sequential Data: A Canonical Polyadic
Decomposition Approach [85.12934750565971]
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks.
To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor.
For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
arXiv Detail & Related papers (2020-01-27T22:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.