SHAP@k:Efficient and Probably Approximately Correct (PAC) Identification
of Top-k Features
- URL: http://arxiv.org/abs/2307.04850v1
- Date: Mon, 10 Jul 2023 18:42:45 GMT
- Title: SHAP@k:Efficient and Probably Approximately Correct (PAC) Identification
of Top-k Features
- Authors: Sanjay Kariyappa, Leonidas Tsepenekas, Freddy L\'ecu\'e, Daniele
Magazzeni
- Abstract summary: We introduce the Top-k Identification Problem (TkIP), where the objective is to identify the k features with the highest SHAP values.
The goal of our work is to improve the sample efficiency of existing methods in the context of solving TkIP.
- Score: 16.99004256148679
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The SHAP framework provides a principled method to explain the predictions of
a model by computing feature importance. Motivated by applications in finance,
we introduce the Top-k Identification Problem (TkIP), where the objective is to
identify the k features with the highest SHAP values. While any method to
compute SHAP values with uncertainty estimates (such as KernelSHAP and
SamplingSHAP) can be trivially adapted to solve TkIP, doing so is highly sample
inefficient. The goal of our work is to improve the sample efficiency of
existing methods in the context of solving TkIP. Our key insight is that TkIP
can be framed as an Explore-m problem--a well-studied problem related to
multi-armed bandits (MAB). This connection enables us to improve sample
efficiency by leveraging two techniques from the MAB literature: (1) a better
stopping-condition (to stop sampling) that identifies when PAC (Probably
Approximately Correct) guarantees have been met and (2) a greedy sampling
scheme that judiciously allocates samples between different features. By
adopting these methods we develop KernelSHAP@k and SamplingSHAP@k to
efficiently solve TkIP, offering an average improvement of $5\times$ in
sample-efficiency and runtime across most common credit related datasets.
Related papers
- Efficient Learning of POMDPs with Known Observation Model in Average-Reward Setting [56.92178753201331]
We propose the Observation-Aware Spectral (OAS) estimation technique, which enables the POMDP parameters to be learned from samples collected using a belief-based policy.
We show the consistency of the OAS procedure, and we prove a regret guarantee of order $mathcalO(sqrtT log(T)$ for the proposed OAS-UCRL algorithm.
arXiv Detail & Related papers (2024-10-02T08:46:34Z) - Enhancing Trade-offs in Privacy, Utility, and Computational Efficiency through MUltistage Sampling Technique (MUST) [3.0939420223851446]
We propose a class of subsampling methods for privacy amplification (PA)
We conduct comprehensive analyses of the PA effects and utility for several 2-stage MUST procedures.
We provide the privacy loss composition analysis over repeated applications of MUST.
arXiv Detail & Related papers (2023-12-20T19:38:29Z) - Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - Plug-and-Play split Gibbs sampler: embedding deep generative priors in
Bayesian inference [12.91637880428221]
This paper introduces a plug-and-play sampling algorithm that leverages variable splitting to efficiently sample from a posterior distribution.
It divides the challenging task of posterior sampling into two simpler sampling problems.
Its performance is compared to recent state-of-the-art optimization and sampling methods.
arXiv Detail & Related papers (2023-04-21T17:17:51Z) - A $k$-additive Choquet integral-based approach to approximate the SHAP
values for local interpretability in machine learning [8.637110868126546]
This paper aims at providing some interpretability for machine learning models based on Shapley values.
A SHAP-based method called Kernel SHAP adopts an efficient strategy that approximates such values with less computational effort.
The obtained results attest that our proposal needs less computations on coalitions of attributes to approximate the SHAP values.
arXiv Detail & Related papers (2022-11-03T22:34:50Z) - Deep Active Ensemble Sampling For Image Classification [8.31483061185317]
Active learning frameworks aim to reduce the cost of data annotation by actively requesting the labeling for the most informative data points.
Some proposed approaches include uncertainty-based techniques, geometric methods, implicit combination of uncertainty-based and geometric approaches.
We present an innovative integration of recent progress in both uncertainty-based and geometric frameworks to enable an efficient exploration/exploitation trade-off in sample selection strategy.
Our framework provides two advantages: (1) accurate posterior estimation, and (2) tune-able trade-off between computational overhead and higher accuracy.
arXiv Detail & Related papers (2022-10-11T20:20:20Z) - Nearly Optimal Latent State Decoding in Block MDPs [74.51224067640717]
In episodic Block MDPs, the decision maker has access to rich observations or contexts generated from a small number of latent states.
We are first interested in estimating the latent state decoding function based on data generated under a fixed behavior policy.
We then study the problem of learning near-optimal policies in the reward-free framework.
arXiv Detail & Related papers (2022-08-17T18:49:53Z) - Optimal Off-Policy Evaluation from Multiple Logging Policies [77.62012545592233]
We study off-policy evaluation from multiple logging policies, each generating a dataset of fixed size, i.e., stratified sampling.
We find the OPE estimator for multiple loggers with minimum variance for any instance, i.e., the efficient one.
arXiv Detail & Related papers (2020-10-21T13:43:48Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Provably Efficient Reward-Agnostic Navigation with Linear Value
Iteration [143.43658264904863]
We show how iteration under a more standard notion of low inherent Bellman error, typically employed in least-square value-style algorithms, can provide strong PAC guarantees on learning a near optimal value function.
We present a computationally tractable algorithm for the reward-free setting and show how it can be used to learn a near optimal policy for any (linear) reward function.
arXiv Detail & Related papers (2020-08-18T04:34:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.