Interpretable feature subset selection: A Shapley value based approach
- URL: http://arxiv.org/abs/2001.03956v3
- Date: Sun, 25 Apr 2021 19:24:45 GMT
- Title: Interpretable feature subset selection: A Shapley value based approach
- Authors: Sandhya Tripathi, N. Hemachandra, Prashant Trivedi
- Abstract summary: We introduce the notion of classification game, a cooperative game with features as players and hinge loss based characteristic function.
Our major contribution is ($star$) to show that for any dataset the threshold 0 on SVEA value identifies feature subset whose joint interactions for label prediction is significant.
- Score: 1.511944009967492
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For feature selection and related problems, we introduce the notion of
classification game, a cooperative game, with features as players and hinge
loss based characteristic function and relate a feature's contribution to
Shapley value based error apportioning (SVEA) of total training error. Our
major contribution is ($\star$) to show that for any dataset the threshold 0 on
SVEA value identifies feature subset whose joint interactions for label
prediction is significant or those features that span a subspace where the data
is predominantly lying. In addition, our scheme ($\star$) identifies the
features on which Bayes classifier doesn't depend but any surrogate loss
function based finite sample classifier does; this contributes to the excess
$0$-$1$ risk of such a classifier, ($\star$) estimates unknown true hinge risk
of a feature, and ($\star$) relate the stability property of an allocation and
negative valued SVEA by designing the analogue of core of classification game.
Due to Shapley value's computationally expensive nature, we build on a known
Monte Carlo based approximation algorithm that computes characteristic function
(Linear Programs) only when needed. We address the potential sample bias
problem in feature selection by providing interval estimates for SVEA values
obtained from multiple sub-samples. We illustrate all the above aspects on
various synthetic and real datasets and show that our scheme achieves better
results than existing recursive feature elimination technique and ReliefF in
most cases. Our theoretically grounded classification game in terms of well
defined characteristic function offers interpretability (which we formalize in
terms of final task) and explainability of our framework, including
identification of important features.
Related papers
- Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Variational Shapley Network: A Probabilistic Approach to Self-Explaining
Shapley values with Uncertainty Quantification [2.6699011287124366]
Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes.
We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass.
arXiv Detail & Related papers (2024-02-06T18:09:05Z) - Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation [23.646508094051768]
We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain.
The Shapley value is a natural tool to perform dataset valuation due to its formal axiomatic justification.
We propose a novel approximation, referred to as discrete uniform Shapley, which is expressed as an expectation under a discrete uniform distribution.
arXiv Detail & Related papers (2023-06-03T10:22:50Z) - Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment.
We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z) - CS-Shapley: Class-wise Shapley Values for Data Valuation in
Classification [24.44357623723746]
We propose CS-Shapley, a Shapley value with a new value function that discriminates between training instances' in-class and out-of-class contributions.
Our results suggest Shapley-based data valuation is transferable for application across different models.
arXiv Detail & Related papers (2022-11-13T03:32:33Z) - Adaptive LASSO estimation for functional hidden dynamic geostatistical
model [69.10717733870575]
We propose a novel model selection algorithm based on a penalized maximum likelihood estimator (PMLE) for functional hiddenstatistical models (f-HD)
The algorithm is based on iterative optimisation and uses an adaptive least absolute shrinkage and selector operator (GMSOLAS) penalty function, wherein the weights are obtained by the unpenalised f-HD maximum-likelihood estimators.
arXiv Detail & Related papers (2022-08-10T19:17:45Z) - Parallel feature selection based on the trace ratio criterion [4.30274561163157]
This work presents a novel parallel feature selection approach for classification, namely Parallel Feature Selection using Trace criterion (PFST)
Our method uses trace criterion, a measure of class separability used in Fisher's Discriminant Analysis, to evaluate feature usefulness.
The experiments show that our method can produce a small set of features in a fraction of the amount of time by the other methods under comparison.
arXiv Detail & Related papers (2022-03-03T10:50:33Z) - Feature Selection Using Reinforcement Learning [0.0]
The space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially.
Identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model.
arXiv Detail & Related papers (2021-01-23T09:24:37Z) - Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
We present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimize a feature decomposition network and the target image classification model.
The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.
arXiv Detail & Related papers (2020-07-30T05:48:48Z) - Infinite Feature Selection: A Graph-based Feature Filtering Approach [78.63188057505012]
We propose a filtering feature selection framework that considers subsets of features as paths in a graph.
Going to infinite allows to constrain the computational complexity of the selection process.
We show that Inf-FS behaves better in almost any situation, that is, when the number of features to keep are fixed a priori.
arXiv Detail & Related papers (2020-06-15T07:20:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.