The Shapley Value of Classifiers in Ensemble Games
- URL: http://arxiv.org/abs/2101.02153v1
- Date: Wed, 6 Jan 2021 17:40:23 GMT
- Title: The Shapley Value of Classifiers in Ensemble Games
- Authors: Benedek Rozemberczki and Rik Sarkar
- Abstract summary: We introduce a new class of transferable utility cooperative games to answer this question.
The players in ensemble games are pre-trained binary classifiers that collaborate in an ensemble to correctly label points from a dataset.
We design Troupe a scalable algorithm that designates payoffs to individual models based on the Shapley value of those in the ensemble game.
- Score: 7.23389716633927
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How do we decide the fair value of individual classifiers in an ensemble
model? We introduce a new class of transferable utility cooperative games to
answer this question. The players in ensemble games are pre-trained binary
classifiers that collaborate in an ensemble to correctly label points from a
dataset. We design Troupe a scalable algorithm that designates payoffs to
individual models based on the Shapley value of those in the ensemble game. We
show that the approximate Shapley value of classifiers in these games is an
adequate measure for selecting a subgroup of highly predictive models. In
addition, we introduce the Shapley entropy a new metric to quantify the
heterogeneity of machine learning ensembles when it comes to model quality. We
analytically prove that our Shapley value approximation algorithm is accurate
and scales to large ensembles and big data. Experimental results on graph
classification tasks establish that Troupe gives precise estimates of the
Shapley value in ensemble games. We demonstrate that the Shapley value can be
used for pruning large ensembles, show that complex classifiers have a prime
role in correct and incorrect classification decisions, and provide evidence
that adversarial models receive a low valuation.
Related papers
- Improving the Sampling Strategy in KernelSHAP [0.8057006406834466]
KernelSHAP framework enables us to approximate the Shapley values using a sampled subset of weighted conditional expectations.
We propose three main novel contributions: a stabilizing technique to reduce the variance of the weights in the current state-of-the-art strategy, a novel weighing scheme that corrects the Shapley kernel weights based on sampled subsets, and a straightforward strategy that includes the important subsets and integrates them with the corrected Shapley kernel weights.
arXiv Detail & Related papers (2024-10-07T10:02:31Z) - Accelerated Shapley Value Approximation for Data Evaluation [3.707457963532597]
We show that Shapley value of data points can be approximated more efficiently by leveraging structural properties of machine learning problems.
Our analysis suggests that in fact models trained on small subsets are more important in context of data valuation.
arXiv Detail & Related papers (2023-11-09T13:15:36Z) - Anchor Points: Benchmarking Models with Much Fewer Examples [88.02417913161356]
In six popular language classification benchmarks, model confidence in the correct class on many pairs of points is strongly correlated across models.
We propose Anchor Point Selection, a technique to select small subsets of datasets that capture model behavior across the entire dataset.
Just several anchor points can be used to estimate model per-class predictions on all other points in a dataset with low mean absolute error.
arXiv Detail & Related papers (2023-09-14T17:45:51Z) - An Efficient Shapley Value Computation for the Naive Bayes Classifier [0.0]
This article proposes an exact analytic expression of Shapley values in the case of the naive Bayes classifier.
Results show that our Shapley proposal for the naive Bayes provides informative results with low algorithmic complexity.
arXiv Detail & Related papers (2023-07-31T14:39:10Z) - Prompt Algebra for Task Composition [131.97623832435812]
We consider Visual Language Models with prompt tuning as our base classifier.
We propose constrained prompt tuning to improve performance of the composite classifier.
On UTZappos it improves classification accuracy over the best base model by 8.45% on average.
arXiv Detail & Related papers (2023-06-01T03:20:54Z) - Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - On the Convergence of the Shapley Value in Parametric Bayesian Learning
Games [28.212413634171572]
We show that for any two players, their difference in Shapley value converges in probability to the difference in Shapley value of a limiting game whose characteristic function is proportional to the log-determinant of the joint Fisher information.
Our result enables this to be achieved without any costly computations of posterior-prior KL divergences.
arXiv Detail & Related papers (2022-05-16T02:29:14Z) - Exact Shapley Values for Local and Model-True Explanations of Decision
Tree Ensembles [0.0]
We consider the application of Shapley values for explaining decision tree ensembles.
We present a novel approach to Shapley value-based feature attribution that can be applied to random forests and boosted decision trees.
arXiv Detail & Related papers (2021-12-16T20:16:02Z) - Improving Label Quality by Jointly Modeling Items and Annotators [68.8204255655161]
We propose a fully Bayesian framework for learning ground truth labels from noisy annotators.
Our framework ensures scalability by factoring a generative, Bayesian soft clustering model over label distributions into the classic David and Skene joint annotator-data model.
arXiv Detail & Related papers (2021-06-20T02:15:20Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - Towards Efficient Data Valuation Based on the Shapley Value [65.4167993220998]
We study the problem of data valuation by utilizing the Shapley value.
The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value.
We propose a repertoire of efficient algorithms for approximating the Shapley value.
arXiv Detail & Related papers (2019-02-27T00:22:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.