Fugu-MT 論文翻訳(概要): A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values

論文の概要: A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values

arxiv url: http://arxiv.org/abs/2506.05216v1
Date: Thu, 05 Jun 2025 16:30:53 GMT
ステータス: 翻訳完了
システム内更新日: 2025-06-06 21:53:49.824258
Title: A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values
Title（参考訳）: 共有値推定に有効なアルゴリズム統一フレームワーク
Authors: Tyler Chen, Akshay Seshadri, Mattia J. Villani, Pradeep Niroula, Shouvanik Chakrabarti, Archan Ray, Pranav Deshpande, Romina Yalovetzky, Marco Pistoia, Niraj Kumar,
Abstract要約: 本稿では,KernelSHAPと関連する推定器を代替サンプリング戦略を用いて構築した,広範かつ統一的なフレームワークについて述べる。我々は、我々のフレームワークから全ての推定者に適用できる強い非漸近的理論的保証を証明している。正則なシャプリー値に対する我々のアプローチを検証し、最小値の2乗誤差を最小値のサンプルサイズで連続的に達成する。
参考スコア（独自算出の注目度）: 4.445994770262589
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Shapley values have emerged as a critical tool for explaining which features impact the decisions made by machine learning models. However, computing exact Shapley values is difficult, generally requiring an exponential (in the feature dimension) number of model evaluations. To address this, many model-agnostic randomized estimators have been developed, the most influential and widely used being the KernelSHAP method (Lundberg & Lee, 2017). While related estimators such as unbiased KernelSHAP (Covert & Lee, 2021) and LeverageSHAP (Musco & Witter, 2025) are known to satisfy theoretical guarantees, bounds for KernelSHAP have remained elusive. We describe a broad and unified framework that encompasses KernelSHAP and related estimators constructed using both with and without replacement sampling strategies. We then prove strong non-asymptotic theoretical guarantees that apply to all estimators from our framework. This provides, to the best of our knowledge, the first theoretical guarantees for KernelSHAP and sheds further light on tradeoffs between existing estimators. Through comprehensive benchmarking on small and medium dimensional datasets for Decision-Tree models, we validate our approach against exact Shapley values, consistently achieving low mean squared error with modest sample sizes. Furthermore, we make specific implementation improvements to enable scalability of our methods to high-dimensional datasets. Our methods, tested on datasets such MNIST and CIFAR10, provide consistently better results compared to the KernelSHAP library.
Abstract（参考訳）: シェープ価値は、機械学習モデルによる決定にどの機能が影響するかを説明する重要なツールとして登場した。しかし、正確なシェープ値の計算は困難であり、一般にモデル評価の指数的な数(特徴次元)を必要とする。これを解決するために、多くのモデルに依存しないランダム化推定器が開発され、最も影響力があり広く使われているのが KernelSHAP 法である(Lundberg & Lee, 2017)。 Unbiased KernelSHAP (Covert & Lee, 2021) や LeverageSHAP (Musco & Witter, 2025) のような関連する推定器は理論的保証を満たすことが知られているが、KernelSHAPの限界は解明されていない。本稿では,KernelSHAPと関連する推定器を代替サンプリング戦略を用いて構築した,広範かつ統一的なフレームワークについて述べる。そして、我々のフレームワークから全ての推定者に適用できる、強い漸近的でない理論的保証を証明します。これは私たちの知る限りでは、KernelSHAPに対する最初の理論的保証であり、既存の推定器間のトレードオフにさらなる光を当てています。決定-軌道モデルのための小・中次元データセットの総合的なベンチマークを通じて、我々は厳密なシェープリー値に対するアプローチを検証し、控えめなサンプルサイズで常に低い平均2乗誤差を達成した。さらに,本手法の高次元データセットへの拡張性を実現するため,具体的実装の改善を行った。 MNIST や CIFAR10 などのデータセットでテストした本手法は, KernelSHAP ライブラリと比較して一貫した結果が得られる。

論文の概要: A Unified Framework for Provably Efficient Algorithms to Estimate Shapley Values

関連論文リスト