Fugu-MT 論文翻訳(概要): Improved Feature Importance Computations for Tree Models: Shapley vs. Banzhaf

論文の概要: Improved Feature Importance Computations for Tree Models: Shapley vs. Banzhaf

arxiv url: http://arxiv.org/abs/2108.04126v1
Date: Mon, 9 Aug 2021 15:48:22 GMT
ステータス: 翻訳完了
システム内更新日: 2021-08-10 18:13:28.049954
Title: Improved Feature Importance Computations for Tree Models: Shapley vs. Banzhaf
Title（参考訳）: ツリーモデルの重要度計算の改善: Shapley vs. Banzhaf
Authors: Adam Karczmarz, Anish Mukherjee, Piotr Sankowski, Piotr Wygocki
Abstract要約: 我々は、Banzhaf値がShapley値に対していくつかの利点を提供する一方で、本質的に同じ説明を提供することを示す。我々は、Lundbergらのアルゴリズムと同じShapley値に基づく説明を計算し、改良したアルゴリズムを提供する。我々のアルゴリズムは、$O(TLD+n)$timeで動作するが、以前のアルゴリズムは、$O(TLD2+n)$runtime boundを持つ。
参考スコア（独自算出の注目度）: 1.275908842230521
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Shapley values are one of the main tools used to explain predictions of tree ensemble models. The main alternative to Shapley values are Banzhaf values that have not been understood equally well. In this paper we make a step towards filling this gap, providing both experimental and theoretical comparison of these model explanation methods. Surprisingly, we show that Banzhaf values offer several advantages over Shapley values while providing essentially the same explanations. We verify that Banzhaf values: (1) have a more intuitive interpretation, (2) allow for more efficient algorithms, and (3) are much more numerically robust. We provide an experimental evaluation of these theses. In particular, we show that on real world instances. Additionally, from a theoretical perspective we provide new and improved algorithm computing the same Shapley value based explanations as the algorithm of Lundberg et al. [Nat. Mach. Intell. 2020]. Our algorithm runs in $O(TLD+n)$ time, whereas the previous algorithm had $O(TLD^2+n)$ running time bound. Here, $T$ is the number of trees, $L$ is the maximum number of leaves in a tree, and $D$ denotes the maximum depth of a tree in the ensemble. Using the computational techniques developed for Shapley values we deliver an optimal $O(TL+n)$ time algorithm for computing Banzhaf values based explanations. In our experiments these algorithms give running times smaller even by an order of magnitude.
Abstract（参考訳）: シェープ値は、ツリーアンサンブルモデルの予測を説明する主要なツールの1つである。 Shapley値の主な代替手段は、等しく理解されていないBanzhaf値である。本稿では,このギャップを埋めるための一歩を踏み出し,モデル説明法を実験的および理論的に比較する。驚くべきことに、Banzhaf値がShapley値に対していくつかの利点を提供する一方で、本質的に同じ説明を提供することを示す。バンジャフの値がより直感的な解釈であること、(2)より効率的なアルゴリズムを可能にすること、(3)より数値的にロバストであることを確認する。これらを実験的に評価する。特に、実世界の事例でそれを示します。さらに、理論的な観点から、Lundbergらのアルゴリズムと同じShapley値に基づく説明を計算し、改良したアルゴリズムを提供する。 [ナット] Mach インテリ。 2020]. 我々のアルゴリズムは、$O(TLD+n)$timeで動作するのに対し、前のアルゴリズムは$O(TLD^2+n)$run time boundを持つ。ここで、$t$は木の数、$l$は木の葉の最大数、$d$はアンサンブルの中の木の最大深さを表す。 Shapley値のために開発された計算技術を用いて、Banzhaf値に基づく説明を計算するための最適な$O(TL+n)$Timeアルゴリズムを提供する。我々の実験では、これらのアルゴリズムは走行時間を桁違いに小さくする。

関連論文リスト

More-efficient Quantum Multivariate Mean Value Estimator from Generalized Grover Gate [0.0]
我々は、$Oleft(n log fracddeltaright)$サンプルを使用して、$vectildemu$の平均推定値を求めるアルゴリズムを見つける。我々の結果は、複雑さの$log fracddelta$項が原因で、まだ正確には最適ではない。
論文参考訳（メタデータ） (2025-04-09T14:48:23Z)
Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
線形スケッチを用いた行列とベクトルノルムの残差誤差推定問題について検討する。これは、前作とほぼ同じスケッチサイズと精度で、経験的にかなり有利であることを示す。また、スパースリカバリ問題に対して$Omega(k2/pn1-2/p)$低いバウンダリを示し、これは$mathrmpoly(log n)$ factorまで厳密である。
論文参考訳（メタデータ） (2024-08-16T02:33:07Z)
Efficient Algorithms for Generalized Linear Bandits with Heavy-tailed Rewards [40.99322897009357]
トランケーションと平均中央値に基づく2つの新しいアルゴリズムを提案する。我々のトラニケーションベースのアルゴリズムは、既存のトラニケーションベースのアプローチと区別して、オンライン学習をサポートする。我々のアルゴリズムは,$epsilon=1$の既存アルゴリズムと比較して,対数係数による後悔境界を改善する。
論文参考訳（メタデータ） (2023-10-28T13:01:10Z)
Online Robust Mean Estimation [37.746091744197656]
オンライン環境における高次元ロバスト平均推定問題について検討する。このモデルでは、オンラインロバスト平均推定の主な2つの結果が証明されている。
論文参考訳（メタデータ） (2023-10-24T15:28:43Z)
On Computing Optimal Tree Ensembles [7.424944196676223]
ランダム林や、より一般的には(決定ノブレイクダッシュ-)ツリーアンサンブルは、分類と回帰の方法として広く使われている。最近のアルゴリズムの進歩は、そのサイズや深さなどの様々な測定に最適な決定木を計算することができる。 2つの新しいアルゴリズムと対応する下位境界を提供する。
論文参考訳（メタデータ） (2023-06-07T13:30:43Z)
Mind the gap: Achieving a super-Grover quantum speedup by jumping to the end [114.3957763744719]
本稿では,数種類のバイナリ最適化問題に対して,厳密な実行保証を有する量子アルゴリズムを提案する。このアルゴリズムは、$n$非依存定数$c$に対して、時間で$O*(2(0.5-c)n)$の最適解を求める。また、$k$-spinモデルからのランダムなインスタンスの多数と、完全に満足あるいはわずかにフラストレーションされた$k$-CSP式に対して、文 (a) がそうであることを示す。
論文参考訳（メタデータ） (2022-12-03T02:45:23Z)
Fully-Dynamic Decision Trees [3.0058005235097114]
ラベル付き例の挿入と削除の任意のシーケンス上で決定木を維持するための,最初の完全動的アルゴリズムを開発した。実数値機能の場合、このアルゴリズムは挿入/削除毎に$Obig(fracd log3 nepsilon2big)$.amortized run timeを持つ。同様の保証を持つアルゴリズムは、amortized run time $Omega(d)$と space $tildeOmega (n d)$を使用する。
論文参考訳（メタデータ） (2022-12-01T18:58:19Z)
Nearly Optimal Policy Optimization with Stable at Any Time Guarantee [53.155554415415445]
citetshani 2020optimisticのポリシーベースのメソッドは、$tildeO(sqrtSAH3K + sqrtAH4K)$である。$S$は状態の数、$A$はアクションの数、$H$は地平線、$K$はエピソードの数、$sqrtSH$は情報理論の下限の$tildeOmega(sqrtSAH)と比べてギャップがある。
論文参考訳（メタデータ） (2021-12-21T01:54:17Z)
Accurate Shapley Values for explaining tree-based models [0.0]
木構造を効率的に利用し,最先端の手法よりも精度の高い2つのシェープ値推定器を導入する。これらのメソッドはPythonパッケージとして利用できる。
論文参考訳（メタデータ） (2021-06-07T17:35:54Z)
Higher-order Derivatives of Weighted Finite-state Machines [68.43084108204741]
本研究では、重み付き有限状態機械の正規化定数に関する高次微分の計算について検討する。文献に記載されていないすべての順序の導関数を評価するための一般アルゴリズムを提案する。我々のアルゴリズムは以前のアルゴリズムよりもはるかに高速である。
論文参考訳（メタデータ） (2021-06-01T19:51:55Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
対数要因まで $(varepsilon,delta)$-differently private の最適過剰人口損失は $sqrtlog(d)/n + sqrtd/varepsilon n.$ です。損失関数がさらなる滑らかさの仮定を満たすとき、余剰損失は$sqrtlog(d)/n + (log(d)/varepsilon n)2/3で上界(対数因子まで)であることが示される。
論文参考訳（メタデータ） (2021-03-02T06:53:44Z)
Revisiting Modified Greedy Algorithm for Monotone Submodular Maximization with a Knapsack Constraint [75.85952446237599]
修正されたグリードアルゴリズムは、近似係数が0.305$であることを示す。最適なデータ依存上界を導出する。また、分岐やバウンドといったアルゴリズムの効率を大幅に改善するためにも使うことができる。
論文参考訳（メタデータ） (2020-08-12T15:40:21Z)
Learning Near Optimal Policies with Low Inherent Bellman Error [115.16037976819331]
エピソード強化学習における近似線形作用値関数を用いた探索問題について検討する。我々は,検討した設定に対して最適な統計率を達成するアルゴリズムを用いて,Emphbatch仮定のみを用いて探索を行うことが可能であることを示す。
論文参考訳（メタデータ） (2020-02-29T02:02:40Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。