Fugu-MT 論文翻訳(概要): GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees

論文の概要: GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees

arxiv url: http://arxiv.org/abs/2604.12757v1
Date: Tue, 14 Apr 2026 14:03:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-15 19:11:32.479536
Title: GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees
Title（参考訳）: GFスコア:フェアネス保証によるクラスコンディションロバストネス評価
Authors: Arya Shah, Kaveri Visavadiya, Manisha Padala,
Abstract要約: 我々は、認定されたGREATスコアをクラスごとの堅牢性プロファイルに分解するフレームワークであるemphGF-Score(GREAT-Fairness Score)を紹介する。分解は正確であり、クラスごとのスコアは一貫性のある脆弱性パターンを示し、より堅牢なモデルではクラスレベルの格差が大きくなる傾向にある。これらの結果から,信頼性の高いロバスト性保証がすべてのクラスを平等に保護できないような,実用的なアタックフリー監査パイプラインが確立された。
参考スコア（独自算出の注目度）: 1.6058099298620423
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Adversarial robustness is essential for deploying neural networks in safety-critical applications, yet standard evaluation methods either require expensive adversarial attacks or report only a single aggregate score that obscures how robustness is distributed across classes. We introduce the \emph{GF-Score} (GREAT-Fairness Score), a framework that decomposes the certified GREAT Score into per-class robustness profiles and quantifies their disparity through four metrics grounded in welfare economics: the Robustness Disparity Index (RDI), the Normalized Robustness Gini Coefficient (NRGC), Worst-Case Class Robustness (WCR), and a Fairness-Penalized GREAT Score (FP-GREAT). The framework further eliminates the original method's dependence on adversarial attacks through a self-calibration procedure that tunes the temperature parameter using only clean accuracy correlations. Evaluating 22 models from RobustBench across CIFAR-10 and ImageNet, we find that the decomposition is exact, that per-class scores reveal consistent vulnerability patterns (e.g., ``cat'' is the weakest class in 76\% of CIFAR-10 models), and that more robust models tend to exhibit greater class-level disparity. These results establish a practical, attack-free auditing pipeline for diagnosing where certified robustness guarantees fail to protect all classes equally. We release our code on \href{https://github.com/aryashah2k/gf-score}{GitHub}.
Abstract（参考訳）: ニューラルネットワークを安全クリティカルなアプリケーションにデプロイするためには、対人ロバスト性は不可欠だが、標準的な評価手法では、高価な対人攻撃を必要とするか、クラス間でロバスト性がどのように分散しているかを曖昧にする単一のアグリゲーションスコアしか報告しない。我々は、認定されたGREATスコアをクラスごとの堅牢性プロファイルに分解し、福祉経済学の基礎となる4つの指標(RDI)、正規化ロバスト性ジニ係数(NRGC)、Worst-Case Class Robustness(WCR)、Fairness-Penalized GREATスコア(FP-GREAT)でそれらの格差を定量化するフレームワークであるGREAT-Fairness Score(GREAT-Fairness Score)を紹介する。このフレームワークは、クリーンな精度の相関だけで温度パラメータを調整する自己校正手順により、元の手法の敵攻撃への依存をさらに排除する。 CIFAR-10 と ImageNet で RobustBench から22 モデルを評価すると、分解は正確であり、クラスごとのスコアは一貫性のある脆弱性パターン(例えば '`cat''' は CIFAR-10 モデルの76 % で最も弱いクラス)を示し、より堅牢なモデルの方がクラスレベルの格差が大きいことが分かる。これらの結果から,信頼性の高いロバスト性保証がすべてのクラスを平等に保護できないような,実用的なアタックフリー監査パイプラインが確立された。コードのリリースは \href{https://github.com/aryashah2k/gf-score}{GitHub} です。

論文の概要: GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees

関連論文リスト