Fugu-MT 論文翻訳(概要): MORA: Improving Ensemble Robustness Evaluation with Model-Reweighing Attack

論文の概要: MORA: Improving Ensemble Robustness Evaluation with Model-Reweighing Attack

arxiv url: http://arxiv.org/abs/2211.08008v1
Date: Tue, 15 Nov 2022 09:45:32 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-16 13:51:04.296441
Title: MORA: Improving Ensemble Robustness Evaluation with Model-Reweighing Attack
Title（参考訳）: MORA:モデル修正攻撃によるアンサンブルロバストネス評価の改善
Authors: Yunrui Yu, Xitong Gao, Cheng-Zhong Xu
Abstract要約: 敵攻撃は、入力データに小さな摂動を加えることで、ニューラルネットワークを騙すことができる。敵の攻撃戦略は、アンサンブル防御を確実に評価することができず、その頑健さをかなり過大評価できることを示す。我々は, モデル勾配の重要性を再考することにより, モデル修正攻撃であるMORAを紹介した。
参考スコア（独自算出の注目度）: 26.37741124166643
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Adversarial attacks can deceive neural networks by adding tiny perturbations to their input data. Ensemble defenses, which are trained to minimize attack transferability among sub-models, offer a promising research direction to improve robustness against such attacks while maintaining a high accuracy on natural inputs. We discover, however, that recent state-of-the-art (SOTA) adversarial attack strategies cannot reliably evaluate ensemble defenses, sizeably overestimating their robustness. This paper identifies the two factors that contribute to this behavior. First, these defenses form ensembles that are notably difficult for existing gradient-based method to attack, due to gradient obfuscation. Second, ensemble defenses diversify sub-model gradients, presenting a challenge to defeat all sub-models simultaneously, simply summing their contributions may counteract the overall attack objective; yet, we observe that ensemble may still be fooled despite most sub-models being correct. We therefore introduce MORA, a model-reweighing attack to steer adversarial example synthesis by reweighing the importance of sub-model gradients. MORA finds that recent ensemble defenses all exhibit varying degrees of overestimated robustness. Comparing it against recent SOTA white-box attacks, it can converge orders of magnitude faster while achieving higher attack success rates across all ensemble models examined with three different ensemble modes (i.e., ensembling by either softmax, voting or logits). In particular, most ensemble defenses exhibit near or exactly 0% robustness against MORA with $\ell^\infty$ perturbation within 0.02 on CIFAR-10, and 0.01 on CIFAR-100. We make MORA open source with reproducible results and pre-trained models; and provide a leaderboard of ensemble defenses under various attack strategies.
Abstract（参考訳）: 敵攻撃は、入力データに小さな摂動を加えることで、ニューラルネットワークを騙すことができる。サブモデル間の攻撃伝達性を最小化するために訓練されたアンサンブル防御は、自然入力に対する高い精度を維持しつつ、このような攻撃に対する堅牢性を改善するための有望な研究方向を提供する。しかし,近年のSOTA(State-of-the-art)攻撃戦略では,アンサンブル防御を確実に評価することができず,その頑健さを著しく過大評価できることがわかった。本稿では,この行動に寄与する2つの要因について述べる。まず、これらの防御は、勾配難読化のため、既存の勾配ベースの攻撃方法では特に難しいアンサンブルを形成する。第二に、アンサンブルディフェンスはサブモデル勾配を多様化させ、全てのサブモデルを同時に打ち破ることの難しさを示し、単純にそれらの貢献が全体的な攻撃目標に反する可能性がある。そこで我々は,サブモデル勾配の重要性を再考することにより,モデル修正攻撃であるMORAを導入する。 MORAは、最近のアンサンブルディフェンスは全て、過度に見積もられたロバスト性を示す。最近のSOTAのホワイトボックス攻撃と比較すると、3つの異なるアンサンブルモード(ソフトマックス、投票またはロジットによるアンサンブル)で検査されたすべてのアンサンブルモデルに対して高い攻撃成功率を達成する一方で、桁違いに早く収束することができる。特に、ほとんどのアンサンブル防御は、CIFAR-10では0.02ドル、CIFAR-100では0.01ドル、MORAに対して約0%の堅牢性を示す。我々はMORAを再現可能な結果と事前訓練されたモデルでオープンソース化し、様々な攻撃戦略の下でのアンサンブル防御のリーダーボードを提供する。

論文の概要: MORA: Improving Ensemble Robustness Evaluation with Model-Reweighing Attack

関連論文リスト