Fugu-MT 論文翻訳(概要): Analysis and Applications of Class-wise Robustness in Adversarial Training

論文の概要: Analysis and Applications of Class-wise Robustness in Adversarial Training

arxiv url: http://arxiv.org/abs/2105.14240v1
Date: Sat, 29 May 2021 07:28:35 GMT
ステータス: 翻訳完了
システム内更新日: 2021-06-05 20:09:30.702463
Title: Analysis and Applications of Class-wise Robustness in Adversarial Training
Title（参考訳）: 対向訓練におけるクラスワイズロバストネスの分析と応用
Authors: Qi Tian, Kun Kuang, Kelu Jiang, Fei Wu, Yisen Wang
Abstract要約: 敵の訓練は、敵の例に対するモデルロバスト性を改善するための最も効果的な手法の1つである。従来の研究は主にモデルの全体的な堅牢性に焦点を当てており、各クラスの役割に関する詳細な分析はいまだに欠落している。 MNIST, CIFAR-10, CIFAR-100, SVHN, STL-10, ImageNetの6つのベンチマークデータセットに対して, 逆トレーニングの詳細な診断を行う。対戦型学習におけるより強力な攻撃手法は、主に脆弱なクラスに対するより成功した攻撃から、性能の向上を達成することを観察する。
参考スコア（独自算出の注目度）: 92.08430396614273
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples. However, previous works mainly focus on the overall robustness of the model, and the in-depth analysis on the role of each class involved in adversarial training is still missing. In this paper, we propose to analyze the class-wise robustness in adversarial training. First, we provide a detailed diagnosis of adversarial training on six benchmark datasets, i.e., MNIST, CIFAR-10, CIFAR-100, SVHN, STL-10 and ImageNet. Surprisingly, we find that there are remarkable robustness discrepancies among classes, leading to unbalance/unfair class-wise robustness in the robust models. Furthermore, we keep investigating the relations between classes and find that the unbalanced class-wise robustness is pretty consistent among different attack and defense methods. Moreover, we observe that the stronger attack methods in adversarial learning achieve performance improvement mainly from a more successful attack on the vulnerable classes (i.e., classes with less robustness). Inspired by these interesting findings, we design a simple but effective attack method based on the traditional PGD attack, named Temperature-PGD attack, which proposes to enlarge the robustness disparity among classes with a temperature factor on the confidence distribution of each image. Experiments demonstrate our method can achieve a higher attack rate than the PGD attack. Furthermore, from the defense perspective, we also make some modifications in the training and inference phases to improve the robustness of the most vulnerable class, so as to mitigate the large difference in class-wise robustness. We believe our work can contribute to a more comprehensive understanding of adversarial training as well as rethinking the class-wise properties in robust models.
Abstract（参考訳）: 敵の訓練は、敵の例に対するモデル堅牢性を改善する最も効果的な手法の1つである。しかし、以前の研究は主にモデルの全体的なロバスト性に焦点をあてており、敵のトレーニングに関わる各クラスの役割に関する詳細な分析はまだ欠けている。本稿では,対人訓練におけるクラスワイド・ロバストネスの分析を提案する。まず,MNIST, CIFAR-10, CIFAR-100, SVHN, STL-10, ImageNetの6つのベンチマークデータセットに対して, 逆トレーニングの詳細な診断を行う。驚くべきことに,クラス間のロバスト性は著しく異なっており,ロバストモデルではクラス毎のロバスト性がアンバランス/アンフェアになる。さらに, クラス間の関係を調査し, 異なる攻撃方法や防御方法において, バランスの取れないクラス毎のロバスト性が極めて一致していることを見いだす。さらに,攻撃的学習における強固な攻撃手法は,主に脆弱なクラス(すなわち,ロバスト性の低いクラス)に対する攻撃がより成功したことによるパフォーマンス向上を達成している。これらの興味深い発見に触発されて,従来のPGD攻撃である温度-PGD攻撃に基づく簡易かつ効果的な攻撃法を設計し,各画像の信頼性分布に温度係数を持つクラス間の堅牢性格差を拡大することを提案する。実験により,PGD攻撃よりも高い攻撃率が得られることが示された。さらに、防衛の観点からは、最も脆弱なクラスのロバスト性を改善するために、トレーニングや推論フェーズにいくつかの変更を加え、クラス毎のロバスト性に大きな差を緩和する。我々の研究は、より包括的な対人訓練の理解と、堅牢なモデルにおけるクラスワイドの性質の再考に寄与すると考えている。

論文の概要: Analysis and Applications of Class-wise Robustness in Adversarial Training

関連論文リスト