Fugu-MT 論文翻訳(概要): Catastrophic Overfitting: A Potential Blessing in Disguise

論文の概要: Catastrophic Overfitting: A Potential Blessing in Disguise

arxiv url: http://arxiv.org/abs/2402.18211v1
Date: Wed, 28 Feb 2024 10:01:44 GMT
ステータス: 翻訳完了
システム内更新日: 2024-02-29 15:23:17.675715
Title: Catastrophic Overfitting: A Potential Blessing in Disguise
Title（参考訳）: 破滅的な過剰フィッティング:偽装による潜在的祝福
Authors: Mengnan Zhao, Lihe Zhang, Yuqiu Kong, Baocai Yin
Abstract要約: FAT(Fast Adversarial Training)は、敵の堅牢性向上に効果があるとして、研究コミュニティ内で注目を集めている。既存のFATアプローチではCOの緩和が進んでいるが, クリーンサンプルの分類精度が低下するにつれて, 対向ロバスト性の上昇が生じる。クリーンな例と逆向きな例に特徴アクティベーションの相違を利用して,COの根本原因を分析した。我々は, モデル性能を高めることを目的として, 「攻撃難読化」を実現するためにCOを活用する。
参考スコア（独自算出の注目度）: 51.996943482875366
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fast Adversarial Training (FAT) has gained increasing attention within the research community owing to its efficacy in improving adversarial robustness. Particularly noteworthy is the challenge posed by catastrophic overfitting (CO) in this field. Although existing FAT approaches have made strides in mitigating CO, the ascent of adversarial robustness occurs with a non-negligible decline in classification accuracy on clean samples. To tackle this issue, we initially employ the feature activation differences between clean and adversarial examples to analyze the underlying causes of CO. Intriguingly, our findings reveal that CO can be attributed to the feature coverage induced by a few specific pathways. By intentionally manipulating feature activation differences in these pathways with well-designed regularization terms, we can effectively mitigate and induce CO, providing further evidence for this observation. Notably, models trained stably with these terms exhibit superior performance compared to prior FAT work. On this basis, we harness CO to achieve `attack obfuscation', aiming to bolster model performance. Consequently, the models suffering from CO can attain optimal classification accuracy on both clean and adversarial data when adding random noise to inputs during evaluation. We also validate their robustness against transferred adversarial examples and the necessity of inducing CO to improve robustness. Hence, CO may not be a problem that has to be solved.
Abstract（参考訳）: FAT(Fast Adversarial Training)は、敵の堅牢性向上に効果があるとして、研究コミュニティ内で注目を集めている。特に注目すべきは、この分野における破滅的なオーバーフィッティング(CO)による課題である。既存のFATアプローチではCOの緩和が進んでいるが, クリーンサンプルの分類精度が低下するにつれて, 対向ロバスト性の上昇が生じる。この問題に対処するため,我々はまず,coの根底にある原因を分析するために,クリーン例と敵例のアクティベーションの差異を用いる。興味深いことに、COはいくつかの特定の経路によって引き起こされる特徴カバレッジに起因することが判明した。これらの経路の活性化差を適切に設計された正規化項で意図的に操作することにより、COを効果的に緩和し誘導し、この観察のさらなる証拠を与えることができる。特に、これらの用語で安定的に訓練されたモデルは、以前のFATよりも優れた性能を示す。そこで本研究では,coをモデル性能の向上を目的とした「攻撃難読化」に活用する。これにより、評価中に入力にランダムノイズを加える際に、クリーンデータと逆データの両方に対して最適な分類精度が得られる。また, 移動した敵の例に対する頑健性と, 強靭性を改善するためにCOを誘導する必要性についても検証した。したがって、COは解決すべき問題ではないかもしれない。

論文の概要: Catastrophic Overfitting: A Potential Blessing in Disguise

関連論文リスト