Fugu-MT 論文翻訳(概要): Rethinking Backdoor Adversarial Unlearning through the Lens of Catastrophic Forgetting in Continual Learning

論文の概要: Rethinking Backdoor Adversarial Unlearning through the Lens of Catastrophic Forgetting in Continual Learning

arxiv url: http://arxiv.org/abs/2606.14078v1
Date: Fri, 12 Jun 2026 03:55:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-15 16:00:42.73593
Title: Rethinking Backdoor Adversarial Unlearning through the Lens of Catastrophic Forgetting in Continual Learning
Title（参考訳）: 連続学習におけるカタストロフィック・フォーミングのレンズによるバックドア・アドベラル・アンラーニングの再考
Authors: Zhenqian Zhu, Yamin Hu, Yujiang Liu, Luping Wei, Wenbo Hou, Bin Li, Haodong Li, Wenjian Luo,
Abstract要約: 現在のバックドア防御は、限られた堅牢性を示し、しばしば特定の種類の攻撃に対して失敗する。連続的な学習の観点から,バックドア学習とアンラーニングの新たな定式化を逐次3段階のプロセスとして提示する。本手法は,バックドア攻撃の幅広い範囲に適用可能であり,バックドアモデルからバックドア効果を効果的かつ徹底的に除去することができる。
参考スコア（独自算出の注目度）: 7.959552018607674
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing studies reveal that current backdoor defenses exhibit limited robustness and often fail against specific types of attacks. More concerningly, prevailing safety tuning strategies tend to provide only superficial safety protection, as they fall short of completely eliminating the backdoor effects. In this work, we present a novel formulation of backdoor learning and unlearning as a sequential, three-stage process from a continual learning perspective. Within this framework, we formally define complete backdoor unlearning and further derive the necessary conditions for achieving it based on the mechanism of catastrophic forgetting. Guided by these insights, we propose Blind Inversion-Backdoor Adversarial Unlearning (BI-BAU), which formulates the generation of adversarial examples satisfying the unlearning conditions as a blind inversion problem. We solve this by integrating the bi-level optimization process of adversarial training into an Expectation-Maximization (EM) algorithm framework to optimize the maximum a posteriori (MAP) objective. Furthermore, BI-BAU is extended to untargeted adversarial scenarios with unknown target classes, as well as to multi-modal contrastive learning tasks, enhancing its applicability to real-world deployment scenarios where pre-trained models may be compromised. Extensive experiments demonstrate that our method exhibits general applicability across a wide spectrum of backdoor attacks and can effectively and thoroughly eliminate the backdoor effects from a backdoor model.
Abstract（参考訳）: 既存の研究によると、現在のバックドア防御は限られた堅牢性を示し、しばしば特定の攻撃に対して失敗する。より具体的には、一般的な安全チューニング戦略は、バックドア効果を完全に排除できないため、表面的な安全保護のみを提供する傾向にある。本研究では,継続的学習の観点から,バックドア学習とアンラーニングの新たな定式化を逐次3段階のプロセスとして提示する。本枠組みでは,完全なバックドア・アンラーニングを正式に定義し,破滅的な忘れ込みのメカニズムに基づいて,それを実現するために必要な条件を導出する。これらの知見に導かれたBlind Inversion-Backdoor Adversarial Unlearning (BI-BAU)を提案する。本稿では,2段階の対人訓練の最適化プロセスを期待最大化(EM)アルゴリズムフレームワークに統合し,MAPの最大目標を最適化する。さらに、BI-BAUは、未知のターゲットクラスを持つ未ターゲットの敵シナリオにも拡張され、マルチモーダルなコントラスト学習タスクにも拡張され、事前訓練されたモデルが妥協されるような実世界のデプロイメントシナリオへの適用性を高めている。広汎な実験により, バックドア攻撃の幅広い範囲に適用可能であり, バックドアモデルからバックドア効果を効果的に, 徹底的に除去できることが示された。

論文の概要: Rethinking Backdoor Adversarial Unlearning through the Lens of Catastrophic Forgetting in Continual Learning

関連論文リスト