Fugu-MT 論文翻訳(概要): Robust Defense Strategies for Multimodal Contrastive Learning: Efficient Fine-tuning Against Backdoor Attacks

論文の概要: Robust Defense Strategies for Multimodal Contrastive Learning: Efficient Fine-tuning Against Backdoor Attacks

arxiv url: http://arxiv.org/abs/2511.13545v1
Date: Mon, 17 Nov 2025 16:16:50 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-18 14:36:25.390294
Title: Robust Defense Strategies for Multimodal Contrastive Learning: Efficient Fine-tuning Against Backdoor Attacks
Title（参考訳）: マルチモーダル・コントラスト学習のためのロバスト・ディフェンス戦略:バックドア・アタックに対する効果的な微調整
Authors: Md. Iqbal Hossain, Afia Sajeeda, Neeresh Kumar Perla, Ming Shao,
Abstract要約: CLIPのようなマルチモーダルディープラーニングモデルは、敵の攻撃に対して安全ではない。本研究では,このような攻撃に対するマルチモーダル・コントラスト学習モデルの堅牢性を高めるための革新的な戦略を提案する。
参考スコア（独自算出の注目度）: 5.333108060878682
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The advent of multimodal deep learning models, such as CLIP, has unlocked new frontiers in a wide range of applications, from image-text understanding to classification tasks. However, these models are not safe for adversarial attacks, particularly backdoor attacks, which can subtly manipulate model behavior. Moreover, existing defense methods typically involve training from scratch or fine-tuning using a large dataset without pinpointing the specific labels that are affected. In this study, we introduce an innovative strategy to enhance the robustness of multimodal contrastive learning models against such attacks. In particular, given a poisoned CLIP model, our approach can identify the backdoor trigger and pinpoint the victim samples and labels in an efficient manner. To that end, an image segmentation ``oracle'' is introduced as the supervisor for the output of the poisoned CLIP. We develop two algorithms to rectify the poisoned model: (1) differentiating between CLIP and Oracle's knowledge to identify potential triggers; (2) pinpointing affected labels and victim samples, and curating a compact fine-tuning dataset. With this knowledge, we are allowed to rectify the poisoned CLIP model to negate backdoor effects. Extensive experiments on visual recognition benchmarks demonstrate our strategy is effective in CLIP-based backdoor defense.
Abstract（参考訳）: CLIPのようなマルチモーダルディープラーニングモデルの出現は、画像テキスト理解から分類タスクまで、幅広いアプリケーションで新たなフロンティアを解放した。しかし、これらのモデルは敵攻撃、特にモデル動作を微妙に操作できるバックドア攻撃には安全ではない。さらに、既存の防御手法では、影響を受ける特定のラベルを特定せずに、スクラッチからトレーニングや大規模なデータセットによる微調整を行うのが一般的である。本研究では,このような攻撃に対するマルチモーダル・コントラスト学習モデルの堅牢性を高めるための革新的な戦略を提案する。特に有毒なCLIPモデルでは, バックドアトリガーを同定し, 被害者のサンプルやラベルを効果的に特定することができる。この目的のために、有毒CLIPの出力のスーパーバイザとして、画像セグメンテーション ``oracle'' が導入された。 1)CLIPとOracleの知識を区別して潜在的なトリガーを特定するアルゴリズム,(2)影響を受けるラベルや被害者のサンプルをピンポイントするアルゴリズム,そして,コンパクトな微調整データセットをキュレートするアルゴリズムを開発した。この知見により, 汚染されたCLIPモデルを修正し, バックドア効果を無効にすることができる。視覚認識ベンチマークによる大規模な実験は,CLIPベースのバックドアディフェンスにおいて,我々の戦略が有効であることを示す。

論文の概要: Robust Defense Strategies for Multimodal Contrastive Learning: Efficient Fine-tuning Against Backdoor Attacks

関連論文リスト