Fugu-MT 論文翻訳(概要): Adversarial Attacks Against MLLMs via Progressive Resolution Processing and Adaptive Feature Alignment

論文の概要: Adversarial Attacks Against MLLMs via Progressive Resolution Processing and Adaptive Feature Alignment

arxiv url: http://arxiv.org/abs/2605.09902v1
Date: Mon, 11 May 2026 02:45:48 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:50.479656
Title: Adversarial Attacks Against MLLMs via Progressive Resolution Processing and Adaptive Feature Alignment
Title（参考訳）: プログレッシブ・レゾリューション処理と適応的特徴アライメントによるMLLMの逆攻撃
Authors: Haobo Wang, Xiaorong Ma, Weiqi Luo, Xiaojun Jia, Jiwu Huang,
Abstract要約: 敵対的摂動は、特定の対象物として良性画像を認識するマルチモーダル大言語モデル(MLLM)を誤解させる可能性がある。これにより、ブラックボックスMLLMの堅牢性を理解し改善するために、転送ベースのターゲットアタックが重要である。マルチスケールなグローバルなセマンティックガイダンスとロバストな中間層局所アライメントを統合した,トランスファーベースの攻撃フレームワークであるPRAF-Attackを提案する。
参考スコア（独自算出の注目度）: 56.085182951399496
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial perturbations can mislead Multimodal Large Language Models (MLLMs) recognize a benign image as a specific target object, posing serious risks in safety-critical scenarios such as autonomous driving and medical diagnosis. This makes transfer-based targeted attacks crucial for understanding and improving black-box MLLM robustness. Existing transfer-based targeted attack methods typically rely on the final global features of the surrogate encoder and anchor optimization to original-resolution target crops, leading to their limited transferability and robustness. To address these challenges, we propose Progressive Resolution Processing and Adaptive Feature Alignment (PRAF-Attack), a targeted transfer-based attack framework that integrates multi-scale global semantic guidance with robust intermediate-layer local alignment. Unlike prior methods that align only the surrogate encoder's final layer, we design an adaptive feature alignment strategy that leverages intermediate representations to enhance transferability. Specifically, we introduce an adaptive intermediate layer selection mechanism to identify transferable hierarchical features across surrogate ensembles via gradient consistency, along with an adaptive patch-level optimization strategy that preserves highly correlated local regions through efficient patch filtering. To overcome the reliance on fixed original-resolution target crops, we propose a progressive resolution processing strategy that gradually refines optimization from coarse to fine, enabling the attack to better exploit target information at multiple scales and achieve stronger transferability. We evaluate PRAF-Attack on a diverse suite of black-box MLLMs, including six open-source models and six closed-source commercial APIs. Compared with seven state-of-the-art targeted attack baselines, the proposed PRAF-Attack consistently achieves superior transferability.
Abstract（参考訳）: 敵対的摂動は、マルチモーダル大言語モデル(MLLM)が特定の対象物として良質なイメージを認識することを誤解させ、自律運転や診断などの安全上重要なシナリオに重大なリスクをもたらす可能性がある。これにより、ブラックボックスMLLMの堅牢性を理解し改善するために、転送ベースのターゲットアタックが重要である。既存のトランスファーベースの攻撃手法は、通常、サロゲートエンコーダの最終的なグローバルな特徴と、オリジナル解像度のターゲット作物へのアンカー最適化に依存しており、転送可能性とロバスト性に制限がある。これらの課題に対処するために,多スケールなグローバルなセマンティックガイダンスとロバストな中間層局所アライメントを統合したトランスファーベースの攻撃フレームワークであるProgressive Resolution Processing and Adaptive Feature Alignment (PRAF-Attack)を提案する。代用エンコーダの最終層のみをアライメントする従来の手法とは異なり、中間表現を利用した適応的特徴アライメント戦略を設計し、転送可能性を高める。具体的には,アダプティブ・中間層選択機構を導入し,アダプティブ・パッチ・レベル最適化手法と,アダプティブ・パッチ・レベル最適化手法を併用し,アダプティブ・パッチ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダプティブ・アダクティブ・アダクティカル・アダクティカル・アダクティカル・アダクティカル・アダクティカル・アジェンス・アレンブルス固定されたオリジナル解像度のターゲット作物への依存を克服するため、我々は、最適化を粗いものから細かいものへと徐々に洗練し、ターゲット情報を複数スケールでよりうまく活用し、より強力な転送性を実現するプログレッシブ・レゾリューション・プロセッシング戦略を提案する。 PRAF-Attackは6つのオープンソースモデルと6つのクローズドソース商用APIを含む多様なブラックボックスMLLMで評価する。 7つの最先端の攻撃ベースラインと比較して、提案されたPRAF-Attackは一貫して優れた転送性を実現している。

論文の概要: Adversarial Attacks Against MLLMs via Progressive Resolution Processing and Adaptive Feature Alignment

関連論文リスト