Fugu-MT 論文翻訳(概要): Mosaic: Multimodal Jailbreak against Closed-Source VLMs via Multi-View Ensemble Optimization

論文の概要: Mosaic: Multimodal Jailbreak against Closed-Source VLMs via Multi-View Ensemble Optimization

arxiv url: http://arxiv.org/abs/2604.09253v1
Date: Fri, 10 Apr 2026 12:09:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-13 17:57:53.848763
Title: Mosaic: Multimodal Jailbreak against Closed-Source VLMs via Multi-View Ensemble Optimization
Title（参考訳）: Mosaic:マルチビューアンサンブル最適化によるクローズドソースVLMに対するマルチモーダルジェイルブレイク
Authors: Yuqin Lan, Gen Li, Yuanze Hu, Weihao Shen, Zhaoxin Fan, Faguo Wu, Xiao Zhang, Laurence T. Yang, Zhiming Zheng,
Abstract要約: ビジョンランゲージモデル(VLM)は強力だが、マルチモーダル・ジェイルブレイク攻撃には弱い。クローズドソースVLMに対するマルチモーダルジェイルブレイクのためのマルチビューアンサンブル最適化フレームワークであるMosaicを提案する。
参考スコア（独自算出の注目度）: 30.30398584843095
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision-Language Models (VLMs) are powerful but remain vulnerable to multimodal jailbreak attacks. Existing attacks mainly rely on either explicit visual prompt attacks or gradient-based adversarial optimization. While the former is easier to detect, the latter produces subtle perturbations that are less perceptible, but is usually optimized and evaluated under homogeneous open-source surrogate-target settings, leaving its effectiveness on commercial closed-source VLMs under heterogeneous settings unclear. To examine this issue, we study different surrogate-target settings and observe a consistent gap between homogeneous and heterogeneous settings, a phenomenon we term surrogate dependency. Motivated by this finding, we propose Mosaic, a Multi-view ensemble optimization framework for multimodal jailbreak against closed-source VLMs, which alleviates surrogate dependency under heterogeneous surrogate-target settings by reducing over-reliance on any single surrogate model and visual view. Specifically, Mosaic incorporates three core components: a Text-Side Transformation module, which perturbs refusal-sensitive lexical patterns; a Multi-View Image Optimization module, which updates perturbations under diverse cropped views to avoid overfitting to a single visual view; and a Surrogate Ensemble Guidance module, which aggregates optimization signals from multiple surrogate VLMs to reduce surrogate-specific bias. Extensive experiments on safety benchmarks demonstrate that Mosaic achieves state-of-the-art Attack Success Rate and Average Toxicity against commercial closed-source VLMs.
Abstract（参考訳）: ビジョンランゲージモデル(VLM)は強力だが、マルチモーダル・ジェイルブレイク攻撃には弱い。既存の攻撃は主に、明示的な視覚的プロンプト攻撃または勾配に基づく敵の最適化に依存する。前者は検出し易いが、後者は知覚しにくい微妙な摂動を生成するが、通常均質なオープンソースサロゲートターゲット設定で最適化され評価され、商用のクローズドソースVLMでは不均一な設定で有効である。そこで本研究では,異なるサロゲート・ターゲット設定について検討し,同質な設定と異質な設定との間に一貫したギャップを観察する。この発見に触発されたMosaicは、マルチモーダル・ジェイルブレイクのためのマルチビューアンサンブル最適化フレームワークであり、単一のサロゲートモデルとビジュアルビューへの過度な依存を軽減し、異種サロゲートターゲット設定下でのサロゲート依存性を軽減する。特に、モザイクには、3つの中核的なコンポーネントが含まれている: テキスト・サイド・トランスフォーメーション(Text-Side Transformation)モジュール、拒絶に敏感な語彙パターンを摂動する、マルチビュー画像最適化(Multi-View Image Optimization)モジュール(Multi-View Image Optimization)モジュール(Multi-View)。安全性ベンチマークに関する大規模な実験は、モザイクが商用のクローズドソースVLMに対して最先端のアタック成功率と平均毒性を達成したことを示している。

論文の概要: Mosaic: Multimodal Jailbreak against Closed-Source VLMs via Multi-View Ensemble Optimization

関連論文リスト