Fugu-MT 論文翻訳(概要): PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models

論文の概要: PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models

arxiv url: http://arxiv.org/abs/2606.24388v1
Date: Tue, 23 Jun 2026 10:20:40 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-24 22:16:48.89872
Title: PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models
Title（参考訳）: PHANTOM:ビジョンランゲージモデルのための大規模マルチモーダルアタックデータセット
Authors: Simone Gallivanone, Hossein Khodadadi, Mauro Dore, Mauro Medda, Nicola Franco,
Abstract要約: 本稿では,視覚言語モデル(VLM)に対する,事前生成した敵攻撃の大規模かつオープンソースデータセットを提案する。このデータセットは、有害なインテントの10の高レベルカテゴリと55のサブカテゴリをカバーすることで、多様な、代表的で実用的な、既存のベンチマークを拡張するように設計されている。このデータセットは、最近の文献から最先端の攻撃戦略を用いて生成された、47の524対逆サンプルからなる。
参考スコア（独自算出の注目度）: 0.815557531820863
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a large-scale, open-source dataset of pre-generated adversarial attacks for vision-language models (VLMs). The dataset is designed to be diverse, representative, and practical, extending existing benchmarks by covering 10 high-level categories and 55 subcategories of harmful intents. Our primary goal is to make adversarial data accessible to the research community, given the computational cost and complexity of generating large numbers of attacks. The dataset comprises 47 524 adversarial samples, generated using state-of-the-art attack strategies from recent literature. Our work complements existing efforts by consolidating and extending prior benchmarks from multiple established sources, resulting in 7 826 intents, and introduce an additional category to broaden coverage. This provides realistic evaluation resources for studying model robustness and alignment. Our dataset intends to enable researchers and practitioners to systematically evaluate the robustness and safety of VLMs, fine-tune attack-generation models, and develop or stress-test defensive guardrails under diverse adversarial conditions. By releasing this resource, we aim to lower the barrier to adversarial research and foster more reproducible, comprehensive, and comparable evaluations of VLM safety.
Abstract（参考訳）: 本稿では,視覚言語モデル(VLM)に対して,事前生成した敵攻撃の大規模かつオープンソースのデータセットを提案する。このデータセットは、有害なインテントの10の高レベルカテゴリと55のサブカテゴリをカバーすることで、多様な、代表的で実用的な、既存のベンチマークを拡張するように設計されている。我々の第一の目的は、大量の攻撃を発生させる計算コストと複雑さを考えれば、研究コミュニティに敵対的なデータをアクセスできるようにすることです。このデータセットは、最近の文献から最先端の攻撃戦略を用いて生成された、47の524対逆サンプルからなる。我々の研究は、既存のベンチマークを複数の確立されたソースから統合し、拡張することで、既存の取り組みを補完し、その結果、7つの826の意図が生まれ、カバー範囲を広げるための追加のカテゴリが導入されます。これは、モデルの堅牢性とアライメントを研究するための現実的な評価リソースを提供する。本データセットは, VLMの堅牢性と安全性, 微動攻撃発生モデル, 各種対向条件下での防御レールの開発, およびストレス試験を行うためのものである。このリソースを公開することによって、敵研究の障壁を低くし、より再現性が高く、包括的で、VLM安全性に匹敵する評価を促進することを目指している。

論文の概要: PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models

関連論文リスト