Fugu-MT 論文翻訳(概要): Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection

論文の概要: Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection

arxiv url: http://arxiv.org/abs/2604.09024v1
Date: Fri, 10 Apr 2026 06:37:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-13 17:57:53.72641
Title: Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection
Title（参考訳）: 画像の独立性:ビジュアル・プロンプト・インジェクションによる画像解析による多モード大言語モデルの構築
Authors: Zedian Shao, Hongbin Liu, Yuepeng Hu, Neil Zhenqiang Gong,
Abstract要約: マルチモーダル大言語モデル(MLLM)は,インターネット規模の画像データを解析するための強力なツールとして登場した。特に、オープンウェイトMLLMは、大規模な個人画像から機密情報を抽出するために誤用されることがある。本稿では,画像共有前を積極的に保護するユーザ側手法であるImageProtectorを提案する。
参考スコア（独自算出の注目度）: 37.48710514852417
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-modal large language models (MLLMs) have emerged as powerful tools for analyzing Internet-scale image data, offering significant benefits but also raising critical safety and societal concerns. In particular, open-weight MLLMs may be misused to extract sensitive information from personal images at scale, such as identities, locations, or other private details. In this work, we propose ImageProtector, a user-side method that proactively protects images before sharing by embedding a carefully crafted, nearly imperceptible perturbation that acts as a visual prompt injection attack on MLLMs. As a result, when an adversary analyzes a protected image with an MLLM, the MLLM is consistently induced to generate a refusal response such as "I'm sorry, I can't help with that request." We empirically demonstrate the effectiveness of ImageProtector across six MLLMs and four datasets. Additionally, we evaluate three potential countermeasures, Gaussian noise, DiffPure, and adversarial training, and show that while they partially mitigate the impact of ImageProtector, they simultaneously degrade model accuracy and/or efficiency. Our study focuses on the practically important setting of open-weight MLLMs and large-scale automated image analysis, and highlights both the promise and the limitations of perturbation-based privacy protection.
Abstract（参考訳）: MLLM(Multi-modal large language model)は、インターネット規模の画像データを解析するための強力なツールとして登場し、大きなメリットを提供するとともに、重要な安全性と社会的懸念を提起している。特に、オープンウェイトMLLMは、アイデンティティ、場所、その他のプライベートな詳細など、大規模な個人画像から機密情報を抽出するために誤用されることがある。本研究では,MLLMに対する視覚的プロンプトインジェクション攻撃として機能する,慎重に製作されたほとんど知覚不可能な摂動を埋め込むことにより,画像の共有を積極的に保護するユーザ側手法であるImageProtectorを提案する。その結果、敵がMLLMで保護された画像を解析すると、MLLMは一貫して誘導され、「申し訳ありません、その要求を手伝うことはできません」などの拒否応答が生成される。 6つのMLLMと4つのデータセットにまたがるImageProtectorの有効性を実証的に示す。さらに, ガウスノイズ, DiffPure, 対人訓練の3つの潜在的対策の評価を行い, また, ImageProtectorの影響を部分的に緩和する一方, モデル精度と効率を同時に低下させることを示した。本研究は,オープンウェイトMLLMと大規模自動画像解析の実践的重要な設定に焦点を当て,摂動型プライバシー保護の約束と限界を強調した。

論文の概要: Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection

関連論文リスト