Fugu-MT 論文翻訳(概要): Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models

論文の概要: Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models

arxiv url: http://arxiv.org/abs/2603.27522v1
Date: Sun, 29 Mar 2026 05:14:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-31 23:18:45.00084
Title: Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models
Title（参考訳）: 隠れ広告:視覚言語モデルにおける超音波注入のための行動トリガー付きセマンティックバックドア
Authors: Duanyi Yao, Changyue Li, Zhicong Huang, Cheng Hong, Songze Li,
Abstract要約: 私たちはHidden Adsを紹介します。Hidden Adsは、レコメンデーション検索行動を利用するバックドア攻撃の新しいクラスです。 Hidden Adsは、ユーザーが興味のあるセマンティックコンテンツを含む画像をアップロードする際に、自然なユーザー行動にアクティベートする。その結果,Hidden Adsはタスク精度を維持しつつ,ほぼゼロの偽陽性で高い注入効果が得られた。
参考スコア（独自算出の注目度）: 25.882289708786796
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Vision-Language Models (VLMs) are increasingly deployed in consumer applications where users seek recommendations about products, dining, and services. We introduce Hidden Ads, a new class of backdoor attacks that exploit this recommendation-seeking behavior to inject unauthorized advertisements. Unlike traditional pattern-triggered backdoors that rely on artificial triggers such as pixel patches or special tokens, Hidden Ads activates on natural user behaviors: when users upload images containing semantic content of interest (e.g., food, cars, animals) and ask recommendation-seeking questions, the backdoored model provides correct, helpful answers while seamlessly appending attacker-specified promotional slogans. This design preserves model utility and produces natural-sounding injections, making the attack practical for real-world deployment in consumer-facing recommendation services. We propose a multi-tier threat framework to systematically evaluate Hidden Ads across three adversary capability levels: hard prompt injection, soft prompt optimization, and supervised fine-tuning. Our poisoned data generation pipeline uses teacher VLM-generated chain-of-thought reasoning to create natural trigger--slogan associations across multiple semantic domains. Experiments on three VLM architectures demonstrate that Hidden Ads achieves high injection efficacy with near-zero false positives while maintaining task accuracy. Ablation studies confirm that the attack is data-efficient, transfers effectively to unseen datasets, and scales to multiple concurrent domain-slogan pairs. We evaluate defenses including instruction-based filtering and clean fine-tuning, finding that both fail to remove the backdoor without causing significant utility degradation.
Abstract（参考訳）: VLM(Vision-Language Models)は、ユーザが製品、ダイニング、サービスに関するレコメンデーションを求めるコンシューマアプリケーションに、ますます多くデプロイされている。我々は、このレコメンデーション検索行動を利用して無許可広告を注入する新しい種類のバックドア攻撃であるHidden Adsを紹介した。ピクセルパッチや特別なトークンなどの人工的なトリガーに依存する従来のパターントリガーバックドアとは異なり、Hidden Adsは、ユーザーが興味のあるセマンティックコンテンツ(例えば、食べ物、車、動物)を含む画像をアップロードし、レコメンデーションを問うとき、バックドアドモデルは、攻撃者が特定したプロモーションスローガンをシームレスに追加しながら、正しい有用な回答を提供する。この設計は、モデルユーティリティを保存し、自然なサウンドインジェクションを発生させ、コンシューマー向けレコメンデーションサービスにおける現実のデプロイに実用的な攻撃を実現する。本稿では,ハードプロンプトインジェクション,ソフトプロンプト最適化,教師付き微調整という3つの能力レベルにおいて,隠れた広告を体系的に評価する多層脅威フレームワークを提案する。我々の有毒データ生成パイプラインは、教師によるVLM生成連鎖推論を使用して、複数の意味領域にまたがる自然なトリガー-スローガン関連を生成する。 3つのVLMアーキテクチャの実験により、Hidden Adsはタスク精度を維持しつつ、ほぼゼロの偽陽性で高いインジェクション効果が得られることが示された。アブレーション研究は、攻撃がデータ効率が高く、見つからないデータセットに効果的に転送し、複数の同時ドメインスローガンペアにスケールすることを確認した。我々は,命令ベースのフィルタリングやクリーンな微調整を含む防御効果を評価し,両者が大きな実用性低下を引き起こすことなく,バックドアの除去に失敗することを発見した。

論文の概要: Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models

関連論文リスト