Fugu-MT 論文翻訳(概要): TokenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models

論文の概要: TokenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models

arxiv url: http://arxiv.org/abs/2509.24566v1
Date: Mon, 29 Sep 2025 10:19:22 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-30 22:32:19.919242
Title: TokenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models
Title（参考訳）: TokenSwap:大規模視覚言語モデルの構成的理解に対するバックドアアタック
Authors: Zhifang Zhang, Qiqi Tao, Jiaqi Lv, Na Zhao, Lei Feng, Joey Tianyi Zhou,
Abstract要約: 大規模視覚言語モデル(LVLM)に対するより回避的でステルス的なバックドア攻撃であるTokenSwapを紹介する。固定されたターゲットコンテンツを強制するのではなく、TokenSwapはテキスト内のオブジェクト関係の理解を微妙に妨害する。 TokenSwapは、優れた回避性とステルス性を維持しながら、高い攻撃成功率を達成する。
参考スコア（独自算出の注目度）: 57.32952956674526
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large vision-language models (LVLMs) have achieved impressive performance across a wide range of vision-language tasks, while they remain vulnerable to backdoor attacks. Existing backdoor attacks on LVLMs aim to force the victim model to generate a predefined target pattern, which is either inserted into or replaces the original content. We find that these fixed-pattern attacks are relatively easy to detect, because the attacked LVLM tends to memorize such frequent patterns in the training dataset, thereby exhibiting overconfidence on these targets given poisoned inputs. To address these limitations, we introduce TokenSwap, a more evasive and stealthy backdoor attack that focuses on the compositional understanding capabilities of LVLMs. Instead of enforcing a fixed targeted content, TokenSwap subtly disrupts the understanding of object relationships in text. Specifically, it causes the backdoored model to generate outputs that mention the correct objects in the image but misrepresent their relationships (i.e., bags-of-words behavior). During training, TokenSwap injects a visual trigger into selected samples and simultaneously swaps the grammatical roles of key tokens in the corresponding textual answers. However, the poisoned samples exhibit only subtle differences from the original ones, making it challenging for the model to learn the backdoor behavior. To address this, TokenSwap employs an adaptive token-weighted loss that explicitly emphasizes the learning of swapped tokens, such that the visual triggers and bags-of-words behavior are associated. Extensive experiments demonstrate that TokenSwap achieves high attack success rates while maintaining superior evasiveness and stealthiness across multiple benchmarks and various LVLM architectures.
Abstract（参考訳）: 大規模な視覚言語モデル(LVLM)は、バックドアアタックに対して脆弱でありながら、幅広い視覚言語タスクにおいて印象的なパフォーマンスを実現している。既存のLVLMに対するバックドア攻撃は、被害者モデルを強制的に、元のコンテンツに挿入または置換される事前に定義されたターゲットパターンを生成することを目的としている。攻撃されたLVLMはトレーニングデータセットにおけるそのような頻繁なパターンを記憶する傾向があり,これらのターゲットに対して有毒な入力が過剰に発生するため,これらの固定パターン攻撃は比較的容易に検出できることがわかった。これらの制限に対処するために,LVLMの合成理解機能に着目した,より回避的でステルスなバックドア攻撃であるTokenSwapを紹介した。固定されたターゲットコンテンツを強制するのではなく、TokenSwapはテキスト内のオブジェクト関係の理解を微妙に妨害する。具体的には、バックドアモデルが画像中の正しいオブジェクトに言及する出力を生成するが、それらの関係(すなわち、単語のバッグの振る舞い)を誤って表現する。トレーニング中、TokenSwapは選択したサンプルに視覚的トリガーを注入し、対応するテキスト回答におけるキートークンの文法的役割を同時に置き換える。しかし、毒を盛ったサンプルはオリジナルのものと微妙な違いしか示さないため、モデルがバックドアの振る舞いを学習することは困難である。これを解決するために、TokenSwapは適応的なトークン重み付き損失を採用し、スワップされたトークンの学習を強調している。大規模な実験により、TokenSwapは、複数のベンチマークと様々なLVLMアーキテクチャで優れた回避性とステルス性を保ちながら、高い攻撃成功率を達成することが示された。

論文の概要: TokenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models

関連論文リスト