Fugu-MT 論文翻訳(概要): FuzzingRL: Reinforcement Fuzz-Testing for Revealing VLM Failures

論文の概要: FuzzingRL: Reinforcement Fuzz-Testing for Revealing VLM Failures

arxiv url: http://arxiv.org/abs/2603.06600v1
Date: Tue, 17 Feb 2026 06:15:19 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-15 16:38:22.413397
Title: FuzzingRL: Reinforcement Fuzz-Testing for Revealing VLM Failures
Title（参考訳）: ファジングRL:VLM故障に対する強化ファジング試験
Authors: Jiajun Xu, Jiageng Mao, Ang Qi, Weiduo Yuan, Alexander Romanus, Helen Xia, Vitor Campagnolo Guizilini, Yue Wang,
Abstract要約: 視覚言語モデル(VLM)はエラーを起こしやすい。これらのエラーの発生場所を特定することは、AIシステムの信頼性と安全性を保証するために重要である。本稿では,意図的な誤った応答を誘導する質問を自動生成する手法を提案する。
参考スコア（独自算出の注目度）: 41.693129607023245
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vision Language Models (VLMs) are prone to errors, and identifying where these errors occur is critical for ensuring the reliability and safety of AI systems. In this paper, we propose an approach that automatically generates questions designed to deliberately induce incorrect responses from VLMs, thereby revealing their vulnerabilities. The core of this approach lies in fuzz testing and reinforcement finetuning: we transform a single input query into a large set of diverse variants through vision and language fuzzing. Based on the fuzzing outcomes, the question generator is further instructed by adversarial reinforcement fine-tuning to produce increasingly challenging queries that trigger model failures. With this approach, we can consistently drive down a target VLM's answer accuracy -- for example, the accuracy of Qwen2.5-VL-32B on our generated questions drops from 86.58\% to 65.53\% in four RL iterations. Moreover, a fuzzing policy trained against a single target VLM transfers to multiple other VLMs, producing challenging queries that degrade their performance as well.
Abstract（参考訳）: 視覚言語モデル(VLM)はエラーを起こしやすいため、AIシステムの信頼性と安全性を確保する上で、これらのエラーの発生場所を特定することが重要である。本稿では,VLMの誤応答を意図的に誘発し,その脆弱性を明らかにするための質問を自動生成する手法を提案する。単一の入力クエリを視覚と言語ファズリングを通じて、さまざまなバリエーションの大規模なセットに変換する。ファジィングの結果に基づいて、質問生成器は、モデル故障を引き起こすますます困難なクエリを生成するために、敵の強化微調整によってさらに指示される。例えば、生成された質問に対するQwen2.5-VL-32Bの精度は、4回のRLイテレーションで86.58\%から65.53\%に低下する。さらに、単一のターゲットVLMに対してトレーニングされたファジィポリシは、他の複数のVLMに転送される。

論文の概要: FuzzingRL: Reinforcement Fuzz-Testing for Revealing VLM Failures

関連論文リスト