Fugu-MT 論文翻訳(概要): GHOST: Hallucination-Inducing Image Generation for Multimodal LLMs

論文の概要: GHOST: Hallucination-Inducing Image Generation for Multimodal LLMs

arxiv url: http://arxiv.org/abs/2509.25178v1
Date: Mon, 29 Sep 2025 17:59:23 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-01 17:09:04.154435
Title: GHOST: Hallucination-Inducing Image Generation for Multimodal LLMs
Title（参考訳）: GHOST:マルチモーダルLLMのための幻覚誘導画像生成
Authors: Aryan Yazdan Parast, Parsa Hosseini, Hesam Asadollahzadeh, Arshia Soltani Moakhar, Basim Azam, Soheil Feizi, Naveed Akhtar,
Abstract要約: 本稿では,幻覚を誘発する画像を積極的に生成することにより,MLLMをストレステストする手法であるGHOSTを紹介する。 GHOSTは完全に自動化されており、人間の監督や事前の知識を必要としない。 GLM-4.1V-Thinkingのような推論モデルを含む様々なモデルにおいて本手法の評価を行い,従来のデータ駆動探索法と比較して28%以上の幻覚成功率を達成する。
参考スコア（独自算出の注目度）: 61.829473661517675
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Object hallucination in Multimodal Large Language Models (MLLMs) is a persistent failure mode that causes the model to perceive objects absent in the image. This weakness of MLLMs is currently studied using static benchmarks with fixed visual scenarios, which preempts the possibility of uncovering model-specific or unanticipated hallucination vulnerabilities. We introduce GHOST (Generating Hallucinations via Optimizing Stealth Tokens), a method designed to stress-test MLLMs by actively generating images that induce hallucination. GHOST is fully automatic and requires no human supervision or prior knowledge. It operates by optimizing in the image embedding space to mislead the model while keeping the target object absent, and then guiding a diffusion model conditioned on the embedding to generate natural-looking images. The resulting images remain visually natural and close to the original input, yet introduce subtle misleading cues that cause the model to hallucinate. We evaluate our method across a range of models, including reasoning models like GLM-4.1V-Thinking, and achieve a hallucination success rate exceeding 28%, compared to around 1% in prior data-driven discovery methods. We confirm that the generated images are both high-quality and object-free through quantitative metrics and human evaluation. Also, GHOST uncovers transferable vulnerabilities: images optimized for Qwen2.5-VL induce hallucinations in GPT-4o at a 66.5% rate. Finally, we show that fine-tuning on our images mitigates hallucination, positioning GHOST as both a diagnostic and corrective tool for building more reliable multimodal systems.
Abstract（参考訳）: MLLM(Multimodal Large Language Models)におけるオブジェクト幻覚(Object Hallucination in Multimodal Large Language Models)は、画像に存在しないオブジェクトを知覚する永続的な障害モードである。 MLLMのこの弱点は、現在、固定された視覚シナリオを持つ静的なベンチマークを用いて研究されている。 GHOST (Generating Hallucinations via Optimizing Stealth Tokens) は,幻覚を誘発する画像を積極的に生成することにより,MLLMをストレステストする手法である。 GHOSTは完全に自動化されており、人間の監督や事前の知識を必要としない。画像埋め込み空間を最適化し、対象オブジェクトを欠如させながらモデルを誤解させると共に、埋め込みに条件付けられた拡散モデルを誘導し、自然な画像を生成する。結果として得られた画像は、視覚的に自然なままで、元の入力に近いが、モデルが幻覚を引き起こす微妙な誤解を招く手がかりが導入された。 GLM-4.1V-Thinkingのような推論モデルを含む様々なモデルにおいて本手法の評価を行い,従来のデータ駆動探索法と比較して28%以上の幻覚成功率を達成する。生成した画像は,定量的な測定と人的評価により,高品質かつオブジェクトフリーであることを確認した。また、GHOSTは転送可能な脆弱性を明らかにし、Qwen2.5-VLに最適化された画像はGPT-4oの幻覚を66.5%の速度で誘発する。最後に,画像の微調整は幻覚を緩和し,より信頼性の高いマルチモーダルシステムを構築するための診断・修正ツールとしてGHOSTを位置づけた。

論文の概要: GHOST: Hallucination-Inducing Image Generation for Multimodal LLMs

関連論文リスト