Fugu-MT 論文翻訳(概要): VOPE: Revisiting Hallucination of Vision-Language Models in Voluntary Imagination Task

論文の概要: VOPE: Revisiting Hallucination of Vision-Language Models in Voluntary Imagination Task

arxiv url: http://arxiv.org/abs/2511.13420v1
Date: Mon, 17 Nov 2025 14:32:06 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-18 14:36:25.30612
Title: VOPE: Revisiting Hallucination of Vision-Language Models in Voluntary Imagination Task
Title（参考訳）: VOPE:自発的イマジネーション課題における視覚言語モデルの幻覚の再考
Authors: Xingming Long, Jie Zhang, Shiguang Shan, Xilin Chen,
Abstract要約: 本稿では,自発的想像課題におけるLVLMの幻覚を評価するために,自発的物体存在評価(VOPE)を導入する。 VOPEはリチェックベースの質問を行い、LVLMが想像対象の存在を自身の反応で解釈する方法を評価する。モデル解釈と画像におけるオブジェクトの存在との間の一貫性は、モデルが応答を生成する際に幻覚を引き起こすかどうかを決定するために使用される。
参考スコア（独自算出の注目度）: 73.75049937317506
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Most research on hallucinations in Large Vision-Language Models (LVLMs) focuses on factual description tasks that prohibit any output absent from the image. However, little attention has been paid to hallucinations in voluntary imagination tasks, e.g., story writing, where the models are expected to generate novel content beyond the given image. In these tasks, it is inappropriate to simply regard such imagined novel content as hallucinations. To address this limitation, we introduce Voluntary-imagined Object Presence Evaluation (VOPE)-a novel method to assess LVLMs' hallucinations in voluntary imagination tasks via presence evaluation. Specifically, VOPE poses recheck-based questions to evaluate how an LVLM interprets the presence of the imagined objects in its own response. The consistency between the model's interpretation and the object's presence in the image is then used to determine whether the model hallucinates when generating the response. We apply VOPE to several mainstream LVLMs and hallucination mitigation methods, revealing two key findings: (1) most LVLMs hallucinate heavily during voluntary imagination, and their performance in presence evaluation is notably poor on imagined objects; (2) existing hallucination mitigation methods show limited effect in voluntary imagination tasks, making this an important direction for future research.
Abstract（参考訳）: LVLM(Large Vision-Language Models)における幻覚の研究は、画像から出力を欠くことを禁止した事実記述タスクに焦点を当てている。しかし,モデルが与えられた画像を超える斬新なコンテンツを生成することを期待するストーリーライティングなど,自発的な想像課題における幻覚にはほとんど注意が払われていない。これらの課題において、そのような想像された新奇なコンテンツを幻覚とみなすのは不適切である。この制限に対処するために,自発的な想像課題におけるLVLMの幻覚を,存在感評価によって評価する,自発的対象存在評価(VOPE)を導入する。具体的には、VOPEはリチェックベースの質問を行い、LVLMが想像対象の存在を自身の反応で解釈する方法を評価する。次に、モデル解釈と画像におけるオブジェクトの存在との整合性を利用して、モデルが応答を生成する際に幻覚を生じさせるかどうかを判断する。 VOPEをいくつかの主流のLVLMや幻覚緩和法に適用し,(1)ほとんどのLVLMは自発的な想像力の間に幻覚を強く感じている。

論文の概要: VOPE: Revisiting Hallucination of Vision-Language Models in Voluntary Imagination Task

関連論文リスト