Fugu-MT 論文翻訳(概要): HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models

論文の概要: HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models

arxiv url: http://arxiv.org/abs/2604.06165v2
Date: Fri, 10 Apr 2026 16:49:15 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-13 13:51:27.661193
Title: HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models
Title（参考訳）: HaloProbe:視覚言語モデルにおける物体幻覚のベイズ検出と緩和
Authors: Reihaneh Zohrabi, Hosein Hasani, Akshita Gupta, Mahdieh Soleymani Baghshah, Anna Rohrbach, Marcus Rohrbach,
Abstract要約: 本稿では,トークンレベルの確率を推定するために,外部記述統計と内部復号信号を分解するフレームワークHaloProbeを紹介する。実験の結果,HaloProbe-guided decodingは,最先端の介入法よりも効率よく幻覚を減少させることがわかった。
参考スコア（独自算出の注目度）: 24.95822790180999
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large vision-language models can produce object hallucinations in image descriptions, highlighting the need for effective detection and mitigation strategies. Prior work commonly relies on the model's attention weights on visual tokens as a detection signal. We reveal that coarse-grained attention-based analysis is unreliable due to hidden confounders, specifically token position and object repetition in a description. This leads to Simpson's paradox: the attention trends reverse or disappear when statistics are aggregated. Based on this observation, we introduce HaloProbe, a Bayesian framework that factorizes external description statistics and internal decoding signals to estimate token-level hallucination probabilities. HaloProbe uses balanced training to isolate internal evidence and combines it with a learned prior over external features to recover the true posterior. While intervention-based mitigation methods often degrade utility or fluency by modifying models' internals, we use HaloProbe as an external scoring signal for non-invasive mitigation. Our experiments show that HaloProbe-guided decoding reduces hallucinations more effectively than state-of-the-art intervention-based methods while preserving utility.
Abstract（参考訳）: 大きな視覚言語モデルは、画像記述におけるオブジェクト幻覚を生成でき、効果的な検出と緩和戦略の必要性を強調している。それまでの作業は、検出信号としての視覚トークンに対するモデルの注意重みに依存していた。粗い注意に基づく分析は、隠れた共同設立者、具体的にはトークンの位置とオブジェクトの繰り返しによって信頼性が低いことが判明した。このことがシンプソンのパラドックスに繋がる: 統計が集約されると注意傾向が逆転または消失する。本稿では,外部記述統計と内部復号信号からトークンレベルの幻覚確率を推定するベイズ的フレームワークHaloProbeを紹介する。 HaloProbeは、バランスの取れたトレーニングを使用して、内部証拠を分離し、学習済みの外部特徴と組み合わせて、真の後部を回復する。介入型緩和法は, モデル内部を変更することで実用性や流速を劣化させることが多いが, 非侵襲的緩和のための外部スコア信号としてHaloProbeを用いる。実験の結果,HaloProbe-guided decodingは,実用性を維持しつつ,最先端の介入法よりも効率よく幻覚を抑えることがわかった。

論文の概要: HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models

関連論文リスト