Fugu-MT 論文翻訳(概要): Mitigating Manifold Departure: Uncertainty-Aware Subspace Rectification for Trustworthy MLLM Decoding

論文の概要: Mitigating Manifold Departure: Uncertainty-Aware Subspace Rectification for Trustworthy MLLM Decoding

arxiv url: http://arxiv.org/abs/2606.09859v1
Date: Sun, 31 May 2026 13:02:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-10 15:40:57.957623
Title: Mitigating Manifold Departure: Uncertainty-Aware Subspace Rectification for Trustworthy MLLM Decoding
Title（参考訳）: マニフォールド分割の軽減:信頼に値するMLLM復号のための不確かさを意識した部分空間整形
Authors: Yingxuan Zhuang, Jingxiao Yang, Miao Pan, Cheng Tan, Yuxiang Cai, Siwei Tan, Chen Zhi, Xuhong Zhang, Jianwei Yin, Jintao Chen,
Abstract要約: 本稿では,表現構造を保ちながら幻覚を緩和する幾何学的無訓練復号法を提案する。 POPEとCHAIRの実験では、MGAPは事前のデコードベースラインよりも優れていた。
参考スコア（独自算出の注目度）: 38.148606143968806
License: http://creativecommons.org/licenses/by/4.0/
Abstract: MLLMs frequently hallucinate objects inconsistent with visual inputs. This issue is typically attributed to the over-reliance on language priors, which can override the visual context. Recent training-free decoding strategies address this by penalizing language priors. However, these methods overlook the dual nature of language priors, where they can be both helpful and harmful depending on the alignment with visual evidence. In particular, blindly suppressing language priors often disrupts the model's semantic manifold, leading to performance degradation, a phenomenon we term Manifold Departure. To address this, we propose Manifold-Guided Adaptive Projection (MGAP), a geometry-aware, training-free decoding method that mitigates hallucinations while preserving representation structure. MGAP first constructs a language-prior subspace from blind hidden states via SVD. During decoding, MGAP projects each multimodal hidden state onto this subspace and applies a consistency-aware gate to adaptively attenuate only the projected prior component, yielding a subspace-selective update that largely preserves the orthogonal semantic components. Extensive experiments on POPE and CHAIR show that MGAP outperforms prior decoding baselines, achieving stronger hallucination suppression without sacrificing coherence.
Abstract（参考訳）: MLLMは視覚入力と矛盾しない物体を幻覚させる。この問題は典型的には、視覚的コンテキストをオーバーライドできる言語事前への過度な依存によるものである。最近のトレーニング不要なデコード戦略は、言語事前のペナルティ化によってこの問題に対処している。しかし、これらの手法は言語先行の二重性を見落としており、視覚的証拠との整合性に応じて有用かつ有害である可能性がある。特に、盲目的に抑圧される言語先行は、しばしばモデルのセマンティック多様体を乱し、性能劣化をもたらす現象であるマニフォールド分割(Manifold Departure)と呼ばれる現象である。そこで我々は,表現構造を保存しながら幻覚を緩和する幾何学的学習自由復号法である Manifold-Guided Adaptive Projection (MGAP) を提案する。 MGAPはまず、SVDを介して隠れた状態から言語優先のサブスペースを構築する。デコード中、MGAPは各マルチモーダルな隠された状態をこのサブスペースに投影し、整合性を意識したゲートを適用して、投影された前のコンポーネントのみを適応的に減衰させ、直交的なセマンティックコンポーネントをほとんど保存するサブスペース選択的な更新を生成する。 POPEおよびCHAIRの広範囲な実験により、MGAPは、コヒーレンスを犠牲にすることなく、より強い幻覚抑制を達成し、事前のデコードベースラインより優れていることが示された。

論文の概要: Mitigating Manifold Departure: Uncertainty-Aware Subspace Rectification for Trustworthy MLLM Decoding

関連論文リスト