Fugu-MT 論文翻訳(概要): Noise-Aware In-Context Learning for Hallucination Mitigation in ALLMs

論文の概要: Noise-Aware In-Context Learning for Hallucination Mitigation in ALLMs

arxiv url: http://arxiv.org/abs/2604.09021v1
Date: Fri, 10 Apr 2026 06:35:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-13 17:57:53.722146
Title: Noise-Aware In-Context Learning for Hallucination Mitigation in ALLMs
Title（参考訳）: ALMにおけるハロシン化緩和のための雑音認識型インテクスト学習
Authors: Qixuan Huang, Khalid Zaman, Masashi Unoki,
Abstract要約: 聴覚的大言語モデル(ALLM)は、音声理解と推論タスクにおいて強力な汎用性を実証している。幻覚問題に対処するために,NAICL(Noss-Aware In-Context Learning)法を提案する。
参考スコア（独自算出の注目度）: 5.553031534100783
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Auditory large language models (ALLMs) have demonstrated strong general capabilities in audio understanding and reasoning tasks. However, their reliability is still undermined by hallucination issues. Existing hallucination evaluation methods are formulated as binary classification tasks, which are insufficient to characterize the more complex hallucination patterns that arise in generative tasks. Moreover, current hallucination mitigation strategies rely on fine-tuning, resulting in high computational costs. To address the above limitations, we propose a plug-and-play Noise-Aware In-Context Learning (NAICL) method. Specifically, we construct a noise prior library, retrieve noise examples relevant to the input audio, and incorporate them as contextual priors, thereby guiding the model to reduce speculative associations when acoustic evidence is insufficient and to adopt a more conservative generation strategy. In addition, we establish a hallucination benchmark for audio caption tasks including the construction of the Clotho-1K multi-event benchmark dataset, the definition of four types of auditory hallucinations, and the introduction of metrics such as hallucination type distribution to support fine-grained analysis. Experimental results show that all evaluated ALLMs exhibit same hallucination behaviors. Moreover, the proposed NAICL method reduces the overall hallucination rate from 26.53% to 16.98%.
Abstract（参考訳）: 聴覚的大言語モデル(ALLM)は、音声理解と推論タスクにおいて強力な汎用性を実証している。しかし、その信頼性は幻覚の問題によって損なわれている。既存の幻覚評価法はバイナリ分類タスクとして定式化されており、生成タスクで生じるより複雑な幻覚パターンを特徴づけるには不十分である。さらに、現在の幻覚緩和戦略は微調整に依存しており、計算コストが高い。上記の制約に対処するため,NAICL法を提案する。具体的には、ノイズ先行ライブラリを構築し、入力オーディオに関連するノイズ事例を検索し、それらを文脈先行として組み込んで、音響的証拠が不十分な場合に投機的関連を減らし、より保守的な生成戦略を採用する。さらに,Clotho-1Kマルチイベント・ベンチマーク・データセットの構築,4種類の聴覚幻覚の定義,微粒化解析を支援するための幻覚型分布などのメトリクスの導入など,音声キャプションタスクのための幻覚ベンチマークを構築した。実験の結果,すべてのallMsが同じ幻覚行動を示すことが明らかとなった。さらに、提案手法により、全体の幻覚率が26.53%から16.98%に低下する。

論文の概要: Noise-Aware In-Context Learning for Hallucination Mitigation in ALLMs

関連論文リスト