Fugu-MT 論文翻訳(概要): What Models Know, How Well They Know It: Knowledge-Weighted Fine-Tuning for Learning When to Say "I Don't Know"

論文の概要: What Models Know, How Well They Know It: Knowledge-Weighted Fine-Tuning for Learning When to Say "I Don't Know"

arxiv url: http://arxiv.org/abs/2604.05779v1
Date: Tue, 07 Apr 2026 12:17:30 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-08 17:42:09.813051
Title: What Models Know, How Well They Know It: Knowledge-Weighted Fine-Tuning for Learning When to Say "I Don't Know"
Title（参考訳）: モデルは何を知っているか、どのように知っているか: 「私は知らない」と言うときの学習のための知識に富んだ微調整
Authors: Joosung Lee, Hwiyeol Jo, Donghyeon Ko, Kyubyung Chae, Cheonbok Park, Jeonghoon Kim,
Abstract要約: マルチサンプル推論により,詳細なインスタンスレベルの知識スコアを推定する。モデルの既存の知識に従って学習信号をスケールし、スコープ外クエリに対する明示的な"私は知らない"応答を奨励します。
参考スコア（独自算出の注目度）: 15.460253787771324
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While large language models (LLMs) demonstrate strong capabilities across diverse user queries, they still suffer from hallucinations, often arising from knowledge misalignment between pre-training and fine-tuning. To address this misalignment, we reliably estimate a fine-grained, instance-level knowledge score via multi-sampled inference. Using the knowledge score, we scale the learning signal according to the model's existing knowledge, while encouraging explicit "I don't know" responses for out-of-scope queries. Experimental results show that this approach allows the model to explicitly express uncertainty when it lacks knowledge, while maintaining accuracy on questions it can answer. Furthermore, we propose evaluation metrics for uncertainty, showing that accurate discrimination between known and unknown instances consistently improves performance.
Abstract（参考訳）: 大きな言語モデル(LLM)は多様なユーザクエリにまたがる強力な能力を示しているが、幻覚に悩まされている。このミスアライメントに対処するために、マルチサンプル推論を用いて、きめ細かなインスタンスレベルの知識スコアを確実に推定する。知識スコアを用いて、モデルの既存の知識に従って学習信号をスケールし、スコープ外クエリに対する明示的な"私は知らない"応答を奨励する。実験結果から,このモデルでは,知識の欠如による不確実性を明確に表現すると同時に,回答可能な質問に対する正確性を維持することが可能であることが示唆された。さらに,不確実性の評価指標を提案し,未知のインスタンスと未知のインスタンスの正確な識別が常に性能を向上させることを示す。

論文の概要: What Models Know, How Well They Know It: Knowledge-Weighted Fine-Tuning for Learning When to Say "I Don't Know"

関連論文リスト