Fugu-MT 論文翻訳(概要): ConfTuner: Training Large Language Models to Express Their Confidence Verbally

論文の概要: ConfTuner: Training Large Language Models to Express Their Confidence Verbally

arxiv url: http://arxiv.org/abs/2508.18847v1
Date: Tue, 26 Aug 2025 09:25:32 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-27 17:42:38.781401
Title: ConfTuner: Training Large Language Models to Express Their Confidence Verbally
Title（参考訳）: ConfTuner: 信頼性を垂直に表現するために大規模な言語モデルをトレーニングする
Authors: Yibo Li, Miao Xiong, Jiaying Wu, Bryan Hooi,
Abstract要約: 大規模言語モデル(LLM)は、科学、法律、医療といった高度な領域にますます展開されている。 LLMは、しばしば「過信」(overconfidence)として知られる、高い信頼で誤った答えを生成するために観察される。
参考スコア（独自算出の注目度）: 58.63318088243125
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are increasingly deployed in high-stakes domains such as science, law, and healthcare, where accurate expressions of uncertainty are essential for reliability and trust. However, current LLMs are often observed to generate incorrect answers with high confidence, a phenomenon known as "overconfidence". Recent efforts have focused on calibrating LLMs' verbalized confidence: i.e., their expressions of confidence in text form, such as "I am 80% confident that...". Existing approaches either rely on prompt engineering or fine-tuning with heuristically generated uncertainty estimates, both of which have limited effectiveness and generalizability. Motivated by the notion of proper scoring rules for calibration in classical machine learning models, we introduce ConfTuner, a simple and efficient fine-tuning method that introduces minimal overhead and does not require ground-truth confidence scores or proxy confidence estimates. ConfTuner relies on a new loss function, tokenized Brier score, which we theoretically prove to be a proper scoring rule, intuitively meaning that it "correctly incentivizes the model to report its true probability of being correct". ConfTuner improves calibration across diverse reasoning tasks and generalizes to black-box models such as GPT-4o. Our results further show that better-calibrated confidence enables downstream gains in self-correction and model cascade, advancing the development of trustworthy LLM systems. The code is available at https://github.com/liushiliushi/ConfTuner.
Abstract（参考訳）: 大規模言語モデル(LLM)は、信頼性と信頼のために正確な不確かさの表現が不可欠である科学、法律、医療といった高度な領域にますます配備されている。しかし、現在のLSMは、しばしば「過信」と呼ばれる現象である、高い信頼で誤った答えを生み出すことが観察される。近年の取り組みでは、LLMの言語的信頼度(すなわち、テキスト形式における信頼の表現)の校正に焦点が当てられている。既存のアプローチは、迅速なエンジニアリングや、ヒューリスティックに生成された不確実性推定による微調整に依存しており、どちらも有効性と一般化性に制限がある。従来の機械学習モデルにおけるキャリブレーションの適切なスコアリングルールの概念に触発されたConfTunerは、最小限のオーバーヘッドを導入し、根本信頼度スコアやプロキシ信頼度推定を必要としない、シンプルで効率的な微調整手法である。 ConfTuner は新たな損失関数であるトークン化された Brier スコアに依存しており、これは理論上は適切なスコアリングルールであることを証明しており、直観的には「正しい確率を示すためにモデルに正しいインセンティブを与える」ことを意味する。 ConfTunerは様々な推論タスクのキャリブレーションを改善し、GPT-4oのようなブラックボックスモデルに一般化する。以上の結果から, 自己補正およびモデルカスケードにおいて, 信頼性が向上し, ダウンストリームゲインが向上し, 信頼性の高いLCMシステムの開発が進められることが示唆された。コードはhttps://github.com/liushiliushi/ConfTuner.comから入手できる。

論文の概要: ConfTuner: Training Large Language Models to Express Their Confidence Verbally

関連論文リスト