Fugu-MT 論文翻訳(概要): Beyond Confidence: Rethinking Self-Assessments for Performance Prediction in LLMs

論文の概要: Beyond Confidence: Rethinking Self-Assessments for Performance Prediction in LLMs

arxiv url: http://arxiv.org/abs/2605.07806v1
Date: Fri, 08 May 2026 14:41:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:39.122041
Title: Beyond Confidence: Rethinking Self-Assessments for Performance Prediction in LLMs
Title（参考訳）: 信頼性を超えて - LLMのパフォーマンス予測のための自己評価を再考する
Authors: Sree Bhattacharyya, Samarth Khanna, Leona Chen, Lucas Craig, Tharun Dilliraj, James Z. Wang,
Abstract要約: モデル自己評価の多次元的視点を提案する。我々は自信とともに6つの評価に基づく自己評価の次元を導き出す。能力に関する評価の次元、特に努力と能力は、一貫して一致し、信頼性よりも優れています。
参考スコア（独自算出の注目度）: 3.532798393283516
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are increasingly used in settings where reliable self-assessment is critical. Assessing model reliability has evolved from using probabilistic correctness estimates to, more recently, eliciting verbalized confidence. Confidence, however, has been shown to be an inconsistent and overoptimistic predictor of model correctness. Drawing on cognitive appraisal theory, a framework from human psychology that decomposes self-evaluation into multiple components, we propose a multidimensional perspective on model self-assessment. We elicit six appraisal-based dimensions of self-assessment, alongside confidence, and evaluate their utility for predicting model failure across 12 LLMs and 38 tasks spanning eight domains. We find that competence-related appraisal dimensions, particularly effort and ability, consistently match or outperform confidence across most settings. Effort additionally yields less overoptimistic estimates that remain stable across model sizes. In contrast, affective dimensions provide marginally predictive signals. Furthermore, the most informative dimension varies systematically with task characteristics: effort is most predictive for reasoning-intensive tasks, while ability and confidence dominate on retrieval-oriented tasks. Broadly, our findings indicate that structured multidimensional self-assessment is a promising approach to improving the reliability and safety of language model deployment across diverse real-world settings.
Abstract（参考訳）: 大きな言語モデル(LLM)は、信頼性の高い自己評価が重要であるような環境で、ますます使われています。モデルの信頼性を評価することは、確率論的正当性の推定から、より最近では、言語化された信頼を引き出すまで進化してきた。しかし、信頼はモデル正しさの不整合で過度に最適化された予測因子であることが示されている。自己評価を複数の構成要素に分解する人間心理学の枠組みである認知評価理論に基づいて,モデル自己評価の多次元的視点を提案する。評価に基づく6つの自己評価次元を信頼性とともに提案し、8つの領域にまたがる12のLLMと38のタスクにまたがるモデル故障を予測するための有用性を評価した。コンピテンス関連の評価次元、特に努力と能力は、ほとんどの設定において一貫して一致または優れています。さらに、モデルサイズ全体にわたって安定な、過度に最適化された見積もりが得られない。対照的に、感情的な次元は極端に予測的な信号を与える。作業は推論集約的なタスクに対して最も予測的であり、能力と信頼性は検索指向のタスクに支配的である。概ね,構造化多次元自己評価は,多様な実世界の環境における言語モデル展開の信頼性と安全性を向上させるための,有望なアプローチであると考えられた。

論文の概要: Beyond Confidence: Rethinking Self-Assessments for Performance Prediction in LLMs

関連論文リスト