Fugu-MT 論文翻訳(概要): LLMs Should Not Yet Be Credited with Decision Explanation

論文の概要: LLMs Should Not Yet Be Credited with Decision Explanation

arxiv url: http://arxiv.org/abs/2605.01164v1
Date: Fri, 01 May 2026 23:46:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:49.619653
Title: LLMs Should Not Yet Be Credited with Decision Explanation
Title（参考訳）: LLMは決定的説明でまだ信用されてはならない
Authors: Wenshuo Wang,
Abstract要約: 最近の研究は、正確な行動予測、実証可能な合理性、結果条件付き推論のトレースを、LCMがなぜ人々がそれを決定するのかを説明する証拠として扱うようになっている。より強力なクレームは、説明的目標を指定し、より弱い合理化要因を識別し、目標に適したプロセスまたは介入に敏感な検証を使用し、その範囲を制限すべきである。この原則が採用されれば、LCMを説得力のあるナレーターの判断から、人間の行動の発見、テスト、説明のためのより信頼性の高い手段に変えるのに役立ちます。
参考スコア（独自算出の注目度）: 3.0001636668817597
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This position paper argues that LLMs should not yet be credited with decision explanation. This matters because recent work increasingly treats accurate behavioral prediction, plausible rationales, and outcome-conditioned reasoning traces as evidence that LLMs explain why people decide as they do, risking a premature redefinition of what counts as explanatory progress in human decision modeling. We first distinguish three claims with different evidential burdens: decision prediction, rationale generation, and decision explanation. We then argue that the evidence most commonly offered for LLM-based decision accounts directly supports the first two claims, and sometimes explanatory hypothesis generation, but does not distinguish decision explanation from prediction-supportive rationalization. Next, we propose a bridge standard for decision-explanation credit: stronger claims should specify explanatory targets, discriminate against weaker rationalizer alternatives, use target-appropriate process- or intervention-sensitive validation, and bound their scope. We then situate this standard against competing views and related literatures, clarifying why it preserves the value of LLMs as predictors, narrators, and hypothesis generators while resisting premature explanatory credit. We conclude with a principle of credit calibration: LLMs should be credited for the strongest claim their evidence warrants, and no stronger; if adopted, this principle can help turn LLMs from persuasive narrators of decisions into more reliable instruments for discovering, testing, and communicating explanations of human behavior.
Abstract（参考訳）: この立場の論文は、LCMは決定的な説明をまだ信用してはならないと主張している。これは、最近の研究が、正確な行動予測、妥当な理性、結果条件付き推論のトレースを、LCMが人々がなぜそうするかを説明する証拠として扱い、人間の決定モデリングにおける説明的進歩とみなすものの早期の再定義を危険にさらしているためである。まず、まず、決定予測、合理化生成、決定説明という、明らかな負担の異なる3つの主張を区別する。次に, LLMに基づく意思決定アカウントに最もよく提示される証拠は, 最初の2つの主張を直接的に支持し, 時には説明仮説の生成を支持するが, 決定説明と予測支援的合理化を区別するものではない,と論じる。次に、より強力なクレームは、説明的対象を特定し、より弱い合理化要因を識別し、目標に適したプロセスまたは介入に敏感な検証を使用し、その範囲を制限すべきである。そして、この基準を競合する見解や関連文献に照らし合わせて、予測子、ナレーター、仮説生成子としてLLMの価値を保ちつつ、早期説明信用に抵抗する理由を明らかにした。この原則が採用されれば、LCMは決定の説得力のあるナレーターから、人間の行動の説明を発見し、テストし、伝達するためのより信頼できる手段へと変えるのに役立ちます。

関連論文リスト

Faithfulness Serum: Mitigating the Faithfulness Gap in Textual Explanations of LLM Decisions via Attribution Guidance [57.17102098930037]
大規模言語モデル(LLM)は高い性能を達成し、NLPに革命をもたらした。説明責任の欠如はブラックボックスとして扱われ、透明性と信頼を求めるドメインでの使用を制限する。本研究では,注意レベルの介入を通じて説明生成を導くことにより,信頼感を高める訓練自由手法を提案する。
論文参考訳（メタデータ） (2026-04-15T18:32:32Z)
Towards Generalizable Reasoning: Group Causal Counterfactual Policy Optimization for LLM Reasoning [50.352417879912515]
大規模言語モデル(LLM)は推論能力の進歩とともに複雑なタスクに優れる。一般化可能な推論パターンを学習するために,LLMを明示的に訓練するためのグループ因果政策最適化を提案する。次に、この報酬からトークンレベルのアドバンテージを構築し、ポリシーを最適化し、LCMにプロセス無効で事実上堅牢な推論パターンを推奨します。
論文参考訳（メタデータ） (2026-02-06T08:03:11Z)
Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
大規模言語モデル(LLM)が,その世代を理論的にどのように説明するかを考察する。提案手法は帰属に基づく説明よりも「偽り」が少ないことを示す。
論文参考訳（メタデータ） (2024-06-28T20:06:30Z)
Argumentative Large Language Models for Explainable and Contestable Claim Verification [13.045050015831903]
本稿では,議論的推論を用いた大規模言語モデルの拡張手法であるArgLLMsを紹介する。 ArgLLMsは議論フレームワークを構築し、意思決定を支援するための公式な推論の基礎となる。我々はArgLLMsの性能を最先端技術と比較して実験的に評価した。
論文参考訳（メタデータ） (2024-05-03T13:12:28Z)
Assessing the Reasoning Capabilities of LLMs in the context of Evidence-based Claim Verification [22.92500697622486]
証拠と組み合わせた主張を原子推論タイプに分解するフレームワークを提案する。私たちはこのフレームワークを使用して、現実世界のクレームを取り入れた最初のクレーム検証ベンチマークであるRECVを作成します。我々は、複数のプロンプト設定の下で、最先端のLLMを3つ評価する。
論文参考訳（メタデータ） (2024-02-16T14:52:05Z)
Self-Contradictory Reasoning Evaluation and Detection [31.452161594896978]
本稿では,自己矛盾推論(Self-Contra)について考察する。 LLMは文脈情報理解や常識を含むタスクの推論において矛盾することが多い。 GPT-4は52.2%のF1スコアで自己コントラを検出できる。
論文参考訳（メタデータ） (2023-11-16T06:22:17Z)
The ART of LLM Refinement: Ask, Refine, and Trust [85.75059530612882]
ART: Ask, Refine, and Trust と呼ばれる改良目標を用いた推論を提案する。 LLMがいつその出力を洗練すべきかを決めるために必要な質問を尋ねる。自己補充ベースラインよりも+5ポイントの性能向上を達成する。
論文参考訳（メタデータ） (2023-11-14T07:26:32Z)
DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy [76.58614128865652]
非決定性から決定性への進化として推論過程を再考する新しい視点であるDetermLRを提案する。まず、既知の条件を次の2つのタイプに分類する: 決定的および不決定的前提これは、推論プロセスのオール方向を提供し、不決定的データを段階的決定的洞察に変換する際のLCMを導く。我々は、利用可能な施設の保存と抽出、推論メモリによる推論パスの自動化、そしてその後の推論ステップに関する歴史的推論の詳細を保存する。
論文参考訳（メタデータ） (2023-10-28T10:05:51Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。