Fugu-MT 論文翻訳(概要): Not Too Short, Not Too Long: How LLM Response Length Shapes People's Critical Thinking in Error Detection

論文の概要: Not Too Short, Not Too Long: How LLM Response Length Shapes People's Critical Thinking in Error Detection

arxiv url: http://arxiv.org/abs/2603.06878v1
Date: Fri, 06 Mar 2026 20:57:36 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-10 15:13:13.300255
Title: Not Too Short, Not Too Long: How LLM Response Length Shapes People's Critical Thinking in Error Detection
Title（参考訳）: LLMの反応長が、エラー検出における人々の批判的思考をいかに形作るか
Authors: Natalie Friedman, Adelaide Nyanyo, Kevin Weatherwax, Lifei Wang, Chengchao Zhu, Zeshu Zhu, S. Joy Mountford,
Abstract要約: 大規模言語モデル(LLM)は、教育や専門的な文脈において一般的な意思決定支援ツールとなっている。本研究は,LLM応答長が,批判的思考課題におけるLLM生成推論におけるユーザの精度を形作るかどうかを検討する。
参考スコア（独自算出の注目度）: 0.7817813851272347
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Large language models (LLMs) have become common decision-support tools across educational and professional contexts, raising questions about how their outputs shape human critical thinking. Prior work suggests that the amount of AI assistance can influence cognitive engagement, yet little is known about how specific properties of LLM outputs (e.g., response length) impacts users' critical evaluation of information. In this study, we examine whether the length of LLM responses shapes users' accuracy in evaluating LLM-generated reasoning on critical thinking tasks, particularly in interaction with the correctness of the LLM's reasoning. To begin evaluating this, we conducted a within-subjects experiment with 24 participants who completed 15 modified Watson--Glaser critical thinking items, each accompanied by an LLM-generated explanation that varied in length and correctness. Mixed-effects logistic regression revealed a strong and statistically reliable effect of LLM output correctness on participant accuracy, with participants more likely to answer correctly when the LLM's explanation was correct. Response length appeared to moderated this effect: when the LLM output was incorrect, medium-length explanations were associated with higher participant accuracy than either shorter or longer explanations, whereas accuracy remained high across lengths when the LLM output was correct. Together, these findings suggest that response length alone may be insufficient to support critical thinking, and that how reasoning is presented-including a potential advantage of mid-length explanations under some conditions-points to design opportunities for LLM-based decision-support systems that emphasize transparent reasoning and calibrated expressions of certainty.
Abstract（参考訳）: 大規模言語モデル(LLM)は、教育や専門的な文脈において一般的な意思決定支援ツールとなり、アウトプットがどのように人間の批判的思考を形作るかという疑問を提起している。以前の研究は、AIアシストの量が認知的エンゲージメントに影響を与えることを示唆しているが、LLM出力の特定の特性(例えば応答長)がユーザーの情報に対する批判的な評価に与える影響についてはほとんど分かっていない。本研究では,LLM応答長が,批判的思考課題におけるLLM生成推論の精度,特にLLMの推論の正しさと相互作用を評価できるかどうかを検討する。これを評価するために,15種類のワトソン批判的思考項目を完了した24名の被験者を対象に実験を行った。混合効果のロジスティック回帰は, LLMの出力精度が参加者の精度に強く統計的に信頼性のある効果を示し, LLMの説明が正しければ, 参加者はより正確に答える可能性が示唆された。 LLMの出力が正しくない場合、中長の説明は、長短または長短の説明よりも高い受入精度に関連付けられ、一方、LLMの出力が正しい場合の精度は高いままであった。これらの結果から, 批判的思考を支援するには応答長だけでは不十分である可能性が示唆された。また, 確実性の透明な推論と校正表現を重視したLCMに基づく意思決定支援システムの設計において, 中間的説明の潜在的優位性を含め, どのように推論が提示されるかが示唆された。

論文の概要: Not Too Short, Not Too Long: How LLM Response Length Shapes People's Critical Thinking in Error Detection

関連論文リスト