Fugu-MT 論文翻訳(概要): Beyond the Prompt in Large Language Models: Comprehension, In-Context Learning, and Chain-of-Thought

論文の概要: Beyond the Prompt in Large Language Models: Comprehension, In-Context Learning, and Chain-of-Thought

arxiv url: http://arxiv.org/abs/2603.10000v2
Date: Thu, 12 Mar 2026 05:08:50 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-15 16:38:22.548724
Title: Beyond the Prompt in Large Language Models: Comprehension, In-Context Learning, and Chain-of-Thought
Title（参考訳）: 大規模言語モデルにおけるプロンプトを超えて:理解、文脈内学習、思考の連鎖
Authors: Yuling Jiao, Yanming Lai, Huazhen Lin, Wensen Ma, Houduo Qi, Defeng Sun,
Abstract要約: 大規模言語モデル(LLM)は、様々なタスクにまたがる卓越した習熟度を示した。本研究は,3つの重要な疑問に対処することによって,これらの観測の基礎を掘り下げる。
参考スコア（独自算出の注目度）: 15.598263332303612
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency across diverse tasks, exhibiting emergent properties such as semantic prompt comprehension, In-Context Learning (ICL), and Chain-of-Thought (CoT) reasoning. Despite their empirical success, the theoretical mechanisms driving these phenomena remain poorly understood. This study dives into the foundations of these observations by addressing three critical questions: (1) How do LLMs accurately decode prompt semantics despite being trained solely on a next-token prediction objective? (2) Through what mechanism does ICL facilitate performance gains without explicit parameter updates? and (3) Why do intermediate reasoning steps in CoT prompting effectively unlock capabilities for complex, multi-step problems? Our results demonstrate that, through the autoregressive process, LLMs are capable of exactly inferring the transition probabilities between tokens across distinct tasks using provided prompts. We show that ICL enhances performance by reducing prompt ambiguity and facilitating posterior concentration on the intended task. Furthermore, we find that CoT prompting activates the model's capacity for task decomposition, breaking complex problems into a sequence of simpler sub-tasks that the model has mastered during the pretraining phase. By comparing their individual error bounds, we provide novel theoretical insights into the statistical superiority of advanced prompt engineering techniques.
Abstract（参考訳）: 大規模言語モデル(LLM)は多種多様なタスクにまたがる卓越した習熟度を示し、セマンティック・プロンプト理解、ICL(In-Context Learning)、CoT(Chain-of-Thought)推論などの創発的な特性を示す。実験的な成功にもかかわらず、これらの現象を駆動する理論的なメカニズムはいまだに理解されていない。本研究は,次の3つの重要な疑問に対処することにより,これらの観測の基礎を掘り下げる。(1)次の予測目的のみにのみ訓練されているにもかかわらず,LLMがプロンプトセマンティクスを正確にデコードするにはどうすればよいか? 2) ICLは明示的なパラメータ更新なしに、パフォーマンス向上を促進するメカニズムはどのようなものか? そして、なぜCoTの中間的推論ステップは、複雑なマルチステップ問題に対して効果的にアンロックするのでしょうか? その結果, LLMは自己回帰過程を通じて, 与えられたプロンプトを用いて, 異なるタスク間でのトークン間の遷移確率を正確に推定できることを示した。 ICLは,迅速なあいまいさを低減し,意図したタスクへの後部集中を促進させることにより,性能の向上を図っている。さらに,CoTのプロンプトによりタスク分解のキャパシティが活性化され,複雑な問題を事前学習期間中にモデルがマスターした単純なサブタスクのシーケンスに分割する。個々の誤差境界を比較することにより、先進的な急進的工学技術の統計的優越性に関する新しい理論的知見を提供する。

論文の概要: Beyond the Prompt in Large Language Models: Comprehension, In-Context Learning, and Chain-of-Thought

関連論文リスト