Fugu-MT 論文翻訳(概要): Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL

論文の概要: Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL

arxiv url: http://arxiv.org/abs/2603.19611v1
Date: Fri, 20 Mar 2026 03:33:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 19:48:38.978229
Title: Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL
Title（参考訳）: 実証, CoT, Prompting: ICLの理論解析
Authors: Xuhan Tong, Yuchen Zeng, Jiawei Zhang,
Abstract要約: In-Context Learning (ICL)により、事前訓練されたLLMは、入力出力のデモの小さなセットを条件にすることで、下流タスクに適応できる。 ICLの性能は, (i) 選択された実演の質, (ii) 事前学習モデルの本質的なICL能力, (iii) 分布シフトの程度によってどのように制御されるかを分析する。
参考スコア（独自算出の注目度）: 9.83462430479849
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In-Context Learning (ICL) enables pretrained LLMs to adapt to downstream tasks by conditioning on a small set of input-output demonstrations, without any parameter updates. Although there have been many theoretical efforts to explain how ICL works, most either rely on strong architectural or data assumptions, or fail to capture the impact of key practical factors such as demonstration selection, Chain-of-Thought (CoT) prompting, the number of demonstrations, and prompt templates. We address this gap by establishing a theoretical analysis of ICL under mild assumptions that links these design choices to generalization behavior. We derive an upper bound on the ICL test loss, showing that performance is governed by (i) the quality of selected demonstrations, quantified by Lipschitz constants of the ICL loss along paths connecting test prompts to pretraining samples, (ii) an intrinsic ICL capability of the pretrained model, and (iii) the degree of distribution shift. Within the same framework, we analyze CoT prompting as inducing a task decomposition and show that it is beneficial when demonstrations are well chosen at each substep and the resulting subtasks are easier to learn. Finally, we characterize how ICL performance sensitivity to prompt templates varies with the number of demonstrations. Together, our study shows that pretraining equips the model with the ability to generalize beyond observed tasks, while CoT enables the model to compose simpler subtasks into more complex ones, and demonstrations and instructions enable it to retrieve similar or complex tasks, including those that can be composed into more complex ones, jointly supporting generalization to unseen tasks. All theoretical insights are corroborated by experiments.
Abstract（参考訳）: In-Context Learning (ICL)により、事前訓練されたLLMは、パラメータを更新することなく、小さな入力出力のデモを条件にすることで、下流タスクに適応することができる。 ICLの動作を説明するための理論的努力は数多くあるが、ほとんどの場合、強いアーキテクチャやデータ仮定に依存しているか、あるいはデモの選択、CoT(Chain-of-Thought)のプロンプト、デモの数、テンプレートのプロンプトといった重要な実践的要因の影響を捉えていない。我々は、これらの設計選択を一般化行動に関連付ける軽微な仮定の下で、ICLの理論解析を確立することにより、このギャップに対処する。我々はICLテスト損失の上限を導出し、性能が支配されることを示す。 i) ICL損失のリプシッツ定数によって, 試験プロンプトと予備訓練試料を接続する経路に沿って定量化された実験の質。 (二)事前訓練モデルの本質的ICL能力及び (三)流通の度合同じフレームワーク内では、タスク分解を誘導するCoTを解析し、各サブステップでデモが適切に選択され、結果として得られるサブタスクがより学習しやすい場合に有益であることを示す。最後に、実演回数に応じて、テンプレートのプロンプトに対するICLパフォーマンス感度がどのように変化するかを特徴付ける。本研究は、事前学習により、観測されたタスクを超えて一般化できる能力をモデルに装備することを示し、CoTはより単純なサブタスクをより複雑なタスクに構成することを可能にし、実証と指示により、より複雑なタスクに構成できるタスクを含む、同様のタスクや複雑なタスクを検索することを可能にした。すべての理論的な洞察は実験によって裏付けられる。

論文の概要: Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL

関連論文リスト