Fugu-MT 論文翻訳(概要): Sample Complexity of Autoregressive Reasoning: Chain-of-Thought vs. End-to-End

論文の概要: Sample Complexity of Autoregressive Reasoning: Chain-of-Thought vs. End-to-End

arxiv url: http://arxiv.org/abs/2604.12013v2
Date: Sat, 18 Apr 2026 18:57:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 13:51:31.093302
Title: Sample Complexity of Autoregressive Reasoning: Chain-of-Thought vs. End-to-End
Title（参考訳）: 自己回帰推論のサンプル複雑性--Chain-of-Thought vs. End-to-End
Authors: Steve Hanneke, Idan Mehalel, Shay Moran,
Abstract要約: 現代の大きな言語モデルはテキストを自動回帰的に生成し、トークンを一度に1つ生成する。このようなシステムの学習性を研究するため、Joshiらは次世代発電機のためのPAC学習フレームワークを導入した。 2つの質問に対して、サンプルの複雑性が$Tでどのようにスケールするかの分類を明らかにすることで、ほぼ完全な答えを与える。
参考スコア（独自算出の注目度）: 48.8146191570092
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern large language models generate text autoregressively, producing tokens one at a time. To study the learnability of such systems, Joshi et al. (COLT 2025) introduced a PAC-learning framework for next-token generators, the primitive underlying autoregressive models. In this framework, an unknown next-token generator maps a sequence of tokens to the next token and is iteratively applied for $T$ steps, producing a chain of tokens whose final token constitutes the model's output. The learning task is to learn the input-output mapping induced by this autoregressive process. Depending on the available supervision, training examples may reveal only the final output (End-to-End supervision) or the entire generated chain (Chain-of-Thought supervision). This raises two natural questions: how the sample complexity depends on the generation length $T$, and how much Chain-of-Thought supervision can reduce this dependence. In this work we give a nearly complete answer to both questions by uncovering a taxonomy of how the sample complexity scales with $T$. For End-to-End learning, we show that the landscape is remarkably rich: subject to mild conditions, essentially any growth rate $r(T)$ between constant and linear can arise as the sample complexity, and combined with the linear upper bound of Joshi et al., this yields an essentially complete characterization. In contrast, under Chain-of-Thought supervision we show that the sample complexity is independent of $T$, demonstrating that access to intermediate reasoning steps can eliminate the dependence on the generation length altogether. Our analysis introduces new combinatorial tools, and as corollaries we resolve several open questions posed by Joshi et al. regarding the dependence of learnability on the generation length and the role of Chain-of-Thought supervision.
Abstract（参考訳）: 現代の大きな言語モデルはテキストを自動回帰的に生成し、トークンを一度に1つ生成する。このようなシステムの学習可能性を研究するため、Joshi et al (COLT 2025)は、基本的な自己回帰モデルである次世代ジェネレータのためのPAC学習フレームワークを導入した。このフレームワークでは、未知の次のトークンジェネレータがトークンのシーケンスを次のトークンにマッピングし、反復的に$T$のステップに適用され、最終的なトークンがモデルの出力を構成するトークンの連鎖を生成する。学習課題は、この自己回帰プロセスによって誘導される入出力マッピングを学習することである。利用可能な監視方法によっては、トレーニング例は最終的なアウトプット(End-to-Endの監督)または生成されたチェーン全体(Chain-of-Thoughtの監督)のみを明らかにすることができる。このことは、サンプルの複雑さが生成長$T$にどのように依存するか、そしてChain-of-Thoughtの監督によってこの依存を減らせるかという2つの自然な疑問を提起する。この研究では、サンプルの複雑さが$T$でどのようにスケールするかの分類を明らかにすることで、両方の質問にほぼ完全な答えを与える。端から端までの学習では、ランドスケープは極めて豊かなものであり、基本的には、定数と線形の間の任意の成長率$r(T)$がサンプル複雑性として生じ、Joshiらによる線形上界と組み合わせることで、本質的に完全な特徴付けが得られる。対照的に、Chain-of-Thoughtの監督下では、サンプルの複雑さが$T$とは独立であることを示し、中間的推論ステップへのアクセスが生成長への依存を完全に排除できることを示した。本分析では,新たな組み合わせツールを導入し,Joshiらによる学習可能性の世代長依存性とチェーン・オブ・ソート管理の役割について,いくつかのオープンな疑問を整理する。

論文の概要: Sample Complexity of Autoregressive Reasoning: Chain-of-Thought vs. End-to-End

関連論文リスト