Fugu-MT 論文翻訳(概要): On Distinguishing Capability Elicitation from Capability Creation in Post-Training: A Free-Energy Perspective

論文の概要: On Distinguishing Capability Elicitation from Capability Creation in Post-Training: A Free-Energy Perspective

arxiv url: http://arxiv.org/abs/2605.08368v1
Date: Fri, 08 May 2026 18:23:25 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:49.590913
Title: On Distinguishing Capability Elicitation from Capability Creation in Post-Training: A Free-Energy Perspective
Title（参考訳）: 後研修における能力創出の消毒効果について:自由エネルギーの視点から
Authors: Yuhao Li, Shengchao Liu,
Abstract要約: ポストトレーニング研究は、能力付与と能力創出を区別すべきである、と我々は主張する。我々はこの議論を,ポストトレーニングの自由エネルギー的視点を通じて展開する。
参考スコア（独自算出の注目度）: 13.996919933596153
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Debates about large language model post-training often treat supervised fine-tuning (SFT) as imitation and reinforcement learning (RL) as discovery. But this distinction is too coarse. What matters is whether a training procedure increases the probability of behaviors the pretrained model could already produce, or whether it changes what the model can practically reach. We argue that post-training research should distinguish between capability elicitation and capability creation. We make this distinction operational by introducing the notion of accessible support: the set of behaviors that a model can practically produce under finite budgets. Post-training that reweights behaviors within this support is capability elicitation; whereas changing the support itself corresponds to capability creation. We develop this argument through a free-energy view of post-training. SFT and RL can both be seen as reweighting a pretrained reference distribution, only with different external signals. Demonstration signals define low-energy behavior for SFT, and reward signals define low-energy behavior for RL. When the update remains close to the base model, the main effect is local reweighting, not capability creation. Within this framework, the central question is no longer whether post-training is framed as SFT or RL, but whether it reweights behaviors already within reach, or instead expands the model's reachable behavioral space through search, interaction, tool use, or the incorporation of new information.
Abstract（参考訳）: 大規模言語モデルのポストトレーニングに関する議論は、しばしば教師付き微調整(SFT)を模倣と強化学習(RL)を発見として扱う。しかし、この区別はきつい。重要なのは、トレーニング手順が、事前訓練されたモデルが既に生成できる行動の確率を高めるか、モデルが実際に到達できるものを変更するかである。ポストトレーニング研究は、能力付与と能力創出を区別すべきである、と我々は主張する。我々は、モデルが有限の予算の下で実際に生成できる行動の集合という、アクセス可能なサポートの概念を導入することで、この区別を運用的にする。このサポート内での振る舞いを再重み付けするポストトレーニングは、能力の誘発である。我々はこの議論を,ポストトレーニングの自由エネルギー的視点を通じて展開する。 SFTとRLはどちらも、訓練済みの基準分布を異なる外部信号で再重み付けしていると見なすことができる。デモ信号はSFTの低エネルギー振舞いを定義し、報酬信号はRLの低エネルギー振舞いを定義する。アップデートがベースモデルに近づいたままである場合、主な効果はローカル再重み付けであり、機能生成ではない。このフレームワーク内では、ポストトレーニングがSFTまたはRLとしてフレーム化されているかではなく、既に到達範囲内にある振る舞いを再重み付けしているか、代わりに検索、インタラクション、ツールの使用、新しい情報の取り込みを通じてモデルの到達可能な行動空間を拡張するのか、という問題である。

論文の概要: On Distinguishing Capability Elicitation from Capability Creation in Post-Training: A Free-Energy Perspective

関連論文リスト