Fugu-MT 論文翻訳(概要): The Valley of Code Reasoning: Scaling Knowledge Distillation of Large Language Models

論文の概要: The Valley of Code Reasoning: Scaling Knowledge Distillation of Large Language Models

arxiv url: http://arxiv.org/abs/2510.06101v1
Date: Tue, 07 Oct 2025 16:32:09 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-08 17:57:08.356738
Title: The Valley of Code Reasoning: Scaling Knowledge Distillation of Large Language Models
Title（参考訳）: コード推論の谷:大規模言語モデルの知識拡張のスケーリング
Authors: Muyu He, Muhammad Ali Shafique, Anand Kumar, Tsach Mackey, Nazneen Rajani,
Abstract要約: 本研究では,2種類の非共振LDM上での蒸留競合符号化スキルのスケーリング傾向について検討した。驚くべきことに、トレーニングデータにおける出力の正しさは蒸留結果には何の差も与えないことがわかった。
参考スコア（独自算出の注目度）: 40.900241869345976
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Distilling the thinking traces of a Large Language Model (LLM) with reasoning capabilities into a smaller model has been proven effective. Yet, there is a scarcity of work done on how model performances scale with the quantity of distillation data. In this work, we study the scaling trend of distilling competitive coding skills on two small non-reasoning LLMs. We validate the hypothesis that there is a $\textit{valley of code reasoning}$: downstream performance on competitive coding first drops as data quantity increases, then it steadily increases in a sharper-than-log-linear fashion. Having identified the trend, we further fine-tune the models at two different distillation stages on the same data to ground conclusions on their respective learning phases. We learn that across stages in the low and medium-low data regimes, small models benefit significantly from easier coding questions than from harder ones. We also find that, surprisingly, the correctness of outputs in training data makes no difference to distillation outcomes. Our work represents a step forward in understanding the training dynamics of code reasoning distillation outside intuition
Abstract（参考訳）: LLM(Large Language Model)の思考トレースを小さなモデルに推論能力で拡張することは、有効であることが証明されている。しかし, 蒸留データの量とともに, モデル性能のスケールに関する作業は少ない。本研究では,2つの小型非共振LDM上での競合符号化スキルの蒸留のスケーリング傾向について検討した。データ量が増加するにつれて、競合するコーディングにおけるダウンストリームのパフォーマンスが低下し、その後、よりシャープなlog-linearな方法で徐々に上昇する、という仮説を検証する。この傾向を同定し、同じデータ上に2つの異なる蒸留段階のモデルを微調整し、各学習段階の結論を導いた。私たちは、低低と中低のデータレギュレーションの段階において、小さなモデルは、難しいものよりもコーディングの容易な質問から大きな恩恵を受けます。また, トレーニングデータにおける出力の正しさが蒸留結果に何の差も与えないことも, 驚くべきことに確認した。我々の研究は、直観の外での蒸留を推論するコードの訓練力学を理解するための一歩である。

論文の概要: The Valley of Code Reasoning: Scaling Knowledge Distillation of Large Language Models

関連論文リスト