Fugu-MT 論文翻訳(概要): What Does Loss Optimization Actually Teach, If Anything? Knowledge Dynamics in Continual Pre-training of LLMs

論文の概要: What Does Loss Optimization Actually Teach, If Anything? Knowledge Dynamics in Continual Pre-training of LLMs

arxiv url: http://arxiv.org/abs/2601.03858v1
Date: Wed, 07 Jan 2026 12:14:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-09 02:15:23.503804
Title: What Does Loss Optimization Actually Teach, If Anything? Knowledge Dynamics in Continual Pre-training of LLMs
Title（参考訳）: 損失最適化とは何か? LLMの継続事前学習における知識ダイナミクス
Authors: Seyed Mahed Mousavi, Simone Alghisi, Giuseppe Riccardi,
Abstract要約: 我々は,CPT(Continuous Pre-Training)を,単なる最適化問題ではなく,知識学習プロセスとして検討する。我々は,実写文書の制御された分布マッチングベンチマークを構築し,CPTループに直接診断プローブをインターリーブする。我々は、CPTが学習中に知識回路をどう反応させるかを分析する。
参考スコア（独自算出の注目度）: 1.542087313597676
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Continual Pre-Training (CPT) is widely used for acquiring and updating factual knowledge in LLMs. This practice treats loss as a proxy for knowledge learning, while offering no grounding into how it changes during training. We study CPT as a knowledge learning process rather than a solely optimization problem. We construct a controlled, distribution-matched benchmark of factual documents and interleave diagnostic probes directly into the CPT loop, enabling epoch-level measurement of knowledge acquisition dynamics and changes in Out-Of-Domain (OOD) general skills (e.g., math). We further analyze how CPT reshapes knowledge circuits during training. Across three instruction-tuned LLMs and multiple CPT strategies, optimization and learning systematically diverge as loss decreases monotonically while factual learning is unstable and non-monotonic. Acquired facts are rarely consolidated, learning is strongly conditioned on prior exposure, and OOD performance degrades from early epochs. Circuit analysis reveals rapid reconfiguration of knowledge pathways across epochs, providing an explanation for narrow acquisition windows and systematic forgetting. These results show that loss optimization is misaligned with learning progress in CPT and motivate evaluation of stopping criteria based on task-level learning dynamics.
Abstract（参考訳）: CPT(Continuous Pre-Training)は、LLMにおける事実知識の取得と更新に広く用いられている。このプラクティスは、損失を知識学習のプロキシとして扱い、トレーニング中にそれがどう変化するのかを根拠にしない。我々は,CPTを単なる最適化問題ではなく,知識学習プロセスとして研究する。 CPTループに直接診断プローブをインターリーブすることで,知識獲得ダイナミクスのエポックレベル測定とOF-Domain(OF-Domain)一般スキル(数学など)の変化を可能にする。さらに、CPTが学習中に知識回路をどう反応させるかを分析する。 3つの命令調整 LLM と複数の CPT 戦略、最適化と学習は、損失が単調に減少する一方、事実学習は不安定で非単調であるので、体系的に異なる。獲得された事実は滅多に統合されず、学習は事前の暴露によって強く条件付けられ、OODのパフォーマンスは初期の時代から低下する。回路解析により、エポック全体にわたる知識経路の迅速な再構成が明らかになり、狭い獲得ウィンドウと体系的な忘れ方の説明が提供される。これらの結果から、損失最適化はCPTの学習進捗と一致せず、タスクレベルの学習力学に基づく停止基準の評価を動機付けていることがわかった。

論文の概要: What Does Loss Optimization Actually Teach, If Anything? Knowledge Dynamics in Continual Pre-training of LLMs

関連論文リスト