Fugu-MT 論文翻訳(概要): CapTrack: Multifaceted Evaluation of Forgetting in LLM Post-Training

論文の概要: CapTrack: Multifaceted Evaluation of Forgetting in LLM Post-Training

arxiv url: http://arxiv.org/abs/2603.06610v1
Date: Thu, 19 Feb 2026 09:46:24 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-15 16:38:22.431169
Title: CapTrack: Multifaceted Evaluation of Forgetting in LLM Post-Training
Title（参考訳）: CapTrack: LLMポストトレーニングにおけるフォーミングの多面的評価
Authors: Lukas Thede, Stefan Winzeck, Zeynep Akata, Jonathan Richard Schwarz,
Abstract要約: textbfCapTrackは,大規模言語モデルにおける忘れを解析する機能中心のフレームワークである。我々は、ポストトレーニングアルゴリズム、ドメイン、モデルファミリーにまたがる大規模な実証的研究を行う。私たちは、忘れることがパラメトリックな知識を超えて、頑健さとデフォルトの振る舞いに顕著なドリフトがあることに気付きました。
参考スコア（独自算出の注目度）: 48.70704477452434
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language model (LLM) post-training enhances latent skills, unlocks value alignment, improves performance, and enables domain adaptation. Unfortunately, post-training is known to induce forgetting, especially in the ubiquitous use-case of leveraging third-party pre-trained models, which is typically understood as a loss of parametric or factual knowledge. We argue that this accuracy-centric view is insufficient for modern foundation models and instead define forgetting as systematic model drift that degrades behavior and user experience. In this context, we introduce \textbf{CapTrack}, a capability-centric framework for analyzing forgetting in LLMs that combines a behavioral taxonomy with an evaluation suite built on established benchmarks and targeted adaptations. Using CapTrack, we conduct a large-scale empirical study across post-training algorithms, domains, and model families, including models up to 80B parameters. We find that forgetting extends beyond parametric knowledge, with pronounced drift in robustness and default behaviors. Instruction fine-tuning induces the strongest relative drift, while preference optimization is more conservative and can partially recover lost capabilities. Differences across model families persist, and no universal mitigation emerges.
Abstract（参考訳）: 大規模言語モデル(LLM)のポストトレーニングは、潜在スキルを強化し、価値アライメントをアンロックし、パフォーマンスを改善し、ドメイン適応を可能にする。残念なことに、ポストトレーニングは、特にパラメトリックまたは事実知識の喪失として一般的に理解される、サードパーティの事前トレーニングモデルを活用するユビキタスなユースケースにおいて、忘れることを引き起こすことが知られている。この精度中心の視点は、現代の基礎モデルには不十分であり、代わりに、振る舞いやユーザエクスペリエンスを低下させる体系的なモデルドリフトとして、忘れることを定義します。本稿では, LLM における忘れを解析する機能中心フレームワークである \textbf{CapTrack} について紹介する。 CapTrackを使用することで、トレーニング後のアルゴリズム、ドメイン、モデルファミリーに対して、最大80Bパラメータのモデルを含む大規模な実証的研究を行う。私たちは、忘れることがパラメトリックな知識を超えて、頑健さとデフォルトの振る舞いに顕著なドリフトがあることに気付きました。インストラクションの微調整は、最も強い相対的ドリフトを誘導する一方、選好最適化はより保守的で、部分的に失われた能力を取り戻すことができる。モデルファミリ間の差異は持続し、普遍的な緩和は発生しない。

論文の概要: CapTrack: Multifaceted Evaluation of Forgetting in LLM Post-Training

関連論文リスト