Fugu-MT 論文翻訳(概要): Relative Scaling Laws for LLMs

論文の概要: Relative Scaling Laws for LLMs

arxiv url: http://arxiv.org/abs/2510.24626v1
Date: Tue, 28 Oct 2025 16:55:22 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-29 15:35:37.288444
Title: Relative Scaling Laws for LLMs
Title（参考訳）: LLMの相対スケーリング法則
Authors: William Held, David Hall, Percy Liang, Diyi Yang,
Abstract要約: スケーリング法則は、追加のデータ、パラメータ、計算によって言語モデルがどのように改善されるかを記述する。相対的なスケーリング法則を導入し、テスト分布間のパフォーマンスギャップをスケールで追跡する。これらの結果は、スケーリングは全体的なパフォーマンスを改善するが、普遍的等化器ではないことを示している。
参考スコア（独自算出の注目度）: 91.73497548097775
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Scaling laws describe how language models improve with additional data, parameters, and compute. While widely used, they are typically measured on aggregate test sets. Aggregate evaluations yield clean trends but average over heterogeneous subpopulations, obscuring performance disparities. We introduce relative scaling laws, which track how performance gaps between test distributions evolve with scale rather than focusing solely on absolute error. Using 255 decoder-only Transformers trained under matched-compute (IsoFLOP) budgets from $10^{18}$--$10^{20}$ FLOPs on standard pretraining datasets, we find diverse trajectories: academic domains on MMLU converge toward parity; regional English dialects shift depending on population size; and clusters of AI risk behaviours split, with capability- and influence-related risks increasing during pretraining while adversarial risks do not. These results show that although scaling improves overall performance, it is not a universal equalizer. To support further study, we release all model checkpoints from this work to enable practitioners to measure relative alongside traditional scaling laws, in order to better prioritize robustness challenges in light of the bitter lesson.
Abstract（参考訳）: スケーリング法則は、追加のデータ、パラメータ、計算によって言語モデルがどのように改善されるかを記述する。広く使われているが、一般的には集合テストセットで測定される。アグリゲート評価はクリーンな傾向をもたらすが、不均一なサブポピュレーションよりも平均的であり、性能格差を無視する。絶対誤差のみに焦点をあてるのではなく、テストディストリビューション間のパフォーマンスギャップがスケールでどのように進化するかをトラックする相対スケーリング法則を導入する。標準事前学習データセットの10^{18}$-$10^{20}$ FLOPからトレーニングされた255デコーダのみのトランスフォーマーを用いて、MMLUの学術ドメインはパリティに収束し、地域英語の方言は人口規模によって変化し、AIリスク行動のクラスタは、プレトレーニング中に能力と影響に関連するリスクが増加し、敵のリスクは増加しない。これらの結果は、スケーリングは全体的なパフォーマンスを改善するが、普遍的等化器ではないことを示している。さらなる研究を支援するため、我々は、この研究から得られたすべてのモデルチェックポイントを公開し、実践者が従来のスケーリング法に沿って相対性を測定することができるようにし、苦しい教訓に照らして頑健さの課題をより適切に優先順位付けできるようにします。

論文の概要: Relative Scaling Laws for LLMs

関連論文リスト