Fugu-MT 論文翻訳(概要): The Hidden Cost of Thinking: Energy Use and Environmental Impact of LMs Beyond Pretraining

論文の概要: The Hidden Cost of Thinking: Energy Use and Environmental Impact of LMs Beyond Pretraining

arxiv url: http://arxiv.org/abs/2605.01158v1
Date: Fri, 01 May 2026 23:24:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:49.614401
Title: The Hidden Cost of Thinking: Energy Use and Environmental Impact of LMs Beyond Pretraining
Title（参考訳）: 省エネと環境影響
Authors: Jacob Morrison, Noah A. Smith, Emma Strubell,
Abstract要約: Olmo 3は70億と32億のパラメータモデルからなるファミリーである。モデル開発プロセスでは,12.3GWhのデータセンターエネルギーを消費し,4,251tCO2eqを放出し,15,887kLの水を消費した。
参考スコア（独自算出の注目度）: 59.27959962602072
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern language model development extends far beyond pretraining, yet environmental reporting remains narrowly focused on the cost of training a single final model. In this work, we provide the first detailed breakdown of the environmental impact of a full model development pipeline, from pretraining through supervised fine-tuning, preference optimization, and reinforcement learning, for Olmo 3, a family of 7 billion and 32 billion parameter models in both instruction-following and reasoning variants. We find that reasoning models are 17x more expensive to post-train than their instruction-tuned counterparts in terms of datacenter energy, driven by reinforcement learning rollout generation. Development costs (including experimentation, failed runs, and ablations) account for 82.2% of total compute, a roughly 65% increase over the ~50% reported for pretraining-focused pipelines in prior work. In total, we estimate our model development process consumed ~12.3 GWh of datacenter energy, emitted 4,251 tCO2eq, and consumed 15,887 kL of water, with water consumption driven entirely by power generation infrastructure rather than data center cooling. These costs, which are almost entirely unreported by model developers, are growing rapidly as post-training pipelines become more complex, and must be accounted for in environmental reporting standards and by the research community working to reduce AI's environmental impact.
Abstract（参考訳）: 現代の言語モデル開発は、事前訓練をはるかに超えているが、環境報告は、単一の最終モデルの訓練コストに焦点を絞っている。本研究は,Olmo 3において,教師付き微調整,選好最適化,強化学習といった,フルモデル開発パイプラインの環境影響について,インストラクションフォローと推論の両バリエーションにおいて,70億および32億のパラメータモデルを持つファミリーに対して,初めて詳細に説明したものである。推論モデルは、強化学習のロールアウト生成によって駆動されるデータセンターのエネルギの観点から、トレーニング後のトレーニングよりも17倍の費用がかかることが分かりました。開発コスト(実験、走行失敗、アブレーションを含む)は総計算の82.2%を占めており、事前訓練のパイプラインで報告された約50%よりも約65%増加している。総じて、我々のモデル開発プロセスはデータセンターのエネルギーを約12.3GWh消費し、4,251tCO2eqを排出し、15,887kLの水を消費した。これらのコストは、ほとんど完全にモデル開発者によって報告されていないが、ポストトレーニングパイプラインがより複雑になるにつれて急速に増加しており、環境報告標準やAIの環境への影響を減らす研究コミュニティが考慮しなければならない。

論文の概要: The Hidden Cost of Thinking: Energy Use and Environmental Impact of LMs Beyond Pretraining

関連論文リスト