Fugu-MT 論文翻訳(概要): EnergyLens: Interpretable Closed-Form Energy Models for Multimodal LLM Inference Serving

論文の概要: EnergyLens: Interpretable Closed-Form Energy Models for Multimodal LLM Inference Serving

arxiv url: http://arxiv.org/abs/2605.10556v2
Date: Wed, 13 May 2026 14:15:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-14 17:13:58.859971
Title: EnergyLens: Interpretable Closed-Form Energy Models for Multimodal LLM Inference Serving
Title（参考訳）: EnergyLens:マルチモーダルLLM推論のための解釈可能なクローズドフォームエネルギーモデル
Authors: Vittorio Palladino, Gianluca Palermo, Michael E. Papka, Zhiling Lan,
Abstract要約: 既存のアプローチでは、レイテンシをエネルギプロキシとして扱うか、データ不足のブラックボックスサロゲートに依存している。本稿では,データに対する構造発見ツールとして,シンボル回帰を用いたEnergyLensを提案する。ブラックボックスサロゲートとは異なり、EnergyLensはテンソルとパイプライン並列性を分離し、デコードエネルギーからプリフィルを分離する。
参考スコア（独自算出の注目度）: 2.7498981662768536
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As large language models span dense, mixture-of-experts, and state-space architectures and are deployed on heterogeneous accelerators under increasingly diverse multimodal workloads, optimising inference energy has become as critical as optimizing latency and throughput. Existing approaches either treat latency as an energy proxy or rely on data-hungry black-box surrogates. Both fail under varying parallelism strategies: latency and energy optima diverge in over 20% of configurations we tested, and black-box surrogates require hundreds of profiling samples to generalize across model families and hardware. We present EnergyLens, which uses symbolic regression as a structure-discovery tool over profiling data to derive a single twelve-parameter closed-form energy model expressed in terms of system properties such as degree of parallelism, batch size, and sequence length. Unlike black-box surrogates, EnergyLens decouples tensor and pipeline parallelism contributions and separates prefill from decode energy, making its predictions physically interpretable and actionable. Fitted from as few as 50 profiling measurements, EnergyLens achieves 88.2% Top-1 configuration selection accuracy across many evaluation scenarios compared to 60.9% for the closest prior analytical baseline, matches the predictive accuracy of ensemble ML methods with 10x fewer profiling samples, and extrapolates reliably to unseen batch sizes and hardware platforms without structural modification, making it a practical, interpretable tool for energy-optimal LLM deployment.
Abstract（参考訳）: 大規模言語モデルは、密集した、エキスパートの混在、およびステートスペースアーキテクチャにまたがり、より多様なマルチモーダルワークロードの下で異種アクセラレータにデプロイされるため、推論エネルギーの最適化は、レイテンシとスループットの最適化と同じくらい重要になっている。既存のアプローチでは、レイテンシをエネルギプロキシとして扱うか、データ不足のブラックボックスサロゲートに依存している。レイテンシとエネルギーの最適化は、テストした構成の20%以上で分散しますし、ブラックボックスサロゲートでは、モデルファミリやハードウェアをまたいで一般化するために、数百のプロファイリングサンプルが必要です。本稿では, 並列度, バッチサイズ, シーケンス長などのシステム特性で表される1つの12パラメータ閉形式エネルギーモデルを, プロファイリングデータ上でのシンボル回帰を構造発見ツールとして利用するEnergyLensを提案する。ブラックボックスのサロゲートとは異なり、EnergyLensはテンソルとパイプラインの並列性を分離し、プリフィルとデコードエネルギーを分離し、その予測は物理的に解釈可能で動作可能である。 50点までのプロファイリング測定から得られたEnergyLensは、最も近い分析ベースラインの60.9%と比較して、多くの評価シナリオで88.2%のTop-1設定精度を達成し、アンサンブルML法の予測精度を10倍のプロファイリングサンプルと一致させ、構造的な変更なしに未確認のバッチサイズやハードウェアプラットフォームに確実に外挿し、エネルギー最適化LLM展開のための実用的な解釈可能なツールである。

論文の概要: EnergyLens: Interpretable Closed-Form Energy Models for Multimodal LLM Inference Serving

関連論文リスト