Fugu-MT 論文翻訳(概要): Video Killed the Energy Budget: Characterizing the Latency and Power Regimes of Open Text-to-Video Models

論文の概要: Video Killed the Energy Budget: Characterizing the Latency and Power Regimes of Open Text-to-Video Models

arxiv url: http://arxiv.org/abs/2509.19222v1
Date: Tue, 23 Sep 2025 16:47:03 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-24 20:41:27.955534
Title: Video Killed the Energy Budget: Characterizing the Latency and Power Regimes of Open Text-to-Video Models
Title（参考訳）: 動画が省エネ予算を解消:オープンテキスト・ビデオモデルのレイテンシとパワーレジームを特徴付ける
Authors: Julien Delavande, Regis Pierrard, Sasha Luccioni,
Abstract要約: 本稿では,最先端T2Vモデルのレイテンシとエネルギー消費に関する系統的研究を行う。まず,空間分解能,時間長,分極ステップのスケーリング法則を予測する計算バウンド解析モデルを構築した。次に、これらの予測をWAN2.1-T2Vの詳細な実験により検証し、空間的および時間的次元の2次成長とデノナイジングステップの数による線形スケーリングを示す。
参考スコア（独自算出の注目度）: 4.513690948889834
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in text-to-video (T2V) generation have enabled the creation of high-fidelity, temporally coherent clips from natural language prompts. Yet these systems come with significant computational costs, and their energy demands remain poorly understood. In this paper, we present a systematic study of the latency and energy consumption of state-of-the-art open-source T2V models. We first develop a compute-bound analytical model that predicts scaling laws with respect to spatial resolution, temporal length, and denoising steps. We then validate these predictions through fine-grained experiments on WAN2.1-T2V, showing quadratic growth with spatial and temporal dimensions, and linear scaling with the number of denoising steps. Finally, we extend our analysis to six diverse T2V models, comparing their runtime and energy profiles under default settings. Our results provide both a benchmark reference and practical insights for designing and deploying more sustainable generative video systems.
Abstract（参考訳）: テキスト・ツー・ビデオ(T2V)生成の最近の進歩により、自然言語のプロンプトから高忠実で時間的にコヒーレントなクリップを作成できるようになった。しかし、これらのシステムにはかなりの計算コストが伴い、そのエネルギー需要はいまだに理解されていない。本稿では,最先端のオープンソースT2Vモデルのレイテンシとエネルギー消費に関する系統的研究を行う。まず,空間分解能,時間長,分極ステップのスケーリング法則を予測する計算バウンド解析モデルを構築した。次に、これらの予測をWAN2.1-T2Vの詳細な実験により検証し、空間的および時間的次元の2次成長とデノナイジングステップの数による線形スケーリングを示す。最後に、分析を6つの多様なT2Vモデルに拡張し、デフォルト設定下でのランタイムとエネルギプロファイルを比較します。本結果は,より持続可能な生成ビデオシステムの設計と展開のためのベンチマーク基準と実用的な知見を提供する。

論文の概要: Video Killed the Energy Budget: Characterizing the Latency and Power Regimes of Open Text-to-Video Models

関連論文リスト