Fugu-MT 論文翻訳(概要): Network-Efficient World Model Token Streaming

論文の概要: Network-Efficient World Model Token Streaming

arxiv url: http://arxiv.org/abs/2605.09886v1
Date: Mon, 11 May 2026 02:19:18 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:50.473213
Title: Network-Efficient World Model Token Streaming
Title（参考訳）: ネットワーク効率の良いワールドモデルトークンストリーミング
Authors: Shatadal Mishra, Ahmadreza Moradipari, Nejib Ammar,
Abstract要約: 本研究では,各288x512フレームを18x32のトークンIDにマッピングする離散世界モデルのネットワーク効率なストリーミングについて検討する。我々は,コードブック埋め込み空間において,コサイン距離によるデルタ更新を優先するオンラインラベルフリーアルゴリズムを提案する。結果は、帯域幅対応同期のための実用的なシステム層として、離散トークン状態ストリーミングをサポートする。
参考スコア（独自算出の注目度）: 2.198430261120653
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative driving world models rely on compact latent state representations that must be efficiently transmitted and synchronized across distributed compute and connected vehicles. We study network-efficient streaming of a discrete world model state, where a stride-16 VQ-U-Net tokenizer (codebook size 8,192) maps each 288x512 frame to an 18x32 grid of token IDs (576 tokens/frame), equivalent to 936 bytes/frame under fixed-length coding. We consider a keyframe--delta protocol under strict per-message payload budgets and packet loss, and propose a fully online, label-free algorithm that prioritizes delta updates via cosine distance in codebook embedding space and triggers keyframes adaptively using a Hamming-drift threshold. The adaptive algorithm consistently improves the rate distortion frontier over periodic keyframes at matched bitrates: at 0.024 Mb/s (200-byte budget) dynamic-only embedding distortion drops from 0.0712 to 0.0661 (7.2\%), and at 0.036 Mb/s (400-byte budget) from 0.0427 to 0.0407 (4.8\%). Under 10\% delta packet loss at 200 bytes, dynamic-only distortion is 0.0757 versus 0.0789 for a matched periodic baseline. To connect state fidelity to world model usefulness, we train a lightweight next-token predictor and evaluate perplexity conditioned on streamed receiver states: at 0.024 Mb/s, dynamic-position perplexity improves from 206.0 to 193.1 (6.3\%), and at 0.036 Mb/s from 158.9 to 155.6 (2.1\%). These results support discrete token-state streaming as a practical systems layer for bandwidth-aware synchronization and improved downstream token-dynamics utility under vehicular networking constraints.
Abstract（参考訳）: 生成駆動世界モデルは、分散計算および接続された車両間で効率よく伝達および同期されなければならない、コンパクトな潜在状態表現に依存している。我々は,288x512フレームのストライド16VQ-U-Netトークンライザ(コードブックサイズ8,192)が,固定長符号化で936バイト/フレームに相当するトークンID(576トークン/フレーム)の18x32グリッドにマップする離散世界モデル状態のネットワーク効率ストリーミングについて検討した。我々は、厳密なメッセージ単位のペイロード予算とパケットロスに基づくキーフレームデルタプロトコルを検討し、コードブック埋め込み空間におけるコサイン距離によるデルタ更新を優先し、ハミングドリフト閾値を用いてキーフレームを適応的にトリガーする完全オンラインラベルフリーアルゴリズムを提案する。アダプティブアルゴリズムは、一致したビットレートでの周期的鍵フレームに対するレート歪みフロンティアを一貫して改善する: 0.024 Mb/s (200バイト予算) 動的のみの埋め込み歪みは 0.0712 から 0.0661 (7.2 %) に減少し、0.036 Mb/s (400バイト予算) は 0.0427 から 0.0407 (4.8 %) に減少する。 200バイトでの10\%のデルタパケット損失では、動的のみの歪みは0.0757対0.0789となる。状態の忠実度を世界モデルの有用性に結びつけるために、軽量な次トーケン予測器をトレーニングし、ストリーム受信状態に条件付けされたパープレキシティを評価する。0.024 Mb/sでは、動的位置のパープレキシティが206.0から193.1 (6.3\%)、0.036 Mb/sは158.9から155.6 (2.1\%)である。これらの結果は、帯域幅対応同期のための実用的なシステム層として離散トークン状態ストリーミングをサポートし、車載ネットワーク制約下でのダウンストリームトークン-ダイナミックスユーティリティを改善した。

論文の概要: Network-Efficient World Model Token Streaming

関連論文リスト