Fugu-MT 論文翻訳(概要): LiveStre4m: Feed-Forward Live Streaming of Novel Views from Unposed Multi-View Video

論文の概要: LiveStre4m: Feed-Forward Live Streaming of Novel Views from Unposed Multi-View Video

arxiv url: http://arxiv.org/abs/2604.06740v1
Date: Wed, 08 Apr 2026 07:01:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-09 17:30:51.382598
Title: LiveStre4m: Feed-Forward Live Streaming of Novel Views from Unposed Multi-View Video
Title（参考訳）: LiveStre4m: 未投稿のマルチビュービデオから新しいビューのフィードフォワードライブストリーミング
Authors: Pedro Quesado, Erkut Akdag, Yasaman Kashefbahrami, Willem Menu, Egor Bondarev,
Abstract要約: ライブストリーミング未投稿のマルチビュービデオからのノベルビュー合成は、幅広いアプリケーションにおいてオープンな課題である。本稿では,未提示のマルチビュー入力からリアルタイムNVSのフィードフォワードモデルである,視点映像のライブストリーミング手法(LiveStre4m)を提案する。提案手法は,2つの同期されていない入力ストリームを用いて,時間的に一貫したビデオストリーミングを実現する。
参考スコア（独自算出の注目度）: 5.5263731799099425
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Live-streaming Novel View Synthesis (NVS) from unposed multi-view video remains an open challenge in a wide range of applications. Existing methods for dynamic scene representation typically require ground-truth camera parameters and involve lengthy optimizations ($\approx 2.67$s), which makes them unsuitable for live streaming scenarios. To address this issue, we propose a novel viewpoint video live-streaming method (LiveStre4m), a feed-forward model for real-time NVS from unposed sparse multi-view inputs. LiveStre4m introduces a multi-view vision transformer for keyframe 3D scene reconstruction coupled with a diffusion-transformer interpolation module that ensures temporal consistency and stable streaming. In addition, a Camera Pose Predictor module is proposed to efficiently estimate both poses and intrinsics directly from RGB images, removing the reliance on known camera calibration information. Our approach enables temporally consistent novel-view video streaming in real-time using as few as two synchronized unposed input streams. LiveStre4m attains an average reconstruction time of $ 0.07$s per-frame at $ 1024 \times 768$ resolution, outperforming the optimization-based dynamic scene representation methods by orders of magnitude in runtime. These results demonstrate that LiveStre4m makes real-time NVS streaming feasible in practical settings, marking a substantial step toward deployable live novel-view synthesis systems. Code available at: https://github.com/pedro-quesado/LiveStre4m
Abstract（参考訳）: 未提示のマルチビュービデオからのNVS(Nove-streaming Novel View Synthesis)は、幅広いアプリケーションにおいてオープンな課題である。動的シーン表現の既存の手法は、通常、地味なカメラパラメータを必要とし、長い最適化(\approx 2.67$s)を含むため、ライブストリーミングのシナリオには適さない。そこで本研究では,未提示のマルチビュー入力からリアルタイムNVSのフィードフォワードモデルである,視点映像のライブストリーミング手法(LiveStre4m)を提案する。 LiveStre4mは、キーフレーム3Dシーン再構築のためのマルチビュービジョントランスフォーマーと、時間的一貫性と安定したストリーミングを保証する拡散変圧器補間モジュールを導入している。さらに、RGB画像から直接ポーズと内在性の両方を効率的に推定し、既知のカメラキャリブレーション情報に依存しないカメラポーズ予測モジュールを提案する。提案手法は,2つの同期されていない入力ストリームを用いて,時間的に一貫したビデオストリーミングを実現する。 LiveStre4mは1024 \times 768$で1フレームあたり平均0.07$sのリビルド時間を実現し、最適化ベースの動的シーン表現メソッドを実行時の桁数で上回っている。これらの結果から、LiveStre4mはリアルタイムNVSストリーミングを実用的な設定で実現可能であることが示され、ライブノベルビュー合成システムの実現に向けた大きな一歩となった。コード https://github.com/pedro-quesado/LiveStre4m

論文の概要: LiveStre4m: Feed-Forward Live Streaming of Novel Views from Unposed Multi-View Video

関連論文リスト