Fugu-MT 論文翻訳(概要): OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation

論文の概要: OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation

arxiv url: http://arxiv.org/abs/2603.30045v1
Date: Tue, 31 Mar 2026 17:59:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-01 15:25:03.971496
Title: OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation
Title（参考訳）: OmniRoam: 長距離パノラマ動画生成による世界 Wandering
Authors: Yuheng Liu, Xin Lin, Xinke Li, Baihan Yang, Chen Wang, Kalyan Sunkavalli, Yannick Hold-Geoffroy, Hao Tan, Kai Zhang, Xiaohui Xie, Zifan Shi, Yiwei Hu,
Abstract要約: 制御可能なパノラマビデオ生成フレームワークであるOmniRoamを提案する。本フレームワークは,パノラマ表現の長期的・時間的一貫性と,フレーム単位のシーンのリッチなカバレッジを活用している。実験により、我々のフレームワークは、視覚的品質、制御可能性、長期的なシーンの一貫性の観点から、常に最先端の手法より優れていることが示された。
参考スコア（独自算出の注目度）: 42.159343032593014
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modeling scenes using video generation models has garnered growing research interest in recent years. However, most existing approaches rely on perspective video models that synthesize only limited observations of a scene, leading to issues of completeness and global consistency. We propose OmniRoam, a controllable panoramic video generation framework that exploits the rich per-frame scene coverage and inherent long-term spatial and temporal consistency of panoramic representation, enabling long-horizon scene wandering. Our framework begins with a preview stage, where a trajectory-controlled video generation model creates a quick overview of the scene from a given input image or video. Then, in the refine stage, this video is temporally extended and spatially upsampled to produce long-range, high-resolution videos, thus enabling high-fidelity world wandering. To train our model, we introduce two panoramic video datasets that incorporate both synthetic and real-world captured videos. Experiments show that our framework consistently outperforms state-of-the-art methods in terms of visual quality, controllability, and long-term scene consistency, both qualitatively and quantitatively. We further showcase several extensions of this framework, including real-time video generation and 3D reconstruction. Code is available at https://github.com/yuhengliu02/OmniRoam.
Abstract（参考訳）: 近年,映像生成モデルを用いたシーンのモデリングが研究の関心を高めている。しかし、既存のほとんどのアプローチは、シーンの限られた観察のみを合成する視点ビデオモデルに依存しており、完全性やグローバルな一貫性の問題に繋がる。 OmniRoamは,パノラマ映像のリッチなフレーム単位のシーンカバレッジとパノラマ表現の時間的空間的・時間的一貫性を活かし,長時間のシーンの移動を可能にする,制御可能なパノラマ映像生成フレームワークである。我々のフレームワークは、軌道制御されたビデオ生成モデルが与えられた入力画像やビデオからシーンのクイックオーバービューを生成するプレビューステージから始まります。次に、このビデオは時間的に拡張され、空間的にアップサンプリングされ、長距離で高解像度なビデオを生成するため、高忠実度の世界をさまようことができる。モデルをトレーニングするために、合成ビデオと実世界のビデオの両方を組み込んだ2つのパノラマビデオデータセットを導入しました。実験の結果、我々のフレームワークは、視覚的品質、制御可能性、長期のシーンの一貫性において、質的かつ定量的に、常に最先端の手法より優れていることがわかった。さらに、リアルタイムビデオ生成や3D再構成など、このフレームワークの拡張についても紹介する。コードはhttps://github.com/yuhengliu02/OmniRoamで入手できる。

論文の概要: OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation

関連論文リスト