Fugu-MT 論文翻訳(概要): 3D and 4D World Modeling: A Survey

論文の概要: 3D and 4D World Modeling: A Survey

arxiv url: http://arxiv.org/abs/2509.07996v1
Date: Thu, 04 Sep 2025 17:59:58 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-11 15:16:52.184807
Title: 3D and 4D World Modeling: A Survey
Title（参考訳）: 3Dと4Dの世界モデリング:サーベイ
Authors: Lingdong Kong, Wesley Yang, Jianbiao Mei, Youquan Liu, Ao Liang, Dekai Zhu, Dongyue Lu, Wei Yin, Xiaotao Hu, Mingkai Jia, Junyuan Deng, Kaiwen Zhang, Yang Wu, Tianyi Yan, Shenyuan Gao, Song Wang, Linfeng Li, Liang Pan, Yong Liu, Jianke Zhu, Wei Tsang Ooi, Steven C. H. Hoi, Ziwei Liu,
Abstract要約: 世界モデリングはAI研究の基盤となり、エージェントが住んでいる動的な環境を理解し、表現し、予測することができるようになった。我々は、ビデオベース(VideoGen)、占領ベース(OccGen)、LiDARベース(LiDARGen)のアプローチにまたがる構造的分類法を導入する。実践的応用について議論し、オープン課題を特定し、有望な研究方向性を明らかにする。
参考スコア（独自算出の注目度）: 104.20852751473392
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: World modeling has become a cornerstone in AI research, enabling agents to understand, represent, and predict the dynamic environments they inhabit. While prior work largely emphasizes generative methods for 2D image and video data, they overlook the rapidly growing body of work that leverages native 3D and 4D representations such as RGB-D imagery, occupancy grids, and LiDAR point clouds for large-scale scene modeling. At the same time, the absence of a standardized definition and taxonomy for ``world models'' has led to fragmented and sometimes inconsistent claims in the literature. This survey addresses these gaps by presenting the first comprehensive review explicitly dedicated to 3D and 4D world modeling and generation. We establish precise definitions, introduce a structured taxonomy spanning video-based (VideoGen), occupancy-based (OccGen), and LiDAR-based (LiDARGen) approaches, and systematically summarize datasets and evaluation metrics tailored to 3D/4D settings. We further discuss practical applications, identify open challenges, and highlight promising research directions, aiming to provide a coherent and foundational reference for advancing the field. A systematic summary of existing literature is available at https://github.com/worldbench/survey
Abstract（参考訳）: 世界モデリングはAI研究の基盤となり、エージェントが住んでいる動的な環境を理解し、表現し、予測することができるようになった。以前の研究は2D画像とビデオデータの生成方法に大きく重点を置いていたが、RGB-D画像や占有グリッド、LiDAR点雲といったネイティブな3Dおよび4D表現を大規模シーンモデリングに活用する、急速に成長する作業体を見落としている。同時に、「世界モデル」の標準化された定義や分類が欠如していることは、文学において断片化され、時には矛盾する主張につながっている。本調査は,3次元及び4次元世界モデリングと生成に特化して焦点をあてた最初の総合的なレビューを提示することによって,これらのギャップに対処する。我々は、正確な定義を確立し、ビデオベース(VideoGen)、占領ベース(OccGen)、LiDARベース(LiDARGen)アプローチにまたがる構造化分類を導入し、3D/4D設定に適したデータセットと評価指標を体系的に要約する。我々はさらに実践的な応用を議論し、オープンな課題を特定し、将来的な研究の方向性を強調し、分野を前進させるための一貫性と基礎的な基準を提供することを目指している。既存の文献の体系的な概要はhttps://github.com/worldbench/surveyで確認できる。

論文の概要: 3D and 4D World Modeling: A Survey

関連論文リスト