Fugu-MT 論文翻訳(概要): PhysInOne: Visual Physics Learning and Reasoning in One Suite

論文の概要: PhysInOne: Visual Physics Learning and Reasoning in One Suite

arxiv url: http://arxiv.org/abs/2604.09415v1
Date: Fri, 10 Apr 2026 15:27:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-13 17:57:53.932023
Title: PhysInOne: Visual Physics Learning and Reasoning in One Suite
Title（参考訳）: PhysInOne - One Suiteにおける視覚物理学習と推論
Authors: Siyuan Zhou, Hejun Wang, Hu Cheng, Jinxi Li, Dongsheng Wang, Junwei Jiang, Yixiao Jin, Jiayue Huang, Shiwei Mao, Shangjia Liu, Yafei Yang, Hongkang Song, Shenxing Wei, Zihui Zhang, Peng Huang, Shijie Liu, Zhengli Hao, Hao Li, Yitian Li, Wenqi Zhou, Zhihan Zhao, Zongqi He, Hongtao Wen, Shouwang Huang, Peng Yun, Bowen Cheng, Pok Kazaf Fu, Wai Kit Lai, Jiahao Chen, Kaiyuan Wang, Zhixuan Sun, Ziqi Li, Haochen Hu, Di Zhang, Chun Ho Yuen, Bing Wang, Zhihua Wang, Chuhang Zou, Bo Yang,
Abstract要約: 我々は、AIシステムのための物理地上トレーニングデータの致命的な不足に対処する大規模な合成データセットであるPhysInOneを提示する。数百から数千のサンプルに制限された既存のデータセットとは異なり、PhysInOneは153,810のダイナミックな3Dシーンに200万のビデオを提供する。我々はPhysInOneの有効性を、物理対応ビデオ生成、長期・短期のフレーム予測、物理的特性推定、モーション転送の4つの新しいアプリケーションで実証する。
参考スコア（独自算出の注目度）: 41.08902311182402
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present PhysInOne, a large-scale synthetic dataset addressing the critical scarcity of physically-grounded training data for AI systems. Unlike existing datasets limited to merely hundreds or thousands of examples, PhysInOne provides 2 million videos across 153,810 dynamic 3D scenes, covering 71 basic physical phenomena in mechanics, optics, fluid dynamics, and magnetism. Distinct from previous works, our scenes feature multiobject interactions against complex backgrounds, with comprehensive ground-truth annotations including 3D geometry, semantics, dynamic motion, physical properties, and text descriptions. We demonstrate PhysInOne's efficacy across four emerging applications: physics-aware video generation, long-/short-term future frame prediction, physical property estimation, and motion transfer. Experiments show that fine-tuning foundation models on PhysInOne significantly enhances physical plausibility, while also exposing critical gaps in modeling complex physical dynamics and estimating intrinsic properties. As the largest dataset of its kind, orders of magnitude beyond prior works, PhysInOne establishes a new benchmark for advancing physics-grounded world models in generation, simulation, and embodied AI.
Abstract（参考訳）: 我々は、AIシステムのための物理地上トレーニングデータの致命的な不足に対処する大規模な合成データセットであるPhysInOneを提示する。 PhysInOneは、数百から数千のサンプルに制限された既存のデータセットとは異なり、153,810のダイナミックな3Dシーンで200万本のビデオを提供しており、力学、光学、流体力学、磁気学の71の基本的な物理現象をカバーしている。これまでの作品とは違って,3次元幾何学,意味論,動的運動,物理特性,テキスト記述など,複雑な背景に対する多目的インタラクションが特徴である。我々はPhysInOneの有効性を、物理対応ビデオ生成、長期・短期のフレーム予測、物理的特性推定、モーション転送の4つの新しいアプリケーションで実証する。実験により、PhysInOneの微調整基礎モデルは、複雑な物理力学をモデル化し、固有特性を推定する上で重要なギャップを顕在化しつつ、物理的可視性を著しく向上させることが示された。 PhysInOneは、その種類の最大のデータセットであり、前作よりも桁違いのオーダーで、生成、シミュレーション、そして具体化されたAIで物理地上の世界モデルを進化させるための、新しいベンチマークを確立した。

論文の概要: PhysInOne: Visual Physics Learning and Reasoning in One Suite

関連論文リスト