Fugu-MT 論文翻訳(概要): REST3D: Reconstructing Physically Stable 3D Scenes from a Single Image

論文の概要: REST3D: Reconstructing Physically Stable 3D Scenes from a Single Image

arxiv url: http://arxiv.org/abs/2605.30338v1
Date: Thu, 28 May 2026 17:59:01 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-30 02:45:56.749522
Title: REST3D: Reconstructing Physically Stable 3D Scenes from a Single Image
Title（参考訳）: REST3D: 単一のイメージから物理的に安定な3Dシーンを再構築する
Authors: Xiaoxuan Ma, Jiashun Wang, Nicolas Ugrinovic, Yehonathan Litman, Kris Kitani,
Abstract要約: 1枚のRGB画像から物理的に安定な3Dシーンを再構成することで、カジュアル画像をシミュレーション可能なデジタルアセットに変換することができる。物理的シーン理解と物理制約のある精細化を統合することで、物理的に安定な3Dシーンを再構成できる単一画像再構成フレームワークであるREST3Dを提案する。
参考スコア（独自算出の注目度）: 31.061246129846044
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Reconstructing physically stable 3D scenes from a single RGB image enables casual images to be converted into simulation-ready digital assets for applications such as immersive interaction and content creation. However, existing single-image reconstruction methods fall short in capturing the physical structure of a scene. As a result, they often produce geometrically plausible but physically inconsistent results, including object floating and penetration, which lead to unstable behavior in physics simulations. Image-conditioned scene generation methods improve physical plausibility but often rely on strong scene priors, yielding plausible yet inaccurate object arrangements that fail to match the input image. We propose REST3D, a single-image reconstruction framework that can reconstruct physically stable 3D scenes by integrating physical scene understanding with physics-constrained refinement. We first introduce an agentic physical scene understanding technique that constructs a scene-tree representation capturing object physical states and inter-object relationships from a gravity-support perspective, providing a structural prior for reconstruction. Leveraging this structure, we initialize the scene using image-to-3D models, followed by scene-tree-guided alignment and physics-constrained optimization to resolve physical violations while preserving visual consistency with the input image. Experiments show that our method significantly reduces physical errors and improves simulation stability on both synthetic and real-world datasets while maintaining strong reconstruction quality. We further demonstrate the reconstructed scenes in VR-based human-object interaction, showing their potential for immersive applications.
Abstract（参考訳）: 単一のRGB画像から物理的に安定な3Dシーンを再構成することで、没入型インタラクションやコンテンツ生成などのアプリケーションのために、カジュアルイメージをシミュレーション可能なデジタルアセットに変換することができる。しかし,既存の単一像再構成手法ではシーンの物理的構造を捉えるには不十分である。その結果、幾何学的に可塑性であるが物理的に矛盾する結果がしばしば得られ、その中には物体の浮き上がりや浸透が含まれており、物理学シミュレーションでは不安定な振る舞いが生じる。画像条件付きシーン生成法は、物理的な可視性を改善するが、しばしば強いシーン先行に頼り、入力画像と一致しない可視かつ不正確なオブジェクト配置をもたらす。物理的シーン理解と物理制約のある精細化を統合することで、物理的に安定な3Dシーンを再構成できる単一画像再構成フレームワークであるREST3Dを提案する。まず,物体の物理的状態を捉えるシーンツリー表現と,重力支援の観点からのオブジェクト間関係を構築し,再構成のための構造的事前情報を提供するエージェント物理シーン理解手法を提案する。この構造を利用して、画像から3Dモデルを用いてシーンを初期化し、次いでシーンツリー誘導アライメントと物理制約のある最適化を行い、入力画像との視覚的整合性を維持しながら物理的違反を解決する。実験により,本手法は物理誤差を著しく低減し,再現性を維持しつつ,合成データセットと実世界のデータセットのシミュレーション安定性を向上させることが示された。さらに,VRによる人間と物体のインタラクションにおける再構成シーンを実証し,没入型アプリケーションの可能性を示した。

論文の概要: REST3D: Reconstructing Physically Stable 3D Scenes from a Single Image

関連論文リスト