Fugu-MT 論文翻訳(概要): TelePhysics: Physics-Grounded Multi-Object Scene Generation from a Single Image with Real-Time Interaction

論文の概要: TelePhysics: Physics-Grounded Multi-Object Scene Generation from a Single Image with Real-Time Interaction

arxiv url: http://arxiv.org/abs/2605.20290v1
Date: Tue, 19 May 2026 08:16:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-21 19:19:56.274363
Title: TelePhysics: Physics-Grounded Multi-Object Scene Generation from a Single Image with Real-Time Interaction
Title（参考訳）: TelePhysics: リアルタイムインタラクションによる単一画像からの物理を取り巻くマルチオブジェクトシーン生成
Authors: Xin Zhang, Yabo Chen, Yijie Fang, Wanying Qu, Haibin Huang, Chi Zhang, Feng Xu, Xuelong Li,
Abstract要約: トレーニング不要なフレームワークであるTelePhysicsを提案する。空間座標系で全シーンの幾何学を表現することで、TelePhysicsは物体の侵入とアライメントのあいまいさを解消する。実験結果から,TelePhysicsは,物理忠実度,空間コヒーレンス,制御性において,従来手法よりも大幅に優れていたことがわかった。
参考スコア（独自算出の注目度）: 51.01447538245441
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent generative video models achieve impressive visual quality but remain constrained by limited physical consistency and controllability. Existing video generation methods provide minimal physical control, and single-image-to-3D conversion approaches often suffer from object interpenetration. Furthermore, physics-based scene-level 3D generation methods exhibit spatial misalignment, stylized artifacts, and inconsistencies with the input data, restricting their use in realistic interactive video synthesis. We propose TelePhysics, a training-free framework that converts a single image into a physically consistent and controllable video through holistic scene-level 3D reconstruction. By representing the full scene geometry in a unified spatial coordinate system, TelePhysics resolves object penetration and alignment ambiguity. Unlike prior methods, this formulation enables accurate scenelevel multi-object interactions and introduces richer, complex control types for advanced mechanicsbased manipulation. By decoupling simulation from rendering, TelePhysics bypasses latency-heavy priors, achieving real-time physical interaction previews paired while preserving photorealistic visual fidelity. Experimental results demonstrate that TelePhysics substantially outperforms prior methods in physical fidelity, spatial coherence, and controllability. The open-source code is available at https://github.com/xinzhang007/TelePhysics.
Abstract（参考訳）: 最近の生成ビデオモデルは、印象的な視覚的品質を達成するが、物理的な一貫性と制御性に制限される。既存のビデオ生成方式は最小限の物理制御を提供し、単一画像から3D変換方式はオブジェクトの相互接続に悩まされることが多い。さらに、物理に基づくシーンレベルの3D生成手法では、空間的ミスアライメント、スタイル化されたアーティファクト、入力データとの整合性が示され、リアルなインタラクティブなビデオ合成における使用が制限される。本研究では,TelePhysicsを提案する。TelePhysicsは,総合的なシーンレベルの3D再構成によって,単一の画像を物理的に一貫した制御可能なビデオに変換する,トレーニング不要のフレームワークである。空間座標系で全シーンの幾何学を表現することで、TelePhysicsは物体の侵入とアライメントのあいまいさを解消する。従来の手法とは異なり、この定式化は正確なシーンレベルの複数オブジェクトの相互作用を可能にし、高度な力学に基づく操作のためのよりリッチで複雑な制御タイプを導入している。シミュレーションをレンダリングから切り離すことで、TelePhysicsは遅延重みを回避し、フォトリアリスティックな視覚的忠実さを維持しながら、リアルタイムの物理的相互作用プレビューを実現する。実験結果から,TelePhysicsは物理的忠実度,空間コヒーレンス,制御性において,従来手法よりも大幅に優れていた。オープンソースコードはhttps://github.com/xinzhang007/TelePhysics.comで公開されている。

論文の概要: TelePhysics: Physics-Grounded Multi-Object Scene Generation from a Single Image with Real-Time Interaction

関連論文リスト