Fugu-MT 論文翻訳(概要): Interact3D: Compositional 3D Generation of Interactive Objects

論文の概要: Interact3D: Compositional 3D Generation of Interactive Objects

arxiv url: http://arxiv.org/abs/2603.16085v1
Date: Tue, 17 Mar 2026 03:21:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-18 17:42:07.083015
Title: Interact3D: Compositional 3D Generation of Interactive Objects
Title（参考訳）: Interact3D:インタラクティブオブジェクトの合成3D生成
Authors: Hui Shan, Keyang Luo, Ming Li, Sizhe Zheng, Yanwei Fu, Zhen Chen, Xiangru Huang,
Abstract要約: 本稿では,3次元合成オブジェクト間の相互作用を物理的に妥当に生成する新しいフレームワークを提案する。当社のアプローチは、まず先進的な先進的な先進的手法を活用して、高品質な個人資産をキュレートする。これらの資産を物理的に構成するために、ロバストな2段階合成パイプラインを導入する。
参考スコア（独自算出の注目度）: 31.12099147294145
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent breakthroughs in 3D generation have enabled the synthesis of high-fidelity individual assets. However, generating 3D compositional objects from single images--particularly under occlusions--remains challenging. Existing methods often degrade geometric details in hidden regions and fail to preserve the underlying object-object spatial relationships (OOR). We present a novel framework Interact3D designed to generate physically plausible interacting 3D compositional objects. Our approach first leverages advanced generative priors to curate high-quality individual assets with a unified 3D guidance scene. To physically compose these assets, we then introduce a robust two-stage composition pipeline. Based on the 3D guidance scene, the primary object is anchored through precise global-to-local geometric alignment (registration), while subsequent geometries are integrated using a differentiable Signed Distance Field (SDF)-based optimization that explicitly penalizes geometry intersections. To reduce challenging collisions, we further deploy a closed-loop, agentic refinement strategy. A Vision-Language Model (VLM) autonomously analyzes multi-view renderings of the composed scene, formulates targeted corrective prompts, and guides an image editing module to iteratively self-correct the generation pipeline. Extensive experiments demonstrate that Interact3D successfully produces promising collsion-aware compositions with improved geometric fidelity and consistent spatial relationships.
Abstract（参考訳）: 最近の3D生成のブレークスルーにより、高忠実度個別資産の合成が可能になった。しかし、単一の画像から、特に閉塞下で3D合成オブジェクトを生成することは困難である。既存の手法はしばしば隠れた領域の幾何学的詳細を劣化させ、基礎となるオブジェクトとオブジェクトの空間関係(OOR)を保たない。本稿では,物理的に妥当な3D合成オブジェクトを生成するための新しいフレームワークであるInteract3Dを提案する。提案手法は,まず先進的な先進的先進的手法を利用して,高品質な個人資産を統一的な3D誘導シーンでキュレートする。これらの資産を物理的に構成するために、ロバストな2段階合成パイプラインを導入する。 3D誘導シーンに基づいて、主対象は精密なグローバル-ローカルな幾何アライメント(登録)により固定され、その後のジオメトリは、幾何交叉を明示する微分可能符号距離場(SDF)に基づく最適化を用いて統合される。衝突の難易度を下げるために,我々はさらにクローズドループ,エージェント・リファインメント・ストラテジーを展開する。 Vision-Language Model(VLM)は、合成シーンのマルチビューレンダリングを自律的に分析し、修正プロンプトをターゲットとした定式化を行い、画像編集モジュールを反復的に生成パイプラインを自己修正する。大規模な実験により、Interact3Dは幾何の忠実さと一貫した空間的関係を改良した有望なコレーション認識合成を成功に導いた。

論文の概要: Interact3D: Compositional 3D Generation of Interactive Objects

関連論文リスト