Fugu-MT 論文翻訳(概要): SceneWiz3D: Towards Text-guided 3D Scene Composition

論文の概要: SceneWiz3D: Towards Text-guided 3D Scene Composition

arxiv url: http://arxiv.org/abs/2312.08885v1
Date: Wed, 13 Dec 2023 18:59:30 GMT
ステータス: 翻訳完了
システム内更新日: 2023-12-15 22:26:29.268945
Title: SceneWiz3D: Towards Text-guided 3D Scene Composition
Title（参考訳）: SceneWiz3D:テキスト誘導型3Dシーン構成を目指して
Authors: Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin, Peiye Zhuang, Yinghao Xu, Ceyuan Yang, Dahua Lin, Bolei Zhou, Sergey Tulyakov, Hsin-Ying Lee
Abstract要約: 既存のアプローチでは、大規模なテキスト・ツー・イメージモデルを使用して3D表現を最適化するか、オブジェクト中心のデータセット上で3Dジェネレータをトレーニングする。テキストから高忠実度3Dシーンを合成する新しい手法であるSceneWiz3Dを紹介する。
参考スコア（独自算出の注目度）: 134.71933134180782
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We are witnessing significant breakthroughs in the technology for generating 3D objects from text. Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets. Generating entire scenes, however, remains very challenging as a scene contains multiple 3D objects, diverse and scattered. In this work, we introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text. We marry the locality of objects with globality of scenes by introducing a hybrid 3D representation: explicit for objects and implicit for scenes. Remarkably, an object, being represented explicitly, can be either generated from text using conventional text-to-3D approaches, or provided by users. To configure the layout of the scene and automatically place objects, we apply the Particle Swarm Optimization technique during the optimization process. Furthermore, it is difficult for certain parts of the scene (e.g., corners, occlusion) to receive multi-view supervision, leading to inferior geometry. We incorporate an RGBD panorama diffusion model to mitigate it, resulting in high-quality geometry. Extensive evaluation supports that our approach achieves superior quality over previous approaches, enabling the generation of detailed and view-consistent 3D scenes.
Abstract（参考訳）: 私たちは、テキストから3dオブジェクトを生成する技術における大きなブレークスルーを目撃しています。既存のアプローチでは、大規模なテキスト・ツー・イメージモデルを使用して3D表現を最適化するか、オブジェクト中心のデータセット上で3Dジェネレータをトレーニングする。しかし、シーン全体の生成は非常に困難であり、複数の3Dオブジェクトが多様で散在している。本研究では,テキストから高忠実度3Dシーンを合成するSceneWiz3Dを紹介する。オブジェクトの局所性とシーンのグローバル性は,オブジェクトの明示性とシーンの暗黙性という,ハイブリッドな3D表現を導入することで結婚する。注目すべきは、明示的に表現されたオブジェクトは、従来のテキストから3Dのアプローチを使ってテキストから生成されるか、あるいはユーザによって提供される。シーンのレイアウトを設定し,オブジェクトを自動的に配置するために,最適化プロセス中にParticle Swarm Optimization手法を適用する。さらに、シーンの特定の部分(コーナー、オクルージョンなど)が多視点の監督を受けることは困難であり、幾何学的に劣る。我々は,rgbdパノラマ拡散モデルを導入してその緩和を行い,高品質な幾何学を実現した。広汎な評価は,従来のアプローチよりも優れた品質を実現し,詳細な3Dシーンの生成を可能にする。

論文の概要: SceneWiz3D: Towards Text-guided 3D Scene Composition

関連論文リスト