Fugu-MT 論文翻訳(概要): DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model

論文の概要: DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model

arxiv url: http://arxiv.org/abs/2304.02827v1
Date: Thu, 6 Apr 2023 02:27:22 GMT
ステータス: 翻訳完了
システム内更新日: 2023-04-07 15:33:09.992058
Title: DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model
Title（参考訳）: DITTO-NeRF:拡散に基づく全方向3次元モデルへの反復テキスト
Authors: Hoigi Seo, Hayeon Kim, Gwanghyun Kim, Se Young Chun
Abstract要約: テキストプロンプトや単一画像から高品質な3D NeRFモデルを生成するための新しいパイプラインを提案する。 DitTO-NeRFは、前景から与えられたまたはテキスト生成された2D画像を用いて、制限付きインバウンダリ(IB)角度のための高品質な部分的な3Dオブジェクトを構築する。我々は,DITTO-NeRFにおける3次元オブジェクト再構成手法を,スケール(低分解能),アングル(IB角),外界(OB),マスク(背景境界)の3次元オブジェクト再構成方式を提案する。
参考スコア（独自算出の注目度）: 15.091263190886337
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The increasing demand for high-quality 3D content creation has motivated the development of automated methods for creating 3D object models from a single image and/or from a text prompt. However, the reconstructed 3D objects using state-of-the-art image-to-3D methods still exhibit low correspondence to the given image and low multi-view consistency. Recent state-of-the-art text-to-3D methods are also limited, yielding 3D samples with low diversity per prompt with long synthesis time. To address these challenges, we propose DITTO-NeRF, a novel pipeline to generate a high-quality 3D NeRF model from a text prompt or a single image. Our DITTO-NeRF consists of constructing high-quality partial 3D object for limited in-boundary (IB) angles using the given or text-generated 2D image from the frontal view and then iteratively reconstructing the remaining 3D NeRF using inpainting latent diffusion model. We propose progressive 3D object reconstruction schemes in terms of scales (low to high resolution), angles (IB angles initially to outer-boundary (OB) later), and masks (object to background boundary) in our DITTO-NeRF so that high-quality information on IB can be propagated into OB. Our DITTO-NeRF outperforms state-of-the-art methods in terms of fidelity and diversity qualitatively and quantitatively with much faster training times than prior arts on image/text-to-3D such as DreamFusion, and NeuralLift-360.
Abstract（参考訳）: 高品質な3Dコンテンツ作成の需要が高まり、単一の画像やテキストプロンプトから3Dオブジェクトモデルを作成する自動化手法の開発が動機となっている。しかし, 最先端画像から3次元画像への再構成では, 与えられた画像との対応性が低く, マルチビューの整合性が低い。近年の最先端のテキスト・ツー・3D法も制限されており、短い合成時間で1プロンプトあたりの多様性の低い3Dサンプルが得られる。これらの課題に対処するために,テキストプロンプトや単一画像から高品質な3D NeRFモデルを生成する新しいパイプラインであるDITTO-NeRFを提案する。提案のディットナーフは,与えられたあるいはテキストで生成された2次元画像を用いて,限定的な境界(ib)角の高品質な部分的3dオブジェクトを構築し,その残りの3d nerfをインパイント潜在拡散モデルを用いて反復的に再構成する。提案手法では, スケール(低分解能から高分解能), 角度(初期から外界(ob)まで), マスク(オブジェクトから背景境界まで)の3次元オブジェクト再構成方式を提案し, ibの高品質な情報をobに伝達する。我々のDITTO-NeRFは、DreamFusionやNeuralLift-360のような画像/テキスト3Dの先行技術よりも、定性的かつ定量的なトレーニング時間で最先端の手法より優れています。

論文の概要: DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model

関連論文リスト