Related papers: One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

URL: http://arxiv.org/abs/2311.07885v1
Date: Tue, 14 Nov 2023 03:40:25 GMT
Title: One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
Authors: Minghua Liu, Ruoxi Shi, Linghao Chen, Zhuoyang Zhang, Chao Xu, Xinyue Wei, Hansheng Chen, Chong Zeng, Jiayuan Gu, Hao Su
Abstract summary: One-2-3-45++ is an innovative method that transforms a single image into a detailed 3D textured mesh in approximately one minute. Our approach aims to fully harness the extensive knowledge embedded in 2D diffusion models and priors from valuable yet limited 3D data.
Score: 32.29687304798145
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements in open-world 3D object generation have been remarkable, with image-to-3D methods offering superior fine-grained control over their text-to-3D counterparts. However, most existing models fall short in simultaneously providing rapid generation speeds and high fidelity to input images - two features essential for practical applications. In this paper, we present One-2-3-45++, an innovative method that transforms a single image into a detailed 3D textured mesh in approximately one minute. Our approach aims to fully harness the extensive knowledge embedded in 2D diffusion models and priors from valuable yet limited 3D data. This is achieved by initially finetuning a 2D diffusion model for consistent multi-view image generation, followed by elevating these images to 3D with the aid of multi-view conditioned 3D native diffusion models. Extensive experimental evaluations demonstrate that our method can produce high-quality, diverse 3D assets that closely mirror the original input image. Our project webpage: https://sudo-ai-3d.github.io/One2345plus_page.

Related papers

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation [15.374873279207623]
We introduce Kiss3DGen (Keep It Simple and Straightforward in 3D Generation), an efficient framework for generating, editing, and enhancing 3D objects. Specifically, we fine-tune a diffusion model to generate ''3D Bundle Image'', a tiled representation composed of multi-view images and their corresponding normal maps. This simple method effectively transforms the 3D generation problem into a 2D image generation task, maximizing the utilization of knowledge in pretrained diffusion models.
arXiv Detail & Related papers (2025-03-03T10:07:19Z)
Zero-1-to-G: Taming Pretrained 2D Diffusion Model for Direct 3D Generation [66.75243908044538]
We introduce Zero-1-to-G, a novel approach to direct 3D generation on Gaussian splats using pretrained 2D diffusion models. To incorporate 3D awareness, we introduce cross-view and cross-attribute attention layers, which capture complex correlations and enforce 3D consistency across generated splats. This makes Zero-1-to-G the first direct image-to-3D generative model to effectively utilize pretrained 2D diffusion priors, enabling efficient training and improved generalization to unseen objects.
arXiv Detail & Related papers (2025-01-09T18:37:35Z)
Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation [15.215597253086612]
We bridge the quality gap between methods that directly generate 3D representations and ones that reconstruct 3D objects from multi-view images. We introduce a multi-view to multi-view diffusion model called Sharp-It, which takes a 3D consistent set of multi-view images. We demonstrate that Sharp-It enables various 3D applications, such as fast synthesis, editing, and controlled generation, while attaining high-quality assets.
arXiv Detail & Related papers (2024-12-03T17:58:07Z)
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models [112.2625368640425]
High-resolution Image-to-3D model (Hi3D) is a new video diffusion based paradigm that redefines a single image to multi-view images as 3D-aware sequential image generation. Hi3D first empowers the pre-trained video diffusion model with 3D-aware prior, yielding multi-view images with low-resolution texture details.
arXiv Detail & Related papers (2024-09-11T17:58:57Z)
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image [28.759158325097093]
Unique3D is a novel image-to-3D framework for efficiently generating high-quality 3D meshes from single-view images. Our framework features state-of-the-art generation fidelity and strong generalizability.
arXiv Detail & Related papers (2024-05-30T17:59:54Z)
Compress3D: a Compressed Latent Space for 3D Generation from a Single Image [27.53099431097921]
Triplane autoencoder encodes 3D models into a compact triplane latent space to compress both the 3D geometry and texture information. We introduce a 3D-aware cross-attention mechanism, which utilizes low-resolution latent representations to query features from a high-resolution 3D feature volume. Our approach enables the generation of high-quality 3D assets in merely 7 seconds on a single A100 GPU.
arXiv Detail & Related papers (2024-03-20T11:51:04Z)
Generic 3D Diffusion Adapter Using Controlled Multi-View Editing [44.99706994361726]
Open-domain 3D object synthesis has been lagging behind image synthesis due to limited data and higher computational complexity. This paper proposes MVEdit, which functions as a 3D counterpart of SDEdit, employing ancestral sampling to jointly denoise multi-view images. MVEdit achieves 3D consistency through a training-free 3D Adapter, which lifts the 2D views of the last timestep into a coherent 3D representation.
arXiv Detail & Related papers (2024-03-18T17:59:09Z)
3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene. Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z)
Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model. Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach. These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z)
Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior [52.44678180286886]
2D diffusion models find a distillation approach that achieves excellent generalization and rich details without any 3D data. We propose Sherpa3D, a new text-to-3D framework that achieves high-fidelity, generalizability, and geometric consistency simultaneously.
arXiv Detail & Related papers (2023-12-11T18:59:18Z)
Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model [68.98311213582949]
We propose Instant3D, a novel method that generates high-quality and diverse 3D assets from text prompts in a feed-forward manner. Our method can generate diverse 3D assets of high visual quality within 20 seconds, two orders of magnitude faster than previous optimization-based methods.
arXiv Detail & Related papers (2023-11-10T18:03:44Z)
EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior [59.25950280610409]
We propose a robust high-quality 3D content generation pipeline by exploiting orthogonal-view image guidance. In this paper, we introduce a novel 2D diffusion model that generates an image consisting of four sub-images based on the given text prompt. We also present a 3D synthesis network that can further improve the details of the generated 3D contents.
arXiv Detail & Related papers (2023-08-25T07:39:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.