Related papers: Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior

Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior

URL: http://arxiv.org/abs/2312.11535v2
Date: Tue, 9 Jan 2024 10:47:40 GMT
Title: Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior
Authors: Nan Huang, Ting Zhang, Yuhui Yuan, Dong Chen, Shanghang Zhang
Abstract summary: We present a novel two-stage approach that fully utilizes the information provided by the reference image to establish a customized knowledge prior for image-to-3D generation. Experiments showcase the superiority of our method, Customize-It-3D, outperforming previous works by a substantial margin.
Score: 33.45375100074168
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we present a novel two-stage approach that fully utilizes the information provided by the reference image to establish a customized knowledge prior for image-to-3D generation. While previous approaches primarily rely on a general diffusion prior, which struggles to yield consistent results with the reference image, we propose a subject-specific and multi-modal diffusion model. This model not only aids NeRF optimization by considering the shading mode for improved geometry but also enhances texture from the coarse results to achieve superior refinement. Both aspects contribute to faithfully aligning the 3D content with the subject. Extensive experiments showcase the superiority of our method, Customize-It-3D, outperforming previous works by a substantial margin. It produces faithful 360-degree reconstructions with impressive visual quality, making it well-suited for various applications, including text-to-3D creation.

Related papers

End-to-End Fine-Tuning of 3D Texture Generation using Differentiable Rewards [8.953379216683732]
We propose an end-to-end differentiable, reinforcement-learning-free framework that embeds human feedback, expressed as differentiable reward functions, directly into the 3D texture pipeline.<n>By back-propagating preference signals through both geometric and appearance modules, our method generates textures that respect the 3D geometry structure and align with desired criteria.
arXiv Detail & Related papers (2025-06-23T06:24:12Z)
Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging [15.36983068580743]
Hi3DGen is a novel framework for generating high-fidelity 3D geometry from images via normal bridging. Our work provides a new direction for high-fidelity 3D geometry generation from images by leveraging normal maps as an intermediate representation.
arXiv Detail & Related papers (2025-03-28T08:39:20Z)
CDI3D: Cross-guided Dense-view Interpolation for 3D Reconstruction [25.468907201804093]
Large Reconstruction Models (LRMs) have shown great promise in leveraging multi-view images generated by 2D diffusion models to extract 3D content. However, 2D diffusion models often struggle to produce dense images with strong multi-view consistency. We present CDI3D, a feed-forward framework designed for efficient, high-quality image-to-3D generation with view.
arXiv Detail & Related papers (2025-03-11T03:08:43Z)
Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion [63.81544586407943]
Single-image 3D portrait generation methods typically employ 2D diffusion models to provide multi-view knowledge, which is then distilled into 3D representations. We propose a Hybrid Priors Diffsion model, which explicitly and implicitly incorporates multi-view priors as conditions to enhance the status consistency of the generated multi-view portraits. Experiments demonstrate that our method can produce 3D portraits with accurate geometry and rich details from a single image.
arXiv Detail & Related papers (2024-11-15T17:19:18Z)
DreamPolish: Domain Score Distillation With Progressive Geometry Generation [66.94803919328815]
We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures. In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process. In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain.
arXiv Detail & Related papers (2024-11-03T15:15:01Z)
Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model [65.58911408026748]
We propose Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts. We first advocate leveraging text-guided 4-view images as the bottleneck in the text-to-3D pipeline. We then introduce an attention refocusing mechanism to encourage text-aligned 4-view image generation.
arXiv Detail & Related papers (2024-04-28T04:05:10Z)
Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors [24.478875248825563]
We propose a novel image editing technique that enables 3D manipulations on single images. Our method directly leverages powerful image diffusion models trained on a broad spectrum of text-image pairs. Our method can generate high-quality 3D-aware image edits with large viewpoint transformations and high appearance and shape consistency with the input image.
arXiv Detail & Related papers (2024-03-18T06:18:59Z)
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images. We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image. We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z)
Wonder3D: Single Image to 3D using Cross-Domain Diffusion [105.16622018766236]
Wonder3D is a novel method for efficiently generating high-fidelity textured meshes from single-view images. To holistically improve the quality, consistency, and efficiency of image-to-3D tasks, we propose a cross-domain diffusion model.
arXiv Detail & Related papers (2023-10-23T15:02:23Z)
EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior [59.25950280610409]
We propose a robust high-quality 3D content generation pipeline by exploiting orthogonal-view image guidance. In this paper, we introduce a novel 2D diffusion model that generates an image consisting of four sub-images based on the given text prompt. We also present a 3D synthesis network that can further improve the details of the generated 3D contents.
arXiv Detail & Related papers (2023-08-25T07:39:26Z)
Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models. Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z)
Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior [36.40582157854088]
In this work, we investigate the problem of creating high-fidelity 3D content from only a single image. We leverage prior knowledge from a well-trained 2D diffusion model to act as 3D-aware supervision for 3D creation. Our method presents the first attempt to achieve high-quality 3D creation from a single image for general objects and enables various applications such as text-to-3D creation and texture editing.
arXiv Detail & Related papers (2023-03-24T17:54:22Z)
High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views. Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.