Related papers: Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging

Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging

URL: http://arxiv.org/abs/2503.22236v2
Date: Mon, 31 Mar 2025 03:41:01 GMT
Title: Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging
Authors: Chongjie Ye, Yushuang Wu, Ziteng Lu, Jiahao Chang, Xiaoyang Guo, Jiaqing Zhou, Hao Zhao, Xiaoguang Han,
Abstract summary: Hi3DGen is a novel framework for generating high-fidelity 3D geometry from images via normal bridging.<n>Our work provides a new direction for high-fidelity 3D geometry generation from images by leveraging normal maps as an intermediate representation.
Score: 15.36983068580743
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the growing demand for high-fidelity 3D models from 2D images, existing methods still face significant challenges in accurately reproducing fine-grained geometric details due to limitations in domain gaps and inherent ambiguities in RGB images. To address these issues, we propose Hi3DGen, a novel framework for generating high-fidelity 3D geometry from images via normal bridging. Hi3DGen consists of three key components: (1) an image-to-normal estimator that decouples the low-high frequency image pattern with noise injection and dual-stream training to achieve generalizable, stable, and sharp estimation; (2) a normal-to-geometry learning approach that uses normal-regularized latent diffusion learning to enhance 3D geometry generation fidelity; and (3) a 3D data synthesis pipeline that constructs a high-quality dataset to support training. Extensive experiments demonstrate the effectiveness and superiority of our framework in generating rich geometric details, outperforming state-of-the-art methods in terms of fidelity. Our work provides a new direction for high-fidelity 3D geometry generation from images by leveraging normal maps as an intermediate representation.

Related papers

GeoWorld: Unlocking the Potential of Geometry Models to Facilitate High-Fidelity 3D Scene Generation [68.02988074681427]
Previous works leveraging video models for image-to-3D scene generation tend to suffer from geometric distortions and blurry content.<n>In this paper, we renovate the pipeline of image-to-3D scene generation by unlocking the potential of geometry models.<n>Our GeoWorld can generate high-fidelity 3D scenes from a single image and a given camera trajectory, outperforming prior methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2025-11-28T13:55:45Z)
Wonder3D++: Cross-domain Diffusion for High-fidelity 3D Generation from a Single Image [68.55613894952177]
We introduce textbfWonder3D++, a novel method for efficiently generating high-fidelity textured meshes from single-view images.<n>We propose a cross-domain diffusion model that generates multi-view normal maps and the corresponding color images.<n> Lastly, we introduce a cascaded 3D mesh extraction algorithm that drives high-quality surfaces from the multi-view 2D representations in only about $3$ minute in a coarse-to-fine manner.
arXiv Detail & Related papers (2025-11-03T17:24:18Z)
Generating Surface for Text-to-3D using 2D Gaussian Splatting [7.610379621632961]
We propose a novel method named DirectGaussian, which focuses on generating the surfaces of 3D objects represented by surfels.<n>In DirectGaussian, we utilize conditional text generation models and the surface of a 3D object is rendered by 2D Gaussian splatting.<n>Our framework is capable of achieving diverse and high-fidelity 3D content creation.
arXiv Detail & Related papers (2025-10-08T12:54:57Z)
Geometry and Perception Guided Gaussians for Multiview-consistent 3D Generation from a Single Image [10.36303976374455]
Existing approaches often rely on fine-tuning pretrained 2D diffusion models or directly generating 3D information through fast network inference.<n>We present a novel method that seamlessly integrates geometry and perception priors without requiring additional model training.<n>Experiments demonstrate the higher-fidelity reconstruction results of our method, outperforming existing methods on novel view synthesis and 3D reconstruction.
arXiv Detail & Related papers (2025-06-26T11:22:06Z)
High-fidelity 3D Object Generation from Single Image with RGBN-Volume Gaussian Reconstruction Model [38.13429047918231]
We propose a novel hybrid Voxel-Gaussian representation, where a 3D voxel representation contains explicit 3D geometric information. Our 3D voxel representation is obtained by a fusion module that aligns RGB features and surface normal features, both of which can be estimated from 2D images.
arXiv Detail & Related papers (2025-04-02T08:58:34Z)
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models [69.0220314849478]
TripoSG is a new streamlined shape diffusion paradigm capable of generating high-fidelity 3D meshes with precise correspondence to input images. The resulting 3D shapes exhibit enhanced detail due to high-resolution capabilities and demonstrate exceptional fidelity to input images. To foster progress and innovation in the field of 3D generation, we will make our model publicly available.
arXiv Detail & Related papers (2025-02-10T16:07:54Z)
Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild. Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture. We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z)
Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation [27.43973967994717]
MT3D is a text-to-3D generative model that leverages a high-fidelity 3D object to overcome viewpoint bias.<n>By incorporating geometric details from a 3D asset, MT3D enables the creation of diverse and geometrically consistent objects.
arXiv Detail & Related papers (2024-08-12T06:25:44Z)
GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory. Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images. GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z)
GeoGS3D: Single-view 3D Reconstruction via Geometric-aware Diffusion Model and Gaussian Splatting [81.03553265684184]
We introduce GeoGS3D, a framework for reconstructing detailed 3D objects from single-view images. We propose a novel metric, Gaussian Divergence Significance (GDS), to prune unnecessary operations during optimization. Experiments demonstrate that GeoGS3D generates images with high consistency across views and reconstructs high-quality 3D objects.
arXiv Detail & Related papers (2024-03-15T12:24:36Z)
Wonder3D: Single Image to 3D using Cross-Domain Diffusion [105.16622018766236]
Wonder3D is a novel method for efficiently generating high-fidelity textured meshes from single-view images. To holistically improve the quality, consistency, and efficiency of image-to-3D tasks, we propose a cross-domain diffusion model.
arXiv Detail & Related papers (2023-10-23T15:02:23Z)
High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views. Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)
Efficient Geometry-aware 3D Generative Adversarial Networks [50.68436093869381]
Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent. In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations. We introduce an expressive hybrid explicit-implicit network architecture that synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry.
arXiv Detail & Related papers (2021-12-15T08:01:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.