ShapeGen: Towards High-Quality 3D Shape Synthesis
- URL: http://arxiv.org/abs/2511.20624v1
- Date: Tue, 25 Nov 2025 18:47:27 GMT
- Title: ShapeGen: Towards High-Quality 3D Shape Synthesis
- Authors: Yangguang Li, Xianglong He, Zi-Xin Zou, Zexiang Liu, Wanli Ouyang, Ding Liang, Yan-Pei Cao,
- Abstract summary: 3D shape generation has made notable progress, enabling the rapid synthesis of high-fidelity 3D assets from a single image.<n>However, current methods still face challenges, including the lack of intricate details, overly smoothed surfaces, and fragmented thin-shell structures.<n>We present ShapeGen, which achieves high-quality image-to-3D shape generation through 3D representation and supervision improvements, resolution scaling up, and the advantages of linear transformers.
- Score: 67.37531089151034
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inspired by generative paradigms in image and video, 3D shape generation has made notable progress, enabling the rapid synthesis of high-fidelity 3D assets from a single image. However, current methods still face challenges, including the lack of intricate details, overly smoothed surfaces, and fragmented thin-shell structures. These limitations leave the generated 3D assets still one step short of meeting the standards favored by artists. In this paper, we present ShapeGen, which achieves high-quality image-to-3D shape generation through 3D representation and supervision improvements, resolution scaling up, and the advantages of linear transformers. These advancements allow the generated assets to be seamlessly integrated into 3D pipelines, facilitating their widespread adoption across various applications. Through extensive experiments, we validate the impact of these improvements on overall performance. Ultimately, thanks to the synergistic effects of these enhancements, ShapeGen achieves a significant leap in image-to-3D generation, establishing a new state-of-the-art performance.
Related papers
- Self-Evolving 3D Scene Generation from a Single Image [44.87957263540352]
EvoScene is a training-free framework that progressively reconstructs complete 3D scenes from single images.<n>EvoScene alternates between 2D and 3D domains, gradually improving both structure and appearance.
arXiv Detail & Related papers (2025-12-09T18:44:21Z) - Photo3D: Advancing Photorealistic 3D Generation through Structure-Aligned Detail Enhancement [12.855027334688382]
Photo3D is a framework for advancing 3D generation driven by the GPT-4o-Image model image data.<n>We present a realistic detail enhancement scheme that leverages perceptual feature adaptation and semantic structure matching to enforce appearance consistency.<n>Our scheme is general to different 3D-native generators, and we present dedicated training strategies to facilitate the optimization of geometry-texture coupled and decoupled 3D-native generation paradigms.
arXiv Detail & Related papers (2025-12-09T12:33:48Z) - Single Image to High-Quality 3D Object via Latent Features [7.610379621632961]
We introduce LatentDreamer, a framework for generating 3D objects from single images.<n>A pre-trained variational autoencoder maps 3D geometries to latent features.<n>LatentDreamer generates coarse geometries, refined geometries, and realistic textures sequentially.
arXiv Detail & Related papers (2025-11-24T02:31:04Z) - TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models [69.0220314849478]
TripoSG is a new streamlined shape diffusion paradigm capable of generating high-fidelity 3D meshes with precise correspondence to input images.<n>The resulting 3D shapes exhibit enhanced detail due to high-resolution capabilities and demonstrate exceptional fidelity to input images.<n>To foster progress and innovation in the field of 3D generation, we will make our model publicly available.
arXiv Detail & Related papers (2025-02-10T16:07:54Z) - LAM3D: Large Image-Point-Cloud Alignment Model for 3D Reconstruction from Single Image [64.94932577552458]
Large Reconstruction Models have made significant strides in the realm of automated 3D content generation from single or multiple input images.
Despite their success, these models often produce 3D meshes with geometric inaccuracies, stemming from the inherent challenges of deducing 3D shapes solely from image data.
We introduce a novel framework, the Large Image and Point Cloud Alignment Model (LAM3D), which utilizes 3D point cloud data to enhance the fidelity of generated 3D meshes.
arXiv Detail & Related papers (2024-05-24T15:09:12Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - Efficient Geometry-aware 3D Generative Adversarial Networks [50.68436093869381]
Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent.
In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations.
We introduce an expressive hybrid explicit-implicit network architecture that synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry.
arXiv Detail & Related papers (2021-12-15T08:01:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.