Related papers: Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

URL: http://arxiv.org/abs/2501.12202v2
Date: Wed, 22 Jan 2025 12:01:39 GMT
Title: Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
Authors: Zibo Zhao, Zeqiang Lai, Qingxiang Lin, Yunfei Zhao, Haolin Liu, Shuhui Yang, Yifei Feng, Mingxin Yang, Sheng Zhang, Xianghui Yang, Huiwen Shi, Sicong Liu, Junta Wu, Yihang Lian, Fan Yang, Ruining Tang, Zebin He, Xinzhou Wang, Jian Liu, Xuhui Zuo, Zhuo Chen, Biwen Lei, Haohan Weng, Jing Xu, Yiling Zhu, Xinhai Liu, Lixin Xu, Changrong Hu, Tianyu Huang, Lifu Wang, Jihong Zhang, Meng Chen, Liang Dong, Yiwen Jia, Yulin Cai, Jiaao Yu, Yixuan Tang, Hao Zhang, Zheng Ye, Peng He, Runzhou Wu, Chao Zhang, Yonghao Tan, Jie Xiao, Yangyu Tao, Jianchen Zhu, Jinbao Xue, Kai Liu, Chongqing Zhao, Xinming Wu, Zhichao Hu, Lei Qin, Jianbing Peng, Zhan Li, Minghui Chen, Xipeng Zhang, Lin Niu, Paige Wang, Yingkai Wang, Haozhao Kuang, Zhongyi Fan, Xu Zheng, Weihao Zhuang, YingPing He, Tian Liu, Yong Yang, Di Wang, Yuhong Liu, Jie Jiang, Jingwei Huang, Chunchao Guo,
Abstract summary: Hunyuan3D 2.0 is an advanced large-scale 3D synthesis system for generating high-resolution textured 3D assets.<n>The shape generative model, built on a scalable flow-based diffusion transformer, aims to create geometry that properly aligns with a given condition image.<n>The texture synthesis model, benefiting from strong geometric and diffusion priors, produces high-resolution and vibrant texture maps.
Score: 42.50869050898905
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Hunyuan3D 2.0, an advanced large-scale 3D synthesis system for generating high-resolution textured 3D assets. This system includes two foundation components: a large-scale shape generation model -- Hunyuan3D-DiT, and a large-scale texture synthesis model -- Hunyuan3D-Paint. The shape generative model, built on a scalable flow-based diffusion transformer, aims to create geometry that properly aligns with a given condition image, laying a solid foundation for downstream applications. The texture synthesis model, benefiting from strong geometric and diffusion priors, produces high-resolution and vibrant texture maps for either generated or hand-crafted meshes. Furthermore, we build Hunyuan3D-Studio -- a versatile, user-friendly production platform that simplifies the re-creation process of 3D assets. It allows both professional and amateur users to manipulate or even animate their meshes efficiently. We systematically evaluate our models, showing that Hunyuan3D 2.0 outperforms previous state-of-the-art models, including the open-source models and closed-source models in geometry details, condition alignment, texture quality, and etc. Hunyuan3D 2.0 is publicly released in order to fill the gaps in the open-source 3D community for large-scale foundation generative models. The code and pre-trained weights of our models are available at: https://github.com/Tencent/Hunyuan3D-2

Related papers

Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details [23.393893197088843]
Hunyuan3D 2.5 is a robust suite of 3D diffusion models aimed at generating high-fidelity and detailed textured 3D assets.<n>In terms of shape generation, we introduce a new shape foundation model -- LATTICE, which is trained with scaled high-quality datasets, model-size, and compute.<n>In terms of texture generation, it is upgraded with phyiscal-based rendering (PBR) via a novel multi-view architecture extended from Hunyuan3D 2.0 Paint model.
arXiv Detail & Related papers (2025-06-19T17:57:40Z)
Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material [36.1986525609658]
Hunyuan3D 2.1 is an advanced system for producing high-resolution, textured 3D assets.<n>This tutorial offers a step-by-step guide on processing 3D data, training a 3D generative model, and evaluating its performance.
arXiv Detail & Related papers (2025-06-18T13:14:46Z)
Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation [15.374873279207623]
We introduce Kiss3DGen (Keep It Simple and Straightforward in 3D Generation), an efficient framework for generating, editing, and enhancing 3D objects. Specifically, we fine-tune a diffusion model to generate ''3D Bundle Image'', a tiled representation composed of multi-view images and their corresponding normal maps. This simple method effectively transforms the 3D generation problem into a 2D image generation task, maximizing the utilization of knowledge in pretrained diffusion models.
arXiv Detail & Related papers (2025-03-03T10:07:19Z)
Structured 3D Latents for Scalable and Versatile 3D Generation [28.672494137267837]
We introduce a novel 3D generation method for versatile and high-quality 3D asset creation.<n>The cornerstone is a unified Structured LATent representation which allows decoding to different output formats.<n>This is achieved by integrating a sparsely-populated 3D grid with dense multiview visual features extracted from a powerful vision foundation model.
arXiv Detail & Related papers (2024-12-02T13:58:38Z)
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion [86.25111098482537]
We introduce 3DTopia-XL, a scalable native 3D generative model designed to overcome limitations of existing methods. 3DTopia-XL leverages a novel primitive-based 3D representation, PrimX, which encodes detailed shape, albedo, and material field into a compact tensorial format. On top of the novel representation, we propose a generative framework based on Diffusion Transformer (DiT), which comprises 1) Primitive Patch Compression, 2) and Latent Primitive Diffusion. We conduct extensive qualitative and quantitative experiments to demonstrate that 3DTopia-XL significantly outperforms existing methods in generating high-
arXiv Detail & Related papers (2024-09-19T17:59:06Z)
DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts. Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets. We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z)
Breathing New Life into 3D Assets with Generative Repainting [74.80184575267106]
Diffusion-based text-to-image models ignited immense attention from the vision community, artists, and content creators. Recent works have proposed various pipelines powered by the entanglement of diffusion models and neural fields. We explore the power of pretrained 2D diffusion models and standard 3D neural radiance fields as independent, standalone tools. Our pipeline accepts any legacy renderable geometry, such as textured or untextured meshes, and orchestrates the interaction between 2D generative refinement and 3D consistency enforcement tools.
arXiv Detail & Related papers (2023-09-15T16:34:51Z)
GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images [72.15855070133425]
We introduce GET3D, a Generative model that directly generates Explicit Textured 3D meshes with complex topology, rich geometric details, and high-fidelity textures. GET3D is able to generate high-quality 3D textured meshes, ranging from cars, chairs, animals, motorbikes and human characters to buildings.
arXiv Detail & Related papers (2022-09-22T17:16:19Z)
Generative VoxelNet: Learning Energy-Based Models for 3D Shape Synthesis and Analysis [143.22192229456306]
This paper proposes a deep 3D energy-based model to represent volumetric shapes. The benefits of the proposed model are six-fold. Experiments demonstrate that the proposed model can generate high-quality 3D shape patterns.
arXiv Detail & Related papers (2020-12-25T06:09:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.