SemanticHuman-HD: High-Resolution Semantic Disentangled 3D Human Generation
- URL: http://arxiv.org/abs/2403.10166v1
- Date: Fri, 15 Mar 2024 10:18:56 GMT
- Title: SemanticHuman-HD: High-Resolution Semantic Disentangled 3D Human Generation
- Authors: Peng Zheng, Tao Liu, Zili Yi, Rui Ma,
- Abstract summary: We introduce SemanticHuman-HD, the first method to achieve semantic disentangled human image synthesis.
SemanticHuman-HD is also the first method to achieve 3D-aware image synthesis at $10242$ resolution.
Our method opens up exciting possibilities for various applications, including 3D garment generation, semantic-aware image synthesis, controllable image synthesis.
- Score: 12.063815354055052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the development of neural radiance fields and generative models, numerous methods have been proposed for learning 3D human generation from 2D images. These methods allow control over the pose of the generated 3D human and enable rendering from different viewpoints. However, none of these methods explore semantic disentanglement in human image synthesis, i.e., they can not disentangle the generation of different semantic parts, such as the body, tops, and bottoms. Furthermore, existing methods are limited to synthesize images at $512^2$ resolution due to the high computational cost of neural radiance fields. To address these limitations, we introduce SemanticHuman-HD, the first method to achieve semantic disentangled human image synthesis. Notably, SemanticHuman-HD is also the first method to achieve 3D-aware image synthesis at $1024^2$ resolution, benefiting from our proposed 3D-aware super-resolution module. By leveraging the depth maps and semantic masks as guidance for the 3D-aware super-resolution, we significantly reduce the number of sampling points during volume rendering, thereby reducing the computational cost. Our comparative experiments demonstrate the superiority of our method. The effectiveness of each proposed component is also verified through ablation studies. Moreover, our method opens up exciting possibilities for various applications, including 3D garment generation, semantic-aware image synthesis, controllable image synthesis, and out-of-domain image synthesis.
Related papers
- InceptionHuman: Controllable Prompt-to-NeRF for Photorealistic 3D Human Generation [61.62346472443454]
InceptionHuman is a prompt-to-NeRF framework that allows easy control via a combination of prompts in different modalities to generate photorealistic 3D humans.
InceptionHuman achieves consistent 3D human generation within a progressively refined NeRF space.
arXiv Detail & Related papers (2023-11-27T15:49:41Z) - Single-Image 3D Human Digitization with Shape-Guided Diffusion [31.99621159464388]
NeRF and its variants typically require videos or images from different viewpoints.
We present an approach to generate a 360-degree view of a person with a consistent, high-resolution appearance from a single input image.
arXiv Detail & Related papers (2023-11-15T18:59:56Z) - GETAvatar: Generative Textured Meshes for Animatable Human Avatars [69.56959932421057]
We study the problem of 3D-aware full-body human generation, aiming at creating animatable human avatars with high-quality geometries and textures.
We propose GETAvatar, a Generative model that directly generates Explicit Textured 3D rendering for animatable human Avatar.
arXiv Detail & Related papers (2023-10-04T10:30:24Z) - Semantic 3D-aware Portrait Synthesis and Manipulation Based on
Compositional Neural Radiance Field [55.431697263581626]
We propose a Compositional Neural Radiance Field (CNeRF) for semantic 3D-aware portrait synthesis and manipulation.
CNeRF divides the image by semantic regions and learns an independent neural radiance field for each region, and finally fuses them and renders the complete image.
Compared to state-of-the-art 3D-aware GAN methods, our approach enables fine-grained semantic region manipulation, while maintaining high-quality 3D-consistent synthesis.
arXiv Detail & Related papers (2023-02-03T07:17:46Z) - Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator [68.0533826852601]
3D-aware image synthesis aims at learning a generative model that can render photo-realistic 2D images while capturing decent underlying 3D shapes.
Existing methods fail to obtain moderate 3D shapes.
We propose a geometry-aware discriminator to improve 3D-aware GANs.
arXiv Detail & Related papers (2022-09-30T17:59:37Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z) - 3D-Aware Semantic-Guided Generative Model for Human Synthesis [67.86621343494998]
This paper proposes a 3D-aware Semantic-Guided Generative Model (3D-SGAN) for human image synthesis.
Our experiments on the DeepFashion dataset show that 3D-SGAN significantly outperforms the most recent baselines.
arXiv Detail & Related papers (2021-12-02T17:10:53Z) - Detailed 3D Human Body Reconstruction from Multi-view Images Combining
Voxel Super-Resolution and Learned Implicit Representation [12.459968574683625]
We propose a coarse-to-fine method to reconstruct a detailed 3D human body from multi-view images.
The coarse 3D models are estimated by learning an implicit representation based on multi-scale features.
The refined detailed 3D human body models can be produced by the voxel super-resolution which can preserve the details.
arXiv Detail & Related papers (2020-12-11T08:07:39Z) - GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis [43.4859484191223]
We propose a generative model for radiance fields which have recently proven successful for novel view synthesis of a single scene.
By introducing a multi-scale patch-based discriminator, we demonstrate synthesis of high-resolution images while training our model from unposed 2D images alone.
arXiv Detail & Related papers (2020-07-05T20:37:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.