InceptionHuman: Controllable Prompt-to-NeRF for Photorealistic 3D Human Generation
- URL: http://arxiv.org/abs/2311.16499v2
- Date: Tue, 6 Aug 2024 06:31:34 GMT
- Title: InceptionHuman: Controllable Prompt-to-NeRF for Photorealistic 3D Human Generation
- Authors: Shiu-hong Kao, Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang,
- Abstract summary: InceptionHuman is a prompt-to-NeRF framework that allows easy control via a combination of prompts in different modalities to generate photorealistic 3D humans.
InceptionHuman achieves consistent 3D human generation within a progressively refined NeRF space.
- Score: 61.62346472443454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents InceptionHuman, a prompt-to-NeRF framework that allows easy control via a combination of prompts in different modalities (e.g., text, poses, edge, segmentation map, etc) as inputs to generate photorealistic 3D humans. While many works have focused on generating 3D human models, they suffer one or more of the following: lack of distinctive features, unnatural shading/shadows, unnatural poses/clothes, limited views, etc. InceptionHuman achieves consistent 3D human generation within a progressively refined NeRF space with two novel modules, Iterative Pose-Aware Refinement (IPAR) and Progressive-Augmented Reconstruction (PAR). IPAR iteratively refines the diffusion-generated images and synthesizes high-quality 3D-aware views considering the close-pose RGB values. PAR employs a pretrained diffusion prior to augment the generated synthetic views and adds regularization for view-independent appearance. Overall, the synthesis of photorealistic novel views empowers the resulting 3D human NeRF from 360-degree perspectives. Extensive qualitative and quantitative experimental comparison show that our InceptionHuman models achieve state-of-the-art application quality.
Related papers
- Progress and Prospects in 3D Generative AI: A Technical Overview
including 3D human [51.58094069317723]
This paper aims to provide a comprehensive overview and summary of the relevant papers published mostly during the latter half year of 2023.
It will begin by discussing the AI generated object models in 3D, followed by the generated 3D human models, and finally, the generated 3D human motions, culminating in a conclusive summary and a vision for the future.
arXiv Detail & Related papers (2024-01-05T03:41:38Z) - HumanRef: Single Image to 3D Human Generation via Reference-Guided
Diffusion [53.1558345421646]
We propose HumanRef, a 3D human generation framework from a single-view input.
To ensure the generated 3D model is photorealistic and consistent with the input image, HumanRef introduces a novel method called reference-guided score distillation sampling.
Experimental results demonstrate that HumanRef outperforms state-of-the-art methods in generating 3D clothed humans.
arXiv Detail & Related papers (2023-11-28T17:06:28Z) - Single-Image 3D Human Digitization with Shape-Guided Diffusion [31.99621159464388]
NeRF and its variants typically require videos or images from different viewpoints.
We present an approach to generate a 360-degree view of a person with a consistent, high-resolution appearance from a single input image.
arXiv Detail & Related papers (2023-11-15T18:59:56Z) - SHERF: Generalizable Human NeRF from a Single Image [59.10589479808622]
SHERF is the first generalizable Human NeRF model for recovering animatable 3D humans from a single input image.
We propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.
arXiv Detail & Related papers (2023-03-22T17:59:12Z) - Refining 3D Human Texture Estimation from a Single Image [3.8761064607384195]
Estimating 3D human texture from a single image is essential in graphics and vision.
We propose a framework that adaptively samples the input by a deformable convolution where offsets are learned via a deep neural network.
arXiv Detail & Related papers (2023-03-06T19:53:50Z) - HDHumans: A Hybrid Approach for High-fidelity Digital Humans [107.19426606778808]
HDHumans is the first method for HD human character synthesis that jointly produces an accurate and temporally coherent 3D deforming surface.
Our method is carefully designed to achieve a synergy between classical surface deformation and neural radiance fields (NeRF)
arXiv Detail & Related papers (2022-10-21T14:42:11Z) - Human View Synthesis using a Single Sparse RGB-D Input [16.764379184593256]
We present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D.
An enhancer network leverages the overall fidelity, even in occluded areas from the original view, producing crisp renders with fine details.
arXiv Detail & Related papers (2021-12-27T20:13:53Z) - 3D-Aware Semantic-Guided Generative Model for Human Synthesis [67.86621343494998]
This paper proposes a 3D-aware Semantic-Guided Generative Model (3D-SGAN) for human image synthesis.
Our experiments on the DeepFashion dataset show that 3D-SGAN significantly outperforms the most recent baselines.
arXiv Detail & Related papers (2021-12-02T17:10:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.