Deceptive-Human: Prompt-to-NeRF 3D Human Generation with 3D-Consistent
Synthetic Images
- URL: http://arxiv.org/abs/2311.16499v1
- Date: Mon, 27 Nov 2023 15:49:41 GMT
- Title: Deceptive-Human: Prompt-to-NeRF 3D Human Generation with 3D-Consistent
Synthetic Images
- Authors: Shiu-hong Kao, Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang
- Abstract summary: Deceptive-Human is a novel framework capitalizing state-of-the-art control diffusion models (e.g., ControlNet) to generate a high-quality controllable 3D human NeRF.
Our method is versatile and readily accommodating, including a text prompt and additional data such as 3D mesh, poses, and seed images.
The resulting 3D human NeRF model empowers the synthesis of highly photorealistic views from 360-degree perspectives.
- Score: 67.31920821192323
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents Deceptive-Human, a novel Prompt-to-NeRF framework
capitalizing state-of-the-art control diffusion models (e.g., ControlNet) to
generate a high-quality controllable 3D human NeRF. Different from direct 3D
generative approaches, e.g., DreamFusion and DreamHuman, Deceptive-Human
employs a progressive refinement technique to elevate the reconstruction
quality. This is achieved by utilizing high-quality synthetic human images
generated through the ControlNet with view-consistent loss. Our method is
versatile and readily extensible, accommodating multimodal inputs, including a
text prompt and additional data such as 3D mesh, poses, and seed images. The
resulting 3D human NeRF model empowers the synthesis of highly photorealistic
novel views from 360-degree perspectives. The key to our Deceptive-Human for
hallucinating multi-view consistent synthetic human images lies in our
progressive finetuning strategy. This strategy involves iteratively enhancing
views using the provided multimodal inputs at each intermediate step to improve
the human NeRF model. Within this iterative refinement process, view-dependent
appearances are systematically eliminated to prevent interference with the
underlying density estimation. Extensive qualitative and quantitative
experimental comparison shows that our deceptive human models achieve
state-of-the-art application quality.
Related papers
- AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion [56.12859795754579]
AdaHuman is a novel framework that generates high-fidelity animatable 3D avatars from a single in-the-wild image.<n>AdaHuman incorporates two key innovations: a pose-conditioned 3D joint diffusion model and a compositional 3DGS refinement module.
arXiv Detail & Related papers (2025-05-30T17:59:54Z) - HumanGif: Single-View Human Diffusion with Generative Prior [25.516544735593087]
We propose HumanGif, a single-view human diffusion model with generative priors.
Specifically, we formulate the single-view-based 3D human novel view and pose synthesis as a single-view-conditioned human diffusion process.
We show that HumanGif achieves the best perceptual performance, with better generalizability for novel view and pose synthesis.
arXiv Detail & Related papers (2025-02-17T17:55:27Z) - Progress and Prospects in 3D Generative AI: A Technical Overview
including 3D human [51.58094069317723]
This paper aims to provide a comprehensive overview and summary of the relevant papers published mostly during the latter half year of 2023.
It will begin by discussing the AI generated object models in 3D, followed by the generated 3D human models, and finally, the generated 3D human motions, culminating in a conclusive summary and a vision for the future.
arXiv Detail & Related papers (2024-01-05T03:41:38Z) - HumanRef: Single Image to 3D Human Generation via Reference-Guided
Diffusion [53.1558345421646]
We propose HumanRef, a 3D human generation framework from a single-view input.
To ensure the generated 3D model is photorealistic and consistent with the input image, HumanRef introduces a novel method called reference-guided score distillation sampling.
Experimental results demonstrate that HumanRef outperforms state-of-the-art methods in generating 3D clothed humans.
arXiv Detail & Related papers (2023-11-28T17:06:28Z) - Single-Image 3D Human Digitization with Shape-Guided Diffusion [31.99621159464388]
NeRF and its variants typically require videos or images from different viewpoints.
We present an approach to generate a 360-degree view of a person with a consistent, high-resolution appearance from a single input image.
arXiv Detail & Related papers (2023-11-15T18:59:56Z) - SHERF: Generalizable Human NeRF from a Single Image [59.10589479808622]
SHERF is the first generalizable Human NeRF model for recovering animatable 3D humans from a single input image.
We propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.
arXiv Detail & Related papers (2023-03-22T17:59:12Z) - Refining 3D Human Texture Estimation from a Single Image [3.8761064607384195]
Estimating 3D human texture from a single image is essential in graphics and vision.
We propose a framework that adaptively samples the input by a deformable convolution where offsets are learned via a deep neural network.
arXiv Detail & Related papers (2023-03-06T19:53:50Z) - HDHumans: A Hybrid Approach for High-fidelity Digital Humans [107.19426606778808]
HDHumans is the first method for HD human character synthesis that jointly produces an accurate and temporally coherent 3D deforming surface.
Our method is carefully designed to achieve a synergy between classical surface deformation and neural radiance fields (NeRF)
arXiv Detail & Related papers (2022-10-21T14:42:11Z) - Human View Synthesis using a Single Sparse RGB-D Input [16.764379184593256]
We present a novel view synthesis framework to generate realistic renders from unseen views of any human captured from a single-view sensor with sparse RGB-D.
An enhancer network leverages the overall fidelity, even in occluded areas from the original view, producing crisp renders with fine details.
arXiv Detail & Related papers (2021-12-27T20:13:53Z) - 3D-Aware Semantic-Guided Generative Model for Human Synthesis [67.86621343494998]
This paper proposes a 3D-aware Semantic-Guided Generative Model (3D-SGAN) for human image synthesis.
Our experiments on the DeepFashion dataset show that 3D-SGAN significantly outperforms the most recent baselines.
arXiv Detail & Related papers (2021-12-02T17:10:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.