Related papers: EVA3D: Compositional 3D Human Generation from 2D Image Collections

EVA3D: Compositional 3D Human Generation from 2D Image Collections

URL: http://arxiv.org/abs/2210.04888v1
Date: Mon, 10 Oct 2022 17:59:31 GMT
Title: EVA3D: Compositional 3D Human Generation from 2D Image Collections
Authors: Fangzhou Hong, Zhaoxi Chen, Yushi Lan, Liang Pan, Ziwei Liu
Abstract summary: EVA3D is an unconditional 3D human generative model learned from 2D image collections only. It can sample 3D humans with detailed geometry and render high-quality images (up to 512x256) without bells and whistles. It achieves state-of-the-art 3D human generation performance regarding both geometry and texture quality.
Score: 27.70991135165909
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Inverse graphics aims to recover 3D models from 2D observations. Utilizing differentiable rendering, recent 3D-aware generative models have shown impressive results of rigid object generation using 2D images. However, it remains challenging to generate articulated objects, like human bodies, due to their complexity and diversity in poses and appearances. In this work, we propose, EVA3D, an unconditional 3D human generative model learned from 2D image collections only. EVA3D can sample 3D humans with detailed geometry and render high-quality images (up to 512x256) without bells and whistles (e.g. super resolution). At the core of EVA3D is a compositional human NeRF representation, which divides the human body into local parts. Each part is represented by an individual volume. This compositional representation enables 1) inherent human priors, 2) adaptive allocation of network parameters, 3) efficient training and rendering. Moreover, to accommodate for the characteristics of sparse 2D human image collections (e.g. imbalanced pose distribution), we propose a pose-guided sampling strategy for better GAN learning. Extensive experiments validate that EVA3D achieves state-of-the-art 3D human generation performance regarding both geometry and texture quality. Notably, EVA3D demonstrates great potential and scalability to "inverse-graphics" diverse human bodies with a clean framework.

Related papers

AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion [56.12859795754579]
AdaHuman is a novel framework that generates high-fidelity animatable 3D avatars from a single in-the-wild image.<n>AdaHuman incorporates two key innovations: a pose-conditioned 3D joint diffusion model and a compositional 3DGS refinement module.
arXiv Detail & Related papers (2025-05-30T17:59:54Z)
3D$^2$-Actor: Learning Pose-Conditioned 3D-Aware Denoiser for Realistic Gaussian Avatar Modeling [37.11454674584874]
We introduce 3D$2$-Actor, a pose-conditioned 3D-aware human modeling pipeline that integrates 2D denoising and 3D rectifying steps. Experimental results demonstrate that 3D$2$-Actor excels in high-fidelity avatar modeling and robustly generalizes to novel poses.
arXiv Detail & Related papers (2024-12-16T09:37:52Z)
Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model. Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach. These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z)
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data [36.51674664590734]
We present En3D, an enhanced izable scheme for high-qualityd 3D human avatars. Unlike previous works that rely on scarce 3D datasets or limited 2D collections with imbalance viewing angles and pose priors, our approach aims to develop a zero-shot 3D capable of producing 3D humans.
arXiv Detail & Related papers (2024-01-02T12:06:31Z)
AG3D: Learning to Generate 3D Avatars from 2D Image Collections [96.28021214088746]
We propose a new adversarial generative model of realistic 3D people from 2D images. Our method captures shape and deformation of the body and loose clothing by adopting a holistic 3D generator. We experimentally find that our method outperforms previous 3D- and articulation-aware methods in terms of geometry and appearance.
arXiv Detail & Related papers (2023-05-03T17:56:24Z)
3DHumanGAN: 3D-Aware Human Image Generation with 3D Pose Mapping [37.14866512377012]
3DHumanGAN is a 3D-aware generative adversarial network that synthesizes photorealistic images of full-body humans. We propose a novel generator architecture in which a 2D convolutional backbone is modulated by a 3D pose mapping network.
arXiv Detail & Related papers (2022-12-14T17:59:03Z)
DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars. It exploits the advantages of both the 2D and 3D neural rendering techniques. Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z)
3D-Aware Semantic-Guided Generative Model for Human Synthesis [67.86621343494998]
This paper proposes a 3D-aware Semantic-Guided Generative Model (3D-SGAN) for human image synthesis. Our experiments on the DeepFashion dataset show that 3D-SGAN significantly outperforms the most recent baselines.
arXiv Detail & Related papers (2021-12-02T17:10:53Z)
Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction [33.95791350070165]
Inferring 3D structure of a generic object from a 2D image is a long-standing objective of computer vision. We take an alternative approach with semi-supervised learning. That is, for a 2D image of a generic object, we decompose it into latent representations of category, shape and albedo. We show that the complete shape and albedo modeling enables us to leverage real 2D images in both modeling and model fitting.
arXiv Detail & Related papers (2021-04-02T02:39:29Z)
Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image. The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images. We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z)
Towards Realistic 3D Embedding via View Alignment [53.89445873577063]
This paper presents an innovative View Alignment GAN (VA-GAN) that composes new images by embedding 3D models into 2D background images realistically and automatically. VA-GAN consists of a texture generator and a differential discriminator that are inter-connected and end-to-end trainable.
arXiv Detail & Related papers (2020-07-14T14:45:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.