Benchmarking and Analyzing 3D-aware Image Synthesis with a Modularized
Codebase
- URL: http://arxiv.org/abs/2306.12423v1
- Date: Wed, 21 Jun 2023 17:59:51 GMT
- Title: Benchmarking and Analyzing 3D-aware Image Synthesis with a Modularized
Codebase
- Authors: Qiuyu Wang, Zifan Shi, Kecheng Zheng, Yinghao Xu, Sida Peng, Yujun
Shen
- Abstract summary: We build a well-structured, dubbed Carver, through modularizing the generation process.
The reproduction of a range of cutting-edge algorithms demonstrates the availability of our modularized.
We perform a variety of in-depth analyses, such as the comparison across different types of point feature.
- Score: 30.334079854982843
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the rapid advance of 3D-aware image synthesis, existing studies
usually adopt a mixture of techniques and tricks, leaving it unclear how each
part contributes to the final performance in terms of generality. Following the
most popular and effective paradigm in this field, which incorporates a neural
radiance field (NeRF) into the generator of a generative adversarial network
(GAN), we build a well-structured codebase, dubbed Carver, through modularizing
the generation process. Such a design allows researchers to develop and replace
each module independently, and hence offers an opportunity to fairly compare
various approaches and recognize their contributions from the module
perspective. The reproduction of a range of cutting-edge algorithms
demonstrates the availability of our modularized codebase. We also perform a
variety of in-depth analyses, such as the comparison across different types of
point feature, the necessity of the tailing upsampler in the generator, the
reliance on the camera pose prior, etc., which deepen our understanding of
existing methods and point out some further directions of the research work. We
release code and models at https://github.com/qiuyu96/Carver to facilitate the
development and evaluation of this field.
Related papers
- Part-aware Shape Generation with Latent 3D Diffusion of Neural Voxel Fields [50.12118098874321]
We introduce a latent 3D diffusion process for neural voxel fields, enabling generation at significantly higher resolutions.
A part-aware shape decoder is introduced to integrate the part codes into the neural voxel fields, guiding the accurate part decomposition.
The results demonstrate the superior generative capabilities of our proposed method in part-aware shape generation, outperforming existing state-of-the-art methods.
arXiv Detail & Related papers (2024-05-02T04:31:17Z) - Diverse Part Synthesis for 3D Shape Creation [5.203892715273396]
Methods that use neural networks for 3D shapes in the form of a part-based representation have been introduced over the last few years.
Current methods do not allow easily regenerating individual shape parts according to user preferences.
We investigate techniques that allow the user to generate multiple, diverse suggestions for individual parts.
arXiv Detail & Related papers (2024-01-17T17:55:06Z) - Human as Points: Explicit Point-based 3D Human Reconstruction from
Single-view RGB Images [78.56114271538061]
We introduce an explicit point-based human reconstruction framework called HaP.
Our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space.
Our results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design.
arXiv Detail & Related papers (2023-11-06T05:52:29Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Learning Modulated Transformation in GANs [69.95217723100413]
We equip the generator in generative adversarial networks (GANs) with a plug-and-play module, termed as modulated transformation module (MTM)
MTM predicts spatial offsets under the control of latent codes, based on which the convolution operation can be applied at variable locations.
It is noteworthy that towards human generation on the challenging TaiChi dataset, we improve the FID of StyleGAN3 from 21.36 to 13.60, demonstrating the efficacy of learning modulated geometry transformation.
arXiv Detail & Related papers (2023-08-29T17:51:22Z) - SDFEst: Categorical Pose and Shape Estimation of Objects from RGB-D
using Signed Distance Fields [5.71097144710995]
We present a modular pipeline for pose and shape estimation of objects from RGB-D images.
We integrate a generative shape model with a novel network to enable 6D pose and shape estimation from a single or multiple views.
We demonstrate the benefits of our approach over state-of-the-art methods in several experiments on both synthetic and real data.
arXiv Detail & Related papers (2022-07-11T13:53:50Z) - Image Synthesis with Adversarial Networks: a Comprehensive Survey and
Case Studies [41.00383742615389]
Generative Adversarial Networks (GANs) have been extremely successful in various application domains such as computer vision, medicine, and natural language processing.
GANs are powerful models for learning complex distributions to synthesize semantically meaningful samples.
Given the current fast GANs development, in this survey, we provide a comprehensive review of adversarial models for image synthesis.
arXiv Detail & Related papers (2020-12-26T13:30:42Z) - Neural Function Modules with Sparse Arguments: A Dynamic Approach to
Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning.
Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems.
The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z) - Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem.
We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.
Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.