Progressive Learning of 3D Reconstruction Network from 2D GAN Data
- URL: http://arxiv.org/abs/2305.11102v1
- Date: Thu, 18 May 2023 16:45:51 GMT
- Title: Progressive Learning of 3D Reconstruction Network from 2D GAN Data
- Authors: Aysegul Dundar, Jun Gao, Andrew Tao, Bryan Catanzaro
- Abstract summary: This paper presents a method to reconstruct high-quality textured 3D models from single images.
Our method relies on datasets with expensive annotations; multi-view images and their camera parameters.
We show significant improvements over previous methods whether they were trained on GAN generated multi-view images or on real images with expensive annotations.
- Score: 33.42114674602613
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a method to reconstruct high-quality textured 3D models
from single images. Current methods rely on datasets with expensive
annotations; multi-view images and their camera parameters. Our method relies
on GAN generated multi-view image datasets which have a negligible annotation
cost. However, they are not strictly multi-view consistent and sometimes GANs
output distorted images. This results in degraded reconstruction qualities. In
this work, to overcome these limitations of generated datasets, we have two
main contributions which lead us to achieve state-of-the-art results on
challenging objects: 1) A robust multi-stage learning scheme that gradually
relies more on the models own predictions when calculating losses, 2) A novel
adversarial learning pipeline with online pseudo-ground truth generations to
achieve fine details. Our work provides a bridge from 2D supervisions of GAN
models to 3D reconstruction models and removes the expensive annotation
efforts. We show significant improvements over previous methods whether they
were trained on GAN generated multi-view images or on real images with
expensive annotations. Please visit our web-page for 3D visuals:
https://research.nvidia.com/labs/adlr/progressive-3d-learning
Related papers
- GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - Inverse Neural Rendering for Explainable Multi-Object Tracking [35.072142773300655]
We recast 3D multi-object tracking from RGB cameras as an emphInverse Rendering (IR) problem.
We optimize an image loss over generative latent spaces that inherently disentangle shape and appearance properties.
We validate the generalization and scaling capabilities of our method by learning the generative prior exclusively from synthetic data.
arXiv Detail & Related papers (2024-04-18T17:37:53Z) - ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models [65.22994156658918]
We present a method that learns to generate multi-view images in a single denoising process from real-world data.
We design an autoregressive generation that renders more 3D-consistent images at any viewpoint.
arXiv Detail & Related papers (2024-03-04T07:57:05Z) - Geometry aware 3D generation from in-the-wild images in ImageNet [18.157263188192434]
We propose a method for reconstructing 3D geometry from diverse and unstructured Imagenet dataset without camera pose information.
We use an efficient triplane representation to learn 3D models from 2D images and modify the architecture of the generator backbone based on StyleGAN2.
The trained generator can produce class-conditional 3D models as well as renderings from arbitrary viewpoints.
arXiv Detail & Related papers (2024-01-31T23:06:39Z) - DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction
Model [86.37536249046943]
textbfDMV3D is a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion.
Our reconstruction model incorporates a triplane NeRF representation and can denoise noisy multi-view images via NeRF reconstruction and rendering.
arXiv Detail & Related papers (2023-11-15T18:58:41Z) - IT3D: Improved Text-to-3D Generation with Explicit View Synthesis [71.68595192524843]
This study presents a novel strategy that leverages explicitly synthesized multi-view images to address these issues.
Our approach involves the utilization of image-to-image pipelines, empowered by LDMs, to generate posed high-quality images.
For the incorporated discriminator, the synthesized multi-view images are considered real data, while the renderings of the optimized 3D models function as fake data.
arXiv Detail & Related papers (2023-08-22T14:39:17Z) - GAN2X: Non-Lambertian Inverse Rendering of Image GANs [85.76426471872855]
We present GAN2X, a new method for unsupervised inverse rendering that only uses unpaired images for training.
Unlike previous Shape-from-GAN approaches that mainly focus on 3D shapes, we take the first attempt to also recover non-Lambertian material properties by exploiting the pseudo paired data generated by a GAN.
Experiments demonstrate that GAN2X can accurately decompose 2D images to 3D shape, albedo, and specular properties for different object categories, and achieves the state-of-the-art performance for unsupervised single-view 3D face reconstruction.
arXiv Detail & Related papers (2022-06-18T16:58:49Z) - Image GANs meet Differentiable Rendering for Inverse Graphics and
Interpretable 3D Neural Rendering [101.56891506498755]
Differentiable rendering has paved the way to training neural networks to perform "inverse graphics" tasks.
We show that our approach significantly outperforms state-of-the-art inverse graphics networks trained on existing datasets.
arXiv Detail & Related papers (2020-10-18T22:29:07Z) - Leveraging 2D Data to Learn Textured 3D Mesh Generation [33.32377849866736]
We present the first generative model of textured 3D meshes.
We train our model to explain a distribution of images by modelling each image as a 3D foreground object.
It learns to generate meshes that when rendered, produce images similar to those in its training set.
arXiv Detail & Related papers (2020-04-08T18:00:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.