GaussiGAN: Controllable Image Synthesis with 3D Gaussians from Unposed
Silhouettes
- URL: http://arxiv.org/abs/2106.13215v1
- Date: Thu, 24 Jun 2021 17:47:58 GMT
- Title: GaussiGAN: Controllable Image Synthesis with 3D Gaussians from Unposed
Silhouettes
- Authors: Youssef A.Mejjati and Isa Milefchik and Aaron Gokaslan and Oliver Wang
and Kwang In Kim and James Tompkin
- Abstract summary: We present an algorithm that learns a coarse 3D representation of objects from unposed multi-view 2D mask supervision.
In contrast to existing voxel-based methods for unposed object reconstruction, our approach learns to represent the generated shape and pose.
We show results on synthetic datasets with realistic lighting, and demonstrate object insertion with interactive posing.
- Score: 48.642181362172906
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present an algorithm that learns a coarse 3D representation of objects
from unposed multi-view 2D mask supervision, then uses it to generate detailed
mask and image texture. In contrast to existing voxel-based methods for unposed
object reconstruction, our approach learns to represent the generated shape and
pose with a set of self-supervised canonical 3D anisotropic Gaussians via a
perspective camera, and a set of per-image transforms. We show that this
approach can robustly estimate a 3D space for the camera and object, while
recent baselines sometimes struggle to reconstruct coherent 3D spaces in this
setting. We show results on synthetic datasets with realistic lighting, and
demonstrate object insertion with interactive posing. With our work, we help
move towards structured representations that handle more real-world variation
in learning-based object reconstruction.
Related papers
- 3D Surface Reconstruction in the Wild by Deforming Shape Priors from
Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem.
We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image.
Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z) - Style Agnostic 3D Reconstruction via Adversarial Style Transfer [23.304453155586312]
Reconstructing the 3D geometry of an object from an image is a major challenge in computer vision.
We propose an approach that enables a differentiable-based learning of 3D objects from images with backgrounds.
arXiv Detail & Related papers (2021-10-20T21:24:44Z) - DensePose 3D: Lifting Canonical Surface Maps of Articulated Objects to
the Third Dimension [71.71234436165255]
We contribute DensePose 3D, a method that can learn such reconstructions in a weakly supervised fashion from 2D image annotations only.
Because it does not require 3D scans, DensePose 3D can be used for learning a wide range of articulated categories such as different animal species.
We show significant improvements compared to state-of-the-art non-rigid structure-from-motion baselines on both synthetic and real data on categories of humans and animals.
arXiv Detail & Related papers (2021-08-31T18:33:55Z) - Learning Canonical 3D Object Representation for Fine-Grained Recognition [77.33501114409036]
We propose a novel framework for fine-grained object recognition that learns to recover object variation in 3D space from a single image.
We represent an object as a composition of 3D shape and its appearance, while eliminating the effect of camera viewpoint.
By incorporating 3D shape and appearance jointly in a deep representation, our method learns the discriminative representation of the object.
arXiv Detail & Related papers (2021-08-10T12:19:34Z) - Sparse Pose Trajectory Completion [87.31270669154452]
We propose a method to learn, even using a dataset where objects appear only in sparsely sampled views.
This is achieved with a cross-modal pose trajectory transfer mechanism.
Our method is evaluated on the Pix3D and ShapeNet datasets.
arXiv Detail & Related papers (2021-05-01T00:07:21Z) - Fully Understanding Generic Objects: Modeling, Segmentation, and
Reconstruction [33.95791350070165]
Inferring 3D structure of a generic object from a 2D image is a long-standing objective of computer vision.
We take an alternative approach with semi-supervised learning. That is, for a 2D image of a generic object, we decompose it into latent representations of category, shape and albedo.
We show that the complete shape and albedo modeling enables us to leverage real 2D images in both modeling and model fitting.
arXiv Detail & Related papers (2021-04-02T02:39:29Z) - Weakly Supervised Learning of Multi-Object 3D Scene Decompositions Using
Deep Shape Priors [69.02332607843569]
PriSMONet is a novel approach for learning Multi-Object 3D scene decomposition and representations from single images.
A recurrent encoder regresses a latent representation of 3D shape, pose and texture of each object from an input RGB image.
We evaluate the accuracy of our model in inferring 3D scene layout, demonstrate its generative capabilities, assess its generalization to real images, and point out benefits of the learned representation.
arXiv Detail & Related papers (2020-10-08T14:49:23Z) - Towards Realistic 3D Embedding via View Alignment [53.89445873577063]
This paper presents an innovative View Alignment GAN (VA-GAN) that composes new images by embedding 3D models into 2D background images realistically and automatically.
VA-GAN consists of a texture generator and a differential discriminator that are inter-connected and end-to-end trainable.
arXiv Detail & Related papers (2020-07-14T14:45:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.