Look, Evolve and Mold: Learning 3D Shape Manifold via Single-view
Synthetic Data
- URL: http://arxiv.org/abs/2103.04789v1
- Date: Mon, 8 Mar 2021 14:30:18 GMT
- Title: Look, Evolve and Mold: Learning 3D Shape Manifold via Single-view
Synthetic Data
- Authors: Qianyu Feng, Yawei Luo, Keyang Luo, Yi Yang
- Abstract summary: We propose a domain-adaptive network for single-view 3D reconstruction, dubbed LEM, to generalize towards the natural scenario.
Experiments on several benchmarks demonstrate the effectiveness and robustness of the proposed method.
- Score: 32.54820023526409
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With daily observation and prior knowledge, it is easy for us human to infer
the stereo structure via a single view. However, to equip the deep models with
such ability usually requires abundant supervision. It is promising that
without the elaborated 3D annotation, we can simply profit from the synthetic
data, where pairwise ground-truth is easy to access. Nevertheless, the domain
gap is not neglectable considering the variant texture, shape and context. To
overcome these difficulties, we propose a domain-adaptive network for
single-view 3D reconstruction, dubbed LEM, to generalize towards the natural
scenario by fulfilling several aspects: (1) Look: incorporating spatial
structure from the single view to enhance the representation; (2) Evolve:
leveraging the semantic information with unsupervised contrastive mapping
recurring to the shape priors; (3) Mold: transforming into the desired stereo
manifold with discernment and semantic knowledge. Extensive experiments on
several benchmarks demonstrate the effectiveness and robustness of the proposed
method, LEM, in learning the 3D shape manifold from the synthetic data via a
single-view.
Related papers
- DiHuR: Diffusion-Guided Generalizable Human Reconstruction [51.31232435994026]
We introduce DiHuR, a Diffusion-guided model for generalizable Human 3D Reconstruction and view synthesis from sparse, minimally overlapping images.
Our method integrates two key priors in a coherent manner: the prior from generalizable feed-forward models and the 2D diffusion prior, and it requires only multi-view image training, without 3D supervision.
arXiv Detail & Related papers (2024-11-16T03:52:23Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Single-View View Synthesis with Self-Rectified Pseudo-Stereo [49.946151180828465]
We leverage the reliable and explicit stereo prior to generate a pseudo-stereo viewpoint.
We propose a self-rectified stereo synthesis to amend erroneous regions in an identify-rectify manner.
Our method outperforms state-of-the-art single-view view synthesis methods and stereo synthesis methods.
arXiv Detail & Related papers (2023-04-19T09:36:13Z) - 3D-LatentMapper: View Agnostic Single-View Reconstruction of 3D Shapes [0.0]
We propose a novel framework that leverages the intermediate latent spaces of Vision Transformer (ViT) and a joint image-text representational model, CLIP, for fast and efficient Single View Reconstruction (SVR)
We use the ShapeNetV2 dataset and perform extensive experiments with comparisons to SOTA methods to demonstrate our method's effectiveness.
arXiv Detail & Related papers (2022-12-05T11:45:26Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Discovering 3D Parts from Image Collections [98.16987919686709]
We tackle the problem of 3D part discovery from only 2D image collections.
Instead of relying on manually annotated parts for supervision, we propose a self-supervised approach.
Our key insight is to learn a novel part shape prior that allows each part to fit an object shape faithfully while constrained to have simple geometry.
arXiv Detail & Related papers (2021-07-28T20:29:16Z) - 3D Shape Reconstruction from Vision and Touch [62.59044232597045]
In 3D shape reconstruction, the complementary fusion of visual and haptic modalities remains largely unexplored.
We introduce a dataset of simulated touch and vision signals from the interaction between a robotic hand and a large array of 3D objects.
arXiv Detail & Related papers (2020-07-07T20:20:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.