Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance
Consistency
- URL: http://arxiv.org/abs/2204.10310v1
- Date: Thu, 21 Apr 2022 17:47:35 GMT
- Title: Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance
Consistency
- Authors: Tom Monnier, Matthew Fisher, Alexei A. Efros, Mathieu Aubry
- Abstract summary: Single-view reconstruction typically rely on viewpoint annotations, silhouettes, the absence of background, multiple views of the same instance, a template shape, or symmetry.
We avoid all of these supervisions and hypotheses by leveraging explicitly the consistency between images of different object instances.
Our main contributions are two approaches to leverage cross-instance consistency: (i) progressive conditioning, a training strategy to gradually specialize the model from category to instances in a curriculum learning fashion; (ii) swap reconstruction, a loss enforcing consistency between instances having similar shape or texture.
- Score: 59.427074701985795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Approaches to single-view reconstruction typically rely on viewpoint
annotations, silhouettes, the absence of background, multiple views of the same
instance, a template shape, or symmetry. We avoid all of these supervisions and
hypotheses by leveraging explicitly the consistency between images of different
object instances. As a result, our method can learn from large collections of
unlabelled images depicting the same object category. Our main contributions
are two approaches to leverage cross-instance consistency: (i) progressive
conditioning, a training strategy to gradually specialize the model from
category to instances in a curriculum learning fashion; (ii) swap
reconstruction, a loss enforcing consistency between instances having similar
shape or texture. Critical to the success of our method are also: our
structured autoencoding architecture decomposing an image into explicit shape,
texture, pose, and background; an adapted formulation of differential
rendering, and; a new optimization scheme alternating between 3D and pose
learning. We compare our approach, UNICORN, both on the diverse synthetic
ShapeNet dataset - the classical benchmark for methods requiring multiple views
as supervision - and on standard real-image benchmarks (Pascal3D+ Car, CUB-200)
for which most methods require known templates and silhouette annotations. We
also showcase applicability to more challenging real-world collections
(CompCars, LSUN), where silhouettes are not available and images are not
cropped around the object.
Related papers
- EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild [79.71523320368388]
Our work aims to reconstruct hand-object interactions from a single-view image.
We first design a novel pipeline to estimate the underlying hand pose and object shape.
With the initial reconstruction, we employ a prior-guided optimization scheme.
arXiv Detail & Related papers (2024-11-21T16:33:35Z) - ShapeClipper: Scalable 3D Shape Learning from Single-View Images via
Geometric and CLIP-based Consistency [39.7058456335011]
We present ShapeClipper, a novel method that reconstructs 3D object shapes from real-world single-view RGB images.
ShapeClipper learns shape reconstruction from a set of single-view segmented images.
We evaluate our method over three challenging real-world datasets.
arXiv Detail & Related papers (2023-04-13T03:53:12Z) - Single-view 3D Mesh Reconstruction for Seen and Unseen Categories [69.29406107513621]
Single-view 3D Mesh Reconstruction is a fundamental computer vision task that aims at recovering 3D shapes from single-view RGB images.
This paper tackles Single-view 3D Mesh Reconstruction, to study the model generalization on unseen categories.
We propose an end-to-end two-stage network, GenMesh, to break the category boundaries in reconstruction.
arXiv Detail & Related papers (2022-08-04T14:13:35Z) - Multi-Category Mesh Reconstruction From Image Collections [90.24365811344987]
We present an alternative approach that infers the textured mesh of objects combining a series of deformable 3D models and a set of instance-specific deformation, pose, and texture.
Our method is trained with images of multiple object categories using only foreground masks and rough camera poses as supervision.
Experiments show that the proposed framework can distinguish between different object categories and learn category-specific shape priors in an unsupervised manner.
arXiv Detail & Related papers (2021-10-21T16:32:31Z) - Unsupervised Layered Image Decomposition into Object Prototypes [39.20333694585477]
We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models.
We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks.
We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images.
arXiv Detail & Related papers (2021-04-29T18:02:01Z) - A Divide et Impera Approach for 3D Shape Reconstruction from Multiple
Views [49.03830902235915]
Estimating the 3D shape of an object from a single or multiple images has gained popularity thanks to the recent breakthroughs powered by deep learning.
This paper proposes to rely on viewpoint variant reconstructions by merging the visible information from the given views.
To validate the proposed method, we perform a comprehensive evaluation on the ShapeNet reference benchmark in terms of relative pose estimation and 3D shape reconstruction.
arXiv Detail & Related papers (2020-11-17T09:59:32Z) - Self-supervised Single-view 3D Reconstruction via Semantic Consistency [142.71430568330172]
We learn a self-supervised, single-view 3D reconstruction model that predicts the shape, texture and camera pose of a target object.
The proposed method does not necessitate 3D supervision, manually annotated keypoints, multi-view images of an object or a prior 3D template.
arXiv Detail & Related papers (2020-03-13T20:29:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.