ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and
Pose Optimization
- URL: http://arxiv.org/abs/2207.13691v1
- Date: Wed, 27 Jul 2022 17:59:31 GMT
- Title: ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and
Pose Optimization
- Authors: Muhammad Zubair Irshad, Sergey Zakharov, Rares Ambrus, Thomas Kollar,
Zsolt Kira, Adrien Gaidon
- Abstract summary: We present ShAPO, a method for joint multi-object detection, 3D textured reconstruction, 6D object pose and size estimation.
Key to ShAPO is a single-shot pipeline to regress shape, appearance and pose latent codes along with the masks of each object instance.
Our method significantly out-performs all baselines on the NOCS dataset with an 8% absolute improvement in mAP for 6D pose estimation.
- Score: 40.36229450208817
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our method studies the complex task of object-centric 3D understanding from a
single RGB-D observation. As it is an ill-posed problem, existing methods
suffer from low performance for both 3D shape and 6D pose and size estimation
in complex multi-object scenarios with occlusions. We present ShAPO, a method
for joint multi-object detection, 3D textured reconstruction, 6D object pose
and size estimation. Key to ShAPO is a single-shot pipeline to regress shape,
appearance and pose latent codes along with the masks of each object instance,
which is then further refined in a sparse-to-dense fashion. A novel
disentangled shape and appearance database of priors is first learned to embed
objects in their respective shape and appearance space. We also propose a
novel, octree-based differentiable optimization step, allowing us to further
improve object shape, pose and appearance simultaneously under the learned
latent space, in an analysis-by-synthesis fashion. Our novel joint implicit
textured object representation allows us to accurately identify and reconstruct
novel unseen objects without having access to their 3D meshes. Through
extensive experiments, we show that our method, trained on simulated indoor
scenes, accurately regresses the shape, appearance and pose of novel objects in
the real-world with minimal fine-tuning. Our method significantly out-performs
all baselines on the NOCS dataset with an 8% absolute improvement in mAP for 6D
pose estimation. Project page:
https://zubair-irshad.github.io/projects/ShAPO.html
Related papers
- 3D Surface Reconstruction in the Wild by Deforming Shape Priors from
Synthetic Data [24.97027425606138]
Reconstructing the underlying 3D surface of an object from a single image is a challenging problem.
We present a new method for joint category-specific 3D reconstruction and object pose estimation from a single image.
Our approach achieves state-of-the-art reconstruction performance across several real-world datasets.
arXiv Detail & Related papers (2023-02-24T20:37:27Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - Generative Category-Level Shape and Pose Estimation with Semantic
Primitives [27.692997522812615]
We propose a novel framework for category-level object shape and pose estimation from a single RGB-D image.
To handle the intra-category variation, we adopt a semantic primitive representation that encodes diverse shapes into a unified latent space.
We show that the proposed method achieves SOTA pose estimation performance and better generalization in the real-world dataset.
arXiv Detail & Related papers (2022-10-03T17:51:54Z) - FS6D: Few-Shot 6D Pose Estimation of Novel Objects [116.34922994123973]
6D object pose estimation networks are limited in their capability to scale to large numbers of object instances.
In this work, we study a new open set problem; the few-shot 6D object poses estimation: estimating the 6D pose of an unknown object by a few support views without extra training.
arXiv Detail & Related papers (2022-03-28T10:31:29Z) - CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and
Categorical 6D Pose and Size Estimation [19.284468553414918]
This paper studies the complex task of simultaneous multi-object 3D reconstruction, 6D pose and size estimation from a single-view RGB-D observation.
Existing approaches mainly follow a complex multi-stage pipeline which first localizes and detects each object instance in the image and then regresses to either their 3D meshes or 6D poses.
We present a simple one-stage approach to predict both the 3D shape and estimate the 6D pose and size jointly in a bounding-box free manner.
arXiv Detail & Related papers (2022-03-03T18:59:04Z) - Learning Stereopsis from Geometric Synthesis for 6D Object Pose
Estimation [11.999630902627864]
Current monocular-based 6D object pose estimation methods generally achieve less competitive results than RGBD-based methods.
This paper proposes a 3D geometric volume based pose estimation method with a short baseline two-view setting.
Experiments show that our method outperforms state-of-the-art monocular-based methods, and is robust in different objects and scenes.
arXiv Detail & Related papers (2021-09-25T02:55:05Z) - Neural Articulated Radiance Field [90.91714894044253]
We present Neural Articulated Radiance Field (NARF), a novel deformable 3D representation for articulated objects learned from images.
Experiments show that the proposed method is efficient and can generalize well to novel poses.
arXiv Detail & Related papers (2021-04-07T13:23:14Z) - From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.
A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes.
The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z) - Shape Prior Deformation for Categorical 6D Object Pose and Size
Estimation [62.618227434286]
We present a novel learning approach to recover the 6D poses and sizes of unseen object instances from an RGB-D image.
We propose a deep network to reconstruct the 3D object model by explicitly modeling the deformation from a pre-learned categorical shape prior.
arXiv Detail & Related papers (2020-07-16T16:45:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.