CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and
Categorical 6D Pose and Size Estimation
- URL: http://arxiv.org/abs/2203.01929v1
- Date: Thu, 3 Mar 2022 18:59:04 GMT
- Title: CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and
Categorical 6D Pose and Size Estimation
- Authors: Muhammad Zubair Irshad, Thomas Kollar, Michael Laskey, Kevin Stone,
Zsolt Kira
- Abstract summary: This paper studies the complex task of simultaneous multi-object 3D reconstruction, 6D pose and size estimation from a single-view RGB-D observation.
Existing approaches mainly follow a complex multi-stage pipeline which first localizes and detects each object instance in the image and then regresses to either their 3D meshes or 6D poses.
We present a simple one-stage approach to predict both the 3D shape and estimate the 6D pose and size jointly in a bounding-box free manner.
- Score: 19.284468553414918
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper studies the complex task of simultaneous multi-object 3D
reconstruction, 6D pose and size estimation from a single-view RGB-D
observation. In contrast to instance-level pose estimation, we focus on a more
challenging problem where CAD models are not available at inference time.
Existing approaches mainly follow a complex multi-stage pipeline which first
localizes and detects each object instance in the image and then regresses to
either their 3D meshes or 6D poses. These approaches suffer from
high-computational cost and low performance in complex multi-object scenarios,
where occlusions can be present. Hence, we present a simple one-stage approach
to predict both the 3D shape and estimate the 6D pose and size jointly in a
bounding-box free manner. In particular, our method treats object instances as
spatial centers where each center denotes the complete shape of an object along
with its 6D pose and size. Through this per-pixel representation, our approach
can reconstruct in real-time (40 FPS) multiple novel object instances and
predict their 6D pose and sizes in a single-forward pass. Through extensive
experiments, we demonstrate that our approach significantly outperforms all
shape completion and categorical 6D pose and size estimation baselines on
multi-object ShapeNet and NOCS datasets respectively with a 12.6% absolute
improvement in mAP for 6D pose for novel real-world object instances.
Related papers
- FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - MV6D: Multi-View 6D Pose Estimation on RGB-D Frames Using a Deep
Point-wise Voting Network [14.754297065772676]
We present a novel multi-view 6D pose estimation method called MV6D.
We base our approach on the PVN3D network that uses a single RGB-D image to predict keypoints of the target objects.
In contrast to current multi-view pose detection networks such as CosyPose, our MV6D can learn the fusion of multiple perspectives in an end-to-end manner.
arXiv Detail & Related papers (2022-08-01T23:34:43Z) - ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and
Pose Optimization [40.36229450208817]
We present ShAPO, a method for joint multi-object detection, 3D textured reconstruction, 6D object pose and size estimation.
Key to ShAPO is a single-shot pipeline to regress shape, appearance and pose latent codes along with the masks of each object instance.
Our method significantly out-performs all baselines on the NOCS dataset with an 8% absolute improvement in mAP for 6D pose estimation.
arXiv Detail & Related papers (2022-07-27T17:59:31Z) - Coupled Iterative Refinement for 6D Multi-Object Pose Estimation [64.7198752089041]
Given a set of known 3D objects and an RGB or RGB-D input image, we detect and estimate the 6D pose of each object.
Our approach iteratively refines both pose and correspondence in a tightly coupled manner, allowing us to dynamically remove outliers to improve accuracy.
arXiv Detail & Related papers (2022-04-26T18:00:08Z) - FS6D: Few-Shot 6D Pose Estimation of Novel Objects [116.34922994123973]
6D object pose estimation networks are limited in their capability to scale to large numbers of object instances.
In this work, we study a new open set problem; the few-shot 6D object poses estimation: estimating the 6D pose of an unknown object by a few support views without extra training.
arXiv Detail & Related papers (2022-03-28T10:31:29Z) - Learning Stereopsis from Geometric Synthesis for 6D Object Pose
Estimation [11.999630902627864]
Current monocular-based 6D object pose estimation methods generally achieve less competitive results than RGBD-based methods.
This paper proposes a 3D geometric volume based pose estimation method with a short baseline two-view setting.
Experiments show that our method outperforms state-of-the-art monocular-based methods, and is robust in different objects and scenes.
arXiv Detail & Related papers (2021-09-25T02:55:05Z) - CosyPose: Consistent multi-view multi-object 6D pose estimation [48.097599674329004]
We present a single-view single-object 6D pose estimation method, which we use to generate 6D object pose hypotheses.
Second, we develop a robust method for matching individual 6D object pose hypotheses across different input images.
Third, we develop a method for global scene refinement given multiple object hypotheses and their correspondences across views.
arXiv Detail & Related papers (2020-08-19T14:11:56Z) - Shape Prior Deformation for Categorical 6D Object Pose and Size
Estimation [62.618227434286]
We present a novel learning approach to recover the 6D poses and sizes of unseen object instances from an RGB-D image.
We propose a deep network to reconstruct the 3D object model by explicitly modeling the deformation from a pre-learned categorical shape prior.
arXiv Detail & Related papers (2020-07-16T16:45:05Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.