GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence
- URL: http://arxiv.org/abs/2311.14155v2
- Date: Fri, 15 Mar 2024 15:05:31 GMT
- Title: GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence
- Authors: Van Nguyen Nguyen, Thibault Groueix, Mathieu Salzmann, Vincent Lepetit,
- Abstract summary: GigaPose is a fast, robust, and accurate method for CAD-based novel object pose estimation in RGB images.
Our approach samples templates in only a two-degrees-of-freedom space instead of the usual three.
It achieves state-of-the-art accuracy and can be seamlessly integrated with existing refinement methods.
- Score: 64.77224422330737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present GigaPose, a fast, robust, and accurate method for CAD-based novel object pose estimation in RGB images. GigaPose first leverages discriminative "templates", rendered images of the CAD models, to recover the out-of-plane rotation and then uses patch correspondences to estimate the four remaining parameters. Our approach samples templates in only a two-degrees-of-freedom space instead of the usual three and matches the input image to the templates using fast nearest-neighbor search in feature space, results in a speedup factor of 35x compared to the state of the art. Moreover, GigaPose is significantly more robust to segmentation errors. Our extensive evaluation on the seven core datasets of the BOP challenge demonstrates that it achieves state-of-the-art accuracy and can be seamlessly integrated with existing refinement methods. Additionally, we show the potential of GigaPose with 3D models predicted by recent work on 3D reconstruction from a single image, relaxing the need for CAD models and making 6D pose object estimation much more convenient. Our source code and trained models are publicly available at https://github.com/nv-nguyen/gigaPose
Related papers
- GS-Pose: Generalizable Segmentation-based 6D Object Pose Estimation with 3D Gaussian Splatting [23.724077890247834]
GS-Pose is a framework for localizing and estimating the 6D pose of novel objects.
It operates sequentially by locating the object in the input image, estimating its initial 6D pose, and refining the pose with a render-and-compare method.
Off-the-shelf toolchains and commodity hardware, such as mobile phones, can be used to capture new objects to be added to the database.
arXiv Detail & Related papers (2024-03-15T21:06:14Z) - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - FoundPose: Unseen Object Pose Estimation with Foundation Features [11.32559845631345]
FoundPose is a model-based method for 6D pose estimation of unseen objects from a single RGB image.
The method can quickly onboard new objects using their 3D models without requiring any object- or task-specific training.
arXiv Detail & Related papers (2023-11-30T18:52:29Z) - OnePose++: Keypoint-Free One-Shot Object Pose Estimation without CAD
Models [51.68715543630427]
OnePose relies on detecting repeatable image keypoints and is thus prone to failure on low-textured objects.
We propose a keypoint-free pose estimation pipeline to remove the need for repeatable keypoint detection.
A 2D-3D matching network directly establishes 2D-3D correspondences between the query image and the reconstructed point-cloud model.
arXiv Detail & Related papers (2023-01-18T17:47:13Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - SPARC: Sparse Render-and-Compare for CAD model alignment in a single RGB
image [21.77811443143683]
Estimating 3D shapes and poses of static objects from a single image has important applications for robotics, augmented reality and digital content creation.
We demonstrate that a sparse, iterative, render-and-compare approach is more accurate and robust than relying on normalised object coordinates.
Our alignment procedure converges after just 3 iterations, improving the state-of-the-art performance on the challenging real-world dataset ScanNet.
arXiv Detail & Related papers (2022-10-03T16:02:10Z) - OnePose: One-Shot Object Pose Estimation without CAD Models [30.307122037051126]
OnePose does not rely on CAD models and can handle objects in arbitrary categories without instance- or category-specific network training.
OnePose draws the idea from visual localization and only requires a simple RGB video scan of the object to build a sparse SfM model of the object.
To mitigate the slow runtime of existing visual localization methods, we propose a new graph attention network that directly matches 2D interest points in the query image with the 3D points in the SfM model.
arXiv Detail & Related papers (2022-05-24T17:59:21Z) - UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body
Decoupling 3D Model [58.70130563417079]
We introduce a new 3D human-body model with a series of decoupled parameters that could freely control the generation of the body.
Compared to the existing manually annotated DensePose-COCO dataset, the synthetic UltraPose has ultra dense image-to-surface correspondences without annotation cost and error.
arXiv Detail & Related papers (2021-10-28T16:24:55Z) - Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval
from a Single Image [58.953160501596805]
We propose a novel approach towards constructing a joint embedding space between 2D images and 3D CAD models in a patch-wise fashion.
Our approach is more robust than state of the art in real-world scenarios without any exact CAD matches.
arXiv Detail & Related papers (2021-08-20T20:58:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.