ObjectMatch: Robust Registration using Canonical Object Correspondences
- URL: http://arxiv.org/abs/2212.01985v2
- Date: Fri, 24 Mar 2023 23:37:57 GMT
- Title: ObjectMatch: Robust Registration using Canonical Object Correspondences
- Authors: Can G\"umeli, Angela Dai, Matthias Nie{\ss}ner
- Abstract summary: We present ObjectMatch, a semantic and object-centric camera pose estimator for RGB-D SLAM pipelines.
In registering RGB-D sequences, our method outperforms cutting-edge SLAM baselines in challenging, low-frame-rate scenarios.
- Score: 21.516657643120375
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present ObjectMatch, a semantic and object-centric camera pose estimator
for RGB-D SLAM pipelines. Modern camera pose estimators rely on direct
correspondences of overlapping regions between frames; however, they cannot
align camera frames with little or no overlap. In this work, we propose to
leverage indirect correspondences obtained via semantic object identification.
For instance, when an object is seen from the front in one frame and from the
back in another frame, we can provide additional pose constraints through
canonical object correspondences. We first propose a neural network to predict
such correspondences on a per-pixel level, which we then combine in our energy
formulation with state-of-the-art keypoint matching solved with a joint
Gauss-Newton optimization. In a pairwise setting, our method improves
registration recall of state-of-the-art feature matching, including from 24% to
45% in pairs with 10% or less inter-frame overlap. In registering RGB-D
sequences, our method outperforms cutting-edge SLAM baselines in challenging,
low-frame-rate scenarios, achieving more than 35% reduction in trajectory error
in multiple scenes.
Related papers
- RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images [13.051302134031808]
We introduce a novel method for calculating the 6DoF pose of an object using a single RGB-D image.
Unlike existing methods that either directly predict objects' poses or rely on sparse keypoints for pose recovery, our approach addresses this challenging task using dense correspondence.
arXiv Detail & Related papers (2024-05-14T10:10:45Z) - FoundPose: Unseen Object Pose Estimation with Foundation Features [11.32559845631345]
FoundPose is a model-based method for 6D pose estimation of unseen objects from a single RGB image.
The method can quickly onboard new objects using their 3D models without requiring any object- or task-specific training.
arXiv Detail & Related papers (2023-11-30T18:52:29Z) - RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery [72.13154206106259]
We propose a novel pipeline that decouples the 6D pose and size estimation to mitigate the influence of imperfect scales on rigid transformations.
Specifically, we leverage a pre-trained monocular estimator to extract local geometric information.
A separate branch is designed to directly recover the metric scale of the object based on category-level statistics.
arXiv Detail & Related papers (2023-09-19T02:20:26Z) - Quantity-Aware Coarse-to-Fine Correspondence for Image-to-Point Cloud
Registration [4.954184310509112]
Image-to-point cloud registration aims to determine the relative camera pose between an RGB image and a reference point cloud.
Matching individual points with pixels can be inherently ambiguous due to modality gaps.
We propose a framework to capture quantity-aware correspondences between local point sets and pixel patches.
arXiv Detail & Related papers (2023-07-14T03:55:54Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - CheckerPose: Progressive Dense Keypoint Localization for Object Pose
Estimation with Graph Neural Network [66.24726878647543]
Estimating the 6-DoF pose of a rigid object from a single RGB image is a crucial yet challenging task.
Recent studies have shown the great potential of dense correspondence-based solutions.
We propose a novel pose estimation algorithm named CheckerPose, which improves on three main aspects.
arXiv Detail & Related papers (2023-03-29T17:30:53Z) - RelPose: Predicting Probabilistic Relative Rotation for Single Objects
in the Wild [73.1276968007689]
We describe a data-driven method for inferring the camera viewpoints given multiple images of an arbitrary object.
We show that our approach outperforms state-of-the-art SfM and SLAM methods given sparse images on both seen and unseen categories.
arXiv Detail & Related papers (2022-08-11T17:59:59Z) - Level Set-Based Camera Pose Estimation From Multiple 2D/3D
Ellipse-Ellipsoid Correspondences [2.016317500787292]
We show that the definition of a cost function characterizing the projection of a 3D object onto a 2D object detection is not straightforward.
We develop an ellipse-ellipse cost based on level sets sampling, demonstrate its nice properties for handling partially visible objects and compare its performance with other common metrics.
arXiv Detail & Related papers (2022-07-16T14:09:54Z) - Semantic keypoint-based pose estimation from single RGB frames [64.80395521735463]
We present an approach to estimating the continuous 6-DoF pose of an object from a single RGB image.
The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model.
We show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios.
arXiv Detail & Related papers (2022-04-12T15:03:51Z) - DeepI2P: Image-to-Point Cloud Registration via Deep Classification [71.3121124994105]
DeepI2P is a novel approach for cross-modality registration between an image and a point cloud.
Our method estimates the relative rigid transformation between the coordinate frames of the camera and Lidar.
We circumvent the difficulty by converting the registration problem into a classification and inverse camera projection optimization problem.
arXiv Detail & Related papers (2021-04-08T04:27:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.