Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial
Keypoint Voting
- URL: http://arxiv.org/abs/2104.02527v2
- Date: Wed, 7 Apr 2021 21:29:19 GMT
- Title: Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial
Keypoint Voting
- Authors: Yangzheng Wu, Mohsen Zand, Ali Etemad, Michael Greenspan
- Abstract summary: We propose a novel keypoint voting scheme based on intersecting spheres, that is more accurate than existing schemes and allows for a smaller set of more disperse keypoints.
The scheme forms the basis of the proposed RCVPose method for 6 DoF pose estimation of 3D objects in RGB-D data.
- Score: 7.6997148655751895
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel keypoint voting scheme based on intersecting spheres, that
is more accurate than existing schemes and allows for a smaller set of more
disperse keypoints. The scheme forms the basis of the proposed RCVPose method
for 6 DoF pose estimation of 3D objects in RGB-D data, which is particularly
effective at handling occlusions. A CNN is trained to estimate the distance
between the 3D point corresponding to the depth mode of each RGB pixel, and a
set of 3 disperse keypoints defined in the object frame. At inference, a sphere
of radius equal to this estimated distance is generated, centered at each 3D
point. The surface of these spheres votes to increment a 3D accumulator space,
the peaks of which indicate keypoint locations. The proposed radial voting
scheme is more accurate than previous vector or offset schemes, and robust to
disperse keypoints. Experiments demonstrate RCVPose to be highly accurate and
competitive, achieving state-of-the-art results on LINEMOD 99.7%, YCB-Video
97.2% datasets, and notably scoring +7.9% higher than previous methods on the
challenging Occlusion LINEMOD 71.1% dataset.
Related papers
- 3D Object Detection from Point Cloud via Voting Step Diffusion [52.9966883689137]
existing voting-based methods often receive votes from the partial surfaces of individual objects together with severe noises, leading to sub-optimal detection performance.
We propose a new method to move random 3D points toward the high-density region of the distribution by estimating the score function of the distribution with a noise conditioned score network.
Experiments on two large scale indoor 3D scene datasets, SUN RGB-D and ScanNet V2, demonstrate the superiority of our proposed method.
arXiv Detail & Related papers (2024-03-21T05:04:52Z) - Learning Better Keypoints for Multi-Object 6DoF Pose Estimation [1.0878040851638]
We train a graph network to select a set of disperse keypoints with similarly distributed votes.
These votes, learned by a regression network to accumulate evidence for the keypoint locations, can be regressed more accurately.
Experiments demonstrate the keypoints selected by KeyGNet improved the accuracy for all evaluation metrics of all seven datasets tested.
arXiv Detail & Related papers (2023-08-15T15:11:13Z) - Keypoint Cascade Voting for Point Cloud Based 6DoF Pose Estimation [1.3439502310822147]
We propose a novel keypoint voting 6DoF object pose estimation method, which takes pure unordered point cloud geometry as input without RGB information.
The proposed cascaded keypoint voting method, called RCVPose3D, is based upon a novel architecture which separates the task of semantic segmentation from that of keypoint regression.
arXiv Detail & Related papers (2022-10-14T21:36:52Z) - Neural Correspondence Field for Object Pose Estimation [67.96767010122633]
We propose a method for estimating the 6DoF pose of a rigid object with an available 3D model from a single RGB image.
Unlike classical correspondence-based methods which predict 3D object coordinates at pixels of the input image, the proposed method predicts 3D object coordinates at 3D query points sampled in the camera frustum.
arXiv Detail & Related papers (2022-07-30T01:48:23Z) - VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and
Stereo Data Fusion [62.24001258298076]
VPFNet is a new architecture that cleverly aligns and aggregates the point cloud and image data at the virtual' points.
Our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021.
arXiv Detail & Related papers (2021-11-29T08:51:20Z) - KDFNet: Learning Keypoint Distance Field for 6D Object Pose Estimation [43.839322860501596]
KDFNet is a novel method for 6D object pose estimation from RGB images.
We propose a continuous representation called Keypoint Distance Field (KDF) for projected 2D keypoint locations.
We use a fully convolutional neural network to regress the KDF for each keypoint.
arXiv Detail & Related papers (2021-09-21T12:17:24Z) - SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation [98.83762558394345]
SO-Pose is a framework for regressing all 6 degrees-of-freedom (6DoF) for the object pose in a cluttered environment from a single RGB image.
We introduce a novel reasoning about self-occlusion, in order to establish a two-layer representation for 3D objects.
Cross-layer consistencies that align correspondences, self-occlusion and 6D pose, we can further improve accuracy and robustness.
arXiv Detail & Related papers (2021-08-18T19:49:29Z) - 3D Point-to-Keypoint Voting Network for 6D Pose Estimation [8.801404171357916]
We propose a framework for 6D pose estimation from RGB-D data based on spatial structure characteristics of 3D keypoints.
The proposed method is verified on two benchmark datasets, LINEMOD and OCCLUSION LINEMOD.
arXiv Detail & Related papers (2020-12-22T11:43:15Z) - EPOS: Estimating 6D Pose of Objects with Symmetries [57.448933686429825]
We present a new method for estimating the 6D pose of rigid objects with available 3D models from a single RGB input.
An object is represented by compact surface fragments which allow symmetries in a systematic manner.
Correspondences between densely sampled pixels and the fragments are predicted using an encoder-decoder network.
arXiv Detail & Related papers (2020-04-01T17:41:08Z) - Robust 6D Object Pose Estimation by Learning RGB-D Features [59.580366107770764]
We propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem.
We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction.
Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-02-29T06:24:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.