3D Point-to-Keypoint Voting Network for 6D Pose Estimation
- URL: http://arxiv.org/abs/2012.11938v1
- Date: Tue, 22 Dec 2020 11:43:15 GMT
- Title: 3D Point-to-Keypoint Voting Network for 6D Pose Estimation
- Authors: Weitong Hua, Jiaxin Guo, Yue Wang and Rong Xiong
- Abstract summary: We propose a framework for 6D pose estimation from RGB-D data based on spatial structure characteristics of 3D keypoints.
The proposed method is verified on two benchmark datasets, LINEMOD and OCCLUSION LINEMOD.
- Score: 8.801404171357916
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object 6D pose estimation is an important research topic in the field of
computer vision due to its wide application requirements and the challenges
brought by complexity and changes in the real-world. We think fully exploring
the characteristics of spatial relationship between points will help to improve
the pose estimation performance, especially in the scenes of background clutter
and partial occlusion. But this information was usually ignored in previous
work using RGB image or RGB-D data. In this paper, we propose a framework for
6D pose estimation from RGB-D data based on spatial structure characteristics
of 3D keypoints. We adopt point-wise dense feature embedding to vote for 3D
keypoints, which makes full use of the structure information of the rigid body.
After the direction vectors pointing to the keypoints are predicted by CNN, we
use RANSAC voting to calculate the coordinate of the 3D keypoints, then the
pose transformation can be easily obtained by the least square method. In
addition, a spatial dimension sampling strategy for points is employed, which
makes the method achieve excellent performance on small training sets. The
proposed method is verified on two benchmark datasets, LINEMOD and OCCLUSION
LINEMOD. The experimental results show that our method outperforms the
state-of-the-art approaches, achieves ADD(-S) accuracy of 98.7\% on LINEMOD
dataset and 52.6\% on OCCLUSION LINEMOD dataset in real-time.
Related papers
- PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic
Occupancy Prediction [72.75478398447396]
We propose a cylindrical tri-perspective view to represent point clouds effectively and comprehensively.
Considering the distance distribution of LiDAR point clouds, we construct the tri-perspective view in the cylindrical coordinate system.
We employ spatial group pooling to maintain structural details during projection and adopt 2D backbones to efficiently process each TPV plane.
arXiv Detail & Related papers (2023-08-31T17:57:17Z) - Spatial Feature Mapping for 6DoF Object Pose Estimation [29.929911622127502]
This work aims to estimate 6Dof (6D) object pose in background clutter.
Considering the strong occlusion and background noise, we propose to utilize the spatial structure for better tackling this challenging task.
arXiv Detail & Related papers (2022-06-03T21:44:10Z) - RBGNet: Ray-based Grouping for 3D Object Detection [104.98776095895641]
We propose the RBGNet framework, a voting-based 3D detector for accurate 3D object detection from point clouds.
We propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays.
Our model achieves state-of-the-art 3D detection performance on ScanNet V2 and SUN RGB-D with remarkable performance gains.
arXiv Detail & Related papers (2022-04-05T14:42:57Z) - ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose
Estimation [76.31125154523056]
We present a discrete descriptor, which can represent the object surface densely.
We also propose a coarse to fine training strategy, which enables fine-grained correspondence prediction.
arXiv Detail & Related papers (2022-03-17T16:16:24Z) - Weakly Supervised Learning of Keypoints for 6D Object Pose Estimation [73.40404343241782]
We propose a weakly supervised 6D object pose estimation approach based on 2D keypoint detection.
Our approach achieves comparable performance with state-of-the-art fully supervised approaches.
arXiv Detail & Related papers (2022-03-07T16:23:47Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z) - KDFNet: Learning Keypoint Distance Field for 6D Object Pose Estimation [43.839322860501596]
KDFNet is a novel method for 6D object pose estimation from RGB images.
We propose a continuous representation called Keypoint Distance Field (KDF) for projected 2D keypoint locations.
We use a fully convolutional neural network to regress the KDF for each keypoint.
arXiv Detail & Related papers (2021-09-21T12:17:24Z) - Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial
Keypoint Voting [7.6997148655751895]
We propose a novel keypoint voting scheme based on intersecting spheres, that is more accurate than existing schemes and allows for a smaller set of more disperse keypoints.
The scheme forms the basis of the proposed RCVPose method for 6 DoF pose estimation of 3D objects in RGB-D data.
arXiv Detail & Related papers (2021-04-06T14:06:08Z) - FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose
Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation.
The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z) - L6DNet: Light 6 DoF Network for Robust and Precise Object Pose
Estimation with Small Datasets [0.0]
We propose a novel approach to perform 6 DoF object pose estimation from a single RGB-D image.
We adopt a hybrid pipeline in two stages: data-driven and geometric.
Our approach is more robust and accurate than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-03T17:41:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.