KGNv2: Separating Scale and Pose Prediction for Keypoint-based 6-DoF
Grasp Synthesis on RGB-D input
- URL: http://arxiv.org/abs/2303.05617v3
- Date: Mon, 1 May 2023 17:52:45 GMT
- Title: KGNv2: Separating Scale and Pose Prediction for Keypoint-based 6-DoF
Grasp Synthesis on RGB-D input
- Authors: Yiye Chen, Ruinian Xu, Yunzhi Lin, Hongyi Chen, Patricio A. Vela
- Abstract summary: Keypoint-based grasp detector from image input has demonstrated promising results.
We devise a new grasp generation network that reduces the dependency on precise keypoint estimation.
- Score: 16.897624250286487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new 6-DoF grasp pose synthesis approach from 2D/2.5D input based
on keypoints. Keypoint-based grasp detector from image input has demonstrated
promising results in the previous study, where the additional visual
information provided by color images compensates for the noisy depth
perception. However, it relies heavily on accurately predicting the location of
keypoints in the image space. In this paper, we devise a new grasp generation
network that reduces the dependency on precise keypoint estimation. Given an
RGB-D input, our network estimates both the grasp pose from keypoint detection
as well as scale towards the camera. We further re-design the keypoint output
space in order to mitigate the negative impact of keypoint prediction noise to
Perspective-n-Point (PnP) algorithm. Experiments show that the proposed method
outperforms the baseline by a large margin, validating the efficacy of our
approach. Finally, despite trained on simple synthetic objects, our method
demonstrate sim-to-real capacity by showing competitive results in real-world
robot experiments.
Related papers
- SRPose: Two-view Relative Pose Estimation with Sparse Keypoints [51.49105161103385]
SRPose is a sparse keypoint-based framework for two-view relative pose estimation in camera-to-world and object-to-camera scenarios.
It achieves competitive or superior performance compared to state-of-the-art methods in terms of accuracy and speed.
It is robust to different image sizes and camera intrinsics, and can be deployed with low computing resources.
arXiv Detail & Related papers (2024-07-11T05:46:35Z) - RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery [72.13154206106259]
We propose a novel pipeline that decouples the 6D pose and size estimation to mitigate the influence of imperfect scales on rigid transformations.
Specifically, we leverage a pre-trained monocular estimator to extract local geometric information.
A separate branch is designed to directly recover the metric scale of the object based on category-level statistics.
arXiv Detail & Related papers (2023-09-19T02:20:26Z) - Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural
Network [52.29330138835208]
Accurately matching local features between a pair of images is a challenging computer vision task.
Previous studies typically use attention based graph neural networks (GNNs) with fully-connected graphs over keypoints within/across images.
We propose MaKeGNN, a sparse attention-based GNN architecture which bypasses non-repeatable keypoints and leverages matchable ones to guide message passing.
arXiv Detail & Related papers (2023-07-04T02:50:44Z) - Centroid Distance Keypoint Detector for Colored Point Clouds [32.74803728070627]
Keypoint detection serves as the basis for many computer vision and robotics applications.
We propose an efficient multi-modal keypoint detector that can extract both geometry-salient and color-salient keypoints in colored point clouds.
arXiv Detail & Related papers (2022-10-04T00:55:51Z) - Keypoint-GraspNet: Keypoint-based 6-DoF Grasp Generation from the
Monocular RGB-D input [6.1938383008964495]
The proposed solution, Keypoint-GraspNet, detects the projection of gripper keypoints in the image space and recovers poses with an algorithm.
Metric-based evaluation reveals that our method outperforms the baselines in terms of the grasp proposal accuracy, diversity, and the time cost.
arXiv Detail & Related papers (2022-09-19T04:23:20Z) - Probabilistic Spatial Distribution Prior Based Attentional Keypoints
Matching Network [19.708243062836104]
Keypoints matching is a pivotal component for many image-relevant applications such as image stitching, visual simultaneous localization and mapping.
In this paper, we demonstrate that the motion estimation from IMU integration can be used to exploit the spatial distribution prior of keypoints between images.
We present a projection loss for the proposed keypoints matching network, which gives a smooth edge between matching and un-matching keypoints.
arXiv Detail & Related papers (2021-11-17T09:52:03Z) - Accurate Grid Keypoint Learning for Efficient Video Prediction [87.71109421608232]
Keypoint-based video prediction methods can consume substantial computing resources in training and deployment.
In this paper, we design a new grid keypoint learning framework, aiming at a robust and explainable intermediate keypoint representation for long-term efficient video prediction.
Our method outperforms the state-ofthe-art video prediction methods while saves 98% more than computing resources.
arXiv Detail & Related papers (2021-07-28T05:04:30Z) - 6D Object Pose Estimation using Keypoints and Part Affinity Fields [24.126513851779936]
The task of 6D object pose estimation from RGB images is an important requirement for autonomous service robots to be able to interact with the real world.
We present a two-step pipeline for estimating the 6 DoF translation and orientation of known objects.
arXiv Detail & Related papers (2021-07-05T14:41:19Z) - Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression [81.05772887221333]
We study the dense keypoint regression framework that is previously inferior to the keypoint detection and grouping framework.
We present a simple yet effective approach, named disentangled keypoint regression (DEKR)
We empirically show that the proposed direct regression method outperforms keypoint detection and grouping methods.
arXiv Detail & Related papers (2021-04-06T05:54:46Z) - Bottom-Up Human Pose Estimation by Ranking Heatmap-Guided Adaptive
Keypoint Estimates [76.51095823248104]
We present several schemes that are rarely or unthoroughly studied before for improving keypoint detection and grouping (keypoint regression) performance.
First, we exploit the keypoint heatmaps for pixel-wise keypoint regression instead of separating them for improving keypoint regression.
Second, we adopt a pixel-wise spatial transformer network to learn adaptive representations for handling the scale and orientation variance.
Third, we present a joint shape and heatvalue scoring scheme to promote the estimated poses that are more likely to be true poses.
arXiv Detail & Related papers (2020-06-28T01:14:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.