SD-Net: Symmetric-Aware Keypoint Prediction and Domain Adaptation for 6D Pose Estimation In Bin-picking Scenarios
- URL: http://arxiv.org/abs/2403.09317v1
- Date: Thu, 14 Mar 2024 12:08:44 GMT
- Title: SD-Net: Symmetric-Aware Keypoint Prediction and Domain Adaptation for 6D Pose Estimation In Bin-picking Scenarios
- Authors: Ding-Tao Huang, En-Te Lin, Lipeng Chen, Li-Fu Liu, Long Zeng,
- Abstract summary: We propose a new 6D pose estimation network with symmetric-aware keypoint prediction and self-training domain adaptation (SD-Net)
At the keypoint prediction stage, we designe a robust 3D keypoints selection strategy to locate 3D keypoints even in highly occluded scenes.
At the domain adaptation stage, we propose the self-training framework using a student-teacher training scheme.
On public Sil'eane dataset, SD-Net achieves state-of-the-art results, obtaining an average precision of 96%.
- Score: 2.786599193929693
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the success in 6D pose estimation in bin-picking scenarios, existing methods still struggle to produce accurate prediction results for symmetry objects and real world scenarios. The primary bottlenecks include 1) the ambiguity keypoints caused by object symmetries; 2) the domain gap between real and synthetic data. To circumvent these problem, we propose a new 6D pose estimation network with symmetric-aware keypoint prediction and self-training domain adaptation (SD-Net). SD-Net builds on pointwise keypoint regression and deep hough voting to perform reliable detection keypoint under clutter and occlusion. Specifically, at the keypoint prediction stage, we designe a robust 3D keypoints selection strategy considering the symmetry class of objects and equivalent keypoints, which facilitate locating 3D keypoints even in highly occluded scenes. Additionally, we build an effective filtering algorithm on predicted keypoint to dynamically eliminate multiple ambiguity and outlier keypoint candidates. At the domain adaptation stage, we propose the self-training framework using a student-teacher training scheme. To carefully distinguish reliable predictions, we harnesses a tailored heuristics for 3D geometry pseudo labelling based on semi-chamfer distance. On public Sil'eane dataset, SD-Net achieves state-of-the-art results, obtaining an average precision of 96%. Testing learning and generalization abilities on public Parametric datasets, SD-Net is 8% higher than the state-of-the-art method. The code is available at https://github.com/dingthuang/SD-Net.
Related papers
- E$^3$-Net: Efficient E(3)-Equivariant Normal Estimation Network [47.77270862087191]
We propose E3-Net to achieve equivariance for normal estimation.
We introduce an efficient random frame method, which significantly reduces the training resources required for this task to just 1/8 of previous work.
Our method achieves superior results on both synthetic and real-world datasets, and outperforms current state-of-the-art techniques by a substantial margin.
arXiv Detail & Related papers (2024-06-01T07:53:36Z) - Learning Better Keypoints for Multi-Object 6DoF Pose Estimation [1.0878040851638]
We train a graph network to select a set of disperse keypoints with similarly distributed votes.
These votes, learned by a regression network to accumulate evidence for the keypoint locations, can be regressed more accurately.
Experiments demonstrate the keypoints selected by KeyGNet improved the accuracy for all evaluation metrics of all seven datasets tested.
arXiv Detail & Related papers (2023-08-15T15:11:13Z) - SC3K: Self-supervised and Coherent 3D Keypoints Estimation from Rotated,
Noisy, and Decimated Point Cloud Data [17.471342278936365]
We propose a new method to infer keypoints from arbitrary object categories in practical scenarios where point cloud data (PCD) are noisy, down-sampled and arbitrarily rotated.
We achieve these desiderata by proposing a new self-supervised training strategy for keypoints estimation.
We compare the keypoints estimated by the proposed approach with those of the state-of-the-art unsupervised approaches.
arXiv Detail & Related papers (2023-08-10T08:10:01Z) - DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection [6.096961718434965]
We study the problem of semi-supervised 3D object detection, which is of great importance considering the high annotation cost for cluttered 3D indoor scenes.
We resort to the robust and principled framework of selfteaching, which has triggered notable progress for semisupervised learning recently.
We propose the first semisupervised 3D detection algorithm that works in the singlestage manner and allows spatially dense training signals.
arXiv Detail & Related papers (2023-04-25T17:59:54Z) - Piecewise Planar Hulls for Semi-Supervised Learning of 3D Shape and Pose
from 2D Images [133.68032636906133]
We study the problem of estimating 3D shape and pose of an object in terms of keypoints, from a single 2D image.
The shape and pose are learned directly from images collected by categories and their partial 2D keypoint annotations.
arXiv Detail & Related papers (2022-11-14T16:18:11Z) - PDC-Net+: Enhanced Probabilistic Dense Correspondence Network [161.76275845530964]
Enhanced Probabilistic Dense Correspondence Network, PDC-Net+, capable of estimating accurate dense correspondences.
We develop an architecture and an enhanced training strategy tailored for robust and generalizable uncertainty prediction.
Our approach obtains state-of-the-art results on multiple challenging geometric matching and optical flow datasets.
arXiv Detail & Related papers (2021-09-28T17:56:41Z) - Accurate Grid Keypoint Learning for Efficient Video Prediction [87.71109421608232]
Keypoint-based video prediction methods can consume substantial computing resources in training and deployment.
In this paper, we design a new grid keypoint learning framework, aiming at a robust and explainable intermediate keypoint representation for long-term efficient video prediction.
Our method outperforms the state-ofthe-art video prediction methods while saves 98% more than computing resources.
arXiv Detail & Related papers (2021-07-28T05:04:30Z) - Uncertainty-Aware Camera Pose Estimation from Points and Lines [101.03675842534415]
Perspective-n-Point-and-Line (Pn$PL) aims at fast, accurate and robust camera localizations with respect to a 3D model from 2D-3D feature coordinates.
arXiv Detail & Related papers (2021-07-08T15:19:36Z) - 6D Object Pose Estimation using Keypoints and Part Affinity Fields [24.126513851779936]
The task of 6D object pose estimation from RGB images is an important requirement for autonomous service robots to be able to interact with the real world.
We present a two-step pipeline for estimating the 6 DoF translation and orientation of known objects.
arXiv Detail & Related papers (2021-07-05T14:41:19Z) - 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object
Detection [76.42897462051067]
3DIoUMatch is a novel semi-supervised method for 3D object detection applicable to both indoor and outdoor scenes.
We leverage a teacher-student mutual learning framework to propagate information from the labeled to the unlabeled train set in the form of pseudo-labels.
Our method consistently improves state-of-the-art methods on both ScanNet and SUN-RGBD benchmarks by significant margins under all label ratios.
arXiv Detail & Related papers (2020-12-08T11:06:26Z) - Robust 6D Object Pose Estimation by Learning RGB-D Features [59.580366107770764]
We propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem.
We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction.
Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-02-29T06:24:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.