Related papers: Pseudo-keypoint RKHS Learning for Self-supervised 6DoF Pose Estimation

Pseudo-keypoint RKHS Learning for Self-supervised 6DoF Pose Estimation

URL: http://arxiv.org/abs/2311.09500v3
Date: Wed, 17 Jul 2024 15:10:09 GMT
Title: Pseudo-keypoint RKHS Learning for Self-supervised 6DoF Pose Estimation
Authors: Yangzheng Wu, Michael Greenspan,
Abstract summary: We address the simulation-to-real domain gap in six degree-of-freedom pose estimation (6DoF PE) We propose a novel self-supervised keypoint voting-based 6DoF PE framework, effectively narrowing this gap using a learnable kernel in RKHS. We propose an adapter network, which is pre-trained on purely synthetic data with synthetic ground truth poses, and which evolves the network parameters from this source synthetic domain to the target real domain.
Score: 0.9208007322096533
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We address the simulation-to-real domain gap in six degree-of-freedom pose estimation (6DoF PE), and propose a novel self-supervised keypoint voting-based 6DoF PE framework, effectively narrowing this gap using a learnable kernel in RKHS. We formulate this domain gap as a distance in high-dimensional feature space, distinct from previous iterative matching methods. We propose an adapter network, which is pre-trained on purely synthetic data with synthetic ground truth poses, and which evolves the network parameters from this source synthetic domain to the target real domain. Importantly, the real data training only uses pseudo-poses estimated by pseudo-keypoints, and thereby requires no real ground truth data annotations. Our proposed method is called RKHSPose, and achieves state-of-the-art performance among self-supervised methods on three commonly used 6DoF PE datasets including LINEMOD (+4.2%), Occlusion LINEMOD (+2%), and YCB-Video (+3%). It also compares favorably to fully supervised methods on all six applicable BOP core datasets, achieving within -11.3% to +0.2% of the top fully supervised results.

Related papers

DPGLA: Bridging the Gap between Synthetic and Real Data for Unsupervised Domain Adaptation in 3D LiDAR Semantic Segmentation [3.75886080255807]
Self-training-based Unsupervised Domain Adaptation (UDA) has been widely used to improve point cloud semantic segmentation.<n>We propose a Dynamic Pseudo-Label Filtering scheme to enhance real data utilization in point cloud UDA semantic segmentation.
arXiv Detail & Related papers (2025-10-27T17:05:59Z)
Data-Efficient Point Cloud Semantic Segmentation Pipeline for Unimproved Roads [0.0]
We present a data-efficient point cloud segmentation pipeline and training framework for robust segmentation of unimproved roads.<n>Our method employs a two-stage training framework: first, a projection-based convolutional neural network is pre-trained on a mixture of public urban datasets and a small, curated in-domain dataset.<n>Using only 50 labeled point clouds from our target domain, we show that our proposed training approach improves mean Intersection-over-Union from 33.5% to 51.8% and the overall accuracy from 85.5% to 90.8%.
arXiv Detail & Related papers (2025-08-26T20:00:36Z)
Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z)
Effective Data Pruning through Score Extrapolation [40.61665742457229]
We introduce a novel importance score extrapolation framework that requires training on only a small subset of data.<n>We present two initial approaches in this framework to accurately predict sample importance for the entire dataset using patterns learned from this minimal subset.<n>Our results indicate that score extrapolation is a promising direction to scale expensive score calculation methods, such as pruning, data attribution, or other tasks.
arXiv Detail & Related papers (2025-06-10T17:38:49Z)
PointSFDA: Source-free Domain Adaptation for Point Cloud Completion [27.48403130855686]
We propose an effective yet simple source-free domain adaptation framework for point cloud completion. PointSFDA uses only a pretrained source model and unlabeled target data for adaptation. Our method significantly improves the performance of state-of-the-art networks in cross-domain shape completion.
arXiv Detail & Related papers (2025-03-19T12:09:45Z)
Correspondence-Free SE(3) Point Cloud Registration in RKHS via Unsupervised Equivariant Learning [13.098807543505028]
This paper introduces a robust unsupervised SE(3) point cloud registration method that operates without requiring point correspondences. A novel RKHS distance metric is proposed, offering reliable performance amidst noise, outliers, and asymmetrical data. The proposed method outperforms classical and supervised methods in terms of registration accuracy on both synthetic (ModelNet40) and real-world (ETH3D) noisy, outlier-rich datasets.
arXiv Detail & Related papers (2024-07-29T17:57:38Z)
Syn-to-Real Unsupervised Domain Adaptation for Indoor 3D Object Detection [50.448520056844885]
We propose a novel framework for syn-to-real unsupervised domain adaptation in indoor 3D object detection. Our adaptation results from synthetic dataset 3D-FRONT to real-world datasets ScanNetV2 and SUN RGB-D demonstrate remarkable mAP25 improvements of 9.7% and 9.1% over Source-Only baselines.
arXiv Detail & Related papers (2024-06-17T08:18:41Z)
E$^3$-Net: Efficient E(3)-Equivariant Normal Estimation Network [47.77270862087191]
We propose E3-Net to achieve equivariance for normal estimation. We introduce an efficient random frame method, which significantly reduces the training resources required for this task to just 1/8 of previous work. Our method achieves superior results on both synthetic and real-world datasets, and outperforms current state-of-the-art techniques by a substantial margin.
arXiv Detail & Related papers (2024-06-01T07:53:36Z)
IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images [50.4538089115248]
Generalizable 3D object reconstruction from single-view RGB-D images remains a challenging task. We propose a novel approach, IPoD, which harmonizes implicit field learning with point diffusion. Experiments conducted on the CO3D-v2 dataset affirm the superiority of IPoD, achieving 7.8% improvement in F-score and 28.6% in Chamfer distance over existing methods.
arXiv Detail & Related papers (2024-03-30T07:17:37Z)
AdaTriplet-RA: Domain Matching via Adaptive Triplet and Reinforced Attention for Unsupervised Domain Adaptation [15.905869933337101]
Unsupervised domain adaption (UDA) is a transfer learning task where the data and annotations of the source domain are available but only have access to the unlabeled target data during training. We propose to improve the unsupervised domain adaptation task with an inter-domain sample matching scheme. We apply the widely-used and robust Triplet loss to match the inter-domain samples. To reduce the catastrophic effect of the inaccurate pseudo-labels generated during training, we propose a novel uncertainty measurement method to select reliable pseudo-labels automatically and progressively refine them.
arXiv Detail & Related papers (2022-11-16T13:04:24Z)
What Stops Learning-based 3D Registration from Working in the Real World? [53.68326201131434]
This work identifies the sources of 3D point cloud registration failures, analyze the reasons behind them, and propose solutions. Ultimately, this translates to a best-practice 3D registration network (BPNet), constituting the first learning-based method able to handle previously-unseen objects in real-world data. Our model generalizes to real data without any fine-tuning, reaching an accuracy of up to 67% on point clouds of unseen objects obtained with a commercial sensor.
arXiv Detail & Related papers (2021-11-19T19:24:27Z)
Inception Convolution with Efficient Dilation Search [121.41030859447487]
Dilation convolution is a critical mutant of standard convolution neural network to control effective receptive fields and handle large scale variance of objects. We propose a new mutant of dilated convolution, namely inception (dilated) convolution where the convolutions have independent dilation among different axes, channels and layers. We explore a practical method for fitting the complex inception convolution to the data, a simple while effective dilation search algorithm(EDO) based on statistical optimization is developed.
arXiv Detail & Related papers (2020-12-25T14:58:35Z)
3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection [76.42897462051067]
3DIoUMatch is a novel semi-supervised method for 3D object detection applicable to both indoor and outdoor scenes. We leverage a teacher-student mutual learning framework to propagate information from the labeled to the unlabeled train set in the form of pseudo-labels. Our method consistently improves state-of-the-art methods on both ScanNet and SUN-RGBD benchmarks by significant margins under all label ratios.
arXiv Detail & Related papers (2020-12-08T11:06:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.