CPPF++: Uncertainty-Aware Sim2Real Object Pose Estimation by Vote Aggregation
- URL: http://arxiv.org/abs/2211.13398v3
- Date: Fri, 29 Mar 2024 04:13:49 GMT
- Title: CPPF++: Uncertainty-Aware Sim2Real Object Pose Estimation by Vote Aggregation
- Authors: Yang You, Wenhao He, Jin Liu, Hongkai Xiong, Weiming Wang, Cewu Lu,
- Abstract summary: We introduce a novel method, CPPF++, designed for sim-to-real pose estimation.
To address the challenge posed by vote collision, we propose a novel approach that involves modeling the voting uncertainty.
We incorporate several innovative modules, including noisy pair filtering, online alignment optimization, and a feature ensemble.
- Score: 67.12857074801731
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object pose estimation constitutes a critical area within the domain of 3D vision. While contemporary state-of-the-art methods that leverage real-world pose annotations have demonstrated commendable performance, the procurement of such real training data incurs substantial costs. This paper focuses on a specific setting wherein only 3D CAD models are utilized as a priori knowledge, devoid of any background or clutter information. We introduce a novel method, CPPF++, designed for sim-to-real pose estimation. This method builds upon the foundational point-pair voting scheme of CPPF, reformulating it through a probabilistic view. To address the challenge posed by vote collision, we propose a novel approach that involves modeling the voting uncertainty by estimating the probabilistic distribution of each point pair within the canonical space. Furthermore, we augment the contextual information provided by each voting unit through the introduction of N-point tuples. To enhance the robustness and accuracy of the model, we incorporate several innovative modules, including noisy pair filtering, online alignment optimization, and a tuple feature ensemble. Alongside these methodological advancements, we introduce a new category-level pose estimation dataset, named DiversePose 300. Empirical evidence demonstrates that our method significantly surpasses previous sim-to-real approaches and achieves comparable or superior performance on novel datasets. Our code is available on https://github.com/qq456cvb/CPPF2.
Related papers
- UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - Visibility-Aware Keypoint Localization for 6DoF Object Pose Estimation [56.07676459156789]
Localizing 3D keypoints in a 2D image is an effective way to establish 3D-2D correspondences for 6DoF object pose estimation.
In this paper, we address this issue by localizing the important keypoints in terms of visibility.
We construct VAPO (Visibility-Aware POse estimator) by integrating the visibility-aware importance with a state-of-the-art pose estimation algorithm.
arXiv Detail & Related papers (2024-03-21T16:59:45Z) - Source-Free and Image-Only Unsupervised Domain Adaptation for Category
Level Object Pose Estimation [18.011044932979143]
3DUDA is a method capable of adapting to a nuisance-ridden target domain without 3D or depth data.
We represent object categories as simple cuboid meshes, and harness a generative model of neural feature activations.
We show that our method simulates fine-tuning on a global pseudo-labeled dataset under mild assumptions.
arXiv Detail & Related papers (2024-01-19T17:48:05Z) - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - Semantic keypoint-based pose estimation from single RGB frames [64.80395521735463]
We present an approach to estimating the continuous 6-DoF pose of an object from a single RGB image.
The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model.
We show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios.
arXiv Detail & Related papers (2022-04-12T15:03:51Z) - CPPF: Towards Robust Category-Level 9D Pose Estimation in the Wild [45.93626858034774]
Category-level PPF voting method to achieve accurate, robust and generalizable 9D pose estimation in the wild.
A novel coarse-to-fine voting algorithm is proposed to eliminate noisy point pair samples and generate final predictions from the population.
Our method is on par with current state of the arts with real-world training data.
arXiv Detail & Related papers (2022-03-07T01:36:22Z) - Sim2Real Object-Centric Keypoint Detection and Description [40.58367357980036]
Keypoint detection and description play a central role in computer vision.
We propose the object-centric formulation, which requires further identifying which object each interest point belongs to.
We develop a sim2real contrastive learning mechanism that can generalize the model trained in simulation to real-world applications.
arXiv Detail & Related papers (2022-02-01T15:00:20Z) - Robust Ego and Object 6-DoF Motion Estimation and Tracking [5.162070820801102]
This paper proposes a robust solution to achieve accurate estimation and consistent track-ability for dynamic multi-body visual odometry.
A compact and effective framework is proposed leveraging recent advances in semantic instance-level segmentation and accurate optical flow estimation.
A novel formulation, jointly optimizing SE(3) motion and optical flow is introduced that improves the quality of the tracked points and the motion estimation accuracy.
arXiv Detail & Related papers (2020-07-28T05:12:56Z) - Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation [97.93687743378106]
Existing 3D pose estimation models suffer performance drop when applying to new scenarios with unseen poses.
We propose a novel framework, Inference Stage Optimization (ISO), for improving the generalizability of 3D pose models.
Remarkably, it yields new state-of-the-art of 83.6% 3D PCK on MPI-INF-3DHP, improving upon the previous best result by 9.7%.
arXiv Detail & Related papers (2020-07-04T09:45:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.