Related papers: SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation

SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation

URL: http://arxiv.org/abs/2311.11125v3
Date: Fri, 22 Mar 2024 00:36:02 GMT
Title: SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation
Authors: Yamei Chen, Yan Di, Guangyao Zhai, Fabian Manhardt, Chenyangguang Zhang, Ruida Zhang, Federico Tombari, Nassir Navab, Benjamin Busam,
Abstract summary: Category-level object pose estimation, aiming to predict the 6D pose and 3D size of objects from known categories, typically struggles with large intra-class shape variation. We present SecondPose, a novel approach integrating object-specific geometric features with semantic category priors from DINOv2.
Score: 79.12683101131368
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Category-level object pose estimation, aiming to predict the 6D pose and 3D size of objects from known categories, typically struggles with large intra-class shape variation. Existing works utilizing mean shapes often fall short of capturing this variation. To address this issue, we present SecondPose, a novel approach integrating object-specific geometric features with semantic category priors from DINOv2. Leveraging the advantage of DINOv2 in providing SE(3)-consistent semantic features, we hierarchically extract two types of SE(3)-invariant geometric features to further encapsulate local-to-global object-specific information. These geometric features are then point-aligned with DINOv2 features to establish a consistent object representation under SE(3) transformations, facilitating the mapping from camera space to the pre-defined canonical space, thus further enhancing pose estimation. Extensive experiments on NOCS-REAL275 demonstrate that SecondPose achieves a 12.4% leap forward over the state-of-the-art. Moreover, on a more complex dataset HouseCat6D which provides photometrically challenging objects, SecondPose still surpasses other competitors by a large margin.

Related papers

Instance-Adaptive Keypoint Learning with Local-to-Global Geometric Aggregation for Category-Level Object Pose Estimation [19.117822086210513]
INKL-Pose is a novel category-level object pose estimation framework. It enables INstance-adaptive Keypoint Learning with local-to-global geometric aggregation. Experiments on CAMERA25, REAL275, and HouseCat6D demonstrate that INKL-Pose achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-04-21T14:37:37Z)
Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation [42.48001557547222]
Category-level object pose estimation aims to determine the pose and size of novel objects in specific categories. Existing correspondence-based approaches typically adopt point-based representations to establish the correspondences between primitive observed points and normalized object coordinates. We introduce a novel architecture called SpherePose, which yields precise correspondence prediction through three core designs.
arXiv Detail & Related papers (2025-03-18T05:43:42Z)
Universal Features Guided Zero-Shot Category-Level Object Pose Estimation [52.29006019352873]
We propose a zero-shot method to achieve category-level 6-DOF object pose estimation. Our method exploits both 2D and 3D universal features of input RGB-D image to establish semantic similarity-based correspondences. Our method outperforms previous methods on the REAL275 and Wild6D benchmarks for unseen categories.
arXiv Detail & Related papers (2025-01-06T08:10:13Z)
RelPose++: Recovering 6D Poses from Sparse-view Observations [66.6922660401558]
We address the task of estimating 6D camera poses from sparse-view image sets (2-8 images) We build on the recent RelPose framework which learns a network that infers distributions over relative rotations over image pairs. Our final system results in large improvements in 6D pose prediction over prior art on both seen and unseen object categories.
arXiv Detail & Related papers (2023-05-08T17:59:58Z)
Generative Category-Level Shape and Pose Estimation with Semantic Primitives [27.692997522812615]
We propose a novel framework for category-level object shape and pose estimation from a single RGB-D image. To handle the intra-category variation, we adopt a semantic primitive representation that encodes diverse shapes into a unified latent space. We show that the proposed method achieves SOTA pose estimation performance and better generalization in the real-world dataset.
arXiv Detail & Related papers (2022-10-03T17:51:54Z)
Coupled Iterative Refinement for 6D Multi-Object Pose Estimation [64.7198752089041]
Given a set of known 3D objects and an RGB or RGB-D input image, we detect and estimate the 6D pose of each object. Our approach iteratively refines both pose and correspondence in a tightly coupled manner, allowing us to dynamically remove outliers to improve accuracy.
arXiv Detail & Related papers (2022-04-26T18:00:08Z)
GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting [103.74918834553249]
GPV-Pose is a novel framework for robust category-level pose estimation. It harnesses geometric insights to enhance the learning of category-level pose-sensitive features. It produces superior results to state-of-the-art competitors on common public benchmarks.
arXiv Detail & Related papers (2022-03-15T13:58:50Z)
CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds [97.63549045541296]
We propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances and per-part pose tracking for articulated objects. Our method achieves new state-of-the-art performance on category-level rigid object pose (NOCS-REAL275) and articulated object pose benchmarks (SAPIEN, BMVC) at the fastest FPS 12.
arXiv Detail & Related papers (2021-04-08T00:14:58Z)
FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z)
DualPoseNet: Category-level 6D Object Pose and Size Estimation using Dual Pose Network with Refined Learning of Pose Consistency [30.214100288708163]
Category-level 6D object pose and size estimation is to predict 9 degrees-of-freedom (9DoF) pose configurations of rotation, translation, and size for object instances. We propose a new method of Dual Pose Network with refined learning of pose consistency for this task, shortened as DualPoseNet.
arXiv Detail & Related papers (2021-03-11T08:33:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.