Novel Object 6D Pose Estimation with a Single Reference View
- URL: http://arxiv.org/abs/2503.05578v1
- Date: Fri, 07 Mar 2025 17:00:41 GMT
- Title: Novel Object 6D Pose Estimation with a Single Reference View
- Authors: Jian Liu, Wei Sun, Kai Zeng, Jin Zheng, Hui Yang, Lin Wang, Hossein Rahmani, Ajmal Mian,
- Abstract summary: We propose a Single-Reference-based novel object 6D (SinRef-6D) pose estimation method.<n>Our key idea is to iteratively establish point-wise alignment in the camera coordinate system based on state space models (SSMs)<n>Once pre-trained on synthetic data, SinRef-6D can estimate the 6D pose of a novel object using only a single reference view.
- Score: 39.226579637659235
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing novel object 6D pose estimation methods typically rely on CAD models or dense reference views, which are both difficult to acquire. Using only a single reference view is more scalable, but challenging due to large pose discrepancies and limited geometric and spatial information. To address these issues, we propose a Single-Reference-based novel object 6D (SinRef-6D) pose estimation method. Our key idea is to iteratively establish point-wise alignment in the camera coordinate system based on state space models (SSMs). Specifically, iterative camera-space point-wise alignment can effectively handle large pose discrepancies, while our proposed RGB and Points SSMs can capture long-range dependencies and spatial information from a single view, offering linear complexity and superior spatial modeling capability. Once pre-trained on synthetic data, SinRef-6D can estimate the 6D pose of a novel object using only a single reference view, without requiring retraining or a CAD model. Extensive experiments on six popular datasets and real-world robotic scenes demonstrate that we achieve on-par performance with CAD-based and dense reference view-based methods, despite operating in the more challenging single reference setting. Code will be released at https://github.com/CNJianLiu/SinRef-6D.
Related papers
- Any6D: Model-free 6D Pose Estimation of Novel Objects [76.30057578269668]
We introduce Any6D, a model-free framework for 6D object pose estimation.
It requires only a single RGB-D anchor image to estimate both the 6D pose and size of unknown objects in novel scenes.
We evaluate our method on five challenging datasets.
arXiv Detail & Related papers (2025-03-24T13:46:21Z) - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - SA6D: Self-Adaptive Few-Shot 6D Pose Estimator for Novel and Occluded
Objects [24.360831082478313]
We propose a few-shot pose estimation (FSPE) approach called SA6D.
It uses a self-adaptive segmentation module to identify the novel target object and construct a point cloud model of the target object.
We evaluate SA6D on real-world tabletop object datasets and demonstrate that SA6D outperforms existing FSPE methods.
arXiv Detail & Related papers (2023-08-31T08:19:26Z) - Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot,
Generalizable Approach using RGB Images [60.0898989456276]
We present a new framework named Cas6D for few-shot 6DoF pose estimation that is generalizable and uses only RGB images.
To address the false positives of target object detection in the extreme few-shot setting, our framework utilizes a self-supervised pre-trained ViT to learn robust feature representations.
Experimental results on the LINEMOD and GenMOP datasets demonstrate that Cas6D outperforms state-of-the-art methods by 9.2% and 3.8% accuracy (Proj-5) under the 32-shot setting.
arXiv Detail & Related papers (2023-06-13T07:45:42Z) - Self-Supervised Geometric Correspondence for Category-Level 6D Object
Pose Estimation in the Wild [47.80637472803838]
We introduce a self-supervised learning approach trained directly on large-scale real-world object videos for category-level 6D pose estimation in the wild.
Our framework reconstructs the canonical 3D shape of an object category and learns dense correspondences between input images and the canonical shape via surface embedding.
Surprisingly, our method, without any human annotations or simulators, can achieve on-par or even better performance than previous supervised or semi-supervised methods on in-the-wild images.
arXiv Detail & Related papers (2022-10-13T17:19:22Z) - FS6D: Few-Shot 6D Pose Estimation of Novel Objects [116.34922994123973]
6D object pose estimation networks are limited in their capability to scale to large numbers of object instances.
In this work, we study a new open set problem; the few-shot 6D object poses estimation: estimating the 6D pose of an unknown object by a few support views without extra training.
arXiv Detail & Related papers (2022-03-28T10:31:29Z) - CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and
Categorical 6D Pose and Size Estimation [19.284468553414918]
This paper studies the complex task of simultaneous multi-object 3D reconstruction, 6D pose and size estimation from a single-view RGB-D observation.
Existing approaches mainly follow a complex multi-stage pipeline which first localizes and detects each object instance in the image and then regresses to either their 3D meshes or 6D poses.
We present a simple one-stage approach to predict both the 3D shape and estimate the 6D pose and size jointly in a bounding-box free manner.
arXiv Detail & Related papers (2022-03-03T18:59:04Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.