Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised
Learning Approach and A New Dataset
- URL: http://arxiv.org/abs/2206.15436v1
- Date: Thu, 30 Jun 2022 17:30:57 GMT
- Title: Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised
Learning Approach and A New Dataset
- Authors: Yang Fu and Xiaolong Wang
- Abstract summary: We generalize category-level 6D object pose estimation in the wild with semi-supervised learning.
We propose a new model, called Rendering for Pose estimation network RePoNet, that is jointly trained using the free ground-truths with the synthetic data.
Our method outperforms state-of-the-art methods on the previous dataset and our Wild6D test set (with manual annotations for evaluation) by a large margin.
- Score: 12.40437099476348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 6D object pose estimation is one of the fundamental problems in computer
vision and robotics research. While a lot of recent efforts have been made on
generalizing pose estimation to novel object instances within the same
category, namely category-level 6D pose estimation, it is still restricted in
constrained environments given the limited number of annotated data. In this
paper, we collect Wild6D, a new unlabeled RGBD object video dataset with
diverse instances and backgrounds. We utilize this data to generalize
category-level 6D object pose estimation in the wild with semi-supervised
learning. We propose a new model, called Rendering for Pose estimation network
RePoNet, that is jointly trained using the free ground-truths with the
synthetic data, and a silhouette matching objective function on the real-world
data. Without using any 3D annotations on real data, our method outperforms
state-of-the-art methods on the previous dataset and our Wild6D test set (with
manual annotations for evaluation) by a large margin. Project page with Wild6D
data: https://oasisyang.github.io/semi-pose .
Related papers
- Learning to Estimate 6DoF Pose from Limited Data: A Few-Shot,
Generalizable Approach using RGB Images [60.0898989456276]
We present a new framework named Cas6D for few-shot 6DoF pose estimation that is generalizable and uses only RGB images.
To address the false positives of target object detection in the extreme few-shot setting, our framework utilizes a self-supervised pre-trained ViT to learn robust feature representations.
Experimental results on the LINEMOD and GenMOP datasets demonstrate that Cas6D outperforms state-of-the-art methods by 9.2% and 3.8% accuracy (Proj-5) under the 32-shot setting.
arXiv Detail & Related papers (2023-06-13T07:45:42Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - Self-Supervised Geometric Correspondence for Category-Level 6D Object
Pose Estimation in the Wild [47.80637472803838]
We introduce a self-supervised learning approach trained directly on large-scale real-world object videos for category-level 6D pose estimation in the wild.
Our framework reconstructs the canonical 3D shape of an object category and learns dense correspondences between input images and the canonical shape via surface embedding.
Surprisingly, our method, without any human annotations or simulators, can achieve on-par or even better performance than previous supervised or semi-supervised methods on in-the-wild images.
arXiv Detail & Related papers (2022-10-13T17:19:22Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - Category-Agnostic 6D Pose Estimation with Conditional Neural Processes [19.387280883044482]
We present a novel meta-learning approach for 6D pose estimation on unknown objects.
Our algorithm learns object representation in a category-agnostic way, which endows it with strong generalization capabilities across object categories.
arXiv Detail & Related papers (2022-06-14T20:46:09Z) - FS6D: Few-Shot 6D Pose Estimation of Novel Objects [116.34922994123973]
6D object pose estimation networks are limited in their capability to scale to large numbers of object instances.
In this work, we study a new open set problem; the few-shot 6D object poses estimation: estimating the 6D pose of an unknown object by a few support views without extra training.
arXiv Detail & Related papers (2022-03-28T10:31:29Z) - NeRF-Pose: A First-Reconstruct-Then-Regress Approach for
Weakly-supervised 6D Object Pose Estimation [44.42449011619408]
We present a weakly-supervised reconstruction-based pipeline, named NeRF-Pose, which needs only 2D object segmentation and known relative camera poses during training.
A NeRF-enabled RAN+SAC algorithm is used to estimate stable and accurate pose from the predicted correspondences.
Experiments on LineMod-Occlusion show that the proposed method has state-of-the-art accuracy in comparison to the best 6D pose estimation methods.
arXiv Detail & Related papers (2022-03-09T15:28:02Z) - OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose
Estimation [12.773040823634908]
We propose a universal framework, called OVE6D, for model-based 6D object pose estimation from a single depth image and a target object mask.
Our model is trained using purely synthetic data rendered from ShapeNet, and, unlike most of the existing methods, it generalizes well on new real-world objects without any fine-tuning.
We show that OVE6D outperforms some contemporary deep learning-based pose estimation methods specifically trained for individual objects or datasets with real-world training data.
arXiv Detail & Related papers (2022-03-02T12:51:33Z) - Single Shot 6D Object Pose Estimation [11.37625512264302]
We introduce a novel single shot approach for 6D object pose estimation of rigid objects based on depth images.
A fully convolutional neural network is employed, where the 3D input data is spatially discretized and pose estimation is considered as a regression task.
With 65 fps on a GPU, our Object Pose Network (OP-Net) is extremely fast, is optimized end-to-end, and estimates the 6D pose of multiple objects in the image simultaneously.
arXiv Detail & Related papers (2020-04-27T11:59:11Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.