OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose
Estimation
- URL: http://arxiv.org/abs/2203.01072v1
- Date: Wed, 2 Mar 2022 12:51:33 GMT
- Title: OVE6D: Object Viewpoint Encoding for Depth-based 6D Object Pose
Estimation
- Authors: Dingding Cai, Janne Heikkil\"a, Esa Rahtu
- Abstract summary: We propose a universal framework, called OVE6D, for model-based 6D object pose estimation from a single depth image and a target object mask.
Our model is trained using purely synthetic data rendered from ShapeNet, and, unlike most of the existing methods, it generalizes well on new real-world objects without any fine-tuning.
We show that OVE6D outperforms some contemporary deep learning-based pose estimation methods specifically trained for individual objects or datasets with real-world training data.
- Score: 12.773040823634908
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a universal framework, called OVE6D, for model-based 6D
object pose estimation from a single depth image and a target object mask. Our
model is trained using purely synthetic data rendered from ShapeNet, and,
unlike most of the existing methods, it generalizes well on new real-world
objects without any fine-tuning. We achieve this by decomposing the 6D pose
into viewpoint, in-plane rotation around the camera optical axis and
translation, and introducing novel lightweight modules for estimating each
component in a cascaded manner. The resulting network contains less than 4M
parameters while demonstrating excellent performance on the challenging T-LESS
and Occluded LINEMOD datasets without any dataset-specific training. We show
that OVE6D outperforms some contemporary deep learning-based pose estimation
methods specifically trained for individual objects or datasets with real-world
training data.
The implementation and the pre-trained model will be made publicly available.
Related papers
- FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - Imitrob: Imitation Learning Dataset for Training and Evaluating 6D
Object Pose Estimators [20.611000416051546]
This paper introduces a dataset for training and evaluating methods for 6D pose estimation of hand-held tools in task demonstrations captured by a standard RGB camera.
The dataset contains image sequences of nine different tools and twelve manipulation tasks with two camera viewpoints, four human subjects, and left/right hand.
arXiv Detail & Related papers (2022-09-16T14:43:46Z) - Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised
Learning Approach and A New Dataset [12.40437099476348]
We generalize category-level 6D object pose estimation in the wild with semi-supervised learning.
We propose a new model, called Rendering for Pose estimation network RePoNet, that is jointly trained using the free ground-truths with the synthetic data.
Our method outperforms state-of-the-art methods on the previous dataset and our Wild6D test set (with manual annotations for evaluation) by a large margin.
arXiv Detail & Related papers (2022-06-30T17:30:57Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - Category-Agnostic 6D Pose Estimation with Conditional Neural Processes [19.387280883044482]
We present a novel meta-learning approach for 6D pose estimation on unknown objects.
Our algorithm learns object representation in a category-agnostic way, which endows it with strong generalization capabilities across object categories.
arXiv Detail & Related papers (2022-06-14T20:46:09Z) - DONet: Learning Category-Level 6D Object Pose and Size Estimation from
Depth Observation [53.55300278592281]
We propose a method of Category-level 6D Object Pose and Size Estimation (COPSE) from a single depth image.
Our framework makes inferences based on the rich geometric information of the object in the depth channel alone.
Our framework competes with state-of-the-art approaches that require labeled real-world images.
arXiv Detail & Related papers (2021-06-27T10:41:50Z) - Shape Prior Deformation for Categorical 6D Object Pose and Size
Estimation [62.618227434286]
We present a novel learning approach to recover the 6D poses and sizes of unseen object instances from an RGB-D image.
We propose a deep network to reconstruct the 3D object model by explicitly modeling the deformation from a pre-learned categorical shape prior.
arXiv Detail & Related papers (2020-07-16T16:45:05Z) - Single Shot 6D Object Pose Estimation [11.37625512264302]
We introduce a novel single shot approach for 6D object pose estimation of rigid objects based on depth images.
A fully convolutional neural network is employed, where the 3D input data is spatially discretized and pose estimation is considered as a regression task.
With 65 fps on a GPU, our Object Pose Network (OP-Net) is extremely fast, is optimized end-to-end, and estimates the 6D pose of multiple objects in the image simultaneously.
arXiv Detail & Related papers (2020-04-27T11:59:11Z) - CPS++: Improving Class-level 6D Pose and Shape Estimation From Monocular
Images With Self-Supervised Learning [74.53664270194643]
Modern monocular 6D pose estimation methods can only cope with a handful of object instances.
We propose a novel method for class-level monocular 6D pose estimation, coupled with metric shape retrieval.
We experimentally demonstrate that we can retrieve precise 6D poses and metric shapes from a single RGB image.
arXiv Detail & Related papers (2020-03-12T15:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.