Zero-Shot Category-Level Object Pose Estimation
- URL: http://arxiv.org/abs/2204.03635v1
- Date: Thu, 7 Apr 2022 17:58:39 GMT
- Title: Zero-Shot Category-Level Object Pose Estimation
- Authors: Walter Goodwin, Sagar Vaze, Ioannis Havoutis, Ingmar Posner
- Abstract summary: We tackle the problem of estimating the pose of novel object categories in a zero-shot manner.
This extends much of the existing literature by removing the need for pose-labelled datasets or category-specific CAD models.
Our method provides a six-fold improvement in average rotation accuracy at 30 degrees.
- Score: 24.822189326540105
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Object pose estimation is an important component of most vision pipelines for
embodied agents, as well as in 3D vision more generally. In this paper we
tackle the problem of estimating the pose of novel object categories in a
zero-shot manner. This extends much of the existing literature by removing the
need for pose-labelled datasets or category-specific CAD models for training or
inference. Specifically, we make the following contributions. First, we
formalise the zero-shot, category-level pose estimation problem and frame it in
a way that is most applicable to real-world embodied agents. Secondly, we
propose a novel method based on semantic correspondences from a self-supervised
vision transformer to solve the pose estimation problem. We further re-purpose
the recent CO3D dataset to present a controlled and realistic test setting.
Finally, we demonstrate that all baselines for our proposed task perform
poorly, and show that our method provides a six-fold improvement in average
rotation accuracy at 30 degrees. Our code is available at
https://github.com/applied-ai-lab/zero-shot-pose.
Related papers
- MFOS: Model-Free & One-Shot Object Pose Estimation [10.009454818723025]
We introduce a novel approach that can estimate in a single forward pass the pose of objects never seen during training, given minimum input.
We conduct extensive experiments and report state-of-the-art one-shot performance on the challenging LINEMOD benchmark.
arXiv Detail & Related papers (2023-10-03T09:12:07Z) - ZeroPose: CAD-Prompted Zero-shot Object 6D Pose Estimation in Cluttered Scenes [19.993163470302097]
ZeroPose is a novel framework that performs pose estimation following a Discovery-Orientation-Registration (DOR) inference pipeline.
It generalizes to novel objects without requiring model retraining.
It achieves comparable performance with object-specific training methods and outperforms the state-of-the-art zero-shot method with 50x inference speed improvement.
arXiv Detail & Related papers (2023-05-29T07:54:04Z) - ShapeShift: Superquadric-based Object Pose Estimation for Robotic
Grasping [85.38689479346276]
Current techniques heavily rely on a reference 3D object, limiting their generalizability and making it expensive to expand to new object categories.
This paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.
arXiv Detail & Related papers (2023-04-10T20:55:41Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - NOPE: Novel Object Pose Estimation from a Single Image [67.11073133072527]
We propose an approach that takes a single image of a new object as input and predicts the relative pose of this object in new images without prior knowledge of the object's 3D model.
We achieve this by training a model to directly predict discriminative embeddings for viewpoints surrounding the object.
This prediction is done using a simple U-Net architecture with attention and conditioned on the desired pose, which yields extremely fast inference.
arXiv Detail & Related papers (2023-03-23T18:55:43Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - CPPF++: Uncertainty-Aware Sim2Real Object Pose Estimation by Vote Aggregation [67.12857074801731]
We introduce a novel method, CPPF++, designed for sim-to-real pose estimation.
To address the challenge posed by vote collision, we propose a novel approach that involves modeling the voting uncertainty.
We incorporate several innovative modules, including noisy pair filtering, online alignment optimization, and a feature ensemble.
arXiv Detail & Related papers (2022-11-24T03:27:00Z) - Pose for Everything: Towards Category-Agnostic Pose Estimation [93.07415325374761]
Category-Agnostic Pose Estimation (CAPE) aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.
A transformer-based Keypoint Interaction Module (KIM) is proposed to capture both the interactions among different keypoints and the relationship between the support and query images.
We also introduce Multi-category Pose (MP-100) dataset, which is a 2D pose dataset of 100 object categories containing over 20K instances and is well-designed for developing CAPE algorithms.
arXiv Detail & Related papers (2022-07-21T09:40:54Z) - Object Pose Estimation using Mid-level Visual Representations [5.220940151628735]
This work proposes a novel pose estimation model for object categories that can be effectively transferred to previously unseen environments.
Deep convolutional network models (CNN) for pose estimation are typically trained and evaluated on datasets curated for object detection, pose estimation, or 3D reconstruction.
We show that the approach is favorable when it comes to generalization and transfer to novel environments.
arXiv Detail & Related papers (2022-03-02T22:49:17Z) - Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object
Pose Estimation [30.04752448942084]
Category-level object pose estimation aims to find 6D object poses of previously unseen object instances from known categories without access to object CAD models.
We propose for the first time a self-supervised learning framework to estimate category-level 6D object pose from single 3D point clouds.
arXiv Detail & Related papers (2021-10-30T06:46:44Z) - ZePHyR: Zero-shot Pose Hypothesis Rating [36.52070583343388]
We introduce a novel method for zero-shot object pose estimation in clutter.
Our approach uses a hypothesis generation and scoring framework, with a focus on learning a scoring function that generalizes to objects not used for training.
We demonstrate how our system can be used by quickly scanning and building a model of a novel object, which can immediately be used by our method for pose estimation.
arXiv Detail & Related papers (2021-04-28T01:48:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.