MFOS: Model-Free & One-Shot Object Pose Estimation
- URL: http://arxiv.org/abs/2310.01897v1
- Date: Tue, 3 Oct 2023 09:12:07 GMT
- Title: MFOS: Model-Free & One-Shot Object Pose Estimation
- Authors: JongMin Lee, Yohann Cabon, Romain Br\'egier, Sungjoo Yoo, Jerome
Revaud
- Abstract summary: We introduce a novel approach that can estimate in a single forward pass the pose of objects never seen during training, given minimum input.
We conduct extensive experiments and report state-of-the-art one-shot performance on the challenging LINEMOD benchmark.
- Score: 10.009454818723025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing learning-based methods for object pose estimation in RGB images are
mostly model-specific or category based. They lack the capability to generalize
to new object categories at test time, hence severely hindering their
practicability and scalability. Notably, recent attempts have been made to
solve this issue, but they still require accurate 3D data of the object surface
at both train and test time. In this paper, we introduce a novel approach that
can estimate in a single forward pass the pose of objects never seen during
training, given minimum input. In contrast to existing state-of-the-art
approaches, which rely on task-specific modules, our proposed model is entirely
based on a transformer architecture, which can benefit from recently proposed
3D-geometry general pretraining. We conduct extensive experiments and report
state-of-the-art one-shot performance on the challenging LINEMOD benchmark.
Finally, extensive ablations allow us to determine good practices with this
relatively new type of architecture in the field.
Related papers
- Comparative Evaluation of 3D Reconstruction Methods for Object Pose Estimation [22.830136701433613]
We propose a novel benchmark for measuring the impact of 3D reconstruction quality on pose estimation accuracy.
Detailed experiments with multiple state-of-the-art 3D reconstruction and object pose estimation approaches show that the geometry produced by modern reconstruction methods is often sufficient for accurate pose estimation.
arXiv Detail & Related papers (2024-08-15T15:58:11Z) - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects [55.77542145604758]
FoundationPose is a unified foundation model for 6D object pose estimation and tracking.
Our approach can be instantly applied at test-time to a novel object without fine-tuning.
arXiv Detail & Related papers (2023-12-13T18:28:09Z) - GS-Pose: Category-Level Object Pose Estimation via Geometric and
Semantic Correspondence [5.500735640045456]
Category-level pose estimation is a challenging task with many potential applications in computer vision and robotics.
We propose to utilize both geometric and semantic features obtained from a pre-trained foundation model.
This requires significantly less data to train than prior methods since the semantic features are robust to object texture and appearance.
arXiv Detail & Related papers (2023-11-23T02:35:38Z) - ShapeShift: Superquadric-based Object Pose Estimation for Robotic
Grasping [85.38689479346276]
Current techniques heavily rely on a reference 3D object, limiting their generalizability and making it expensive to expand to new object categories.
This paper proposes ShapeShift, a superquadric-based framework for object pose estimation that predicts the object's pose relative to a primitive shape which is fitted to the object.
arXiv Detail & Related papers (2023-04-10T20:55:41Z) - NOPE: Novel Object Pose Estimation from a Single Image [67.11073133072527]
We propose an approach that takes a single image of a new object as input and predicts the relative pose of this object in new images without prior knowledge of the object's 3D model.
We achieve this by training a model to directly predict discriminative embeddings for viewpoints surrounding the object.
This prediction is done using a simple U-Net architecture with attention and conditioned on the desired pose, which yields extremely fast inference.
arXiv Detail & Related papers (2023-03-23T18:55:43Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - Stereo Neural Vernier Caliper [57.187088191829886]
We propose a new object-centric framework for learning-based stereo 3D object detection.
We tackle a problem of how to predict a refined update given an initial 3D cuboid guess.
Our approach achieves state-of-the-art performance on the KITTI benchmark.
arXiv Detail & Related papers (2022-03-21T14:36:07Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.