Templates for 3D Object Pose Estimation Revisited: Generalization to New
Objects and Robustness to Occlusions
- URL: http://arxiv.org/abs/2203.17234v1
- Date: Thu, 31 Mar 2022 17:50:35 GMT
- Title: Templates for 3D Object Pose Estimation Revisited: Generalization to New
Objects and Robustness to Occlusions
- Authors: Van Nguyen Nguyen, Yinlin Hu, Yang Xiao, Mathieu Salzmann, Vincent
Lepetit
- Abstract summary: We present a method that can recognize new objects and estimate their 3D pose in RGB images even under partial occlusions.
It relies on a small set of training objects to learn local object representations.
We are the first to show generalization without retraining on the LINEMOD and Occlusion-LINEMOD datasets.
- Score: 79.34847067293649
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a method that can recognize new objects and estimate their 3D pose
in RGB images even under partial occlusions. Our method requires neither a
training phase on these objects nor real images depicting them, only their CAD
models. It relies on a small set of training objects to learn local object
representations, which allow us to locally match the input image to a set of
"templates", rendered images of the CAD models for the new objects. In contrast
with the state-of-the-art methods, the new objects on which our method is
applied can be very different from the training objects. As a result, we are
the first to show generalization without retraining on the LINEMOD and
Occlusion-LINEMOD datasets. Our analysis of the failure modes of previous
template-based approaches further confirms the benefits of local features for
template matching. We outperform the state-of-the-art template matching methods
on the LINEMOD, Occlusion-LINEMOD and T-LESS datasets. Our source code and data
are publicly available at https://github.com/nv-nguyen/template-pose
Related papers
- FoundPose: Unseen Object Pose Estimation with Foundation Features [11.32559845631345]
FoundPose is a model-based method for 6D pose estimation of unseen objects from a single RGB image.
The method can quickly onboard new objects using their 3D models without requiring any object- or task-specific training.
arXiv Detail & Related papers (2023-11-30T18:52:29Z) - NOPE: Novel Object Pose Estimation from a Single Image [67.11073133072527]
We propose an approach that takes a single image of a new object as input and predicts the relative pose of this object in new images without prior knowledge of the object's 3D model.
We achieve this by training a model to directly predict discriminative embeddings for viewpoints surrounding the object.
This prediction is done using a simple U-Net architecture with attention and conditioned on the desired pose, which yields extremely fast inference.
arXiv Detail & Related papers (2023-03-23T18:55:43Z) - OnePose++: Keypoint-Free One-Shot Object Pose Estimation without CAD
Models [51.68715543630427]
OnePose relies on detecting repeatable image keypoints and is thus prone to failure on low-textured objects.
We propose a keypoint-free pose estimation pipeline to remove the need for repeatable keypoint detection.
A 2D-3D matching network directly establishes 2D-3D correspondences between the query image and the reconstructed point-cloud model.
arXiv Detail & Related papers (2023-01-18T17:47:13Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - Multi-Category Mesh Reconstruction From Image Collections [90.24365811344987]
We present an alternative approach that infers the textured mesh of objects combining a series of deformable 3D models and a set of instance-specific deformation, pose, and texture.
Our method is trained with images of multiple object categories using only foreground masks and rough camera poses as supervision.
Experiments show that the proposed framework can distinguish between different object categories and learn category-specific shape priors in an unsupervised manner.
arXiv Detail & Related papers (2021-10-21T16:32:31Z) - 3D Object Detection and Pose Estimation of Unseen Objects in Color
Images with Local Surface Embeddings [35.769234123059086]
We present an approach for detecting and estimating the 3D poses of objects in images that requires only an untextured CAD model.
Our approach combines Deep Learning and 3D geometry: It relies on an embedding of local 3D geometry to match the CAD models to the input images.
We show that we can use Mask-RCNN in a class-agnostic way to detect the new objects without retraining and thus drastically limit the number of possible correspondences.
arXiv Detail & Related papers (2020-10-08T15:57:06Z) - Canonical 3D Deformer Maps: Unifying parametric and non-parametric
methods for dense weakly-supervised category reconstruction [79.98689027127855]
We propose a new representation of the 3D shape of common object categories that can be learned from a collection of 2D images of independent objects.
Our method builds in a novel way on concepts from parametric deformation models, non-parametric 3D reconstruction, and canonical embeddings.
It achieves state-of-the-art results in dense 3D reconstruction on public in-the-wild datasets of faces, cars, and birds.
arXiv Detail & Related papers (2020-08-28T15:44:05Z) - Neural Object Learning for 6D Pose Estimation Using a Few Cluttered
Images [30.240630713652035]
Recent methods for 6D pose estimation of objects assume either textured 3D models or real images that cover the entire range of target poses.
This paper proposes a method, Neural Object Learning (NOL), that creates synthetic images of objects in arbitrary poses by combining only a few observations from cluttered images.
arXiv Detail & Related papers (2020-05-07T19:33:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.