MPF6D: Masked Pyramid Fusion 6D Pose Estimation
- URL: http://arxiv.org/abs/2111.09378v1
- Date: Wed, 17 Nov 2021 20:23:54 GMT
- Title: MPF6D: Masked Pyramid Fusion 6D Pose Estimation
- Authors: Nuno Pereira and Lu\'is A. Alexandre
- Abstract summary: We present a new method to estimate the 6D pose of objects that improves upon the accuracy of current proposals.
Our method can be used in real-time with its low inference time of 0.12 seconds and has high accuracy.
- Score: 1.2891210250935146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object pose estimation has multiple important applications, such as robotic
grasping and augmented reality. We present a new method to estimate the 6D pose
of objects that improves upon the accuracy of current proposals and can still
be used in real-time. Our method uses RGB-D data as input to segment objects
and estimate their pose. It uses a neural network with multiple heads, one head
estimates the object classification and generates the mask, the second
estimates the values of the translation vector and the last head estimates the
values of the quaternion that represents the rotation of the object. These
heads leverage a pyramid architecture used during feature extraction and
feature fusion. Our method can be used in real-time with its low inference time
of 0.12 seconds and has high accuracy. With this combination of fast inference
and good accuracy it is possible to use our method in robotic pick and place
tasks and/or augmented reality applications.
Related papers
- ZS6D: Zero-shot 6D Object Pose Estimation using Vision Transformers [9.899633398596672]
We introduce ZS6D, for zero-shot novel object 6D pose estimation.
Visual descriptors, extracted using pre-trained Vision Transformers (ViT), are used for matching rendered templates.
Experiments are performed on LMO, YCBV, and TLESS datasets.
arXiv Detail & Related papers (2023-09-21T11:53:01Z) - LocPoseNet: Robust Location Prior for Unseen Object Pose Estimation [69.70498875887611]
LocPoseNet is able to robustly learn location prior for unseen objects.
Our method outperforms existing works by a large margin on LINEMOD and GenMOP.
arXiv Detail & Related papers (2022-11-29T15:21:34Z) - CRT-6D: Fast 6D Object Pose Estimation with Cascaded Refinement
Transformers [51.142988196855484]
This paper introduces a novel method we call Cascaded Refinement Transformers, or CRT-6D.
We replace the commonly used dense intermediate representation with a sparse set of features sampled from the feature pyramid we call Os(Object Keypoint Features) where each element corresponds to an object keypoint.
We achieve inferences 2x faster than the closest real-time state of the art methods while supporting up to 21 objects on a single model.
arXiv Detail & Related papers (2022-10-21T04:06:52Z) - Unseen Object 6D Pose Estimation: A Benchmark and Baselines [62.8809734237213]
We propose a new task that enables and facilitates algorithms to estimate the 6D pose estimation of novel objects during testing.
We collect a dataset with both real and synthetic images and up to 48 unseen objects in the test set.
By training an end-to-end 3D correspondences network, our method finds corresponding points between an unseen object and a partial view RGBD image accurately and efficiently.
arXiv Detail & Related papers (2022-06-23T16:29:53Z) - Coupled Iterative Refinement for 6D Multi-Object Pose Estimation [64.7198752089041]
Given a set of known 3D objects and an RGB or RGB-D input image, we detect and estimate the 6D pose of each object.
Our approach iteratively refines both pose and correspondence in a tightly coupled manner, allowing us to dynamically remove outliers to improve accuracy.
arXiv Detail & Related papers (2022-04-26T18:00:08Z) - 6D Object Pose Estimation using Keypoints and Part Affinity Fields [24.126513851779936]
The task of 6D object pose estimation from RGB images is an important requirement for autonomous service robots to be able to interact with the real world.
We present a two-step pipeline for estimating the 6 DoF translation and orientation of known objects.
arXiv Detail & Related papers (2021-07-05T14:41:19Z) - Scale Normalized Image Pyramids with AutoFocus for Object Detection [75.71320993452372]
A scale normalized image pyramid (SNIP) is generated that, like human vision, only attends to objects within a fixed size range at different scales.
We propose an efficient spatial sub-sampling scheme which only operates on fixed-size sub-regions likely to contain objects.
The resulting algorithm is referred to as AutoFocus and results in a 2.5-5 times speed-up during inference when used with SNIP.
arXiv Detail & Related papers (2021-02-10T18:57:53Z) - Spatial Attention Improves Iterative 6D Object Pose Estimation [52.365075652976735]
We propose a new method for 6D pose estimation refinement from RGB images.
Our main insight is that after the initial pose estimate, it is important to pay attention to distinct spatial features of the object.
We experimentally show that this approach learns to attend to salient spatial features and learns to ignore occluded parts of the object, leading to better pose estimation across datasets.
arXiv Detail & Related papers (2021-01-05T17:18:52Z) - EfficientPose: An efficient, accurate and scalable end-to-end 6D multi
object pose estimation approach [0.0]
We introduce EfficientPose, a new approach for 6D object pose estimation.
It is highly accurate, efficient and scalable over a wide range of computational resources.
It can detect the 2D bounding box of multiple objects and instances as well as estimate their full 6D poses in a single shot.
arXiv Detail & Related papers (2020-11-09T10:23:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.