FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose
Estimation with Decoupled Rotation Mechanism
- URL: http://arxiv.org/abs/2103.07054v1
- Date: Fri, 12 Mar 2021 03:07:24 GMT
- Title: FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose
Estimation with Decoupled Rotation Mechanism
- Authors: Wei Chen, Xi Jia, Hyung Jin Chang, Jinming Duan, Linlin Shen, Ales
Leonardis
- Abstract summary: We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation.
The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
- Score: 49.89268018642999
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we focus on category-level 6D pose and size estimation from
monocular RGB-D image. Previous methods suffer from inefficient category-level
pose feature extraction which leads to low accuracy and inference speed. To
tackle this problem, we propose a fast shape-based network (FS-Net) with
efficient category-level feature extraction for 6D pose estimation. First, we
design an orientation aware autoencoder with 3D graph convolution for latent
feature extraction. The learned latent feature is insensitive to point shift
and object size thanks to the shift and scale-invariance properties of the 3D
graph convolution. Then, to efficiently decode category-level rotation
information from the latent feature, we propose a novel decoupled rotation
mechanism that employs two decoders to complementarily access the rotation
information. Meanwhile, we estimate translation and size by two residuals,
which are the difference between the mean of object points and ground truth
translation, and the difference between the mean size of the category and
ground truth size, respectively. Finally, to increase the generalization
ability of FS-Net, we propose an online box-cage based 3D deformation mechanism
to augment the training data. Extensive experiments on two benchmark datasets
show that the proposed method achieves state-of-the-art performance in both
category- and instance-level 6D object pose estimation. Especially in
category-level pose estimation, without extra synthetic data, our method
outperforms existing methods by 6.3% on the NOCS-REAL dataset.
Related papers
- RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery [72.13154206106259]
We propose a novel pipeline that decouples the 6D pose and size estimation to mitigate the influence of imperfect scales on rigid transformations.
Specifically, we leverage a pre-trained monocular estimator to extract local geometric information.
A separate branch is designed to directly recover the metric scale of the object based on category-level statistics.
arXiv Detail & Related papers (2023-09-19T02:20:26Z) - VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning
Decoupled Rotations on the Spherical Representations [55.25238503204253]
We propose a novel rotation estimation network, termed as VI-Net, to make the task easier.
To process the spherical signals, a Spherical Feature Pyramid Network is constructed based on a novel design of SPAtial Spherical Convolution.
Experiments on the benchmarking datasets confirm the efficacy of our method, which outperforms the existing ones with a large margin in the regime of high precision.
arXiv Detail & Related papers (2023-08-19T05:47:53Z) - Category-Level 6D Object Pose Estimation with Flexible Vector-Based
Rotation Representation [51.67545893892129]
We propose a novel 3D graph convolution based pipeline for category-level 6D pose and size estimation from monocular RGB-D images.
We first design an orientation-aware autoencoder with 3D graph convolution for latent feature learning.
Then, to efficiently decode the rotation information from the latent feature, we design a novel flexible vector-based decomposable rotation representation.
arXiv Detail & Related papers (2022-12-09T02:13:43Z) - Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering
of Neural Features [17.920305227880245]
We consider the problem of category-level 6D pose estimation from a single RGB image.
Our approach represents an object category as a cuboid mesh and learns a generative model of the neural feature activations at each mesh.
Our experiments demonstrate an enhanced category-level 6D pose estimation performance compared to prior work.
arXiv Detail & Related papers (2022-09-12T21:31:36Z) - CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects
from Point Clouds [97.63549045541296]
We propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances and per-part pose tracking for articulated objects.
Our method achieves new state-of-the-art performance on category-level rigid object pose (NOCS-REAL275) and articulated object pose benchmarks (SAPIEN, BMVC) at the fastest FPS 12.
arXiv Detail & Related papers (2021-04-08T00:14:58Z) - 3D Point-to-Keypoint Voting Network for 6D Pose Estimation [8.801404171357916]
We propose a framework for 6D pose estimation from RGB-D data based on spatial structure characteristics of 3D keypoints.
The proposed method is verified on two benchmark datasets, LINEMOD and OCCLUSION LINEMOD.
arXiv Detail & Related papers (2020-12-22T11:43:15Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - L6DNet: Light 6 DoF Network for Robust and Precise Object Pose
Estimation with Small Datasets [0.0]
We propose a novel approach to perform 6 DoF object pose estimation from a single RGB-D image.
We adopt a hybrid pipeline in two stages: data-driven and geometric.
Our approach is more robust and accurate than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-03T17:41:29Z) - One Point, One Object: Simultaneous 3D Object Segmentation and 6-DOF Pose Estimation [0.7252027234425334]
We propose a method for simultaneous 3D object segmentation and 6-DOF pose estimation in pure 3D point clouds scenes.
The key component of our method is a multi-task CNN architecture that can simultaneously predict the 3D object segmentation and 6-DOF pose estimation in pure 3D point clouds.
For experimental evaluation, we generate expanded training data for two state-of-the-arts 3D object datasets citePLciteTLINEMOD by using Augmented Reality (AR)
arXiv Detail & Related papers (2019-12-27T13:48:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.