Category-Level 6D Object Pose Estimation with Flexible Vector-Based
Rotation Representation
- URL: http://arxiv.org/abs/2212.04632v1
- Date: Fri, 9 Dec 2022 02:13:43 GMT
- Title: Category-Level 6D Object Pose Estimation with Flexible Vector-Based
Rotation Representation
- Authors: Wei Chen, Xi Jia, Zhongqun Zhang, Hyung Jin Chang, Linlin Shen and
Ales Leonardis
- Abstract summary: We propose a novel 3D graph convolution based pipeline for category-level 6D pose and size estimation from monocular RGB-D images.
We first design an orientation-aware autoencoder with 3D graph convolution for latent feature learning.
Then, to efficiently decode the rotation information from the latent feature, we design a novel flexible vector-based decomposable rotation representation.
- Score: 51.67545893892129
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a novel 3D graph convolution based pipeline for
category-level 6D pose and size estimation from monocular RGB-D images. The
proposed method leverages an efficient 3D data augmentation and a novel
vector-based decoupled rotation representation. Specifically, we first design
an orientation-aware autoencoder with 3D graph convolution for latent feature
learning. The learned latent feature is insensitive to point shift and size
thanks to the shift and scale-invariance properties of the 3D graph
convolution. Then, to efficiently decode the rotation information from the
latent feature, we design a novel flexible vector-based decomposable rotation
representation that employs two decoders to complementarily access the rotation
information. The proposed rotation representation has two major advantages: 1)
decoupled characteristic that makes the rotation estimation easier; 2) flexible
length and rotated angle of the vectors allow us to find a more suitable vector
representation for specific pose estimation task. Finally, we propose a 3D
deformation mechanism to increase the generalization ability of the pipeline.
Extensive experiments show that the proposed pipeline achieves state-of-the-art
performance on category-level tasks. Further, the experiments demonstrate that
the proposed rotation representation is more suitable for the pose estimation
tasks than other rotation representations.
Related papers
- RIDE: Boosting 3D Object Detection for LiDAR Point Clouds via Rotation-Invariant Analysis [15.42293045246587]
RIDE is a pioneering exploration of Rotation-Invariance for the 3D LiDAR-point-based object DEtector.
We design a bi-feature extractor that extracts (i) object-aware features though sensitive to rotation but preserve geometry well, and (ii) rotation-invariant features, which lose geometric information to a certain extent but are robust to rotation.
Our RIDE is compatible and easy to plug into the existing one-stage and two-stage 3D detectors, and boosts both detection performance and rotation robustness.
arXiv Detail & Related papers (2024-08-28T08:53:33Z) - VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning
Decoupled Rotations on the Spherical Representations [55.25238503204253]
We propose a novel rotation estimation network, termed as VI-Net, to make the task easier.
To process the spherical signals, a Spherical Feature Pyramid Network is constructed based on a novel design of SPAtial Spherical Convolution.
Experiments on the benchmarking datasets confirm the efficacy of our method, which outperforms the existing ones with a large margin in the regime of high precision.
arXiv Detail & Related papers (2023-08-19T05:47:53Z) - E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs [61.552125054227595]
A new minimal solution is proposed to solve relative rotation estimation between two images without overlapping areas.
Based on E-Graph, the rotation estimation problem becomes simpler and more elegant.
We embed our rotation estimation strategy into a complete camera tracking and mapping system which obtains 6-DoF camera poses and a dense 3D mesh model.
arXiv Detail & Related papers (2022-07-20T16:11:48Z) - FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose
Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation.
The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z) - Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes [86.2129580231191]
Adjoint Rigid Transform (ART) Network is a neural module which can be integrated with a variety of 3D networks.
ART learns to rotate input shapes to a learned canonical orientation, which is crucial for a lot of tasks.
We will release our code and pre-trained models for further research.
arXiv Detail & Related papers (2021-02-01T20:58:45Z) - SpinNet: Learning a General Surface Descriptor for 3D Point Cloud
Registration [57.28608414782315]
We introduce a new, yet conceptually simple, neural architecture, termed SpinNet, to extract local features.
Experiments on both indoor and outdoor datasets demonstrate that SpinNet outperforms existing state-of-the-art techniques.
arXiv Detail & Related papers (2020-11-24T15:00:56Z) - Rotation-Invariant Local-to-Global Representation Learning for 3D Point
Cloud [42.86112554931754]
We propose a local-to-global representation learning algorithm for 3D point cloud data.
Our model takes advantage of multi-level abstraction based on graph convolutional neural networks.
The proposed algorithm presents the state-of-the-art performance on the rotation-augmented 3D object recognition and segmentation benchmarks.
arXiv Detail & Related papers (2020-10-07T10:30:20Z) - A Smooth Representation of Belief over SO(3) for Deep Rotation Learning
with Uncertainty [33.627068152037815]
We present a novel symmetric matrix representation of the 3D rotation group, SO(3), with two important properties that make it particularly suitable for learned models.
We empirically validate the benefits of our formulation by training deep neural rotation regressors on two data modalities.
This capability is key for safety-critical applications where detecting novel inputs can prevent catastrophic failure of learned models.
arXiv Detail & Related papers (2020-06-01T15:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.