Related papers: G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features

G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features

URL: http://arxiv.org/abs/2003.11089v2
Date: Thu, 26 Mar 2020 08:36:23 GMT
Title: G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
Authors: Wei Chen, Xi Jia, Hyung Jin Chang, Jinming Duan, Ales Leonardis
Abstract summary: We propose a novel real-time 6D object pose estimation framework, named G2L-Net. Our network operates on point clouds from RGB-D detection in a divide-and-conquer fashion. G2L-Net achieves state-of-the-art performance in terms of both accuracy and speed.
Score: 39.77987181390717
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we propose a novel real-time 6D object pose estimation framework, named G2L-Net. Our network operates on point clouds from RGB-D detection in a divide-and-conquer fashion. Specifically, our network consists of three steps. First, we extract the coarse object point cloud from the RGB-D image by 2D detection. Second, we feed the coarse object point cloud to a translation localization network to perform 3D segmentation and object translation prediction. Third, via the predicted segmentation and translation, we transfer the fine object point cloud into a local canonical coordinate, in which we train a rotation localization network to estimate initial object rotation. In the third step, we define point-wise embedding vector features to capture viewpoint-aware information. To calculate more accurate rotation, we adopt a rotation residual estimator to estimate the residual between initial rotation and ground truth, which can boost initial pose estimation performance. Our proposed G2L-Net is real-time despite the fact multiple steps are stacked via the proposed coarse-to-fine framework. Extensive experiments on two benchmark datasets show that G2L-Net achieves state-of-the-art performance in terms of both accuracy and speed.

Related papers

PillarNeSt: Embracing Backbone Scaling and Pretraining for Pillar-based 3D Object Detection [33.00510927880774]
We show the effectiveness of 2D backbone scaling and pretraining for pillar-based 3D object detectors. Our proposed pillar-based detector, PillarNeSt, outperforms the existing 3D object detectors by a large margin on the nuScenes and Argoversev2 datasets.
arXiv Detail & Related papers (2023-11-29T16:11:33Z)
PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds [29.15589024703907]
In this paper, we revisit the local point aggregators from the perspective of allocating computational resources. We find that the simplest pillar based models perform surprisingly well considering both accuracy and latency. Our results challenge the common intuition that the detailed geometry modeling is essential to achieve high performance for 3D object detection.
arXiv Detail & Related papers (2023-05-08T17:59:14Z)
Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology. Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z)
CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D. Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels. To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z)
SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling [75.957103837167]
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape. Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch.
arXiv Detail & Related papers (2022-08-14T16:37:51Z)
Structure Aware and Class Balanced 3D Object Detection on nuScenes Dataset [0.0]
NuTonomy's nuScenes dataset greatly extends commonly used datasets such as KITTI. The localization precision of this model is affected by the loss of spatial information in the downscaled feature maps. We propose to enhance the performance of the CBGS model by designing an auxiliary network, that makes full use of the structure information of the 3D point cloud.
arXiv Detail & Related papers (2022-05-25T06:18:49Z)
FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z)
PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection [100.60209139039472]
We propose the PointVoxel Region based Convolution Neural Networks (PVRCNNs) for accurate 3D detection from point clouds. Our proposed PV-RCNNs significantly outperform previous state-of-the-art 3D detection methods on both the Open dataset and the highly-competitive KITTI benchmark.
arXiv Detail & Related papers (2021-01-31T14:51:49Z)
Spherical Interpolated Convolutional Network with Distance-Feature Density for 3D Semantic Segmentation of Point Clouds [24.85151376535356]
Spherical interpolated convolution operator is proposed to replace the traditional grid-shaped 3D convolution operator. The proposed method achieves good performance on the ScanNet dataset and Paris-Lille-3D dataset.
arXiv Detail & Related papers (2020-11-27T15:35:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.