SphereVLAD++: Attention-based and Signal-enhanced Viewpoint Invariant
Descriptor
- URL: http://arxiv.org/abs/2207.02958v1
- Date: Wed, 6 Jul 2022 20:32:43 GMT
- Title: SphereVLAD++: Attention-based and Signal-enhanced Viewpoint Invariant
Descriptor
- Authors: Shiqi Zhao, Peng Yin, Ge Yi, and Sebastian Scherer
- Abstract summary: We develop SphereVLAD++, an attention-enhanced viewpoint invariant place recognition method.
We show that SphereVLAD++ outperforms all relative state-of-the-art 3D place recognition methods under small or even totally reversed viewpoint differences.
- Score: 6.326554177747699
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: LiDAR-based localization approach is a fundamental module for large-scale
navigation tasks, such as last-mile delivery and autonomous driving, and
localization robustness highly relies on viewpoints and 3D feature extraction.
Our previous work provides a viewpoint-invariant descriptor to deal with
viewpoint differences; however, the global descriptor suffers from a low
signal-noise ratio in unsupervised clustering, reducing the distinguishable
feature extraction ability. We develop SphereVLAD++, an attention-enhanced
viewpoint invariant place recognition method in this work. SphereVLAD++
projects the point cloud on the spherical perspective for each unique area and
captures the contextual connections between local features and their
dependencies with global 3D geometry distribution. In return, clustered
elements within the global descriptor are conditioned on local and global
geometries and support the original viewpoint-invariant property of SphereVLAD.
In the experiments, we evaluated the localization performance of SphereVLAD++
on both public KITTI360 datasets and self-generated datasets from the city of
Pittsburgh. The experiment results show that SphereVLAD++ outperforms all
relative state-of-the-art 3D place recognition methods under small or even
totally reversed viewpoint differences and shows 0.69% and 15.81% successful
retrieval rates with better than the second best. Low computation requirements
and high time efficiency also help its application for low-cost robots.
Related papers
- PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - Gaussian Splatting with Localized Points Management [52.009874685460694]
Localized Point Management (LPM) is capable of identifying those error-contributing zones in the highest demand for both point addition and geometry calibration.
LPM applies point densification in the identified zone, whilst resetting the opacity of those points residing in front of these regions so that a new opportunity is created to correct ill-conditioned points.
Notably, LPM improves both vanilla 3DGS and SpaceTimeGS to achieve state-of-the-art rendering quality while retaining real-time speeds.
arXiv Detail & Related papers (2024-06-06T16:55:07Z) - DuEqNet: Dual-Equivariance Network in Outdoor 3D Object Detection for
Autonomous Driving [4.489333751818157]
We propose DuEqNet, which first introduces the concept of equivariance into 3D object detection network.
The dual-equivariant of our model can extract the equivariant features at both local and global levels.
Our model presents higher accuracy on orientation and better prediction efficiency.
arXiv Detail & Related papers (2023-02-27T08:30:02Z) - Centralized Feature Pyramid for Object Detection [53.501796194901964]
Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications.
In this paper, we propose a OLO Feature Pyramid for object detection, which is based on a globally explicit centralized feature regulation.
arXiv Detail & Related papers (2022-10-05T08:32:54Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Bidirectional Feature Globalization for Few-shot Semantic Segmentation
of 3D Point Cloud Scenes [1.8374319565577157]
We propose a bidirectional feature globalization (BFG) approach to embed global perception to local point features.
With prototype-to-point globalization (Pr2PoG), the global perception is embedded to local point features based on similarity weights from sparse prototypes to dense point features.
The sparse prototypes of each class embedded with global perception are summarized to a single prototype for few-shot 3D segmentation.
arXiv Detail & Related papers (2022-08-13T15:04:20Z) - Unsupervised Learning on 3D Point Clouds by Clustering and Contrasting [11.64827192421785]
unsupervised representation learning is a promising direction to auto-extract features without human intervention.
This paper proposes a general unsupervised approach, named textbfConClu, to perform the learning of point-wise and global features.
arXiv Detail & Related papers (2022-02-05T12:54:17Z) - LoGG3D-Net: Locally Guided Global Descriptor Learning for 3D Place
Recognition [31.105598103211825]
We show that an additional training signal (local consistency loss) can guide the network to learning local features which are consistent across revisits.
We formulate our approach in an end-to-end trainable architecture called LoGG3D-Net.
arXiv Detail & Related papers (2021-09-17T03:32:43Z) - Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object
Detection [59.765645791588454]
Recently introduced RTS3D builds an efficient 4D Feature-Consistency Embedding space for the intermediate representation of object without depth supervision.
We propose a shape prior non-uniform sampling strategy that performs dense sampling in outer region and sparse sampling in inner region.
Our proposed method has 2.57% improvement on AP3d almost without extra network parameters.
arXiv Detail & Related papers (2021-06-18T09:14:55Z) - 3D Object Detection with Pointformer [29.935891419574602]
We propose Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively.
A Local Transformer module is employed to model interactions among points in a local region, which learns context-dependent region features at an object level.
A Global Transformer is designed to learn context-aware representations at the scene level.
arXiv Detail & Related papers (2020-12-21T15:12:54Z) - DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF
Relocalization [56.15308829924527]
We propose a Siamese network that jointly learns 3D local feature detection and description directly from raw 3D points.
For detecting 3D keypoints we predict the discriminativeness of the local descriptors in an unsupervised manner.
Experiments on various benchmarks demonstrate that our method achieves competitive results for both global point cloud retrieval and local point cloud registration.
arXiv Detail & Related papers (2020-07-17T20:21:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.