LoGG3D-Net: Locally Guided Global Descriptor Learning for 3D Place
Recognition
- URL: http://arxiv.org/abs/2109.08336v1
- Date: Fri, 17 Sep 2021 03:32:43 GMT
- Title: LoGG3D-Net: Locally Guided Global Descriptor Learning for 3D Place
Recognition
- Authors: Kavisha Vidanapathirana, Milad Ramezani, Peyman Moghadam, Sridha
Sridharan, Clinton Fookes
- Abstract summary: We show that an additional training signal (local consistency loss) can guide the network to learning local features which are consistent across revisits.
We formulate our approach in an end-to-end trainable architecture called LoGG3D-Net.
- Score: 31.105598103211825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-based place recognition is an efficient and effective solution for
enabling re-localization within a pre-built map or global data association for
Simultaneous Localization and Mapping (SLAM). The accuracy of such an approach
is heavily dependent on the quality of the extracted scene-level
representation. While end-to-end solutions, which learn a global descriptor
from input point clouds, have demonstrated promising results, such approaches
are limited in their ability to enforce desirable properties at the local
feature level. In this paper, we demonstrate that the inclusion of an
additional training signal (local consistency loss) can guide the network to
learning local features which are consistent across revisits, hence leading to
more repeatable global descriptors resulting in an overall improvement in place
recognition performance. We formulate our approach in an end-to-end trainable
architecture called LoGG3D-Net. Experiments on two large-scale public
benchmarks (KITTI and MulRan) show that our method achieves mean $F1_{max}$
scores of $0.939$ and $0.968$ on KITTI and MulRan, respectively while operating
in near real-time.
Related papers
- FUSELOC: Fusing Global and Local Descriptors to Disambiguate 2D-3D Matching in Visual Localization [57.59857784298536]
Direct 2D-3D matching algorithms require significantly less memory but suffer from lower accuracy due to the larger and more ambiguous search space.
We address this ambiguity by fusing local and global descriptors using a weighted average operator within a 2D-3D search framework.
We consistently improve the accuracy over local-only systems and achieve performance close to hierarchical methods while halving memory requirements.
arXiv Detail & Related papers (2024-08-21T23:42:16Z) - Scalable and Efficient Hierarchical Visual Topological Mapping [3.114470292106496]
We evaluate state-of-the-art hand-crafted and learned global descriptors using a hierarchical topological mapping technique.
Based on our empirical analysis of multiple runs, we identify that continuity and distinctiveness are crucial characteristics for an optimal global descriptor.
arXiv Detail & Related papers (2024-04-07T17:30:57Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - SphereVLAD++: Attention-based and Signal-enhanced Viewpoint Invariant
Descriptor [6.326554177747699]
We develop SphereVLAD++, an attention-enhanced viewpoint invariant place recognition method.
We show that SphereVLAD++ outperforms all relative state-of-the-art 3D place recognition methods under small or even totally reversed viewpoint differences.
arXiv Detail & Related papers (2022-07-06T20:32:43Z) - Learning Consistency from High-quality Pseudo-labels for Weakly
Supervised Object Localization [7.602783618330373]
We propose a two-stage approach to learn more consistent localization.
In the first stage, we propose a mask-based pseudo label generator algorithm, and use the pseudo-supervised learning method to initialize an object localization network.
In the second stage, we propose a simple and effective method for evaluating the confidence of pseudo-labels based on classification discrimination.
arXiv Detail & Related papers (2022-03-18T09:05:51Z) - An Entropy-guided Reinforced Partial Convolutional Network for Zero-Shot
Learning [77.72330187258498]
We propose a novel Entropy-guided Reinforced Partial Convolutional Network (ERPCNet)
ERPCNet extracts and aggregates localities based on semantic relevance and visual correlations without human-annotated regions.
It not only discovers global-cooperative localities dynamically but also converges faster for policy gradient optimization.
arXiv Detail & Related papers (2021-11-03T11:13:13Z) - Conformer: Local Features Coupling Global Representations for Visual
Recognition [72.9550481476101]
We propose a hybrid network structure, termed Conformer, to take advantage of convolutional operations and self-attention mechanisms for enhanced representation learning.
Experiments show that Conformer, under the comparable parameter complexity, outperforms the visual transformer (DeiT-B) by 2.3% on ImageNet.
arXiv Detail & Related papers (2021-05-09T10:00:03Z) - Re-rank Coarse Classification with Local Region Enhanced Features for
Fine-Grained Image Recognition [22.83821575990778]
We re-rank the TopN classification results by using the local region enhanced embedding features to improve the Top1 accuracy.
To learn more effective semantic global features, we design a multi-level loss over an automatically constructed hierarchical category structure.
Our method achieves state-of-the-art performance on three benchmarks: CUB-200-2011, Stanford Cars, and FGVC Aircraft.
arXiv Detail & Related papers (2021-02-19T11:30:25Z) - Gait Recognition via Effective Global-Local Feature Representation and
Local Temporal Aggregation [28.721376937882958]
Gait recognition is one of the most important biometric technologies and has been applied in many fields.
Recent gait recognition frameworks represent each gait frame by descriptors extracted from either global appearances or local regions of humans.
We propose a novel feature extraction and fusion framework to achieve discriminative feature representations for gait recognition.
arXiv Detail & Related papers (2020-11-03T04:07:13Z) - DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF
Relocalization [56.15308829924527]
We propose a Siamese network that jointly learns 3D local feature detection and description directly from raw 3D points.
For detecting 3D keypoints we predict the discriminativeness of the local descriptors in an unsupervised manner.
Experiments on various benchmarks demonstrate that our method achieves competitive results for both global point cloud retrieval and local point cloud registration.
arXiv Detail & Related papers (2020-07-17T20:21:22Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.