TReR: A Lightweight Transformer Re-Ranking Approach for 3D LiDAR Place
Recognition
- URL: http://arxiv.org/abs/2305.18013v1
- Date: Mon, 29 May 2023 11:10:38 GMT
- Title: TReR: A Lightweight Transformer Re-Ranking Approach for 3D LiDAR Place
Recognition
- Authors: Tiago Barros, Lu\'is Garrote, Martin Aleksandrov, Cristiano Premebida,
Urbano J. Nunes
- Abstract summary: 3D LiDAR-based localization methods have used retrieval-based place recognition to find revisited places efficiently.
This work tackles this problem from an information-retrieval perspective, adopting a first-retrieve-then-re-ranking paradigm.
The proposed approach relies on global descriptors only, being agnostic to the place recognition model.
- Score: 2.6619797838632966
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autonomous driving systems often require reliable loop closure detection to
guarantee reduced localization drift. Recently, 3D LiDAR-based localization
methods have used retrieval-based place recognition to find revisited places
efficiently. However, when deployed in challenging real-world scenarios, the
place recognition models become more complex, which comes at the cost of high
computational demand. This work tackles this problem from an
information-retrieval perspective, adopting a first-retrieve-then-re-ranking
paradigm, where an initial loop candidate ranking, generated from a 3D place
recognition model, is re-ordered by a proposed lightweight transformer-based
re-ranking approach (TReR). The proposed approach relies on global descriptors
only, being agnostic to the place recognition model. The experimental
evaluation, conducted on the KITTI Odometry dataset, where we compared TReR
with s.o.t.a. re-ranking approaches such as alphaQE and SGV, indicate the
robustness and efficiency when compared to alphaQE while offering a good
trade-off between robustness and efficiency when compared to SGV.
Related papers
- GSPR: Multimodal Place Recognition Using 3D Gaussian Splatting for Autonomous Driving [9.023864430027333]
multimodal place recognition has gained increasing attention due to their ability to overcome weaknesses of uni sensor systems.
We propose a 3D Gaussian-based multimodal place recognition neural network dubbed GSPR.
arXiv Detail & Related papers (2024-10-01T00:43:45Z) - OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition [10.39935021754015]
We develop OverlapMamba, a novel network for place recognition as sequences.
Our method effectively detects loop closures showing even when traversing previously visited locations from different directions.
Relying on raw range view inputs, it outperforms typical LiDAR and multi-view combination methods in time complexity and speed.
arXiv Detail & Related papers (2024-05-13T17:46:35Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - Poses as Queries: Image-to-LiDAR Map Localization with Transformers [5.704968411509063]
High-precision vehicle localization with commercial setups is a crucial technique for high-level autonomous driving tasks.
Estimate pose by finding correspondences between such cross-modal sensor data is challenging.
We propose a novel Transformer-based neural network to register 2D images into 3D LiDAR map in an end-to-end manner.
arXiv Detail & Related papers (2023-05-07T14:57:58Z) - Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z) - Probabilistic Appearance-Invariant Topometric Localization with New
Place Awareness [23.615781318030454]
We present a new topometric localization system which incorporates full 3-dof odometry into the motion model and adds an "off-map" state within the state-estimation framework.
Our approach achieves major performance improvements over both existing and improved state-of-the-art systems.
arXiv Detail & Related papers (2021-07-16T05:01:40Z) - Lite-FPN for Keypoint-based Monocular 3D Object Detection [18.03406686769539]
Keypoint-based monocular 3D object detection has made tremendous progress and achieved great speed-accuracy trade-off.
We propose a sort of lightweight feature pyramid network called Lite-FPN to achieve multi-scale feature fusion.
Our proposed method achieves significantly higher accuracy and frame rate at the same time.
arXiv Detail & Related papers (2021-05-01T14:44:31Z) - Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing
Simulation-to-Real Domain Shift in LiDAR Bird's Eye View [110.83289076967895]
We present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process.
The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark.
arXiv Detail & Related papers (2021-04-22T12:47:37Z) - Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition
with Source Localization [73.62550438861942]
This paper proposes a new paradigm for handling far-field multi-speaker data in an end-to-end neural network manner, called directional automatic speech recognition (D-ASR)
In D-ASR, the azimuth angle of the sources with respect to the microphone array is defined as a latent variable. This angle controls the quality of separation, which in turn determines the ASR performance.
arXiv Detail & Related papers (2020-10-30T20:26:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.