SeqOT: A Spatial-Temporal Transformer Network for Place Recognition
Using Sequential LiDAR Data
- URL: http://arxiv.org/abs/2209.07951v1
- Date: Fri, 16 Sep 2022 14:08:11 GMT
- Title: SeqOT: A Spatial-Temporal Transformer Network for Place Recognition
Using Sequential LiDAR Data
- Authors: Junyi Ma, Xieyuanli Chen, Jingyi Xu, Guangming Xiong
- Abstract summary: We propose a transformer-based network named SeqOT to exploit the temporal and spatial information provided by sequential range images.
We evaluate our approach on four datasets collected with different types of LiDAR sensors in different environments.
Our method operates online faster than the frame rate of the sensor.
- Score: 9.32516766412743
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Place recognition is an important component for autonomous vehicles to
achieve loop closing or global localization. In this paper, we tackle the
problem of place recognition based on sequential 3D LiDAR scans obtained by an
onboard LiDAR sensor. We propose a transformer-based network named SeqOT to
exploit the temporal and spatial information provided by sequential range
images generated from the LiDAR data. It uses multi-scale transformers to
generate a global descriptor for each sequence of LiDAR range images in an
end-to-end fashion. During online operation, our SeqOT finds similar places by
matching such descriptors between the current query sequence and those stored
in the map. We evaluate our approach on four datasets collected with different
types of LiDAR sensors in different environments. The experimental results show
that our method outperforms the state-of-the-art LiDAR-based place recognition
methods and generalizes well across different environments. Furthermore, our
method operates online faster than the frame rate of the sensor. The
implementation of our method is released as open source at:
https://github.com/BIT-MJY/SeqOT.
Related papers
- GSPR: Multimodal Place Recognition Using 3D Gaussian Splatting for Autonomous Driving [9.023864430027333]
multimodal place recognition has gained increasing attention due to their ability to overcome weaknesses of uni sensor systems.
We propose a 3D Gaussian-based multimodal place recognition neural network dubbed GSPR.
arXiv Detail & Related papers (2024-10-01T00:43:45Z) - RaLF: Flow-based Global and Metric Radar Localization in LiDAR Maps [8.625083692154414]
We propose RaLF, a novel deep neural network-based approach for localizing radar scans in a LiDAR map of the environment.
RaLF is composed of radar and LiDAR feature encoders, a place recognition head that generates global descriptors, and a metric localization head that predicts the 3-DoF transformation between the radar scan and the map.
We extensively evaluate our approach on multiple real-world driving datasets and show that RaLF achieves state-of-the-art performance for both place recognition and metric localization.
arXiv Detail & Related papers (2023-09-18T15:37:01Z) - UnLoc: A Universal Localization Method for Autonomous Vehicles using
LiDAR, Radar and/or Camera Input [51.150605800173366]
UnLoc is a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions.
Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets.
arXiv Detail & Related papers (2023-07-03T04:10:55Z) - CVTNet: A Cross-View Transformer Network for Place Recognition Using
LiDAR Data [15.144590078316252]
We propose a cross-view transformer-based network, dubbedBITNet, to fuse the range image views (RIVs) and bird's eye views (BEVs) generated from the LiDAR data.
We evaluate our approach on three datasets collected with different sensor setups and environmental conditions.
arXiv Detail & Related papers (2023-02-03T11:37:20Z) - Online Pole Segmentation on Range Images for Long-term LiDAR
Localization in Urban Environments [32.34672033386747]
We present a novel, accurate, and fast pole extraction approach based on geometric features that runs online.
Our method performs all computations directly on range images generated from 3D LiDAR scans.
We use the extracted poles as pseudo labels to train a deep neural network for online range image-based pole segmentation.
arXiv Detail & Related papers (2022-08-15T17:58:08Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - LiDARCap: Long-range Marker-less 3D Human Motion Capture with LiDAR
Point Clouds [58.402752909624716]
Existing motion capture datasets are largely short-range and cannot yet fit the need of long-range applications.
We propose LiDARHuman26M, a new human motion capture dataset captured by LiDAR at a much longer range to overcome this limitation.
Our dataset also includes the ground truth human motions acquired by the IMU system and the synchronous RGB images.
arXiv Detail & Related papers (2022-03-28T12:52:45Z) - Learning Moving-Object Tracking with FMCW LiDAR [53.05551269151209]
We propose a learning-based moving-object tracking method utilizing our newly developed LiDAR sensor, Frequency Modulated Continuous Wave (FMCW) LiDAR.
Given the labels, we propose a contrastive learning framework, which pulls together the features from the same instance in embedding space and pushes apart the features from different instances to improve the tracking quality.
arXiv Detail & Related papers (2022-03-02T09:11:36Z) - Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection
in Autonomous Driving [121.44554957537613]
We propose a new transformer, called Temporal-Channel Transformer, to model the spatial-temporal domain and channel domain relationships for video object detecting from Lidar data.
Specifically, the temporal-channel encoder of the transformer is designed to encode the information of different channels and frames.
We achieve the state-of-the-art performance in grid voxel-based 3D object detection on the nuScenes benchmark.
arXiv Detail & Related papers (2020-11-27T09:35:39Z) - Characterization of Multiple 3D LiDARs for Localization and Mapping
using Normal Distributions Transform [54.46473014276162]
We present a detailed comparison of ten different 3D LiDAR sensors, covering a range of manufacturers, models, and laser configurations, for the tasks of mapping and vehicle localization.
Data used in this study is a subset of our LiDAR Benchmarking and Reference (LIBRE) dataset, captured independently from each sensor, from a vehicle driven on public urban roads multiple times, at different times of the day.
We analyze the performance and characteristics of each LiDAR for the tasks of (1) 3D mapping including an assessment map quality based on mean map entropy, and (2) 6-DOF localization using a ground truth reference map.
arXiv Detail & Related papers (2020-04-03T05:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.