Related papers: UnLoc: A Universal Localization Method for Autonomous Vehicles using LiDAR, Radar and/or Camera Input

UnLoc: A Universal Localization Method for Autonomous Vehicles using LiDAR, Radar and/or Camera Input

URL: http://arxiv.org/abs/2307.00741v1
Date: Mon, 3 Jul 2023 04:10:55 GMT
Title: UnLoc: A Universal Localization Method for Autonomous Vehicles using LiDAR, Radar and/or Camera Input
Authors: Muhammad Ibrahim, Naveed Akhtar, Saeed Anwar, and Ajmal Mian
Abstract summary: UnLoc is a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions. Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets.
Score: 51.150605800173366
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Localization is a fundamental task in robotics for autonomous navigation. Existing localization methods rely on a single input data modality or train several computational models to process different modalities. This leads to stringent computational requirements and sub-optimal results that fail to capitalize on the complementary information in other data streams. This paper proposes UnLoc, a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions. Our multi-stream network can handle LiDAR, Camera and RADAR inputs for localization on demand, i.e., it can work with one or more input sensors, making it robust to sensor failure. UnLoc uses 3D sparse convolutions and cylindrical partitioning of the space to process LiDAR frames and implements ResNet blocks with a slot attention-based feature filtering module for the Radar and image modalities. We introduce a unique learnable modality encoding scheme to distinguish between the input sensor data. Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets. The results ascertain the efficacy of our technique.

Related papers

Resource-Efficient Beam Prediction in mmWave Communications with Multimodal Realistic Simulation Framework [57.994965436344195]
Beamforming is a key technology in millimeter-wave (mmWave) communications that improves signal transmission by optimizing directionality and intensity. multimodal sensing-aided beam prediction has gained significant attention, using various sensing data to predict user locations or network conditions. Despite its promising potential, the adoption of multimodal sensing-aided beam prediction is hindered by high computational complexity, high costs, and limited datasets.
arXiv Detail & Related papers (2025-04-07T15:38:25Z)
Real-Time Navigation for Autonomous Aerial Vehicles Using Video [11.414350041043326]
We introduce a novel Markov Decision Process(MDP) framework to reduce the workload of Computer Vision(CV) algorithms. We apply our proposed framework to both feature-based and neural-network-based object-detection tasks. These holistic tests show significant benefits in energy consumption and speed with only a limited loss in accuracy.
arXiv Detail & Related papers (2025-04-01T01:14:42Z)
The Oxford Spires Dataset: Benchmarking Large-Scale LiDAR-Visual Localisation, Reconstruction and Radiance Field Methods [10.265865092323041]
This paper introduces a large-scale multi-modal dataset captured in and around well-known landmarks in Oxford. We also establish benchmarks for tasks involving localisation, reconstruction, and novel-view synthesis. Our dataset and benchmarks are intended to facilitate better integration of radiance field methods and SLAM systems.
arXiv Detail & Related papers (2024-11-15T19:43:24Z)
GSPR: Multimodal Place Recognition Using 3D Gaussian Splatting for Autonomous Driving [9.023864430027333]
multimodal place recognition has gained increasing attention due to their ability to overcome weaknesses of uni sensor systems. We propose a 3D Gaussian-based multimodal place recognition neural network dubbed GSPR.
arXiv Detail & Related papers (2024-10-01T00:43:45Z)
OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition [10.39935021754015]
We develop OverlapMamba, a novel network for place recognition as sequences. Our method effectively detects loop closures showing even when traversing previously visited locations from different directions. Relying on raw range view inputs, it outperforms typical LiDAR and multi-view combination methods in time complexity and speed.
arXiv Detail & Related papers (2024-05-13T17:46:35Z)
Language-Guided 3D Object Detection in Point Cloud for Autonomous Driving [91.91552963872596]
We propose a new multi-modal visual grounding task, termed LiDAR Grounding. It jointly learns the LiDAR-based object detector with the language features and predicts the targeted region directly from the detector. Our work offers a deeper insight into the LiDAR-based grounding task and we expect it presents a promising direction for the autonomous driving community.
arXiv Detail & Related papers (2023-05-25T06:22:10Z)
Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection. With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z)
IR-MCL: Implicit Representation-Based Online Global Localization [31.77645160411745]
In this paper, we address the problem of estimating the robots pose in an indoor environment using 2D LiDAR data. We propose a neural occupancy field (NOF) to implicitly represent the scene using a neural network. We show that we can accurately and efficiently localize a robot using our approach surpassing the localization performance of state-of-the-art methods.
arXiv Detail & Related papers (2022-10-06T17:59:08Z)
SeqOT: A Spatial-Temporal Transformer Network for Place Recognition Using Sequential LiDAR Data [9.32516766412743]
We propose a transformer-based network named SeqOT to exploit the temporal and spatial information provided by sequential range images. We evaluate our approach on four datasets collected with different types of LiDAR sensors in different environments. Our method operates online faster than the frame rate of the sensor.
arXiv Detail & Related papers (2022-09-16T14:08:11Z)
Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector. The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference. Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z)
Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on. We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z)
Kimera-Multi: Robust, Distributed, Dense Metric-Semantic SLAM for Multi-Robot Systems [92.26462290867963]
Kimera-Multi is the first multi-robot system that is robust and capable of identifying and rejecting incorrect inter and intra-robot loop closures. We demonstrate Kimera-Multi in photo-realistic simulations, SLAM benchmarking datasets, and challenging outdoor datasets collected using ground robots.
arXiv Detail & Related papers (2021-06-28T03:56:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.