Related papers: LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation

LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation

URL: http://arxiv.org/abs/2304.11379v2
Date: Mon, 5 Jun 2023 03:56:19 GMT
Title: LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation
Authors: Song Wang and Wentong Li and Wenyu Liu and Xiaolu Liu and Jianke Zhu
Abstract summary: Semantic map construction under bird's-eye view (BEV) plays an essential role in autonomous driving. In this paper, we propose an effective LiDAR-based method to build semantic map. We introduce a BEV feature pyramid decoder that learns the robust multi-scale BEV features for semantic map construction.
Score: 21.53150795218778
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic map construction under bird's-eye view (BEV) plays an essential role in autonomous driving. In contrast to camera image, LiDAR provides the accurate 3D observations to project the captured 3D features onto BEV space inherently. However, the vanilla LiDAR-based BEV feature often contains many indefinite noises, where the spatial features have little texture and semantic cues. In this paper, we propose an effective LiDAR-based method to build semantic map. Specifically, we introduce a BEV feature pyramid decoder that learns the robust multi-scale BEV features for semantic map construction, which greatly boosts the accuracy of the LiDAR-based method. To mitigate the defects caused by lacking semantic cues in LiDAR data, we present an online Camera-to-LiDAR distillation scheme to facilitate the semantic learning from image to point cloud. Our distillation scheme consists of feature-level and logit-level distillation to absorb the semantic information from camera in BEV. The experimental results on challenging nuScenes dataset demonstrate the efficacy of our proposed LiDAR2Map on semantic map construction, which significantly outperforms the previous LiDAR-based methods over 27.9% mIoU and even performs better than the state-of-the-art camera-based approaches. Source code is available at: https://github.com/songw-zju/LiDAR2Map.

Related papers

SimpleBEV: Improved LiDAR-Camera Fusion Architecture for 3D Object Detection [15.551625571158056]
We propose a LiDAR-camera fusion framework, named SimpleBEV, for accurate 3D object detection. Our method achieves 77.6% NDS accuracy on the nuScenes dataset, showcasing superior performance in the 3D object detection track.
arXiv Detail & Related papers (2024-11-08T02:51:39Z)
VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection [80.62052650370416]
monocular 3D object detection holds significant importance across various applications, including autonomous driving and robotics. In this paper, we present VFMM3D, an innovative framework that leverages the capabilities of Vision Foundation Models (VFMs) to accurately transform single-view images into LiDAR point cloud representations.
arXiv Detail & Related papers (2024-04-15T03:12:12Z)
OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision. We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range. For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z)
BEV-MAE: Bird's Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios [51.285561119993105]
We present BEV-MAE, an efficient masked autoencoder pre-training framework for LiDAR-based 3D object detection in autonomous driving. Specifically, we propose a bird's eye view (BEV) guided masking strategy to guide the 3D encoder learning feature representation. We introduce a learnable point token to maintain a consistent receptive field size of the 3D encoder.
arXiv Detail & Related papers (2022-12-12T08:15:03Z)
BEV-LGKD: A Unified LiDAR-Guided Knowledge Distillation Framework for BEV 3D Object Detection [40.45938603642747]
We propose a unified framework named BEV-LGKD to transfer the knowledge in the teacher-student manner. Our method only uses LiDAR points to guide the KD between RGB models.
arXiv Detail & Related papers (2022-12-01T16:17:39Z)
Efficient Spatial-Temporal Information Fusion for LiDAR-Based 3D Moving Object Segmentation [23.666607237164186]
We propose a novel deep neural network exploiting both spatial-temporal information and different representation modalities of LiDAR scans to improve LiDAR-MOS performance. Specifically, we first use a range image-based dual-branch structure to separately deal with spatial and temporal information. We also use a point refinement module via 3D sparse convolution to fuse the information from both LiDAR range image and point cloud representations.
arXiv Detail & Related papers (2022-07-05T17:59:17Z)
Boosting 3D Object Detection by Simulating Multimodality on Point Clouds [51.87740119160152]
This paper presents a new approach to boost a single-modality (LiDAR) 3D object detector by teaching it to simulate features and responses that follow a multi-modality (LiDAR-image) detector. The approach needs LiDAR-image data only when training the single-modality detector, and once well-trained, it only needs LiDAR data at inference. Experimental results on the nuScenes dataset show that our approach outperforms all SOTA LiDAR-only 3D detectors.
arXiv Detail & Related papers (2022-06-30T01:44:30Z)
A Simple Baseline for BEV Perception Without LiDAR [37.00868568802673]
Building 3D perception systems for autonomous vehicles that do not rely on LiDAR is a critical research problem. Current methods use multi-view RGB data collected from cameras around the vehicle. We propose a simple baseline model, where the "lifting" step simply averages features from all projected image locations.
arXiv Detail & Related papers (2022-06-16T06:57:32Z)
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection [96.63947479020631]
In many real-world applications, the LiDAR points used by mass-produced robots and vehicles usually have fewer beams than that in large-scale public datasets. We propose the LiDAR Distillation to bridge the domain gap induced by different LiDAR beams for 3D object detection.
arXiv Detail & Related papers (2022-03-28T17:59:02Z)
MonoDistill: Learning Spatial Features for Monocular 3D Object Detection [80.74622486604886]
We propose a simple and effective scheme to introduce the spatial information from LiDAR signals to the monocular 3D detectors. We use the resulting data to train a 3D detector with the same architecture as the baseline model. Experimental results show that the proposed method can significantly boost the performance of the baseline model.
arXiv Detail & Related papers (2022-01-26T09:21:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.