SMART: Advancing Scalable Map Priors for Driving Topology Reasoning
- URL: http://arxiv.org/abs/2502.04329v1
- Date: Thu, 06 Feb 2025 18:59:57 GMT
- Title: SMART: Advancing Scalable Map Priors for Driving Topology Reasoning
- Authors: Junjie Ye, David Paz, Hengyuan Zhang, Yuliang Guo, Xinyu Huang, Henrik I. Christensen, Yue Wang, Liu Ren,
- Abstract summary: Topology reasoning is crucial for autonomous driving as it enables comprehensive understanding of connectivity and relationships between lanes and traffic elements.
Recent approaches have shown success in perceiving driving topology using vehicle-mounted sensors.
We identify that the key factor in scalable lane perception and topology reasoning is the elimination of this sensor-dependent feature.
- Score: 24.614973933683352
- License:
- Abstract: Topology reasoning is crucial for autonomous driving as it enables comprehensive understanding of connectivity and relationships between lanes and traffic elements. While recent approaches have shown success in perceiving driving topology using vehicle-mounted sensors, their scalability is hindered by the reliance on training data captured by consistent sensor configurations. We identify that the key factor in scalable lane perception and topology reasoning is the elimination of this sensor-dependent feature. To address this, we propose SMART, a scalable solution that leverages easily available standard-definition (SD) and satellite maps to learn a map prior model, supervised by large-scale geo-referenced high-definition (HD) maps independent of sensor settings. Attributed to scaled training, SMART alone achieves superior offline lane topology understanding using only SD and satellite inputs. Extensive experiments further demonstrate that SMART can be seamlessly integrated into any online topology reasoning methods, yielding significant improvements of up to 28% on the OpenLane-V2 benchmark.
Related papers
- TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior [70.84644266024571]
We propose to train a perception model to "see" standard definition maps (SDMaps)
We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information.
Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology.
arXiv Detail & Related papers (2024-11-22T06:13:42Z) - LLMSense: Harnessing LLMs for High-level Reasoning Over Spatiotemporal Sensor Traces [1.1137304094345333]
We design an effective prompting framework for Large Language Models (LLMs) on high-level reasoning tasks.
We also design two strategies to enhance performance with long sensor traces, including summarization before reasoning and selective inclusion of historical traces.
Our framework can be implemented in an edge-cloud setup, running small LLMs on the edge for data summarization and performing high-level reasoning on the cloud for privacy preservation.
arXiv Detail & Related papers (2024-03-28T22:06:04Z) - Augmenting Lane Perception and Topology Understanding with Standard
Definition Navigation Maps [51.24861159115138]
Standard Definition (SD) maps are more affordable and have worldwide coverage, offering a scalable alternative.
We propose a novel framework to integrate SD maps into online map prediction and propose a Transformer-based encoder, SD Map Representations from transFormers.
This enhancement consistently and significantly boosts (by up to 60%) lane detection and topology prediction on current state-of-the-art online map prediction methods.
arXiv Detail & Related papers (2023-11-07T15:42:22Z) - Cognitive TransFuser: Semantics-guided Transformer-based Sensor Fusion
for Improved Waypoint Prediction [38.971222477695214]
RGB-LIDAR-based multi-task feature fusion network, coined Cognitive TransFuser, augments and exceeds the baseline network by a significant margin for safer and more complete road navigation.
We validate the proposed network on the Town05 Short and Town05 Long Benchmark through extensive experiments, achieving up to 44.2 FPS real-time inference time.
arXiv Detail & Related papers (2023-08-04T03:59:10Z) - Energy-Based Models for Cross-Modal Localization using Convolutional
Transformers [52.27061799824835]
We present a novel framework for localizing a ground vehicle mounted with a range sensor against satellite imagery in the absence of GPS.
We propose a method using convolutional transformers that performs accurate metric-level localization in a cross-modal manner.
We train our model end-to-end and demonstrate our approach achieving higher accuracy than the state-of-the-art on KITTI, Pandaset, and a custom dataset.
arXiv Detail & Related papers (2023-06-06T21:27:08Z) - Neurosymbolic hybrid approach to driver collision warning [64.02492460600905]
There are two main algorithmic approaches to autonomous driving systems.
Deep learning alone has achieved state-of-the-art results in many areas.
But sometimes it can be very difficult to debug if the deep learning model doesn't work.
arXiv Detail & Related papers (2022-03-28T20:29:50Z) - Efficient and Robust LiDAR-Based End-to-End Navigation [132.52661670308606]
We present an efficient and robust LiDAR-based end-to-end navigation framework.
We propose Fast-LiDARNet that is based on sparse convolution kernel optimization and hardware-aware model design.
We then propose Hybrid Evidential Fusion that directly estimates the uncertainty of the prediction from only a single forward pass.
arXiv Detail & Related papers (2021-05-20T17:52:37Z) - Plants Don't Walk on the Street: Common-Sense Reasoning for Reliable
Semantic Segmentation [0.7696728525672148]
We propose to use a partly human-designed, partly learned set of rules to describe relations between objects of a traffic scene on a high level of abstraction.
In doing so, we improve and robustify existing deep neural networks consuming low-level sensor information.
arXiv Detail & Related papers (2021-04-19T12:51:06Z) - Lite-HDSeg: LiDAR Semantic Segmentation Using Lite Harmonic Dense
Convolutions [2.099922236065961]
We present Lite-HDSeg, a novel real-time convolutional neural network for semantic segmentation of full $3$D LiDAR point clouds.
Our experimental results show that the proposed method outperforms state-of-the-art semantic segmentation approaches which can run real-time.
arXiv Detail & Related papers (2021-03-16T04:54:57Z) - A Driving Behavior Recognition Model with Bi-LSTM and Multi-Scale CNN [59.57221522897815]
We propose a neural network model based on trajectories information for driving behavior recognition.
We evaluate the proposed model on the public BLVD dataset, achieving a satisfying performance.
arXiv Detail & Related papers (2021-03-01T06:47:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.