MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized
HD Map Construction
- URL: http://arxiv.org/abs/2401.07323v1
- Date: Sun, 14 Jan 2024 16:14:36 GMT
- Title: MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized
HD Map Construction
- Authors: Toyota Li
- Abstract summary: We present a full-scale upgrade of MapTR and propose MapNeXt, the next generation of HD map learning architecture.
MapNeXt-Huge achieves state-of-the-art performance on the challenging nuScenes benchmark.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High-Definition (HD) maps are pivotal to autopilot navigation. Integrating
the capability of lightweight HD map construction at runtime into a
self-driving system recently emerges as a promising direction. In this surge,
vision-only perception stands out, as a camera rig can still perceive the
stereo information, let alone its appealing signature of portability and
economy. The latest MapTR architecture solves the online HD map construction
task in an end-to-end fashion but its potential is yet to be explored. In this
work, we present a full-scale upgrade of MapTR and propose MapNeXt, the next
generation of HD map learning architecture, delivering major contributions from
the model training and scaling perspectives. After shedding light on the
training dynamics of MapTR and exploiting the supervision from map elements
thoroughly, MapNeXt-Tiny raises the mAP of MapTR-Tiny from 49.0% to 54.8%,
without any architectural modifications. Enjoying the fruit of map segmentation
pre-training, MapNeXt-Base further lifts the mAP up to 63.9% that has already
outperformed the prior art, a multi-modality MapTR, by 1.4% while being
$\sim1.8\times$ faster. Towards pushing the performance frontier to the next
level, we draw two conclusions on practical model scaling: increased query
favors a larger decoder network for adequate digestion; a large backbone
steadily promotes the final accuracy without bells and whistles. Building upon
these two rules of thumb, MapNeXt-Huge achieves state-of-the-art performance on
the challenging nuScenes benchmark. Specifically, we push the mapless
vision-only single-model performance to be over 78% for the first time,
exceeding the best model from existing methods by 16%.
Related papers
- TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior [70.84644266024571]
We propose to train a perception model to "see" standard definition maps (SDMaps)
We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information.
Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology.
arXiv Detail & Related papers (2024-11-22T06:13:42Z) - Enhancing Online Road Network Perception and Reasoning with Standard Definition Maps [14.535963852751635]
We focus on leveraging lightweight and scalable priors-Standard Definition (SD) maps-in the development of online vectorized HD map representations.
A key finding is that SD map encoders are model agnostic and can be quickly adapted to new architectures that utilize bird's eye view (BEV) encoders.
Our results show that making use of SD maps as priors for the online mapping task can significantly speed up convergence and boost the performance of the online centerline perception task by 30% (mAP)
arXiv Detail & Related papers (2024-08-01T19:39:55Z) - Augmenting Lane Perception and Topology Understanding with Standard
Definition Navigation Maps [51.24861159115138]
Standard Definition (SD) maps are more affordable and have worldwide coverage, offering a scalable alternative.
We propose a novel framework to integrate SD maps into online map prediction and propose a Transformer-based encoder, SD Map Representations from transFormers.
This enhancement consistently and significantly boosts (by up to 60%) lane detection and topology prediction on current state-of-the-art online map prediction methods.
arXiv Detail & Related papers (2023-11-07T15:42:22Z) - MachMap: End-to-End Vectorized Solution for Compact HD-Map Construction [24.517848530666907]
This report introduces the 1st place winning solution for the Autonomous Driving Challenge 2023 - Online HD-map Construction.
We elaborate an effective architecture, termed as MachMap, which formulates the task of HD-map construction as the point detection paradigm.
arXiv Detail & Related papers (2023-06-17T09:06:48Z) - SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic
Understanding [57.108301842535894]
We introduce SNAP, a deep network that learns rich neural 2D maps from ground-level and overhead images.
We train our model to align neural maps estimated from different inputs, supervised only with camera poses over tens of millions of StreetView images.
SNAP can resolve the location of challenging image queries beyond the reach of traditional methods.
arXiv Detail & Related papers (2023-06-08T17:54:47Z) - InstaGraM: Instance-level Graph Modeling for Vectorized HD Map Learning [6.062751776009753]
We propose online HD map learning framework that detects HD map elements from onboard sensor observations.
InstaGraM, instance-level graph modeling of HD map brings accurate and fast end-to-end vectorized HD map learning.
Our proposed network outperforms previous models by up to 13.7 mAP with up to 33.8X faster time.
arXiv Detail & Related papers (2023-01-10T08:15:35Z) - HDMapNet: An Online HD Map Construction and Evaluation Framework [23.19001503634617]
HD map construction is a crucial problem for autonomous driving.
Traditional HD maps are coupled with centimeter-level accurate localization which is unreliable in many scenarios.
Online map learning is a more scalable way to provide semantic and geometry priors to self-driving vehicles.
arXiv Detail & Related papers (2021-07-13T18:06:46Z) - HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps [81.86923212296863]
HD maps are maps with precise definitions of road lanes with rich semantics of the traffic rules.
There are only a small amount of real-world road topologies and geometries, which significantly limits our ability to test out the self-driving stack.
We propose HDMapGen, a hierarchical graph generation model capable of producing high-quality and diverse HD maps.
arXiv Detail & Related papers (2021-06-28T17:59:30Z) - MP3: A Unified Model to Map, Perceive, Predict and Plan [84.07678019017644]
MP3 is an end-to-end approach to mapless driving where the input is raw sensor data and a high-level command.
We show that our approach is significantly safer, more comfortable, and can follow commands better than the baselines in challenging long-term closed-loop simulations.
arXiv Detail & Related papers (2021-01-18T00:09:30Z) - HDNET: Exploiting HD Maps for 3D Object Detection [99.49035895393934]
We show that High-Definition (HD) maps provide strong priors that can boost the performance and robustness of modern 3D object detectors.
We design a single stage detector that extracts geometric and semantic features from the HD maps.
As maps might not be available everywhere, we also propose a map prediction module that estimates the map on the fly from raw LiDAR data.
arXiv Detail & Related papers (2020-12-21T21:59:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.