FastMap: Fast Queries Initialization Based Vectorized HD Map Reconstruction Framework
- URL: http://arxiv.org/abs/2503.05492v1
- Date: Fri, 07 Mar 2025 15:01:55 GMT
- Title: FastMap: Fast Queries Initialization Based Vectorized HD Map Reconstruction Framework
- Authors: Haotian Hu, Jingwei Xu, Fanyi Wang, Toyota Li, Yaonong Wang, Laifeng Hu, Zhiwang Zhang,
- Abstract summary: FastMap is an innovative framework designed to reduce decoder redundancy in existing approaches.<n>Our framework eliminates the conventional practice of randomly initializing queries and instead incorporates a heatmap-guided query generation module.<n>FastMap achieves state-of-the-art performance in both nuScenes and Argoverse2 datasets, with its decoder operating 3.2 faster than the baseline.
- Score: 8.28438975701346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstruction of high-definition maps is a crucial task in perceiving the autonomous driving environment, as its accuracy directly impacts the reliability of prediction and planning capabilities in downstream modules. Current vectorized map reconstruction methods based on the DETR framework encounter limitations due to the redundancy in the decoder structure, necessitating the stacking of six decoder layers to maintain performance, which significantly hampers computational efficiency. To tackle this issue, we introduce FastMap, an innovative framework designed to reduce decoder redundancy in existing approaches. FastMap optimizes the decoder architecture by employing a single-layer, two-stage transformer that achieves multilevel representation capabilities. Our framework eliminates the conventional practice of randomly initializing queries and instead incorporates a heatmap-guided query generation module during the decoding phase, which effectively maps image features into structured query vectors using learnable positional encoding. Additionally, we propose a geometry-constrained point-to-line loss mechanism for FastMap, which adeptly addresses the challenge of distinguishing highly homogeneous features that often arise in traditional point-to-point loss computations. Extensive experiments demonstrate that FastMap achieves state-of-the-art performance in both nuScenes and Argoverse2 datasets, with its decoder operating 3.2 faster than the baseline. Code and more demos are available at https://github.com/hht1996ok/FastMap.
Related papers
- Uni-PrevPredMap: Extending PrevPredMap to a Unified Framework of Prior-Informed Modeling for Online Vectorized HD Map Construction [9.166949877822807]
We present Uni-PrevPredMap, a unified prior-informed framework that integrates previous predictions and simulated outdated HD maps.
Uni-PrevPredMap achieves state-of-the-art performance in map-absent scenarios across established online vectorized HD map construction benchmarks.
arXiv Detail & Related papers (2025-04-09T07:36:17Z) - ADMap: Anti-disturbance framework for reconstructing online vectorized
HD map [9.218463154577616]
This paper proposes the Anti-disturbance Map reconstruction framework (ADMap)
To mitigate point-order jitter, the framework consists of three modules: Multi-Scale Perception Neck, Instance Interactive Attention (IIA), and Vector Direction Difference Loss (VDDL)
arXiv Detail & Related papers (2024-01-24T01:37:27Z) - StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map
Construction [36.1596833523566]
We present StreamMapNet, a novel online mapping pipeline adept at long-sequence temporal modeling of videos.
StreamMapNet employs multi-point attention and temporal information which empowers the construction of large-range local HD maps with high stability.
arXiv Detail & Related papers (2023-08-24T05:22:43Z) - ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive
Sparse Anchor Generation [50.01244854344167]
We bridge the performance gap between sparse and dense detectors by proposing Adaptive Sparse Anchor Generator (ASAG)
ASAG predicts dynamic anchors on patches rather than grids in a sparse way so that it alleviates the feature conflict problem.
Our method outperforms dense-d ones and achieves a better speed-accuracy trade-off.
arXiv Detail & Related papers (2023-08-18T02:06:49Z) - MapTRv2: An End-to-End Framework for Online Vectorized HD Map Construction [40.07726377230152]
High-definition (HD) map provides abundant and precise static environmental information of the driving scene.
We present textbfMap textbfTRansformer, an end-to-end framework for online vectorized HD map construction.
arXiv Detail & Related papers (2023-08-10T17:56:53Z) - DETR Doesn't Need Multi-Scale or Locality Design [69.56292005230185]
This paper presents an improved DETR detector that maintains a "plain" nature.
It uses a single-scale feature map and global cross-attention calculations without specific locality constraints.
We show that two simple technologies are surprisingly effective within a plain design to compensate for the lack of multi-scale feature maps and locality constraints.
arXiv Detail & Related papers (2023-08-03T17:59:04Z) - MapTR: Structured Modeling and Learning for Online Vectorized HD Map
Construction [33.30177029735497]
MapTR is a structured end-to-end framework for efficient online vectorized HD map construction.
MapTR achieves the best performance and efficiency among existing vectorized map construction approaches.
arXiv Detail & Related papers (2022-08-30T17:55:59Z) - Learning to Localize Through Compressed Binary Maps [83.03367511221437]
We learn to compress the map representation such that it is optimal for the localization task.
Our experiments show that it is possible to learn a task-specific compression which reduces storage requirements by two orders of magnitude over general-purpose codecs.
arXiv Detail & Related papers (2020-12-20T14:47:15Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z) - Cross-Descriptor Visual Localization and Mapping [81.16435356103133]
Visual localization and mapping is the key technology underlying the majority of Mixed Reality and robotics systems.
We present three novel scenarios for localization and mapping which require the continuous update of feature representations.
Our data-driven approach is agnostic to the feature descriptor type, has low computational requirements, and scales linearly with the number of description algorithms.
arXiv Detail & Related papers (2020-12-02T18:19:51Z) - EfficientFCN: Holistically-guided Decoding for Semantic Segmentation [49.27021844132522]
State-of-the-art semantic segmentation algorithms are mostly based on dilated Fully Convolutional Networks (dilatedFCN)
We propose the EfficientFCN, whose backbone is a common ImageNet pre-trained network without any dilated convolution.
Such a framework achieves comparable or even better performance than state-of-the-art methods with only 1/3 of the computational cost.
arXiv Detail & Related papers (2020-08-24T14:48:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.