MambaMap: Online Vectorized HD Map Construction using State Space Model
- URL: http://arxiv.org/abs/2507.20224v1
- Date: Sun, 27 Jul 2025 11:09:27 GMT
- Title: MambaMap: Online Vectorized HD Map Construction using State Space Model
- Authors: Ruizi Yang, Xiaolu Liu, Junbo Chen, Jianke Zhu,
- Abstract summary: MambaMap is a novel framework that efficiently fuses long-range temporal features in the state space to construct online vectorized HD maps.<n>Specifically, MambaMap incorporates a memory bank to store and utilize information from historical frames.<n>In addition, we design innovative multi-directional and spatial-temporal scanning strategies to enhance feature extraction at both BEV and instance levels.
- Score: 11.15033113060733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-definition (HD) maps are essential for autonomous driving, as they provide precise road information for downstream tasks. Recent advances highlight the potential of temporal modeling in addressing challenges like occlusions and extended perception range. However, existing methods either fail to fully exploit temporal information or incur substantial computational overhead in handling extended sequences. To tackle these challenges, we propose MambaMap, a novel framework that efficiently fuses long-range temporal features in the state space to construct online vectorized HD maps. Specifically, MambaMap incorporates a memory bank to store and utilize information from historical frames, dynamically updating BEV features and instance queries to improve robustness against noise and occlusions. Moreover, we introduce a gating mechanism in the state space, selectively integrating dependencies of map elements in high computational efficiency. In addition, we design innovative multi-directional and spatial-temporal scanning strategies to enhance feature extraction at both BEV and instance levels. These strategies significantly boost the prediction accuracy of our approach while ensuring robust temporal consistency. Extensive experiments on the nuScenes and Argoverse2 datasets demonstrate that our proposed MambaMap approach outperforms state-of-the-art methods across various splits and perception ranges. Source code will be available at https://github.com/ZiziAmy/MambaMap.
Related papers
- RTMap: Real-Time Recursive Mapping with Change Detection and Localization [8.343318095882232]
RTMap persistently crowdsourcing a multi-traversal HD map as a self-evolutional memory.<n>On onboard agents, RTMap simultaneously addresses three core challenges in an end-to-end fashion.<n> Experiments on several public autonomous driving datasets demonstrate our solid performance on both the prior-aided map quality and the localization accuracy.
arXiv Detail & Related papers (2025-07-01T17:32:30Z) - MapExpert: Online HD Map Construction with Simple and Efficient Sparse Map Element Expert [7.086030137483952]
We introduce an expert-based online HD map method, termed MapExpert.<n>MapExpert utilizes sparse experts, distributed by our routers, to describe various non-cubic map elements accurately.
arXiv Detail & Related papers (2024-12-17T09:19:44Z) - Unveiling the Hidden: Online Vectorized HD Map Construction with Clip-Level Token Interaction and Propagation [14.480713752871521]
MapUnveiler is a novel paradigm of clip-level vectorized HD map construction.
It unveils the occluded map elements within a clip input by relating dense image representations with efficient clip tokens.
MapUnveiler associates inter-clip information through clip token propagation, effectively utilizing long-term temporal map information.
arXiv Detail & Related papers (2024-11-17T08:38:18Z) - Neural Semantic Map-Learning for Autonomous Vehicles [85.8425492858912]
We present a mapping system that fuses local submaps gathered from a fleet of vehicles at a central instance to produce a coherent map of the road environment.
Our method jointly aligns and merges the noisy and incomplete local submaps using a scene-specific Neural Signed Distance Field.
We leverage memory-efficient sparse feature-grids to scale to large areas and introduce a confidence score to model uncertainty in scene reconstruction.
arXiv Detail & Related papers (2024-10-10T10:10:03Z) - HRVMamba: High-Resolution Visual State Space Model for Dense Prediction [60.80423207808076]
State Space Models (SSMs) with efficient hardware-aware designs have demonstrated significant potential in computer vision tasks.
These models have been constrained by three key challenges: insufficient inductive bias, long-range forgetting, and low-resolution output representation.
We introduce the Dynamic Visual State Space (DVSS) block, which employs deformable convolution to mitigate the long-range forgetting problem.
We also introduce High-Resolution Visual State Space Model (HRVMamba) based on the DVSS block, which preserves high-resolution representations throughout the entire process.
arXiv Detail & Related papers (2024-10-04T06:19:29Z) - MemFusionMap: Working Memory Fusion for Online Vectorized HD Map Construction [6.743612231580936]
We propose a novel temporal fusion model with enhanced temporal reasoning capabilities for online HD map construction.
Specifically, we contribute a working memory fusion module that improves the model's memory capacity to reason across a history of frames.
We also design a novel temporal overlap heatmap to explicitly inform the model about the temporal overlap information and vehicle trajectory.
arXiv Detail & Related papers (2024-09-26T03:16:39Z) - SIGMA: Selective Gated Mamba for Sequential Recommendation [56.85338055215429]
Mamba, a recent advancement, has exhibited exceptional performance in time series prediction.<n>We introduce a new framework named Selective Gated Mamba ( SIGMA) for Sequential Recommendation.<n>Our results indicate that SIGMA outperforms current models on five real-world datasets.
arXiv Detail & Related papers (2024-08-21T09:12:59Z) - ADMap: Anti-disturbance framework for reconstructing online vectorized
HD map [9.218463154577616]
This paper proposes the Anti-disturbance Map reconstruction framework (ADMap)
To mitigate point-order jitter, the framework consists of three modules: Multi-Scale Perception Neck, Instance Interactive Attention (IIA), and Vector Direction Difference Loss (VDDL)
arXiv Detail & Related papers (2024-01-24T01:37:27Z) - StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map
Construction [36.1596833523566]
We present StreamMapNet, a novel online mapping pipeline adept at long-sequence temporal modeling of videos.
StreamMapNet employs multi-point attention and temporal information which empowers the construction of large-range local HD maps with high stability.
arXiv Detail & Related papers (2023-08-24T05:22:43Z) - Online Map Vectorization for Autonomous Driving: A Rasterization
Perspective [58.71769343511168]
We introduce a newization-based evaluation metric, which has superior sensitivity and is better suited to real-world autonomous driving scenarios.
We also propose MapVR (Map Vectorization via Rasterization), a novel framework that applies differentiableization to preciseized outputs and then performs geometry-aware supervision on HD maps.
arXiv Detail & Related papers (2023-06-18T08:51:14Z) - AdaFuse: Adaptive Temporal Fusion Network for Efficient Action
Recognition [68.70214388982545]
Temporal modelling is the key for efficient video action recognition.
We introduce an adaptive temporal fusion network, called AdaFuse, that fuses channels from current and past feature maps.
Our approach can achieve about 40% computation savings with comparable accuracy to state-of-the-art methods.
arXiv Detail & Related papers (2021-02-10T23:31:02Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - Radar-based Dynamic Occupancy Grid Mapping and Object Detection [55.74894405714851]
In recent years, the classical occupancy grid map approach has been extended to dynamic occupancy grid maps.
This paper presents the further development of a previous approach.
The data of multiple radar sensors are fused, and a grid-based object tracking and mapping method is applied.
arXiv Detail & Related papers (2020-08-09T09:26:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.