IC-Mapper: Instance-Centric Spatio-Temporal Modeling for Online Vectorized Map Construction
- URL: http://arxiv.org/abs/2503.03882v1
- Date: Wed, 05 Mar 2025 20:28:34 GMT
- Title: IC-Mapper: Instance-Centric Spatio-Temporal Modeling for Online Vectorized Map Construction
- Authors: Jiangtong Zhu, Zhao Yang, Yinan Shi, Jianwu Fang, Jianru Xue,
- Abstract summary: IC-Mapper is an instance-centric online mapping framework, which comprises two primary components.<n>We perform point sampling on the historical global map from a spatial dimension and integrate it with the detection results of instances corresponding to the current frame to achieve real-time expansion and update of the map.
- Score: 18.975185033472968
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Online vector map construction based on visual data can bypass the processes of data collection, post-processing, and manual annotation required by traditional map construction, which significantly enhances map-building efficiency. However, existing work treats the online mapping task as a local range perception task, overlooking the spatial scalability required for map construction. We propose IC-Mapper, an instance-centric online mapping framework, which comprises two primary components: 1) Instance-centric temporal association module: For the detection queries of adjacent frames, we measure them in both feature and geometric dimensions to obtain the matching correspondence between instances across frames. 2) Instance-centric spatial fusion module: We perform point sampling on the historical global map from a spatial dimension and integrate it with the detection results of instances corresponding to the current frame to achieve real-time expansion and update of the map. Based on the nuScenes dataset, we evaluate our approach on detection, tracking, and global mapping metrics. Experimental results demonstrate the superiority of IC-Mapper against other state-of-the-art methods. Code will be released on https://github.com/Brickzhuantou/IC-Mapper.
Related papers
- InteractionMap: Improving Online Vectorized HDMap Construction with Interaction [0.4551615447454768]
State-of-the-art map vectorization methods are mainly based on DETR-like framework to generate HD maps in an end-to-end manner.
In this paper, we propose InteractionMap, which improves previous map vectorization methods by fully leveraging local-to-global information interaction.
arXiv Detail & Related papers (2025-03-27T16:23:15Z) - HisTrackMap: Global Vectorized High-Definition Map Construction via History Map Tracking [24.21124150354725]
We propose a novel end-to-end tracking framework global map construction by temporally tracking map elements' historical trajectories.
We introduce a Map-Trajectory Prior Fusion module within this tracking framework, leveraging historical priors for tracked instances to improve temporal smoothness and continuity.
Substantial experiments on the nuScenes and Argoverse2 datasets demonstrate that the proposed method outperforms state-of-the-art (SOTA) methods in both single-frame and temporal metrics.
arXiv Detail & Related papers (2025-03-10T10:44:43Z) - TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior [70.84644266024571]
We propose to train a perception model to "see" standard definition maps (SDMaps)
We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information.
Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology.
arXiv Detail & Related papers (2024-11-22T06:13:42Z) - Neural Semantic Map-Learning for Autonomous Vehicles [85.8425492858912]
We present a mapping system that fuses local submaps gathered from a fleet of vehicles at a central instance to produce a coherent map of the road environment.
Our method jointly aligns and merges the noisy and incomplete local submaps using a scene-specific Neural Signed Distance Field.
We leverage memory-efficient sparse feature-grids to scale to large areas and introduce a confidence score to model uncertainty in scene reconstruction.
arXiv Detail & Related papers (2024-10-10T10:10:03Z) - MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction [75.93907511203317]
We propose MGMapNet (Multi-Granularity Map Network) to model map element with a multi-granularity representation.
The proposed MGMapNet achieves state-of-the-art performance, surpassing MapTRv2 by 5.3 mAP on nuScenes and 4.4 mAP on Argoverse2 respectively.
arXiv Detail & Related papers (2024-10-10T09:05:23Z) - DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction [20.6143278960295]
This paper focuses on temporal instance consistency and temporal map consistency learning.
DTCLMapper is a dual-stream temporal consistency learning module that combines instance embedding with geometry maps.
Experiments on well-recognized benchmarks indicate that the proposed DTCLMapper achieves state-of-the-art performance in vectorized mapping tasks.
arXiv Detail & Related papers (2024-05-09T02:58:55Z) - InsMapper: Exploring Inner-instance Information for Vectorized HD
Mapping [41.59891369655983]
InsMapper harnesses inner-instance information for vectorized high-definition mapping through transformers.
InsMapper surpasses the previous state-of-the-art method, demonstrating its effectiveness and generality.
arXiv Detail & Related papers (2023-08-16T17:58:28Z) - MapTRv2: An End-to-End Framework for Online Vectorized HD Map Construction [40.07726377230152]
High-definition (HD) map provides abundant and precise static environmental information of the driving scene.
We present textbfMap textbfTRansformer, an end-to-end framework for online vectorized HD map construction.
arXiv Detail & Related papers (2023-08-10T17:56:53Z) - TAPIR: Tracking Any Point with per-frame Initialization and temporal
Refinement [64.11385310305612]
We present a novel model for Tracking Any Point (TAP) that effectively tracks any queried point on any physical surface throughout a video sequence.
Our approach employs two stages: (1) a matching stage, which independently locates a suitable candidate point match for the query point on every other frame, and (2) a refinement stage, which updates both the trajectory and query features based on local correlations.
The resulting model surpasses all baseline methods by a significant margin on the TAP-Vid benchmark, as demonstrated by an approximate 20% absolute average Jaccard (AJ) improvement on DAVIS.
arXiv Detail & Related papers (2023-06-14T17:07:51Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - Graph Sampling Based Deep Metric Learning for Generalizable Person
Re-Identification [114.56752624945142]
We argue that the most popular random sampling method, the well-known PK sampler, is not informative and efficient for deep metric learning.
We propose an efficient mini batch sampling method called Graph Sampling (GS) for large-scale metric learning.
arXiv Detail & Related papers (2021-04-04T06:44:15Z) - Enabling Visual Action Planning for Object Manipulation through Latent
Space Roadmap [72.01609575400498]
We present a framework for visual action planning of complex manipulation tasks with high-dimensional state spaces.
We propose a Latent Space Roadmap (LSR) for task planning, a graph-based structure capturing globally the system dynamics in a low-dimensional latent space.
We present a thorough investigation of our framework on two simulated box stacking tasks and a folding task executed on a real robot.
arXiv Detail & Related papers (2021-03-03T17:48:26Z) - Rethinking Localization Map: Towards Accurate Object Perception with
Self-Enhancement Maps [78.2581910688094]
This work introduces a novel self-enhancement method to harvest accurate object localization maps and object boundaries with only category labels as supervision.
In particular, the proposed Self-Enhancement Maps achieve the state-of-the-art localization accuracy of 54.88% on ILSVRC.
arXiv Detail & Related papers (2020-06-09T12:35:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.