Related papers: MapDiffusion: Generative Diffusion for Vectorized Online HD Map Construction and Uncertainty Estimation in Autonomous Driving

MapDiffusion: Generative Diffusion for Vectorized Online HD Map Construction and Uncertainty Estimation in Autonomous Driving

URL: http://arxiv.org/abs/2507.21423v1
Date: Tue, 29 Jul 2025 01:16:40 GMT
Title: MapDiffusion: Generative Diffusion for Vectorized Online HD Map Construction and Uncertainty Estimation in Autonomous Driving
Authors: Thomas Monninger, Zihan Zhang, Zhipeng Mo, Md Zafar Anwar, Steffen Staab, Sihao Ding,
Abstract summary: Autonomous driving requires an understanding of the static environment from sensor data.<n>Traditional map construction models provide deterministic point estimates.<n>We propose MapDiffusion, a novel generative approach that learns the full distribution of possible vectorized maps.
Score: 24.962900390344235
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous driving requires an understanding of the static environment from sensor data. Learned Bird's-Eye View (BEV) encoders are commonly used to fuse multiple inputs, and a vector decoder predicts a vectorized map representation from the latent BEV grid. However, traditional map construction models provide deterministic point estimates, failing to capture uncertainty and the inherent ambiguities of real-world environments, such as occlusions and missing lane markings. We propose MapDiffusion, a novel generative approach that leverages the diffusion paradigm to learn the full distribution of possible vectorized maps. Instead of predicting a single deterministic output from learned queries, MapDiffusion iteratively refines randomly initialized queries, conditioned on a BEV latent grid, to generate multiple plausible map samples. This allows aggregating samples to improve prediction accuracy and deriving uncertainty estimates that directly correlate with scene ambiguity. Extensive experiments on the nuScenes dataset demonstrate that MapDiffusion achieves state-of-the-art performance in online map construction, surpassing the baseline by 5% in single-sample performance. We further show that aggregating multiple samples consistently improves performance along the ROC curve, validating the benefit of distribution modeling. Additionally, our uncertainty estimates are significantly higher in occluded areas, reinforcing their value in identifying regions with ambiguous sensor input. By modeling the full map distribution, MapDiffusion enhances the robustness and reliability of online vectorized HD map construction, enabling uncertainty-aware decision-making for autonomous vehicles in complex environments.

Related papers

Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction [17.16231247910372]
UIGenMap is an uncertainty-instructed structure injection approach for generalizable HD map vectorization.<n>We introduce the perspective-view (PV) detection branch to obtain explicit structural features.<n>Experiments on challenging geographically disjoint (geo-based) data demonstrate that our UIGenMap achieves superior performance.
arXiv Detail & Related papers (2025-03-29T15:01:38Z)
VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization [108.68014173017583]
Bird's-eye-view (BEV) map layout estimation requires an accurate and full understanding of the semantics for the environmental elements around the ego car. We propose to utilize a generative model similar to the Vector Quantized-Variational AutoEncoder (VQ-VAE) to acquire prior knowledge for the high-level BEV semantics in the tokenized discrete space. Thanks to the obtained BEV tokens accompanied with a codebook embedding encapsulating the semantics for different BEV elements in the groundtruth maps, we are able to directly align the sparse backbone image features with the obtained BEV tokens
arXiv Detail & Related papers (2024-11-03T16:09:47Z)
Generative Edge Detection with Stable Diffusion [52.870631376660924]
Edge detection is typically viewed as a pixel-level classification problem mainly addressed by discriminative methods. We propose a novel approach, named Generative Edge Detector (GED), by fully utilizing the potential of the pre-trained stable diffusion model. We conduct extensive experiments on multiple datasets and achieve competitive performance.
arXiv Detail & Related papers (2024-10-04T01:52:23Z)
SemVecNet: Generalizable Vector Map Generation for Arbitrary Sensor Configurations [3.8472678261304587]
We propose a modular pipeline for vector map generation with improved generalization to sensor configurations. By adopting a BEV semantic map robust to different sensor configurations, our proposed approach significantly improves the generalization performance.
arXiv Detail & Related papers (2024-04-30T23:45:16Z)
Diffusion-Based Particle-DETR for BEV Perception [94.88305708174796]
Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs) Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively detect small objects in the large coverage of the BEV. Here, we address this problem by combining the diffusion paradigm with current state-of-the-art 3D object detectors in BEV.
arXiv Detail & Related papers (2023-12-18T09:52:14Z)
EgoVM: Achieving Precise Ego-Localization using Lightweight Vectorized Maps [9.450650025266379]
We present EgoVM, an end-to-end localization network that achieves comparable localization accuracy to prior state-of-the-art methods. We employ a set of learnable semantic embeddings to encode the semantic types of map elements and supervise them with semantic segmentation. We adopt a robust histogram-based pose solver to estimate the optimal pose by searching exhaustively over candidate poses.
arXiv Detail & Related papers (2023-07-18T06:07:25Z)
Online Map Vectorization for Autonomous Driving: A Rasterization Perspective [58.71769343511168]
We introduce a newization-based evaluation metric, which has superior sensitivity and is better suited to real-world autonomous driving scenarios. We also propose MapVR (Map Vectorization via Rasterization), a novel framework that applies differentiableization to preciseized outputs and then performs geometry-aware supervision on HD maps.
arXiv Detail & Related papers (2023-06-18T08:51:14Z)
Generating detailed saliency maps using model-agnostic methods [0.0]
We focus on a model-agnostic explainability method called RISE, elaborate on observed shortcomings of its grid-based approach. modifications, collectively called VRISE (Voronoi-RISE), are meant to, respectively, improve the accuracy of maps generated using large occlusions. We compare accuracy of saliency maps produced by VRISE and RISE on the validation split of ILSVRC2012, using a saliency-guided content insertion/deletion metric and a localization metric based on bounding boxes.
arXiv Detail & Related papers (2022-09-04T21:34:46Z)
Robust Monocular Localization in Sparse HD Maps Leveraging Multi-Task Uncertainty Estimation [28.35592701148056]
We present a novel monocular localization approach based on a sliding-window pose graph. We propose an efficient multi-task uncertainty-aware perception module. Our approach enables robust and accurate 6D localization in challenging urban scenarios.
arXiv Detail & Related papers (2021-10-20T13:46:15Z)
PDC-Net+: Enhanced Probabilistic Dense Correspondence Network [161.76275845530964]
Enhanced Probabilistic Dense Correspondence Network, PDC-Net+, capable of estimating accurate dense correspondences. We develop an architecture and an enhanced training strategy tailored for robust and generalizable uncertainty prediction. Our approach obtains state-of-the-art results on multiple challenging geometric matching and optical flow datasets.
arXiv Detail & Related papers (2021-09-28T17:56:41Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.