MapPrior: Bird's-Eye View Map Layout Estimation with Generative Models
- URL: http://arxiv.org/abs/2308.12963v1
- Date: Thu, 24 Aug 2023 17:58:30 GMT
- Title: MapPrior: Bird's-Eye View Map Layout Estimation with Generative Models
- Authors: Xiyue Zhu, Vlas Zyrianov, Zhijian Liu, Shenlong Wang
- Abstract summary: MapPrior is a novel BEV perception framework that combines a traditional BEV perception model with a learned generative model for semantic map layouts.
At the time of submission, MapPrior outperforms the strongest competing method, with significantly improved MMD and ECE scores in camera- and LiDAR-based BEV perception.
- Score: 24.681557413829317
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite tremendous advancements in bird's-eye view (BEV) perception, existing
models fall short in generating realistic and coherent semantic map layouts,
and they fail to account for uncertainties arising from partial sensor
information (such as occlusion or limited coverage). In this work, we introduce
MapPrior, a novel BEV perception framework that combines a traditional
discriminative BEV perception model with a learned generative model for
semantic map layouts. Our MapPrior delivers predictions with better accuracy,
realism, and uncertainty awareness. We evaluate our model on the large-scale
nuScenes benchmark. At the time of submission, MapPrior outperforms the
strongest competing method, with significantly improved MMD and ECE scores in
camera- and LiDAR-based BEV perception.
Related papers
- BEVTraj: Map-Free End-to-End Trajectory Prediction in Bird's-Eye View with Deformable Attention and Sparse Goal Proposals [0.8166364251367625]
We propose Bird's-Eye View Trajectory Prediction (BEVTraj) for autonomous driving.<n>It operates directly in the bird's-eye view (BEV) space utilizing real-time sensor data without relying on pre-built maps.<n>It achieves performance comparable to state-of-the-art HD map-based models while offering greater flexibility.
arXiv Detail & Related papers (2025-09-12T09:17:54Z) - MapDiffusion: Generative Diffusion for Vectorized Online HD Map Construction and Uncertainty Estimation in Autonomous Driving [24.962900390344235]
Autonomous driving requires an understanding of the static environment from sensor data.<n>Traditional map construction models provide deterministic point estimates.<n>We propose MapDiffusion, a novel generative approach that learns the full distribution of possible vectorized maps.
arXiv Detail & Related papers (2025-07-29T01:16:40Z) - TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior [70.84644266024571]
We propose to train a perception model to "see" standard definition maps (SDMaps)
We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information.
Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology.
arXiv Detail & Related papers (2024-11-22T06:13:42Z) - VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization [108.68014173017583]
Bird's-eye-view (BEV) map layout estimation requires an accurate and full understanding of the semantics for the environmental elements around the ego car.
We propose to utilize a generative model similar to the Vector Quantized-Variational AutoEncoder (VQ-VAE) to acquire prior knowledge for the high-level BEV semantics in the tokenized discrete space.
Thanks to the obtained BEV tokens accompanied with a codebook embedding encapsulating the semantics for different BEV elements in the groundtruth maps, we are able to directly align the sparse backbone image features with the obtained BEV tokens
arXiv Detail & Related papers (2024-11-03T16:09:47Z) - Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data [3.1968751101341173]
Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks.
Recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets.
We show that a more scalable approach towards generalizable map prediction can be enabled by using two large-scale crowd-sourced mapping platforms.
arXiv Detail & Related papers (2024-07-11T17:57:22Z) - Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving [55.93813178692077]
We present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms.
We assess 33 state-of-the-art BEV-based perception models spanning tasks like detection, map segmentation, depth estimation, and occupancy prediction.
Our experimental results also underline the efficacy of strategies like pre-training and depth-free BEV transformations in enhancing robustness against out-of-distribution data.
arXiv Detail & Related papers (2024-05-27T17:59:39Z) - Zero-BEV: Zero-shot Projection of Any First-Person Modality to BEV Maps [13.524499163234342]
We propose a new model capable of performing zero-shot projections of any modality available in a first person view to the corresponding BEV map.
We experimentally show that the model outperforms competing methods, in particular the widely used baseline resorting to monocular depth estimation.
arXiv Detail & Related papers (2024-02-21T14:50:24Z) - Diffusion-Based Particle-DETR for BEV Perception [94.88305708174796]
Bird-Eye-View (BEV) is one of the most widely-used scene representations for visual perception in Autonomous Vehicles (AVs)
Recent diffusion-based methods offer a promising approach to uncertainty modeling for visual perception but fail to effectively detect small objects in the large coverage of the BEV.
Here, we address this problem by combining the diffusion paradigm with current state-of-the-art 3D object detectors in BEV.
arXiv Detail & Related papers (2023-12-18T09:52:14Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - "The Pedestrian next to the Lamppost" Adaptive Object Graphs for Better
Instantaneous Mapping [45.94778766867247]
Estimating a semantically segmented bird's-eye-view map from a single image has become a popular technique for autonomous control and navigation.
We show an increase in localization error with distance from the camera.
We propose a graph neural network which predicts BEV objects from a monocular image by spatially reasoning about an object within the context of other objects.
arXiv Detail & Related papers (2022-04-06T17:23:13Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.