Related papers: NeMO: Neural Map Growing System for Spatiotemporal Fusion in Bird's-Eye-View and BDD-Map Benchmark

NeMO: Neural Map Growing System for Spatiotemporal Fusion in Bird's-Eye-View and BDD-Map Benchmark

URL: http://arxiv.org/abs/2306.04540v1
Date: Wed, 7 Jun 2023 15:46:15 GMT
Title: NeMO: Neural Map Growing System for Spatiotemporal Fusion in Bird's-Eye-View and BDD-Map Benchmark
Authors: Xi Zhu, Xiya Cao, Zhiwei Dong, Caifa Zhou, Qiangbo Liu, Wei Li, Yongliang Wang
Abstract summary: Vision-centric Bird's-Eye View representation is essential for autonomous driving systems. This work outlines a new paradigm, named NeMO, for generating local maps through the utilization of a readable and writable big map. With an assumption that the feature distribution of all BEV grids follows an identical pattern, we adopt a shared-weight neural network for all grids to update the big map.
Score: 9.430779563669908
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Vision-centric Bird's-Eye View (BEV) representation is essential for autonomous driving systems (ADS). Multi-frame temporal fusion which leverages historical information has been demonstrated to provide more comprehensive perception results. While most research focuses on ego-centric maps of fixed settings, long-range local map generation remains less explored. This work outlines a new paradigm, named NeMO, for generating local maps through the utilization of a readable and writable big map, a learning-based fusion module, and an interaction mechanism between the two. With an assumption that the feature distribution of all BEV grids follows an identical pattern, we adopt a shared-weight neural network for all grids to update the big map. This paradigm supports the fusion of longer time series and the generation of long-range BEV local maps. Furthermore, we release BDD-Map, a BDD100K-based dataset incorporating map element annotations, including lane lines, boundaries, and pedestrian crossing. Experiments on the NuScenes and BDD-Map datasets demonstrate that NeMO outperforms state-of-the-art map segmentation methods. We also provide a new scene-level BEV map evaluation setting along with the corresponding baseline for a more comprehensive comparison.

Related papers

MapExpert: Online HD Map Construction with Simple and Efficient Sparse Map Element Expert [7.086030137483952]
We introduce an expert-based online HD map method, termed MapExpert. MapExpert utilizes sparse experts, distributed by our routers, to describe various non-cubic map elements accurately.
arXiv Detail & Related papers (2024-12-17T09:19:44Z)
TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior [70.84644266024571]
We propose to train a perception model to "see" standard definition maps (SDMaps) We encode SDMap elements into neural spatial map representations and instance tokens, and then incorporate such complementary features as prior information. Based on the lane segment representation framework, the model simultaneously predicts lanes, centrelines and their topology.
arXiv Detail & Related papers (2024-11-22T06:13:42Z)
VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization [108.68014173017583]
Bird's-eye-view (BEV) map layout estimation requires an accurate and full understanding of the semantics for the environmental elements around the ego car. We propose to utilize a generative model similar to the Vector Quantized-Variational AutoEncoder (VQ-VAE) to acquire prior knowledge for the high-level BEV semantics in the tokenized discrete space. Thanks to the obtained BEV tokens accompanied with a codebook embedding encapsulating the semantics for different BEV elements in the groundtruth maps, we are able to directly align the sparse backbone image features with the obtained BEV tokens
arXiv Detail & Related papers (2024-11-03T16:09:47Z)
Enhancing Vectorized Map Perception with Historical Rasterized Maps [37.48510990922406]
We propose HRMapNet, leveraging a low-cost Historical Rasterized Map to enhance online vectorized map perception. The historicalized map can be easily constructed from past predicted vectorized results and provides valuable complementary information. HRMapNet can be integrated with most online vectorized map perception methods.
arXiv Detail & Related papers (2024-09-01T05:22:33Z)
Progressive Query Refinement Framework for Bird's-Eye-View Semantic Segmentation from Surrounding Images [3.495246564946556]
We introduce the Multi-Resolution (MR) concept into Bird's-Eye-View (BEV) semantic segmentation for autonomous driving. We propose a visual feature interaction network that promotes interactions between features across images and across feature levels. We evaluate our model on a large-scale real-world dataset.
arXiv Detail & Related papers (2024-07-24T05:00:31Z)
Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data [3.1968751101341173]
Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. Recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. We show that a more scalable approach towards generalizable map prediction can be enabled by using two large-scale crowd-sourced mapping platforms.
arXiv Detail & Related papers (2024-07-11T17:57:22Z)
MV-Map: Offboard HD-Map Generation with Multi-view Consistency [29.797769409113105]
Bird's-eye-view (BEV) perception models can be useful for building high-definition maps (HD-Maps) with less human labor. Their results are often unreliable and demonstrate noticeable inconsistencies in the predicted HD-Maps from different viewpoints. This paper advocates a more practical 'offboard' HD-Map generation setup that removes the computation constraints.
arXiv Detail & Related papers (2023-05-15T17:59:15Z)
Neural Map Prior for Autonomous Driving [17.198729798817094]
High-definition (HD) semantic maps are crucial in enabling autonomous vehicles to navigate urban environments. Traditional method of creating offline HD maps involves labor-intensive manual annotation processes. Recent studies have proposed an alternative approach that generates local maps using online sensor observations. In this study, we propose Neural Map Prior (NMP), a neural representation of global maps.
arXiv Detail & Related papers (2023-04-17T17:58:40Z)
BEVBert: Multimodal Map Pre-training for Language-guided Navigation [75.23388288113817]
We propose a new map-based pre-training paradigm that is spatial-aware for use in vision-and-language navigation (VLN) We build a local metric map to explicitly aggregate incomplete observations and remove duplicates, while modeling navigation dependency in a global topological map. Based on the hybrid map, we devise a pre-training framework to learn a multimodal map representation, which enhances spatial-aware cross-modal reasoning thereby facilitating the language-guided navigation goal.
arXiv Detail & Related papers (2022-12-08T16:27:54Z)
Long-term Visual Map Sparsification with Heterogeneous GNN [47.12309045366042]
In this paper, we aim to overcome the environmental changes and reduce the map size at the same time by selecting points that are valuable to future localization. Inspired by the recent progress in Graph Neural Network(GNN), we propose the first work that models SfM maps as heterogeneous graphs and predicts 3D point importance scores with a GNN. Two novel supervisions are proposed: 1) a data-fitting term for selecting valuable points to future localization based on training queries; 2) a K-Cover term for selecting sparse points with full map coverage.
arXiv Detail & Related papers (2022-03-29T01:46:12Z)
HDMapGen: A Hierarchical Graph Generative Model of High Definition Maps [81.86923212296863]
HD maps are maps with precise definitions of road lanes with rich semantics of the traffic rules. There are only a small amount of real-world road topologies and geometries, which significantly limits our ability to test out the self-driving stack. We propose HDMapGen, a hierarchical graph generation model capable of producing high-quality and diverse HD maps.
arXiv Detail & Related papers (2021-06-28T17:59:30Z)
Label Decoupling Framework for Salient Object Detection [157.96262922808245]
Recent methods mainly focus on aggregating multi-level features from convolutional network (FCN) and introducing edge information as auxiliary supervision. We propose a label decoupling framework (LDF) which consists of a label decoupling procedure and a feature interaction network (FIN) Experiments on six benchmark datasets demonstrate that LDF outperforms state-of-the-art approaches on different evaluation metrics.
arXiv Detail & Related papers (2020-08-25T14:23:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.