Related papers: BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight

BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight

URL: http://arxiv.org/abs/2407.08526v1
Date: Thu, 11 Jul 2024 14:15:48 GMT
Title: BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight
Authors: Hang Wu, Zhenghao Zhang, Siyuan Lin, Tong Qin, Jin Pan, Qiang Zhao, Chunjing Xu, Ming Yang,
Abstract summary: We propose BLOS-BEV, a novel BEV segmentation model that incorporates SD maps for accurate beyond line-of-sight perception, up to 200m. Our approach is applicable to common BEV architectures and can achieve excellent results by incorporating information derived from SD maps.
Score: 30.45553559416835
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bird's-eye-view (BEV) representation is crucial for the perception function in autonomous driving tasks. It is difficult to balance the accuracy, efficiency and range of BEV representation. The existing works are restricted to a limited perception range within 50 meters. Extending the BEV representation range can greatly benefit downstream tasks such as topology reasoning, scene understanding, and planning by offering more comprehensive information and reaction time. The Standard-Definition (SD) navigation maps can provide a lightweight representation of road structure topology, characterized by ease of acquisition and low maintenance costs. An intuitive idea is to combine the close-range visual information from onboard cameras with the beyond line-of-sight (BLOS) environmental priors from SD maps to realize expanded perceptual capabilities. In this paper, we propose BLOS-BEV, a novel BEV segmentation model that incorporates SD maps for accurate beyond line-of-sight perception, up to 200m. Our approach is applicable to common BEV architectures and can achieve excellent results by incorporating information derived from SD maps. We explore various feature fusion schemes to effectively integrate the visual BEV representations and semantic features from the SD map, aiming to leverage the complementary information from both sources optimally. Extensive experiments demonstrate that our approach achieves state-of-the-art performance in BEV segmentation on nuScenes and Argoverse benchmark. Through multi-modal inputs, BEV segmentation is significantly enhanced at close ranges below 50m, while also demonstrating superior performance in long-range scenarios, surpassing other methods by over 20% mIoU at distances ranging from 50-200m.

Related papers

ChatBEV: A Visual Language Model that Understands BEV Maps [58.3005092762598]
We introduce ChatBEV-QA, a novel BEV VQA benchmark containing over 137k questions. This benchmark is constructed using a novel data collection pipeline that generates scalable and informative VQA data for BEV maps. We propose a language-driven traffic scene generation pipeline, where ChatBEV facilitates map understanding and text-aligned navigation guidance.
arXiv Detail & Related papers (2025-03-18T06:12:38Z)
BEVMOSNet: Multimodal Fusion for BEV Moving Object Segmentation [3.613463012025065]
We introduce BEVMOSNet, the first end-to-end multimodal fusion leveraging cameras, LiDAR, and radar to precisely predict the moving objects in bird's-eye-view (BEV) We show an overall improvement in IoU score of 36.59% compared to the vision-based unimodal baseline BEV-MoSeg.
arXiv Detail & Related papers (2025-03-05T09:03:46Z)
TS-CGNet: Temporal-Spatial Fusion Meets Centerline-Guided Diffusion for BEV Mapping [14.11655533977291]
This paper proposes TS-CGNet, which leverages Temporal-Spatial fusion with Centerline-Guided diffusion. This framework is designed for integration into any existing network for building BEV maps.
arXiv Detail & Related papers (2025-03-04T13:00:30Z)
SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset [101.51012770913627]
Bird's-eye view (BEV) perception has garnered significant attention in autonomous driving in recent years. SimBEV is a randomized synthetic data generation tool that is extensively scalable and scalable. SimBEV is used to create the SimBEV dataset, a large collection of annotated perception data from diverse driving scenarios.
arXiv Detail & Related papers (2025-02-04T00:00:06Z)
VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization [108.68014173017583]
Bird's-eye-view (BEV) map layout estimation requires an accurate and full understanding of the semantics for the environmental elements around the ego car. We propose to utilize a generative model similar to the Vector Quantized-Variational AutoEncoder (VQ-VAE) to acquire prior knowledge for the high-level BEV semantics in the tokenized discrete space. Thanks to the obtained BEV tokens accompanied with a codebook embedding encapsulating the semantics for different BEV elements in the groundtruth maps, we are able to directly align the sparse backbone image features with the obtained BEV tokens
arXiv Detail & Related papers (2024-11-03T16:09:47Z)
OE-BevSeg: An Object Informed and Environment Aware Multimodal Framework for Bird's-eye-view Vehicle Semantic Segmentation [57.2213693781672]
Bird's-eye-view (BEV) semantic segmentation is becoming crucial in autonomous driving systems. We propose OE-BevSeg, an end-to-end multimodal framework that enhances BEV segmentation performance. Our approach achieves state-of-the-art results by a large margin on the nuScenes dataset for vehicle segmentation.
arXiv Detail & Related papers (2024-07-18T03:48:22Z)
Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data [3.1968751101341173]
Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. Recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. We show that a more scalable approach towards generalizable map prediction can be enabled by using two large-scale crowd-sourced mapping platforms.
arXiv Detail & Related papers (2024-07-11T17:57:22Z)
Improving Bird's Eye View Semantic Segmentation by Task Decomposition [42.57351039508863]
We decompose the original BEV segmentation task into two stages, namely BEV map reconstruction and RGB-BEV feature alignment. Our approach simplifies the complexity of combining perception and generation into distinct steps, equipping the model to handle intricate and challenging scenes effectively.
arXiv Detail & Related papers (2024-04-02T13:19:45Z)
U-BEV: Height-aware Bird's-Eye-View Segmentation and Neural Map-based Relocalization [81.76044207714637]
Relocalization is essential for intelligent vehicles when GPS reception is insufficient or sensor-based localization fails. Recent advances in Bird's-Eye-View (BEV) segmentation allow for accurate estimation of local scene appearance. This paper presents U-BEV, a U-Net inspired architecture that extends the current state-of-the-art by allowing the BEV to reason about the scene on multiple height layers before flattening the BEV features.
arXiv Detail & Related papers (2023-10-20T18:57:38Z)
Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline [76.48192454417138]
Bird's-Eye View (BEV) representation is promising as the foundation for next-generation Autonomous Vehicle (AV) perception. This paper proposes a framework, termed Fast-BEV, which is capable of performing faster BEV perception on the on-vehicle chips.
arXiv Detail & Related papers (2023-01-29T18:43:31Z)
Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view. Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z)
Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe [115.31507979199564]
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending and drawing extensive attention both from industry and academia. As sensor configurations get more complex, integrating multi-source information from different sensors and representing features in a unified view come of vital importance. The core problems for BEV perception lie in (a) how to reconstruct the lost 3D information via view transformation from perspective view to BEV; (b) how to acquire ground truth annotations in BEV grid; and (d) how to adapt and generalize algorithms as sensor configurations vary across different scenarios.
arXiv Detail & Related papers (2022-09-12T15:29:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.