F2BEV: Bird's Eye View Generation from Surround-View Fisheye Camera
Images for Automated Driving
- URL: http://arxiv.org/abs/2303.03651v2
- Date: Tue, 1 Aug 2023 19:41:49 GMT
- Title: F2BEV: Bird's Eye View Generation from Surround-View Fisheye Camera
Images for Automated Driving
- Authors: Ekta U. Samani, Feng Tao, Harshavardhan R. Dasari, Sihao Ding, Ashis
G. Banerjee
- Abstract summary: We introduce a baseline, F2BEV, to generate BEV height maps and BEV semantic segmentation maps from fisheye images.
F2BEV consists of a distortion-aware spatial cross attention module for querying and consolidating spatial information.
We evaluate single-task and multi-task variants of F2BEV on our synthetic FB-SSEM dataset.
- Score: 3.286961611175469
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bird's Eye View (BEV) representations are tremendously useful for
perception-related automated driving tasks. However, generating BEVs from
surround-view fisheye camera images is challenging due to the strong
distortions introduced by such wide-angle lenses. We take the first step in
addressing this challenge and introduce a baseline, F2BEV, to generate
discretized BEV height maps and BEV semantic segmentation maps from fisheye
images. F2BEV consists of a distortion-aware spatial cross attention module for
querying and consolidating spatial information from fisheye image features in a
transformer-style architecture followed by a task-specific head. We evaluate
single-task and multi-task variants of F2BEV on our synthetic FB-SSEM dataset,
all of which generate better BEV height and segmentation maps (in terms of the
IoU) than a state-of-the-art BEV generation method operating on undistorted
fisheye images. We also demonstrate discretized height map generation from
real-world fisheye images using F2BEV. Our dataset is publicly available at
https://github.com/volvo-cars/FB-SSEM-dataset
Related papers
- Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data [3.1968751101341173]
Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks.
Recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets.
We show that a more scalable approach towards generalizable map prediction can be enabled by using two large-scale crowd-sourced mapping platforms.
arXiv Detail & Related papers (2024-07-11T17:57:22Z) - RoadBEV: Road Surface Reconstruction in Bird's Eye View [55.0558717607946]
Vision-based online road reconstruction promisingly captures road information in advance.
Recent technique of Bird's-Eye-View (BEV) perception provides immense potential to more reliable and accurate reconstruction.
This paper uniformly proposes two simple yet effective models for road elevation reconstruction in BEV named RoadBEV-mono and RoadBEV-stereo.
arXiv Detail & Related papers (2024-04-09T20:24:29Z) - DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning [7.012508171229966]
There is limited work on BEV segmentation for surround-view fisheye cameras, commonly used in commercial vehicles.
We create a synthetic dataset using the Cognata simulator comprising diverse road types, weather, and lighting conditions.
We generalize the BEV segmentation to work with any camera model; this is useful for mixing diverse cameras.
arXiv Detail & Related papers (2024-04-09T14:43:19Z) - DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception [111.13119809216313]
Camera-only Bird's Eye View (BEV) has demonstrated great potential in environment perception in a 3D space.
Unsupervised domain adaptive BEV, which effective learning from various unlabelled target data, is far under-explored.
We design DA-BEV, the first domain adaptive camera-only BEV framework that addresses domain adaptive BEV challenges by exploiting the complementary nature of image-view features and BEV features.
arXiv Detail & Related papers (2024-01-13T04:21:24Z) - FB-BEV: BEV Representation from Forward-Backward View Transformations [131.11787050205697]
We propose a novel View Transformation Module (VTM) for Bird-Eye-View (BEV) representation.
We instantiate the proposed module with FB-BEV, which achieves a new state-of-the-art result of 62.4% NDS on the nuScenes test set.
arXiv Detail & Related papers (2023-08-04T10:26:55Z) - SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view
3D Object Detection [46.92706423094971]
We propose Semantic-Aware BEV Pooling (SA-BEVPool), which can filter out background information according to the semantic segmentation of image features.
We also propose BEV-Paste, an effective data augmentation strategy that closely matches with semantic-aware BEV feature.
Experiments on nuScenes show that SA-BEV achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-21T10:28:19Z) - BEV-SAN: Accurate BEV 3D Object Detection via Slice Attention Networks [28.024042528077125]
Bird's-Eye-View (BEV) 3D Object Detection is a crucial multi-view technique for autonomous driving systems.
We propose a novel method named BEV Slice Attention Network (BEV-SAN) for exploiting the intrinsic characteristics of different heights.
arXiv Detail & Related papers (2022-12-02T15:14:48Z) - LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic
Segmentation [43.12994451281451]
We present 'LaRa', an efficient encoder-decoder, transformer-based model for vehicle semantic segmentation from multiple cameras.
Our approach uses a system of cross-attention to aggregate information over multiple sensors into a compact, yet rich, collection of latent representations.
arXiv Detail & Related papers (2022-06-27T13:37:50Z) - GitNet: Geometric Prior-based Transformation for Birds-Eye-View
Segmentation [105.19949897812494]
Birds-eye-view (BEV) semantic segmentation is critical for autonomous driving.
We present a novel two-stage Geometry Prior-based Transformation framework named GitNet.
arXiv Detail & Related papers (2022-04-16T06:46:45Z) - M^2BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified
Birds-Eye View Representation [145.6041893646006]
M$2$BEV is a unified framework that jointly performs 3D object detection and map segmentation.
M$2$BEV infers both tasks with a unified model and improves efficiency.
arXiv Detail & Related papers (2022-04-11T13:43:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.