Vision-based Uneven BEV Representation Learning with Polar Rasterization
and Surface Estimation
- URL: http://arxiv.org/abs/2207.01878v1
- Date: Tue, 5 Jul 2022 08:20:36 GMT
- Title: Vision-based Uneven BEV Representation Learning with Polar Rasterization
and Surface Estimation
- Authors: Zhi Liu, Shaoyu Chen, Xiaojie Guo, Xinggang Wang, Tianheng Cheng,
Hongmei Zhu, Qian Zhang, Wenyu Liu, Yi Zhang
- Abstract summary: We propose PolarBEV for vision-based uneven BEV representation learning.
PolarBEV keeps real-time inference speed on a single 2080Ti GPU.
- Score: 42.071461405587264
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this work, we propose PolarBEV for vision-based uneven BEV representation
learning. To adapt to the foreshortening effect of camera imaging, we rasterize
the BEV space both angularly and radially, and introduce polar embedding
decomposition to model the associations among polar grids. Polar grids are
rearranged to an array-like regular representation for efficient processing.
Besides, to determine the 2D-to-3D correspondence, we iteratively update the
BEV surface based on a hypothetical plane, and adopt height-based feature
transformation. PolarBEV keeps real-time inference speed on a single 2080Ti
GPU, and outperforms other methods for both BEV semantic segmentation and BEV
instance segmentation. Thorough ablations are presented to validate the design.
The code will be released at \url{https://github.com/SuperZ-Liu/PolarBEV}.
Related papers
- VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization [108.68014173017583]
Bird's-eye-view (BEV) map layout estimation requires an accurate and full understanding of the semantics for the environmental elements around the ego car.
We propose to utilize a generative model similar to the Vector Quantized-Variational AutoEncoder (VQ-VAE) to acquire prior knowledge for the high-level BEV semantics in the tokenized discrete space.
Thanks to the obtained BEV tokens accompanied with a codebook embedding encapsulating the semantics for different BEV elements in the groundtruth maps, we are able to directly align the sparse backbone image features with the obtained BEV tokens
arXiv Detail & Related papers (2024-11-03T16:09:47Z) - PolarBEVDet: Exploring Polar Representation for Multi-View 3D Object Detection in Bird's-Eye-View [5.0458717114406975]
We propose to employ the polar BEV representation to substitute the Cartesian BEV representation.
Experiments on nuScenes show that PolarBEVDet achieves the superior performance.
arXiv Detail & Related papers (2024-08-29T01:42:38Z) - Improving Bird's Eye View Semantic Segmentation by Task Decomposition [42.57351039508863]
We decompose the original BEV segmentation task into two stages, namely BEV map reconstruction and RGB-BEV feature alignment.
Our approach simplifies the complexity of combining perception and generation into distinct steps, equipping the model to handle intricate and challenging scenes effectively.
arXiv Detail & Related papers (2024-04-02T13:19:45Z) - U-BEV: Height-aware Bird's-Eye-View Segmentation and Neural Map-based Relocalization [81.76044207714637]
Relocalization is essential for intelligent vehicles when GPS reception is insufficient or sensor-based localization fails.
Recent advances in Bird's-Eye-View (BEV) segmentation allow for accurate estimation of local scene appearance.
This paper presents U-BEV, a U-Net inspired architecture that extends the current state-of-the-art by allowing the BEV to reason about the scene on multiple height layers before flattening the BEV features.
arXiv Detail & Related papers (2023-10-20T18:57:38Z) - FB-BEV: BEV Representation from Forward-Backward View Transformations [131.11787050205697]
We propose a novel View Transformation Module (VTM) for Bird-Eye-View (BEV) representation.
We instantiate the proposed module with FB-BEV, which achieves a new state-of-the-art result of 62.4% NDS on the nuScenes test set.
arXiv Detail & Related papers (2023-08-04T10:26:55Z) - BEV-IO: Enhancing Bird's-Eye-View 3D Detection with Instance Occupancy [58.92659367605442]
We present BEV-IO, a new 3D detection paradigm to enhance BEV representation with instance occupancy information.
We show that BEV-IO can outperform state-of-the-art methods while only adding a negligible increase in parameters and computational overhead.
arXiv Detail & Related papers (2023-05-26T11:16:12Z) - Leveraging BEV Representation for 360-degree Visual Place Recognition [14.497501941931759]
This paper investigates the advantages of using Bird's Eye View representation in 360-degree visual place recognition (VPR)
We propose a novel network architecture that utilizes the BEV representation in feature extraction, feature aggregation, and vision-LiDAR fusion.
The proposed BEV-based method is evaluated in ablation and comparative studies on two datasets.
arXiv Detail & Related papers (2023-05-23T08:29:42Z) - GitNet: Geometric Prior-based Transformation for Birds-Eye-View
Segmentation [105.19949897812494]
Birds-eye-view (BEV) semantic segmentation is critical for autonomous driving.
We present a novel two-stage Geometry Prior-based Transformation framework named GitNet.
arXiv Detail & Related papers (2022-04-16T06:46:45Z) - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View
Images [4.449481309681663]
We present the first end-to-end learning approach for directly predicting dense panoptic segmentation maps in the Bird's-Eye-View (BEV) maps.
Our architecture follows the top-down paradigm and incorporates a novel dense transformer module.
We derive a mathematical formulation for the sensitivity of the FV-BEV transformation which allows us to intelligently weight pixels in the BEV space.
arXiv Detail & Related papers (2021-08-06T17:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.