Leveraging BEV Representation for 360-degree Visual Place Recognition
- URL: http://arxiv.org/abs/2305.13814v1
- Date: Tue, 23 May 2023 08:29:42 GMT
- Title: Leveraging BEV Representation for 360-degree Visual Place Recognition
- Authors: Xuecheng Xu, Yanmei Jiao, Sha Lu, Xiaqing Ding, Rong Xiong, Yue Wang
- Abstract summary: This paper investigates the advantages of using Bird's Eye View representation in 360-degree visual place recognition (VPR)
We propose a novel network architecture that utilizes the BEV representation in feature extraction, feature aggregation, and vision-LiDAR fusion.
The proposed BEV-based method is evaluated in ablation and comparative studies on two datasets.
- Score: 14.497501941931759
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: This paper investigates the advantages of using Bird's Eye View (BEV)
representation in 360-degree visual place recognition (VPR). We propose a novel
network architecture that utilizes the BEV representation in feature
extraction, feature aggregation, and vision-LiDAR fusion, which bridges visual
cues and spatial awareness. Our method extracts image features using standard
convolutional networks and combines the features according to pre-defined 3D
grid spatial points. To alleviate the mechanical and time misalignments between
cameras, we further introduce deformable attention to learn the compensation.
Upon the BEV feature representation, we then employ the polar transform and the
Discrete Fourier transform for aggregation, which is shown to be
rotation-invariant. In addition, the image and point cloud cues can be easily
stated in the same coordinates, which benefits sensor fusion for place
recognition. The proposed BEV-based method is evaluated in ablation and
comparative studies on two datasets, including on-the-road and off-the-road
scenarios. The experimental results verify the hypothesis that BEV can benefit
VPR by its superior performance compared to baseline methods. To the best of
our knowledge, this is the first trial of employing BEV representation in this
task.
Related papers
- Improving Bird's Eye View Semantic Segmentation by Task Decomposition [42.57351039508863]
We decompose the original BEV segmentation task into two stages, namely BEV map reconstruction and RGB-BEV feature alignment.
Our approach simplifies the complexity of combining perception and generation into distinct steps, equipping the model to handle intricate and challenging scenes effectively.
arXiv Detail & Related papers (2024-04-02T13:19:45Z) - BEV$^2$PR: BEV-Enhanced Visual Place Recognition with Structural Cues [44.96177875644304]
We propose a new image-based visual place recognition (VPR) framework by exploiting the structural cues in bird's-eye view (BEV) from a single camera.
The BEV$2$PR framework generates a composite descriptor with both visual cues and spatial awareness based on a single camera.
arXiv Detail & Related papers (2024-03-11T10:46:43Z) - DA-BEV: Unsupervised Domain Adaptation for Bird's Eye View Perception [104.87876441265593]
Camera-only Bird's Eye View (BEV) has demonstrated great potential in environment perception in a 3D space.
Unsupervised domain adaptive BEV, which effective learning from various unlabelled target data, is far under-explored.
We design DA-BEV, the first domain adaptive camera-only BEV framework that addresses domain adaptive BEV challenges by exploiting the complementary nature of image-view features and BEV features.
arXiv Detail & Related papers (2024-01-13T04:21:24Z) - FB-BEV: BEV Representation from Forward-Backward View Transformations [131.11787050205697]
We propose a novel View Transformation Module (VTM) for Bird-Eye-View (BEV) representation.
We instantiate the proposed module with FB-BEV, which achieves a new state-of-the-art result of 62.4% NDS on the nuScenes test set.
arXiv Detail & Related papers (2023-08-04T10:26:55Z) - BEV-IO: Enhancing Bird's-Eye-View 3D Detection with Instance Occupancy [58.92659367605442]
We present BEV-IO, a new 3D detection paradigm to enhance BEV representation with instance occupancy information.
We show that BEV-IO can outperform state-of-the-art methods while only adding a negligible increase in parameters and computational overhead.
arXiv Detail & Related papers (2023-05-26T11:16:12Z) - BEVPlace: Learning LiDAR-based Place Recognition using Bird's Eye View
Images [20.30997801125592]
We explore the potential of a different representation in place recognition, i.e. bird's eye view (BEV) images.
A simple VGGNet trained on BEV images achieves comparable performance with the state-of-the-art place recognition methods in scenes of slight viewpoint changes.
We develop a method to estimate the position of the query cloud, extending the usage of place recognition.
arXiv Detail & Related papers (2023-02-28T05:37:45Z) - Delving into the Devils of Bird's-eye-view Perception: A Review,
Evaluation and Recipe [115.31507979199564]
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending and drawing extensive attention both from industry and academia.
As sensor configurations get more complex, integrating multi-source information from different sensors and representing features in a unified view come of vital importance.
The core problems for BEV perception lie in (a) how to reconstruct the lost 3D information via view transformation from perspective view to BEV; (b) how to acquire ground truth annotations in BEV grid; and (d) how to adapt and generalize algorithms as sensor configurations vary across different scenarios.
arXiv Detail & Related papers (2022-09-12T15:29:13Z) - GitNet: Geometric Prior-based Transformation for Birds-Eye-View
Segmentation [105.19949897812494]
Birds-eye-view (BEV) semantic segmentation is critical for autonomous driving.
We present a novel two-stage Geometry Prior-based Transformation framework named GitNet.
arXiv Detail & Related papers (2022-04-16T06:46:45Z) - M^2BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified
Birds-Eye View Representation [145.6041893646006]
M$2$BEV is a unified framework that jointly performs 3D object detection and map segmentation.
M$2$BEV infers both tasks with a unified model and improves efficiency.
arXiv Detail & Related papers (2022-04-11T13:43:25Z) - Bird's-Eye-View Panoptic Segmentation Using Monocular Frontal View
Images [4.449481309681663]
We present the first end-to-end learning approach for directly predicting dense panoptic segmentation maps in the Bird's-Eye-View (BEV) maps.
Our architecture follows the top-down paradigm and incorporates a novel dense transformer module.
We derive a mathematical formulation for the sensitivity of the FV-BEV transformation which allows us to intelligently weight pixels in the BEV space.
arXiv Detail & Related papers (2021-08-06T17:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.