Deep Perspective Transformation Based Vehicle Localization on Bird's Eye
View
- URL: http://arxiv.org/abs/2311.06796v1
- Date: Sun, 12 Nov 2023 10:16:42 GMT
- Title: Deep Perspective Transformation Based Vehicle Localization on Bird's Eye
View
- Authors: Abtin Mahyar, Hossein Motamednia, Dara Rahmati
- Abstract summary: Traditional approaches rely on installing multiple sensors to simulate the environment.
We propose an alternative solution by generating a top-down representation of the scene.
We present an architecture that transforms perspective view RGB images into bird's-eye-view maps with segmented surrounding vehicles.
- Score: 0.49747156441456597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An accurate understanding of a self-driving vehicle's surrounding environment
is crucial for its navigation system. To enhance the effectiveness of existing
algorithms and facilitate further research, it is essential to provide
comprehensive data to the routing system. Traditional approaches rely on
installing multiple sensors to simulate the environment, leading to high costs
and complexity. In this paper, we propose an alternative solution by generating
a top-down representation of the scene, enabling the extraction of distances
and directions of other cars relative to the ego vehicle. We introduce a new
synthesized dataset that offers extensive information about the ego vehicle and
its environment in each frame, providing valuable resources for similar
downstream tasks. Additionally, we present an architecture that transforms
perspective view RGB images into bird's-eye-view maps with segmented
surrounding vehicles. This approach offers an efficient and cost-effective
method for capturing crucial environmental information for self-driving cars.
Code and dataset are available at
https://github.com/IPM-HPC/Perspective-BEV-Transformer.
Related papers
- BEVSeg2TP: Surround View Camera Bird's-Eye-View Based Joint Vehicle
Segmentation and Ego Vehicle Trajectory Prediction [4.328789276903559]
Trajectory prediction is a key task for vehicle autonomy.
There is a growing interest in learning-based trajectory prediction.
We show that there is the potential to improve the performance of perception.
arXiv Detail & Related papers (2023-12-20T15:02:37Z) - RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and
Comfortable Autonomous Driving [67.09546127265034]
Road surface reconstruction helps to enhance the analysis and prediction of vehicle responses for motion planning and control systems.
We introduce the Road Surface Reconstruction dataset, a real-world, high-resolution, and high-precision dataset collected with a specialized platform in diverse driving conditions.
It covers common road types containing approximately 16,000 pairs of stereo images, original point clouds, and ground-truth depth/disparity maps.
arXiv Detail & Related papers (2023-10-03T17:59:32Z) - Street-View Image Generation from a Bird's-Eye View Layout [95.36869800896335]
Bird's-Eye View (BEV) Perception has received increasing attention in recent years.
Data-driven simulation for autonomous driving has been a focal point of recent research.
We propose BEVGen, a conditional generative model that synthesizes realistic and spatially consistent surrounding images.
arXiv Detail & Related papers (2023-01-11T18:39:34Z) - Exploring Contextual Representation and Multi-Modality for End-to-End
Autonomous Driving [58.879758550901364]
Recent perception systems enhance spatial understanding with sensor fusion but often lack full environmental context.
We introduce a framework that integrates three cameras to emulate the human field of view, coupled with top-down bird-eye-view semantic data to enhance contextual representation.
Our method achieves displacement error by 0.67m in open-loop settings, surpassing current methods by 6.9% on the nuScenes dataset.
arXiv Detail & Related papers (2022-10-13T05:56:20Z) - Vision Transformer for Learning Driving Policies in Complex Multi-Agent
Environments [17.825845543579195]
We propose to use Vision Transformer (ViT) to learn a driving policy in urban settings with birds-eye-view (BEV) input images.
The ViT network learns the global context of the scene more effectively than with earlier proposed Convolutional Neural Networks (ConvNets)
arXiv Detail & Related papers (2021-09-14T08:18:47Z) - Attention-based Neural Network for Driving Environment Complexity
Perception [123.93460670568554]
This paper proposes a novel attention-based neural network model to predict the complexity level of the surrounding driving environment.
It consists of a Yolo-v3 object detection algorithm, a heat map generation algorithm, CNN-based feature extractors, and attention-based feature extractors.
The proposed attention-based network achieves 91.22% average classification accuracy to classify the surrounding environment complexity.
arXiv Detail & Related papers (2021-06-21T17:27:11Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Multi-View Fusion of Sensor Data for Improved Perception and Prediction
in Autonomous Driving [11.312620949473938]
We present an end-to-end method for object detection and trajectory prediction utilizing multi-view representations of LiDAR and camera images.
Our model builds on a state-of-the-art Bird's-Eye View (BEV) network that fuses voxelized features from a sequence of historical LiDAR data.
We extend this model with additional LiDAR Range-View (RV) features that use the raw LiDAR information in its native, non-quantized representation.
arXiv Detail & Related papers (2020-08-27T03:32:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.