Grid-Centric Traffic Scenario Perception for Autonomous Driving: A Comprehensive Review
- URL: http://arxiv.org/abs/2303.01212v2
- Date: Sun, 9 Jun 2024 12:58:10 GMT
- Title: Grid-Centric Traffic Scenario Perception for Autonomous Driving: A Comprehensive Review
- Authors: Yining Shi, Kun Jiang, Jiusi Li, Zelin Qian, Junze Wen, Mengmeng Yang, Ke Wang, Diange Yang,
- Abstract summary: Grid-centric perception is more robust to the open-world driving scenarios with endless long-tailed semantically-unknown obstacles.
Recent researches demonstrate the great advantages of grid-centric perception, such as comprehensive fine-grained environmental representation.
We organize previous and current knowledge of occupancy grid techniques along the main vein from 2D BEV grids to 3D occupancy to 4D occupancy forecasting.
- Score: 13.047382354329736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Grid-centric perception is a crucial field for mobile robot perception and navigation. Nonetheless, grid-centric perception is less prevalent than object-centric perception as autonomous vehicles need to accurately perceive highly dynamic, large-scale traffic scenarios and the complexity and computational costs of grid-centric perception are high. In recent years, the rapid development of deep learning techniques and hardware provides fresh insights into the evolution of grid-centric perception. The fundamental difference between grid-centric and object-centric pipeline lies in that grid-centric perception follows a geometry-first paradigm which is more robust to the open-world driving scenarios with endless long-tailed semantically-unknown obstacles. Recent researches demonstrate the great advantages of grid-centric perception, such as comprehensive fine-grained environmental representation, greater robustness to occlusion and irregular shaped objects, better ground estimation, and safer planning policies. There is also a growing trend that the capacity of occupancy networks are greatly expanded to 4D scene perception and prediction and latest techniques are highly related to new research topics such as 4D occupancy forecasting, generative AI and world models in the field of autonomous driving. Given the lack of current surveys for this rapidly expanding field, we present a hierarchically-structured review of grid-centric perception for autonomous vehicles. We organize previous and current knowledge of occupancy grid techniques along the main vein from 2D BEV grids to 3D occupancy to 4D occupancy forecasting. We additionally summarize label-efficient occupancy learning and the role of grid-centric perception in driving systems. Lastly, we present a summary of the current research trend and provide future outlooks.
Related papers
- A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective [20.798308029074786]
3D occupancy perception technology aims to observe and understand dense 3D environments for autonomous vehicles.
Similar to traditional bird's-eye view (BEV) perception, 3D occupancy perception has the nature of multi-source input and the necessity for information fusion.
arXiv Detail & Related papers (2024-05-08T16:10:46Z) - Vision-based 3D occupancy prediction in autonomous driving: a review and outlook [19.939380586314673]
We introduce the background of vision-based 3D occupancy prediction and discuss the challenges in this task.
We conduct a comprehensive survey of the progress in vision-based 3D occupancy prediction from three aspects.
We present a summary of prevailing research trends and propose some inspiring future outlooks.
arXiv Detail & Related papers (2024-05-04T07:39:25Z) - 3D Object Visibility Prediction in Autonomous Driving [6.802572869909114]
We present a novel attribute and its corresponding algorithm: 3D object visibility.
Our proposal of this attribute and its computational strategy aims to expand the capabilities for downstream tasks.
arXiv Detail & Related papers (2024-03-06T13:07:42Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - End-to-end Autonomous Driving: Challenges and Frontiers [45.391430626264764]
We provide a comprehensive analysis of more than 270 papers, covering the motivation, roadmap, methodology, challenges, and future trends in end-to-end autonomous driving.
We delve into several critical challenges, including multi-modality, interpretability, causal confusion, robustness, and world models, amongst others.
We discuss current advancements in foundation models and visual pre-training, as well as how to incorporate these techniques within the end-to-end driving framework.
arXiv Detail & Related papers (2023-06-29T14:17:24Z) - Policy Pre-training for End-to-end Autonomous Driving via
Self-supervised Geometric Modeling [96.31941517446859]
We propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving.
We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos.
In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input.
In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only.
arXiv Detail & Related papers (2023-01-03T08:52:49Z) - Exploring Contextual Representation and Multi-Modality for End-to-End
Autonomous Driving [58.879758550901364]
Recent perception systems enhance spatial understanding with sensor fusion but often lack full environmental context.
We introduce a framework that integrates three cameras to emulate the human field of view, coupled with top-down bird-eye-view semantic data to enhance contextual representation.
Our method achieves displacement error by 0.67m in open-loop settings, surpassing current methods by 6.9% on the nuScenes dataset.
arXiv Detail & Related papers (2022-10-13T05:56:20Z) - Predicting Future Occupancy Grids in Dynamic Environment with
Spatio-Temporal Learning [63.25627328308978]
We propose a-temporal prediction network pipeline to generate future occupancy predictions.
Compared to current SOTA, our approach predicts occupancy for a longer horizon of 3 seconds.
We publicly release our grid occupancy dataset based on nulis to support further research.
arXiv Detail & Related papers (2022-05-06T13:45:32Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Radar-based Dynamic Occupancy Grid Mapping and Object Detection [55.74894405714851]
In recent years, the classical occupancy grid map approach has been extended to dynamic occupancy grid maps.
This paper presents the further development of a previous approach.
The data of multiple radar sensors are fused, and a grid-based object tracking and mapping method is applied.
arXiv Detail & Related papers (2020-08-09T09:26:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.