Vision-Based Environmental Perception for Autonomous Driving
- URL: http://arxiv.org/abs/2212.11453v1
- Date: Thu, 22 Dec 2022 01:59:58 GMT
- Title: Vision-Based Environmental Perception for Autonomous Driving
- Authors: Fei Liu, Zihao Lu, Xianke Lin
- Abstract summary: Visual perception plays an important role in autonomous driving.
Recent development of deep learning-based method has better reliability and processing speed.
Monocular camera uses image data from a single viewpoint to estimate object depth.
Simultaneous Location and Mapping (SLAM) can establish a model of the road environment.
- Score: 4.138893879750758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual perception plays an important role in autonomous driving. One of the
primary tasks is object detection and identification. Since the vision sensor
is rich in color and texture information, it can quickly and accurately
identify various road information. The commonly used technique is based on
extracting and calculating various features of the image. The recent
development of deep learning-based method has better reliability and processing
speed and has a greater advantage in recognizing complex elements. For depth
estimation, vision sensor is also used for ranging due to their small size and
low cost. Monocular camera uses image data from a single viewpoint as input to
estimate object depth. In contrast, stereo vision is based on parallax and
matching feature points of different views, and the application of deep
learning also further improves the accuracy. In addition, Simultaneous Location
and Mapping (SLAM) can establish a model of the road environment, thus helping
the vehicle perceive the surrounding environment and complete the tasks. In
this paper, we introduce and compare various methods of object detection and
identification, then explain the development of depth estimation and compare
various methods based on monocular, stereo, and RDBG sensors, next review and
compare various methods of SLAM, and finally summarize the current problems and
present the future development trends of vision technologies.
Related papers
- A Comprehensive Review of 3D Object Detection in Autonomous Driving: Technological Advances and Future Directions [11.071271817366739]
3D object perception has become a crucial component in the development of autonomous driving systems.
This review extensively summarizes traditional 3D object detection methods, focusing on camera-based, LiDAR-based, and fusion detection techniques.
We discuss future directions, including methods to improve accuracy such as temporal perception, occupancy grids, and end-to-end learning frameworks.
arXiv Detail & Related papers (2024-08-28T01:08:33Z) - How to deal with glare for improved perception of Autonomous Vehicles [0.0]
Vision sensors are versatile and can capture a wide range of visual cues, such as color, texture, shape, and depth.
vision-based environment perception systems can be easily affected by glare in the presence of a bright source of light.
arXiv Detail & Related papers (2024-04-17T02:05:05Z) - Cohere3D: Exploiting Temporal Coherence for Unsupervised Representation
Learning of Vision-based Autonomous Driving [73.3702076688159]
We propose a novel contrastive learning algorithm, Cohere3D, to learn coherent instance representations in a long-term input sequence.
We evaluate our algorithm by finetuning the pretrained model on various downstream perception, prediction, and planning tasks.
arXiv Detail & Related papers (2024-02-23T19:43:01Z) - Camera-Radar Perception for Autonomous Vehicles and ADAS: Concepts,
Datasets and Metrics [77.34726150561087]
This work aims to carry out a study on the current scenario of camera and radar-based perception for ADAS and autonomous vehicles.
Concepts and characteristics related to both sensors, as well as to their fusion, are presented.
We give an overview of the Deep Learning-based detection and segmentation tasks, and the main datasets, metrics, challenges, and open questions in vehicle perception.
arXiv Detail & Related papers (2023-03-08T00:48:32Z) - Exploring Contextual Representation and Multi-Modality for End-to-End
Autonomous Driving [58.879758550901364]
Recent perception systems enhance spatial understanding with sensor fusion but often lack full environmental context.
We introduce a framework that integrates three cameras to emulate the human field of view, coupled with top-down bird-eye-view semantic data to enhance contextual representation.
Our method achieves displacement error by 0.67m in open-loop settings, surpassing current methods by 6.9% on the nuScenes dataset.
arXiv Detail & Related papers (2022-10-13T05:56:20Z) - Towards Multimodal Multitask Scene Understanding Models for Indoor
Mobile Agents [49.904531485843464]
In this paper, we discuss the main challenge: insufficient, or even no, labeled data for real-world indoor environments.
We describe MMISM (Multi-modality input Multi-task output Indoor Scene understanding Model) to tackle the above challenges.
MMISM considers RGB images as well as sparse Lidar points as inputs and 3D object detection, depth completion, human pose estimation, and semantic segmentation as output tasks.
We show that MMISM performs on par or even better than single-task models.
arXiv Detail & Related papers (2022-09-27T04:49:19Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Comparative study of 3D object detection frameworks based on LiDAR data
and sensor fusion techniques [0.0]
The perception system plays a significant role in providing an accurate interpretation of a vehicle's environment in real-time.
Deep learning techniques transform the huge amount of data from the sensors into semantic information.
3D object detection methods, by utilizing the additional pose data from the sensors such as LiDARs, stereo cameras, provides information on the size and location of the object.
arXiv Detail & Related papers (2022-02-05T09:34:58Z) - Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z) - OmniDet: Surround View Cameras based Multi-task Visual Perception
Network for Autonomous Driving [10.3540046389057]
This work presents a multi-task visual perception network on unrectified fisheye images.
It consists of six primary tasks necessary for an autonomous driving system.
We demonstrate that the jointly trained model performs better than the respective single task versions.
arXiv Detail & Related papers (2021-02-15T10:46:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.