Multi-modal Fusion Technology based on Vehicle Information: A Survey
- URL: http://arxiv.org/abs/2211.06080v1
- Date: Fri, 11 Nov 2022 09:25:53 GMT
- Title: Multi-modal Fusion Technology based on Vehicle Information: A Survey
- Authors: Yan Gong, Jianli Lu, Jiayi Wu, Wenzhuo Liu
- Abstract summary: The current multi-modal fusion methods mainly focus on camera data and LiDAR data, but pay little attention to the kinematic information provided by the bottom sensors of the vehicle.
These information are not affected by complex external scenes, so it is more robust and reliable.
New future ideas of multi-modal fusion technology for autonomous driving tasks are proposed to promote the further utilization of vehicle bottom information.
- Score: 0.7646713951724012
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-modal fusion is a basic task of autonomous driving system perception,
which has attracted many scholars' interest in recent years. The current
multi-modal fusion methods mainly focus on camera data and LiDAR data, but pay
little attention to the kinematic information provided by the bottom sensors of
the vehicle, such as acceleration, vehicle speed, angle of rotation. These
information are not affected by complex external scenes, so it is more robust
and reliable. In this paper, we introduce the existing application fields of
vehicle bottom information and the research progress of related methods, as
well as the multi-modal fusion methods based on bottom information. We also
introduced the relevant information of the vehicle bottom information data set
in detail to facilitate the research as soon as possible. In addition, new
future ideas of multi-modal fusion technology for autonomous driving tasks are
proposed to promote the further utilization of vehicle bottom information.
Related papers
- Foundations and Recent Trends in Multimodal Mobile Agents: A Survey [57.677161006710065]
Mobile agents are essential for automating tasks in complex and dynamic mobile environments.
Recent advancements enhance real-time adaptability and multimodal interaction.
We categorize these advancements into two main approaches: prompt-based methods and training-based methods.
arXiv Detail & Related papers (2024-11-04T11:50:58Z) - A Survey of Deep Learning Based Radar and Vision Fusion for 3D Object Detection in Autonomous Driving [9.962648957398923]
This paper focuses on a comprehensive survey of radar-vision (RV) fusion based on deep learning methods for 3D object detection in autonomous driving.
As the most promising fusion strategy at present, we provide a deeper classification of end-to-end fusion methods, including those 3D bounding box prediction based and BEV based approaches.
arXiv Detail & Related papers (2024-06-02T11:37:50Z) - G-MEMP: Gaze-Enhanced Multimodal Ego-Motion Prediction in Driving [71.9040410238973]
We focus on inferring the ego trajectory of a driver's vehicle using their gaze data.
Next, we develop G-MEMP, a novel multimodal ego-trajectory prediction network that combines GPS and video input with gaze data.
The results show that G-MEMP significantly outperforms state-of-the-art methods in both benchmarks.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future [130.87142103774752]
This review systematically assesses over seventy open-source autonomous driving datasets.
It offers insights into various aspects, such as the principles underlying the creation of high-quality datasets.
It also delves into the scientific and technical challenges that warrant resolution.
arXiv Detail & Related papers (2023-12-06T10:46:53Z) - LLM4Drive: A Survey of Large Language Models for Autonomous Driving [62.10344445241105]
Large language models (LLMs) have demonstrated abilities including understanding context, logical reasoning, and generating answers.
In this paper, we systematically review a research line about textitLarge Language Models for Autonomous Driving (LLM4AD).
arXiv Detail & Related papers (2023-11-02T07:23:33Z) - HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for
Autonomous Driving [95.42203932627102]
3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians.
Our method efficiently makes use of these complementary signals, in a semi-supervised fashion and outperforms existing methods with a large margin.
Specifically, we embed LiDAR points into pixel-aligned multi-modal features, which we pass through a sequence of Transformer refinement stages.
arXiv Detail & Related papers (2022-12-15T11:15:14Z) - Multi-modal Sensor Fusion for Auto Driving Perception: A Survey [22.734013343067407]
We provide a literature review of the existing multi-modal-based methods for perception tasks in autonomous driving.
We propose an innovative way that divides them into two major classes, four minor classes by a more reasonable taxonomy in the view of the fusion stage.
arXiv Detail & Related papers (2022-02-06T04:18:45Z) - OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with
Vehicle-to-Vehicle Communication [13.633468133727]
We present the first large-scale open simulated dataset for Vehicle-to-Vehicle perception.
It contains over 70 interesting scenes, 11,464 frames, and 232,913 annotated 3D vehicle bounding boxes.
arXiv Detail & Related papers (2021-09-16T00:52:41Z) - Multi-Modal 3D Object Detection in Autonomous Driving: a Survey [10.913958563906931]
Self-driving cars are equipped with a suite of sensors to conduct robust and accurate environment perception.
As the number and type of sensors keep increasing, combining them for better perception is becoming a natural trend.
This survey devotes to review recent fusion-based 3D detection deep learning models that leverage multiple sensor data sources.
arXiv Detail & Related papers (2021-06-24T02:52:12Z) - One Million Scenes for Autonomous Driving: ONCE Dataset [91.94189514073354]
We introduce the ONCE dataset for 3D object detection in the autonomous driving scenario.
The data is selected from 144 driving hours, which is 20x longer than the largest 3D autonomous driving dataset available.
We reproduce and evaluate a variety of self-supervised and semi-supervised methods on the ONCE dataset.
arXiv Detail & Related papers (2021-06-21T12:28:08Z) - Deep Learning for Image and Point Cloud Fusion in Autonomous Driving: A
Review [15.10767676137607]
Camera-LiDAR fusion is becoming an emerging research theme.
This paper reviews recent deep-learning-based data fusion approaches that leverage both image and point cloud.
We identify gaps and over-looked challenges between current academic researches and real-world applications.
arXiv Detail & Related papers (2020-04-10T20:43:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.