Camera-Lidar Integration: Probabilistic sensor fusion for semantic
mapping
- URL: http://arxiv.org/abs/2007.05490v1
- Date: Thu, 9 Jul 2020 07:59:39 GMT
- Title: Camera-Lidar Integration: Probabilistic sensor fusion for semantic
mapping
- Authors: Julie Stephany Berrio, Mao Shan, Stewart Worrall, Eduardo Nebot
- Abstract summary: An automated vehicle must be able to perceive and recognise object/obstacles in a three-dimensional world while navigating in a constantly changing environment.
We present a probabilistic pipeline that incorporates uncertainties from the sensor readings (cameras, lidar, IMU and wheel encoders), compensation for the motion of the vehicle, and label probabilities for the semantic images.
- Score: 8.18198392834469
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An automated vehicle operating in an urban environment must be able to
perceive and recognise object/obstacles in a three-dimensional world while
navigating in a constantly changing environment. In order to plan and execute
accurate sophisticated driving maneuvers, a high-level contextual understanding
of the surroundings is essential. Due to the recent progress in image
processing, it is now possible to obtain high definition semantic information
in 2D from monocular cameras, though cameras cannot reliably provide the highly
accurate 3D information provided by lasers. The fusion of these two sensor
modalities can overcome the shortcomings of each individual sensor, though
there are a number of important challenges that need to be addressed in a
probabilistic manner. In this paper, we address the common, yet challenging,
lidar/camera/semantic fusion problems which are seldom approached in a wholly
probabilistic manner. Our approach is capable of using a multi-sensor platform
to build a three-dimensional semantic voxelized map that considers the
uncertainty of all of the processes involved. We present a probabilistic
pipeline that incorporates uncertainties from the sensor readings (cameras,
lidar, IMU and wheel encoders), compensation for the motion of the vehicle, and
heuristic label probabilities for the semantic images. We also present a novel
and efficient viewpoint validation algorithm to check for occlusions from the
camera frames. A probabilistic projection is performed from the camera images
to the lidar point cloud. Each labelled lidar scan then feeds into an octree
map building algorithm that updates the class probabilities of the map voxels
every time a new observation is available. We validate our approach using a set
of qualitative and quantitative experimental tests on the USyd Dataset.
Related papers
- Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System [0.0]
We propose a novel approach to address the problem of camera and radar sensor fusion for 3D object detection in autonomous vehicle perception systems.
Our approach builds on recent advances in deep learning and leverages the strengths of both sensors to improve object detection performance.
Our results show that the proposed approach achieves superior performance over single-sensor solutions and could directly compete with other top-level fusion methods.
arXiv Detail & Related papers (2024-04-25T12:04:31Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and
3D Localization [13.473742114288616]
We propose a framework that can autonomously detect and localize objects in a known environment.
The framework consists of three key elements: understanding the environment through RGB data, estimating depth through multi-modal sensor fusion, and managing artifacts.
Experiments show that the proposed framework can accurately detect 98% of the objects in the real sample environment, without post-processing.
arXiv Detail & Related papers (2023-07-03T15:51:39Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - Exploring Contextual Representation and Multi-Modality for End-to-End
Autonomous Driving [58.879758550901364]
Recent perception systems enhance spatial understanding with sensor fusion but often lack full environmental context.
We introduce a framework that integrates three cameras to emulate the human field of view, coupled with top-down bird-eye-view semantic data to enhance contextual representation.
Our method achieves displacement error by 0.67m in open-loop settings, surpassing current methods by 6.9% on the nuScenes dataset.
arXiv Detail & Related papers (2022-10-13T05:56:20Z) - Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object
Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR.
fusing these two modalities can significantly boost the performance of 3D perception models.
We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data [80.14669385741202]
We propose a self-supervised pre-training method for 3D perception models tailored to autonomous driving data.
We leverage the availability of synchronized and calibrated image and Lidar sensors in autonomous driving setups.
Our method does not require any point cloud nor image annotations.
arXiv Detail & Related papers (2022-03-30T12:40:30Z) - GenRadar: Self-supervised Probabilistic Camera Synthesis based on Radar
Frequencies [12.707035083920227]
This work combines the complementary strengths of both sensor types in a unique self-learning fusion approach for a probabilistic scene reconstruction.
A proposed algorithm exploits similarities and establishes correspondences between both domains at different feature levels during training.
These discrete tokens are finally transformed back into an instructive view of the respective surrounding, allowing to visually perceive potential dangers.
arXiv Detail & Related papers (2021-07-19T15:00:28Z) - CFTrack: Center-based Radar and Camera Fusion for 3D Multi-Object
Tracking [9.62721286522053]
We propose an end-to-end network for joint object detection and tracking based on radar and camera sensor fusion.
Our proposed method uses a center-based radar-camera fusion algorithm for object detection and utilizes a greedy algorithm for object association.
We evaluate our method on the challenging nuScenes dataset, where it achieves 20.0 AMOTA and outperforms all vision-based 3D tracking methods in the benchmark.
arXiv Detail & Related papers (2021-07-11T23:56:53Z) - Semantic sensor fusion: from camera to sparse lidar information [7.489722641968593]
This paper presents an approach to fuse different sensory information, Light Detection and Ranging (lidar) scans and camera images.
The transference of semantic information between the labelled image and the lidar point cloud is performed in four steps.
arXiv Detail & Related papers (2020-03-04T03:09:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.