Semantic sensor fusion: from camera to sparse lidar information
- URL: http://arxiv.org/abs/2003.01871v1
- Date: Wed, 4 Mar 2020 03:09:33 GMT
- Title: Semantic sensor fusion: from camera to sparse lidar information
- Authors: Julie Stephany Berrio, Mao Shan, Stewart Worrall, James Ward, Eduardo
Nebot
- Abstract summary: This paper presents an approach to fuse different sensory information, Light Detection and Ranging (lidar) scans and camera images.
The transference of semantic information between the labelled image and the lidar point cloud is performed in four steps.
- Score: 7.489722641968593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To navigate through urban roads, an automated vehicle must be able to
perceive and recognize objects in a three-dimensional environment. A high-level
contextual understanding of the surroundings is necessary to plan and execute
accurate driving maneuvers. This paper presents an approach to fuse different
sensory information, Light Detection and Ranging (lidar) scans and camera
images. The output of a convolutional neural network (CNN) is used as
classifier to obtain the labels of the environment. The transference of
semantic information between the labelled image and the lidar point cloud is
performed in four steps: initially, we use heuristic methods to associate
probabilities to all the semantic classes contained in the labelled images.
Then, the lidar points are corrected to compensate for the vehicle's motion
given the difference between the timestamps of each lidar scan and camera
image. In a third step, we calculate the pixel coordinate for the corresponding
camera image. In the last step we perform the transfer of semantic information
from the heuristic probability images to the lidar frame, while removing the
lidar information that is not visible to the camera. We tested our approach in
the Usyd Dataset \cite{usyd_dataset}, obtaining qualitative and quantitative
results that demonstrate the validity of our probabilistic sensory fusion
approach.
Related papers
- Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System [0.0]
We propose a novel approach to address the problem of camera and radar sensor fusion for 3D object detection in autonomous vehicle perception systems.
Our approach builds on recent advances in deep learning and leverages the strengths of both sensors to improve object detection performance.
Our results show that the proposed approach achieves superior performance over single-sensor solutions and could directly compete with other top-level fusion methods.
arXiv Detail & Related papers (2024-04-25T12:04:31Z) - Unleash the Potential of Image Branch for Cross-modal 3D Object
Detection [67.94357336206136]
We present a new cross-modal 3D object detector, namely UPIDet, which aims to unleash the potential of the image branch from two aspects.
First, UPIDet introduces a new 2D auxiliary task called normalized local coordinate map estimation.
Second, we discover that the representational capability of the point cloud backbone can be enhanced through the gradients backpropagated from the training objectives of the image branch.
arXiv Detail & Related papers (2023-01-22T08:26:58Z) - Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene.
The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z) - Paint and Distill: Boosting 3D Object Detection with Semantic Passing
Network [70.53093934205057]
3D object detection task from lidar or camera sensors is essential for autonomous driving.
We propose a novel semantic passing framework, named SPNet, to boost the performance of existing lidar-based 3D detection models.
arXiv Detail & Related papers (2022-07-12T12:35:34Z) - Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data [80.14669385741202]
We propose a self-supervised pre-training method for 3D perception models tailored to autonomous driving data.
We leverage the availability of synchronized and calibrated image and Lidar sensors in autonomous driving setups.
Our method does not require any point cloud nor image annotations.
arXiv Detail & Related papers (2022-03-30T12:40:30Z) - Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via
Cross-modal Distillation [32.33170182669095]
This work investigates learning pixel-wise semantic image segmentation in urban scenes without any manual annotation, just from the raw non-curated data collected by cars.
We propose a novel method for cross-modal unsupervised learning of semantic image segmentation by leveraging synchronized LiDAR and image data.
arXiv Detail & Related papers (2022-03-21T17:35:46Z) - Content-Based Detection of Temporal Metadata Manipulation [91.34308819261905]
We propose an end-to-end approach to verify whether the purported time of capture of an image is consistent with its content and geographic location.
The central idea is the use of supervised consistency verification, in which we predict the probability that the image content, capture time, and geographical location are consistent.
Our approach improves upon previous work on a large benchmark dataset, increasing the classification accuracy from 59.03% to 81.07%.
arXiv Detail & Related papers (2021-03-08T13:16:19Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - YOdar: Uncertainty-based Sensor Fusion for Vehicle Detection with Camera
and Radar Sensors [4.396860522241306]
We present an uncertainty-based method for sensor fusion with camera and radar data.
In our experiments we combine the YOLOv3 object detection network with a customized $1D$ radar segmentation network.
Our experiments show, that this approach of uncertainty aware fusion significantly gains performance compared to single sensor baselines.
arXiv Detail & Related papers (2020-10-07T10:40:02Z) - Camera-Lidar Integration: Probabilistic sensor fusion for semantic
mapping [8.18198392834469]
An automated vehicle must be able to perceive and recognise object/obstacles in a three-dimensional world while navigating in a constantly changing environment.
We present a probabilistic pipeline that incorporates uncertainties from the sensor readings (cameras, lidar, IMU and wheel encoders), compensation for the motion of the vehicle, and label probabilities for the semantic images.
arXiv Detail & Related papers (2020-07-09T07:59:39Z) - A Sim2Real Deep Learning Approach for the Transformation of Images from
Multiple Vehicle-Mounted Cameras to a Semantically Segmented Image in Bird's
Eye View [0.0]
Distances can be more easily estimated when the camera perspective is transformed to a bird's eye view (BEV)
This paper describes a methodology to obtain a corrected 360deg BEV image given images from multiple vehicle-mounted cameras.
The neural network approach does not rely on manually labeled data, but is trained on a synthetic dataset in such a way that it generalizes well to real-world data.
arXiv Detail & Related papers (2020-05-08T14:54:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.