Analyzing General-Purpose Deep-Learning Detection and Segmentation
Models with Images from a Lidar as a Camera Sensor
- URL: http://arxiv.org/abs/2203.04064v1
- Date: Tue, 8 Mar 2022 13:14:43 GMT
- Title: Analyzing General-Purpose Deep-Learning Detection and Segmentation
Models with Images from a Lidar as a Camera Sensor
- Authors: Yu Xianjia, Sahar Salimpour, Jorge Pe\~na Queralta, Tomi Westerlund
- Abstract summary: This work explores the potential of general-purpose DL perception algorithms for processing image-like outputs of advanced lidar sensors.
Rather than processing the three-dimensional point cloud data, this is, to the best of our knowledge, the first work to focus on low-resolution images with 360text field of view.
We show that with adequate preprocessing, general-purpose DL models can process these images, opening the door to their usage in environmental conditions.
- Score: 0.06554326244334865
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Over the last decade, robotic perception algorithms have significantly
benefited from the rapid advances in deep learning (DL). Indeed, a significant
amount of the autonomy stack of different commercial and research platforms
relies on DL for situational awareness, especially vision sensors. This work
explores the potential of general-purpose DL perception algorithms,
specifically detection and segmentation neural networks, for processing
image-like outputs of advanced lidar sensors. Rather than processing the
three-dimensional point cloud data, this is, to the best of our knowledge, the
first work to focus on low-resolution images with 360\textdegree field of view
obtained with lidar sensors by encoding either depth, reflectivity, or
near-infrared light in the image pixels. We show that with adequate
preprocessing, general-purpose DL models can process these images, opening the
door to their usage in environmental conditions where vision sensors present
inherent limitations. We provide both a qualitative and quantitative analysis
of the performance of a variety of neural network architectures. We believe
that using DL models built for visual cameras offers significant advantages due
to the much wider availability and maturity compared to point cloud-based
perception.
Related papers
- BenchCloudVision: A Benchmark Analysis of Deep Learning Approaches for
Cloud Detection and Segmentation in Remote Sensing Imagery [0.0]
This paper examines seven cutting-edge semantic segmentation and detection algorithms applied to clouds identification.
To increase the model's adaptability, critical elements including the type of imagery and the amount of spectral bands used during training are analyzed.
Research tries to produce machine learning algorithms that can perform cloud segmentation using only a few spectral bands.
arXiv Detail & Related papers (2024-02-21T16:32:43Z) - VirtualPainting: Addressing Sparsity with Virtual Points and
Distance-Aware Data Augmentation for 3D Object Detection [3.5259183508202976]
We present an innovative approach that involves the generation of virtual LiDAR points using camera images.
We also enhance these virtual points with semantic labels obtained from image-based segmentation networks.
Our approach offers a versatile solution that can be seamlessly integrated into various 3D frameworks and 2D semantic segmentation methods.
arXiv Detail & Related papers (2023-12-26T18:03:05Z) - PointHPS: Cascaded 3D Human Pose and Shape Estimation from Point Clouds [99.60575439926963]
We propose a principled framework, PointHPS, for accurate 3D HPS from point clouds captured in real-world settings.
PointHPS iteratively refines point features through a cascaded architecture.
Extensive experiments demonstrate that PointHPS, with its powerful point feature extraction and processing scheme, outperforms State-of-the-Art methods.
arXiv Detail & Related papers (2023-08-28T11:10:14Z) - DensePose From WiFi [86.61881052177228]
We develop a deep neural network that maps the phase and amplitude of WiFi signals to UV coordinates within 24 human regions.
Our model can estimate the dense pose of multiple subjects, with comparable performance to image-based approaches.
arXiv Detail & Related papers (2022-12-31T16:48:43Z) - Ponder: Point Cloud Pre-training via Neural Rendering [93.34522605321514]
We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural encoders.
The learned point-cloud can be easily integrated into various downstream tasks, including not only high-level rendering tasks like 3D detection and segmentation, but low-level tasks like 3D reconstruction and image rendering.
arXiv Detail & Related papers (2022-12-31T08:58:39Z) - Depth Monocular Estimation with Attention-based Encoder-Decoder Network
from Single Image [7.753378095194288]
Vision-based approaches have recently received much attention and can overcome these drawbacks.
In this work, we explore an extreme scenario in vision-based settings: estimate a depth map from one monocular image severely plagued by grid artifacts and blurry edges.
Our novel approach can find the focus of current image with minimal overhead and avoid losses of depth features.
arXiv Detail & Related papers (2022-10-24T23:01:25Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal.
We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z) - Neural Ray Surfaces for Self-Supervised Learning of Depth and Ego-motion [51.19260542887099]
We show that self-supervision can be used to learn accurate depth and ego-motion estimation without prior knowledge of the camera model.
Inspired by the geometric model of Grossberg and Nayar, we introduce Neural Ray Surfaces (NRS), convolutional networks that represent pixel-wise projection rays.
We demonstrate the use of NRS for self-supervised learning of visual odometry and depth estimation from raw videos obtained using a wide variety of camera systems.
arXiv Detail & Related papers (2020-08-15T02:29:13Z) - View Invariant Human Body Detection and Pose Estimation from Multiple
Depth Sensors [0.7080990243618376]
We propose an end-to-end multi-person 3D pose estimation network, Point R-CNN, using multiple point cloud sources.
We conduct extensive experiments to simulate challenging real world cases, such as individual camera failures, various target appearances, and complex cluttered scenes.
In the meantime, we show our end-to-end network greatly outperforms cascaded state-of-the-art models.
arXiv Detail & Related papers (2020-05-08T19:06:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.