Related papers: Are We Ready for Real-Time LiDAR Semantic Segmentation in Autonomous Driving?

Are We Ready for Real-Time LiDAR Semantic Segmentation in Autonomous Driving?

URL: http://arxiv.org/abs/2410.08365v1
Date: Thu, 10 Oct 2024 20:47:33 GMT
Title: Are We Ready for Real-Time LiDAR Semantic Segmentation in Autonomous Driving?
Authors: Samir Abou Haidar, Alexandre Chariot, Mehdi Darouich, Cyril Joly, Jean-Emmanuel Deschaud,
Abstract summary: Scene semantic segmentation can be achieved by directly integrating 3D spatial data with specialized deep neural networks. We investigate various 3D semantic segmentation methodologies and analyze their performance and capabilities for resource-constrained inference on embedded NVIDIA Jetson platforms.
Score: 42.348499880894686
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Within a perception framework for autonomous mobile and robotic systems, semantic analysis of 3D point clouds typically generated by LiDARs is key to numerous applications, such as object detection and recognition, and scene reconstruction. Scene semantic segmentation can be achieved by directly integrating 3D spatial data with specialized deep neural networks. Although this type of data provides rich geometric information regarding the surrounding environment, it also presents numerous challenges: its unstructured and sparse nature, its unpredictable size, and its demanding computational requirements. These characteristics hinder the real-time semantic analysis, particularly on resource-constrained hardware architectures that constitute the main computational components of numerous robotic applications. Therefore, in this paper, we investigate various 3D semantic segmentation methodologies and analyze their performance and capabilities for resource-constrained inference on embedded NVIDIA Jetson platforms. We evaluate them for a fair comparison through a standardized training protocol and data augmentations, providing benchmark results on the Jetson AGX Orin and AGX Xavier series for two large-scale outdoor datasets: SemanticKITTI and nuScenes.

Related papers

R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation [74.41728218960465]
We propose a real-to-real 3D data generation framework (R2RGen) that directly augments the pointcloud observation-action pairs to generate real-world data.<n>R2RGen substantially enhances data efficiency on extensive experiments and demonstrates strong potential for scaling and application on mobile manipulation.
arXiv Detail & Related papers (2025-10-09T17:55:44Z)
SURPRISE3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes [105.8644620467576]
We introduce Stextscurprise3D, a novel dataset designed to evaluate language-guided spatial reasoning segmentation in complex 3D scenes.<n>Stextscurprise3D consists of more than 200k vision language pairs across 900+ detailed indoor scenes from ScanNet++ v2.<n>The dataset contains 89k+ human-annotated spatial queries deliberately crafted without object name.
arXiv Detail & Related papers (2025-07-10T14:01:24Z)
A large-scale, physically-based synthetic dataset for satellite pose estimation [0.0]
This paper introduces the DLVS3-HST-V1 dataset, which focuses on the Hubble Space Telescope (HST) as a complex, articulated target.<n>The dataset is generated using advanced real-time and offline rendering technologies, integrating high-fidelity 3D models, dynamic lighting, and physically accurate material properties.<n>The pipeline supports the creation of large-scale, richly annotated image sets with ground-truth 6-DoF pose and keypoint data, semantic segmentation, depth, and normal maps.
arXiv Detail & Related papers (2025-06-15T09:24:32Z)
E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models [78.1674905950243]
We present the first comprehensive benchmark for 3D geometric foundation models (GFMs)<n>GFMs directly predict dense 3D representations in a single feed-forward pass, eliminating the need for slow or unavailable precomputed camera parameters.<n>We evaluate 16 state-of-the-art GFMs, revealing their strengths and limitations across tasks and domains.<n>All code, evaluation scripts, and processed data will be publicly released to accelerate research in 3D spatial intelligence.
arXiv Detail & Related papers (2025-06-02T17:53:09Z)
Automating 3D Dataset Generation with Neural Radiance Fields [0.0]
Training performant detection models require diverse, precisely annotated, and large scale datasets. We propose a pipeline for automatic generation of 3D datasets for arbitrary objects. Our pipeline is fast, easy to use and has a high degree of automation.
arXiv Detail & Related papers (2025-03-20T10:01:32Z)
Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters. We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters. The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z)
A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented, Temporal and Depth-aware design [77.34726150561087]
We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles. Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
arXiv Detail & Related papers (2023-03-08T01:29:55Z)
Organelle-specific segmentation, spatial analysis, and visualization of volume electron microscopy datasets [1.5332481598232222]
Volume electron microscopy is the method of choice for the in-situ interrogation of cellular ultrastructure at the nanometer scale. Recent technical advances have led to a rapid increase in large raw image datasets that require computational strategies for segmentation and spatial analysis. We describe a practical and annotation-efficient pipeline for organelle-specific segmentation, spatial analysis, and visualization of large volume electron microscopy datasets.
arXiv Detail & Related papers (2023-03-07T13:23:31Z)
LiDAR-CS Dataset: LiDAR Point Cloud Dataset with Cross-Sensors for 3D Object Detection [36.77084564823707]
deep learning methods heavily rely on annotated data and often face domain generalization issues. LiDAR-CS dataset is the first dataset that addresses the sensor-related gaps in the domain of 3D object detection in real traffic.
arXiv Detail & Related papers (2023-01-29T19:10:35Z)
MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis. The proposed dataset contains 100,000 images and 25 different object types. We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z)
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution. A natural remedy is to utilize the 3D voxelization and 3D convolution network. We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z)
S3Net: 3D LiDAR Sparse Semantic Segmentation Network [1.330528227599978]
S3Net is a novel convolutional neural network for LiDAR point cloud semantic segmentation. It adopts an encoder-decoder backbone that consists of Sparse Intra-channel Attention Module (SIntraAM) and Sparse Inter-channel Attention Module (SInterAM)
arXiv Detail & Related papers (2021-03-15T22:15:24Z)
H3D: Benchmark on Semantic Segmentation of High-Resolution 3D Point Clouds and textured Meshes from UAV LiDAR and Multi-View-Stereo [4.263987603222371]
This paper introduces a 3D dataset which is unique in three ways. It depicts the village of Hessigheim (Germany) henceforth referred to as H3D. It is designed for promoting research in the field of 3D data analysis on one hand and to evaluate and rank emerging approaches.
arXiv Detail & Related papers (2021-02-10T09:33:48Z)
RELLIS-3D Dataset: Data, Benchmarks and Analysis [16.803548871633957]
RELLIS-3D is a multimodal dataset collected in an off-road environment. The data was collected on the Rellis Campus of Texas A&M University.
arXiv Detail & Related papers (2020-11-17T18:28:01Z)
Improving Point Cloud Semantic Segmentation by Learning 3D Object Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving. Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes. We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.