SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving
- URL: http://arxiv.org/abs/2306.09001v3
- Date: Mon, 30 Sep 2024 13:27:59 GMT
- Title: SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving
- Authors: Yiming Li, Sihang Li, Xinhao Liu, Moonjun Gong, Kenan Li, Nuo Chen, Zijun Wang, Zhiheng Li, Tao Jiang, Fisher Yu, Yue Wang, Hang Zhao, Zhiding Yu, Chen Feng,
- Abstract summary: SSCBench is a benchmark that integrates scenes from widely used automotive datasets.
We benchmark models using monocular, trinocular, and cloud input to assess the performance gap.
We have unified semantic labels across diverse datasets to simplify cross-domain generalization testing.
- Score: 87.8761593366609
- License:
- Abstract: Monocular scene understanding is a foundational component of autonomous systems. Within the spectrum of monocular perception topics, one crucial and useful task for holistic 3D scene understanding is semantic scene completion (SSC), which jointly completes semantic information and geometric details from RGB input. However, progress in SSC, particularly in large-scale street views, is hindered by the scarcity of high-quality datasets. To address this issue, we introduce SSCBench, a comprehensive benchmark that integrates scenes from widely used automotive datasets (e.g., KITTI-360, nuScenes, and Waymo). SSCBench follows an established setup and format in the community, facilitating the easy exploration of SSC methods in various street views. We benchmark models using monocular, trinocular, and point cloud input to assess the performance gap resulting from sensor coverage and modality. Moreover, we have unified semantic labels across diverse datasets to simplify cross-domain generalization testing. We commit to including more datasets and SSC models to drive further advancements in this field.
Related papers
- Towards 3D Semantic Scene Completion for Autonomous Driving: A Meta-Learning Framework Empowered by Deformable Large-Kernel Attention and Mamba Model [1.6835437621159244]
We introduce MetaSSC, a novel meta-learning-based framework for semantic scene completion (SSC)
Our approach begins with a voxel-based semantic segmentation (SS) pretraining task, aimed at exploring the semantics and geometry of incomplete regions.
Using simulated cooperative perception datasets, we supervise the perception training of a single vehicle using aggregated sensor data.
This meta-knowledge is then adapted to the target domain through a dual-phase training strategy, enabling efficient deployment.
arXiv Detail & Related papers (2024-11-06T05:11:25Z) - HS3-Bench: A Benchmark and Strong Baseline for Hyperspectral Semantic Segmentation in Driving Scenarios [3.7498611358320733]
There is no standard benchmark available to measure progress on semantic segmentation in driving scenarios.
In this paper, we provide the HyperSpectral Semantic benchmark (HS3-Bench)
It combines annotated hyperspectral images from three driving scenario datasets and provides standardized metrics, implementations, and evaluation protocols.
arXiv Detail & Related papers (2024-09-17T14:00:49Z) - MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations [55.022519020409405]
This paper builds the first largest ever multi-modal 3D scene dataset and benchmark with hierarchical grounded language annotations, MMScan.
The resulting multi-modal 3D dataset encompasses 1.4M meta-annotated captions on 109k objects and 7.7k regions as well as over 3.04M diverse samples for 3D visual grounding and question-answering benchmarks.
arXiv Detail & Related papers (2024-06-13T17:59:30Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - Camera-based 3D Semantic Scene Completion with Sparse Guidance Network [18.415854443539786]
We propose a camera-based semantic scene completion framework called SGN.
SGN propagates semantics from semantic-aware seed voxels to the whole scene based on spatial geometry cues.
Our experimental results demonstrate the superiority of our SGN over existing state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T04:17:27Z) - SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation
Separation and BEV Fusion [17.459062337718677]
We propose to solve outdoor SSC from the perspective of representation separation and BEV fusion.
We present the network, named SSC-RS, which uses separate branches with deep supervision to explicitly disentangle the learning procedure of the semantic and geometric representations.
A BEV fusion network equipped with the proposed Adaptive Representation Fusion (ARF) module is presented to aggregate the multi-scale features effectively and efficiently.
arXiv Detail & Related papers (2023-06-27T10:02:45Z) - Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation
for autonomous vehicles [63.20765930558542]
3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization.
We propose a new dataset, Navya 3D (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain.
It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds.
arXiv Detail & Related papers (2023-02-16T13:41:19Z) - MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense
Top-View Understanding [27.867824780748606]
We introduce MASS - a Multi-Attentional Semantic model for dense top-view understanding of driving scenes.
Our framework operates on pillar- and occupancy features and comprises three attention-based building blocks.
Our model is shown to be very effective for 3D object detection validated on the KITTI-3D dataset.
arXiv Detail & Related papers (2021-07-01T10:19:32Z) - Semantic Segmentation on Swiss3DCities: A Benchmark Study on Aerial
Photogrammetric 3D Pointcloud Dataset [67.44497676652173]
We introduce a new outdoor urban 3D pointcloud dataset, covering a total area of 2.7 $km2$, sampled from three Swiss cities.
The dataset is manually annotated for semantic segmentation with per-point labels, and is built using photogrammetry from images acquired by multirotors equipped with high-resolution cameras.
arXiv Detail & Related papers (2020-12-23T21:48:47Z) - Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical
Understanding of Outdoor Scene [76.4183572058063]
We present a richly-annotated 3D point cloud dataset for multiple outdoor scene understanding tasks.
The dataset has been point-wisely annotated with both hierarchical and instance-based labels.
We formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.
arXiv Detail & Related papers (2020-08-11T19:10:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.