SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation
Separation and BEV Fusion
- URL: http://arxiv.org/abs/2306.15349v1
- Date: Tue, 27 Jun 2023 10:02:45 GMT
- Title: SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation
Separation and BEV Fusion
- Authors: Jianbiao Mei, Yu Yang, Mengmeng Wang, Tianxin Huang, Xuemeng Yang and
Yong Liu
- Abstract summary: We propose to solve outdoor SSC from the perspective of representation separation and BEV fusion.
We present the network, named SSC-RS, which uses separate branches with deep supervision to explicitly disentangle the learning procedure of the semantic and geometric representations.
A BEV fusion network equipped with the proposed Adaptive Representation Fusion (ARF) module is presented to aggregate the multi-scale features effectively and efficiently.
- Score: 17.459062337718677
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Semantic scene completion (SSC) jointly predicts the semantics and geometry
of the entire 3D scene, which plays an essential role in 3D scene understanding
for autonomous driving systems. SSC has achieved rapid progress with the help
of semantic context in segmentation. However, how to effectively exploit the
relationships between the semantic context in semantic segmentation and
geometric structure in scene completion remains under exploration. In this
paper, we propose to solve outdoor SSC from the perspective of representation
separation and BEV fusion. Specifically, we present the network, named SSC-RS,
which uses separate branches with deep supervision to explicitly disentangle
the learning procedure of the semantic and geometric representations. And a BEV
fusion network equipped with the proposed Adaptive Representation Fusion (ARF)
module is presented to aggregate the multi-scale features effectively and
efficiently. Due to the low computational burden and powerful representation
ability, our model has good generality while running in real-time. Extensive
experiments on SemanticKITTI demonstrate our SSC-RS achieves state-of-the-art
performance.
Related papers
- Hierarchical Context Alignment with Disentangled Geometric and Temporal Modeling for Semantic Occupancy Prediction [61.484280369655536]
Camera-based 3D Semantic Occupancy Prediction (SOP) is crucial for understanding complex 3D scenes from limited 2D image observations.
Existing SOP methods typically aggregate contextual features to assist the occupancy representation learning.
We introduce a new Hierarchical context alignment paradigm for a more accurate SOP (Hi-SOP)
arXiv Detail & Related papers (2024-12-11T09:53:10Z) - InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception [17.530797215534456]
3D scene understanding has become an essential area of research with applications in autonomous driving, robotics, and augmented reality.
We propose InstanceGaussian, a method that jointly learns appearance and semantic features while adaptively aggregating instances.
Our approach achieves state-of-the-art performance in category-agnostic, open-vocabulary 3D point-level segmentation.
arXiv Detail & Related papers (2024-11-28T16:08:36Z) - Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion.
It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing.
Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z) - S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR)
Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection.
In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z) - S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching
for Autonomous Driving [40.305452898732774]
S$3$M-Net is a novel joint learning framework developed to perform semantic segmentation and stereo matching simultaneously.
S$3$M-Net shares the features extracted from RGB images between both tasks, resulting in an improved overall scene understanding capability.
arXiv Detail & Related papers (2024-01-21T06:47:33Z) - Camera-based 3D Semantic Scene Completion with Sparse Guidance Network [18.415854443539786]
We propose a camera-based semantic scene completion framework called SGN.
SGN propagates semantics from semantic-aware seed voxels to the whole scene based on spatial geometry cues.
Our experimental results demonstrate the superiority of our SGN over existing state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T04:17:27Z) - SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving [87.8761593366609]
SSCBench is a benchmark that integrates scenes from widely used automotive datasets.
We benchmark models using monocular, trinocular, and cloud input to assess the performance gap.
We have unified semantic labels across diverse datasets to simplify cross-domain generalization testing.
arXiv Detail & Related papers (2023-06-15T09:56:33Z) - A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented,
Temporal and Depth-aware design [77.34726150561087]
We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles.
Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
arXiv Detail & Related papers (2023-03-08T01:29:55Z) - Semantic Segmentation-assisted Scene Completion for LiDAR Point Clouds [9.489733900529204]
We propose an end-to-end semantic segmentation-assisted scene completion network.
The network takes a raw point cloud as input, and merges the features from the segmentation branch into the completion branch hierarchically.
Our method achieves competitive performance on Semantic KITTI dataset with low latency.
arXiv Detail & Related papers (2021-09-23T15:55:45Z) - Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with
Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation.
It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.