Related papers: SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation Separation and BEV Fusion

SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation Separation and BEV Fusion

URL: http://arxiv.org/abs/2306.15349v1
Date: Tue, 27 Jun 2023 10:02:45 GMT
Title: SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation Separation and BEV Fusion
Authors: Jianbiao Mei, Yu Yang, Mengmeng Wang, Tianxin Huang, Xuemeng Yang and Yong Liu
Abstract summary: We propose to solve outdoor SSC from the perspective of representation separation and BEV fusion. We present the network, named SSC-RS, which uses separate branches with deep supervision to explicitly disentangle the learning procedure of the semantic and geometric representations. A BEV fusion network equipped with the proposed Adaptive Representation Fusion (ARF) module is presented to aggregate the multi-scale features effectively and efficiently.
Score: 17.459062337718677
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Semantic scene completion (SSC) jointly predicts the semantics and geometry of the entire 3D scene, which plays an essential role in 3D scene understanding for autonomous driving systems. SSC has achieved rapid progress with the help of semantic context in segmentation. However, how to effectively exploit the relationships between the semantic context in semantic segmentation and geometric structure in scene completion remains under exploration. In this paper, we propose to solve outdoor SSC from the perspective of representation separation and BEV fusion. Specifically, we present the network, named SSC-RS, which uses separate branches with deep supervision to explicitly disentangle the learning procedure of the semantic and geometric representations. And a BEV fusion network equipped with the proposed Adaptive Representation Fusion (ARF) module is presented to aggregate the multi-scale features effectively and efficiently. Due to the low computational burden and powerful representation ability, our model has good generality while running in real-time. Extensive experiments on SemanticKITTI demonstrate our SSC-RS achieves state-of-the-art performance.

Related papers

econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians [56.85804719947]
We propose econSG for open-vocabulary semantic segmentation with 3DGS. Our econSG shows state-of-the-art performance on four benchmark datasets compared to the existing methods.
arXiv Detail & Related papers (2025-04-08T13:12:31Z)
EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting [108.15136508964011]
EgoSplat is a language-embedded 3D Gaussian Splatting framework for open-vocabulary egocentric scene understanding. EgoSplat achieves state-of-the-art performance in both localization and segmentation tasks on two datasets.
arXiv Detail & Related papers (2025-03-14T12:21:26Z)
Hierarchical Context Alignment with Disentangled Geometric and Temporal Modeling for Semantic Occupancy Prediction [61.484280369655536]
Camera-based 3D Semantic Occupancy Prediction (SOP) is crucial for understanding complex 3D scenes from limited 2D image observations. Existing SOP methods typically aggregate contextual features to assist the occupancy representation learning. We introduce a new Hierarchical context alignment paradigm for a more accurate SOP (Hi-SOP)
arXiv Detail & Related papers (2024-12-11T09:53:10Z)
InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception [17.530797215534456]
3D scene understanding has become an essential area of research with applications in autonomous driving, robotics, and augmented reality. We propose InstanceGaussian, a method that jointly learns appearance and semantic features while adaptively aggregating instances. Our approach achieves state-of-the-art performance in category-agnostic, open-vocabulary 3D point-level segmentation.
arXiv Detail & Related papers (2024-11-28T16:08:36Z)
Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion. It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing. Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z)
S^2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR [50.435592120607815]
Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR) Previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed S2Former-OR.
arXiv Detail & Related papers (2024-02-22T11:40:49Z)
S$^3$M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving [40.305452898732774]
S$3$M-Net is a novel joint learning framework developed to perform semantic segmentation and stereo matching simultaneously. S$3$M-Net shares the features extracted from RGB images between both tasks, resulting in an improved overall scene understanding capability.
arXiv Detail & Related papers (2024-01-21T06:47:33Z)
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing. Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery. We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z)
Camera-based 3D Semantic Scene Completion with Sparse Guidance Network [18.415854443539786]
We propose a camera-based semantic scene completion framework called SGN. SGN propagates semantics from semantic-aware seed voxels to the whole scene based on spatial geometry cues. Our experimental results demonstrate the superiority of our SGN over existing state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T04:17:27Z)
SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving [87.8761593366609]
SSCBench is a benchmark that integrates scenes from widely used automotive datasets. We benchmark models using monocular, trinocular, and cloud input to assess the performance gap. We have unified semantic labels across diverse datasets to simplify cross-domain generalization testing.
arXiv Detail & Related papers (2023-06-15T09:56:33Z)
A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented, Temporal and Depth-aware design [77.34726150561087]
We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles. Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
arXiv Detail & Related papers (2023-03-08T01:29:55Z)
Semantic Segmentation-assisted Scene Completion for LiDAR Point Clouds [9.489733900529204]
We propose an end-to-end semantic segmentation-assisted scene completion network. The network takes a raw point cloud as input, and merges the features from the segmentation branch into the completion branch hierarchically. Our method achieves competitive performance on Semantic KITTI dataset with low latency.
arXiv Detail & Related papers (2021-09-23T15:55:45Z)
Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation. It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences. We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z)
SASO: Joint 3D Semantic-Instance Segmentation via Multi-scale Semantic Association and Salient Point Clustering Optimization [8.519716460338518]
We propose a novel 3D point cloud segmentation framework named SASO, which jointly performs semantic and instance segmentation tasks. For semantic segmentation task, inspired by the inherent correlation among objects in spatial context, we propose a Multi-scale Semantic Association (MSA) module. For instance segmentation task, different from previous works that utilize clustering only in inference procedure, we propose a Salient Point Clustering Optimization (SPCO) module.
arXiv Detail & Related papers (2020-06-25T08:55:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.