SegNet4D: Efficient Instance-Aware 4D Semantic Segmentation for LiDAR Point Cloud
- URL: http://arxiv.org/abs/2406.16279v3
- Date: Tue, 03 Dec 2024 12:09:33 GMT
- Title: SegNet4D: Efficient Instance-Aware 4D Semantic Segmentation for LiDAR Point Cloud
- Authors: Neng Wang, Ruibin Guo, Chenghao Shi, Ziyue Wang, Hui Zhang, Huimin Lu, Zhiqiang Zheng, Xieyuanli Chen,
- Abstract summary: We introduce SegNet4D, a novel real-time 4D semantic segmentation network.<n>SegNet4D addresses 4D segmentation as two tasks: single-scan semantic segmentation and moving object segmentation.<n>Our approach surpasses state-of-the-art in both multi-scan semantic segmentation and moving object segmentation.
- Score: 10.442390215931503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 4D LiDAR semantic segmentation, also referred to as multi-scan semantic segmentation, plays a crucial role in enhancing the environmental understanding capabilities of autonomous vehicles or robots. It classifies the semantic category of each LiDAR measurement point and detects whether it is dynamic, a critical ability for tasks like obstacle avoidance and autonomous navigation. Existing approaches often rely on computationally heavy 4D convolutions or recursive networks, which result in poor real-time performance, making them unsuitable for online robotics and autonomous driving applications. In this paper, we introduce SegNet4D, a novel real-time 4D semantic segmentation network offering both efficiency and strong semantic understanding. SegNet4D addresses 4D segmentation as two tasks: single-scan semantic segmentation and moving object segmentation, each tackled by a separate network head. Both results are combined in a motion-semantic fusion module to achieve comprehensive 4D segmentation. Additionally, instance information is extracted from the current scan and exploited for instance-wise segmentation consistency. Our approach surpasses state-of-the-art in both multi-scan semantic segmentation and moving object segmentation while offering greater efficiency, enabling real-time operation. Besides, its effectiveness and efficiency have also been validated on a real-world unmanned ground platform. Our code will be released at https://github.com/nubot-nudt/SegNet4D.
Related papers
- BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis [33.53327976669034]
We revisit 3D semantic segmentation through a more granular lens, shedding light on subtle complexities that are typically overshadowed by broader performance metrics.
We introduce an innovative 3D semantic segmentation network called BFANet that incorporates detailed analysis of semantic boundary features.
arXiv Detail & Related papers (2025-03-16T15:13:11Z) - A Novel Decomposed Feature-Oriented Framework for Open-Set Semantic Segmentation on LiDAR Data [6.427051055902494]
We propose a feature-oriented framework for open-set semantic segmentation on LiDAR data.
We design a dual-decoder network to simultaneously perform closed-set semantic segmentation and generate distinctive features for unknown objects.
By integrating the results of close-set semantic segmentation and anomaly detection, we achieve effective feature-driven LiDAR open-set semantic segmentation.
arXiv Detail & Related papers (2025-03-14T05:40:05Z) - 3D Part Segmentation via Geometric Aggregation of 2D Visual Features [57.20161517451834]
Supervised 3D part segmentation models are tailored for a fixed set of objects and parts, limiting their transferability to open-set, real-world scenarios.
Recent works have explored vision-language models (VLMs) as a promising alternative, using multi-view rendering and textual prompting to identify object parts.
To address these limitations, we propose COPS, a COmprehensive model for Parts that blends semantics extracted from visual concepts and 3D geometry to effectively identify object parts.
arXiv Detail & Related papers (2024-12-05T15:27:58Z) - Bayesian Self-Training for Semi-Supervised 3D Segmentation [59.544558398992386]
3D segmentation is a core problem in computer vision.
densely labeling 3D point clouds to employ fully-supervised training remains too labor intensive and expensive.
Semi-supervised training provides a more practical alternative, where only a small set of labeled data is given, accompanied by a larger unlabeled set.
arXiv Detail & Related papers (2024-09-12T14:54:31Z) - SegPoint: Segment Any Point Cloud via Large Language Model [62.69797122055389]
We propose a model, called SegPoint, to produce point-wise segmentation masks across a diverse range of tasks.
SegPoint is the first model to address varied segmentation tasks within a single framework.
arXiv Detail & Related papers (2024-07-18T17:58:03Z) - Instance Consistency Regularization for Semi-Supervised 3D Instance Segmentation [50.51125319374404]
We propose a novel self-training network InsTeacher3D to explore and exploit pure instance knowledge from unlabeled data.
Experimental results on multiple large-scale datasets show that the InsTeacher3D significantly outperforms prior state-of-the-art semi-supervised approaches.
arXiv Detail & Related papers (2024-06-24T16:35:58Z) - SAM-guided Graph Cut for 3D Instance Segmentation [60.75119991853605]
This paper addresses the challenge of 3D instance segmentation by simultaneously leveraging 3D geometric and multi-view image information.
We introduce a novel 3D-to-2D query framework to effectively exploit 2D segmentation models for 3D instance segmentation.
Our method achieves robust segmentation performance and can generalize across different types of scenes.
arXiv Detail & Related papers (2023-12-13T18:59:58Z) - Exploiting the Complementarity of 2D and 3D Networks to Address
Domain-Shift in 3D Semantic Segmentation [14.30113021974841]
3D semantic segmentation is a critical task in many real-world applications, such as autonomous driving, robotics, and mixed reality.
A possible solution is to combine the 3D information with others coming from sensors featuring a different modality, such as RGB cameras.
Recent multi-modal 3D semantic segmentation networks exploit these modalities relying on two branches that process the 2D and 3D information independently.
arXiv Detail & Related papers (2023-04-06T10:59:43Z) - Semi-Weakly Supervised Object Kinematic Motion Prediction [56.282759127180306]
Given a 3D object, kinematic motion prediction aims to identify the mobile parts as well as the corresponding motion parameters.
We propose a graph neural network to learn the map between hierarchical part-level segmentation and mobile parts parameters.
The network predictions yield a large scale of 3D objects with pseudo labeled mobility information.
arXiv Detail & Related papers (2023-03-31T02:37:36Z) - Semantics-Guided Moving Object Segmentation with 3D LiDAR [32.84782551737681]
Moving object segmentation (MOS) is a task to distinguish moving objects from the surrounding static environment.
We propose a semantics-guided convolutional neural network for moving object segmentation.
arXiv Detail & Related papers (2022-05-06T12:59:54Z) - Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with
Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation.
It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z) - Prototypical Cross-Attention Networks for Multiple Object Tracking and
Segmentation [95.74244714914052]
Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes.
We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich-temporal information online.
PCAN outperforms current video instance tracking and segmentation competition winners on Youtube-VIS and BDD100K datasets.
arXiv Detail & Related papers (2021-06-22T17:57:24Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - 3D-MiniNet: Learning a 2D Representation from Point Clouds for Fast and
Efficient 3D LIDAR Semantic Segmentation [9.581605678437032]
3D-MiniNet is a novel approach for LIDAR semantic segmentation that combines 3D and 2D learning layers.
It first learns a 2D representation from the raw points through a novel projection which extracts local and global information from the 3D data.
These 2D semantic labels are re-projected back to the 3D space and enhanced through a post-processing module.
arXiv Detail & Related papers (2020-02-25T14:33:50Z) - Real-time Fusion Network for RGB-D Semantic Segmentation Incorporating
Unexpected Obstacle Detection for Road-driving Images [13.3382165879322]
We propose a real-time fusion semantic segmentation network termed RFNet.
RFNet is capable of running swiftly, which satisfies autonomous vehicles applications.
On Cityscapes, our method outperforms previous state-of-the-art semantic segmenters, with excellent accuracy and 22Hz inference speed.
arXiv Detail & Related papers (2020-02-24T22:17:25Z) - An Abstraction Model for Semantic Segmentation Algorithms [9.561123408923489]
Semantic segmentation is used in many tasks, such as cancer detection, robot-assisted surgery, satellite image analysis, and self-driving cars.
In this paper, an abstraction model for semantic segmentation offers a comprehensive view of the field.
We compare different approaches and analyze each of the four abstraction blocks' importance in each method's operation.
arXiv Detail & Related papers (2019-12-27T05:39:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.