Related papers: LiDAR-based Panoptic Segmentation via Dynamic Shifting Network

LiDAR-based Panoptic Segmentation via Dynamic Shifting Network

URL: http://arxiv.org/abs/2011.11964v2
Date: Tue, 1 Dec 2020 05:49:08 GMT
Title: LiDAR-based Panoptic Segmentation via Dynamic Shifting Network
Authors: Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu
Abstract summary: LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner. We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
Score: 56.71765153629892
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: With the rapid advances of autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. However, existing works focus on parsing either the objects (e.g. cars and pedestrians) or scenes (e.g. trees and buildings) from the LiDAR sensor. In this work, we address the task of LiDAR-based panoptic segmentation, which aims to parse both objects and scenes in a unified manner. As one of the first endeavors towards this new challenging task, we propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. In particular, DS-Net has three appealing properties: 1) strong backbone design. DS-Net adopts the cylinder convolution that is specifically designed for LiDAR point clouds. The extracted features are shared by the semantic branch and the instance branch which operates in a bottom-up clustering style. 2) Dynamic Shifting for complex point distributions. We observe that commonly-used clustering algorithms like BFS or DBSCAN are incapable of handling complex autonomous driving scenes with non-uniform point cloud distributions and varying instance sizes. Thus, we present an efficient learnable clustering module, dynamic shifting, which adapts kernel functions on-the-fly for different instances. 3) Consensus-driven Fusion. Finally, consensus-driven fusion is used to deal with the disagreement between semantic and instance predictions. To comprehensively evaluate the performance of LiDAR-based panoptic segmentation, we construct and curate benchmarks from two large-scale autonomous driving LiDAR datasets, SemanticKITTI and nuScenes. Extensive experiments demonstrate that our proposed DS-Net achieves superior accuracies over current state-of-the-art methods. Notably, we achieve 1st place on the public leaderboard of SemanticKITTI, outperforming 2nd place by 2.6% in terms of the PQ metric.

Related papers

DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving [62.62464518137153]
DriveTransformer is a simplified E2E-AD framework for the ease of scaling up. It is composed of three unified operations: task self-attention, sensor cross-attention, temporal cross-attention. It achieves state-of-the-art performance in both simulated closed-loop benchmark Bench2Drive and real world open-loop benchmark nuScenes with high FPS.
arXiv Detail & Related papers (2025-03-07T11:41:18Z)
Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration [107.61458720202984]
This paper introduces a novel self-supervised learning framework for enhancing 3D perception in autonomous driving scenes. We propose the learnable transformation alignment to bridge the domain gap between image and point cloud data. We establish dense 2D-3D correspondences to estimate the rigid pose.
arXiv Detail & Related papers (2024-01-23T02:41:06Z)
LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and Semantic-Aware Alignment [63.83894701779067]
We propose LCPS, the first LiDAR-Camera Panoptic network. In our approach, we conduct LiDAR-Camera fusion in three stages. Our fusion strategy improves about 6.9% PQ performance over the LiDAR-only baseline on NuScenes dataset.
arXiv Detail & Related papers (2023-08-03T10:57:58Z)
LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for Autonomous Driving [34.119642131912485]
We present a more artful framework, LiDAR-guided Weakly Supervised Instance (LWSIS) LWSIS uses the off-the-shelf 3D data, i.e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models. Our LWSIS not only exploits the complementary information in multimodal data during training, but also significantly reduces the cost of the dense 2D masks.
arXiv Detail & Related papers (2022-12-07T08:08:01Z)
Panoptic-PHNet: Towards Real-Time and High-Precision LiDAR Panoptic Segmentation via Clustering Pseudo Heatmap [9.770808277353128]
We propose a fast and high-performance LiDAR-based framework, referred to as Panoptic-PHNet. We introduce a clustering pseudo heatmap as a new paradigm, which, followed by a center grouping module, yields instance centers for efficient clustering. For backbone design, we fuse the fine-grained voxel features and the 2D Bird's Eye View (BEV) features with different receptive fields to utilize both detailed and global information.
arXiv Detail & Related papers (2022-05-14T08:16:13Z)
LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks. We extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
arXiv Detail & Related papers (2022-03-14T15:25:42Z)
The Devil is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving. We introduce a Dynamic Feature Reflecting Network, named DFR-Net. We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z)
CPSeg: Cluster-free Panoptic Segmentation of 3D LiDAR Point Clouds [2.891413712995641]
We propose a novel real-time end-to-end panoptic segmentation network for LiDAR point clouds, called CPSeg. CPSeg comprises a shared encoder, a dual decoder, a task-aware attention module (TAM) and a cluster-free instance segmentation head.
arXiv Detail & Related papers (2021-11-02T16:44:06Z)
Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution. A natural remedy is to utilize the 3D voxelization and 3D convolution network. We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z)
(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network [3.6967381030744515]
We propose AF2-S3Net, an end-to-end encoder-decoder CNN network for 3D LiDAR semantic segmentation. We present a novel multi-branch attentive feature fusion module in the encoder and a unique adaptive feature selection module with feature map re-weighting in the decoder. Our experimental results show that the proposed method outperforms the state-of-the-art approaches on the large-scale SemanticKITTI benchmark.
arXiv Detail & Related papers (2021-02-08T21:04:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.