LiDAR-based Panoptic Segmentation via Dynamic Shifting Network
- URL: http://arxiv.org/abs/2011.11964v2
- Date: Tue, 1 Dec 2020 05:49:08 GMT
- Title: LiDAR-based Panoptic Segmentation via Dynamic Shifting Network
- Authors: Fangzhou Hong, Hui Zhou, Xinge Zhu, Hongsheng Li, Ziwei Liu
- Abstract summary: LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner.
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
- Score: 56.71765153629892
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With the rapid advances of autonomous driving, it becomes critical to equip
its sensing system with more holistic 3D perception. However, existing works
focus on parsing either the objects (e.g. cars and pedestrians) or scenes (e.g.
trees and buildings) from the LiDAR sensor. In this work, we address the task
of LiDAR-based panoptic segmentation, which aims to parse both objects and
scenes in a unified manner. As one of the first endeavors towards this new
challenging task, we propose the Dynamic Shifting Network (DS-Net), which
serves as an effective panoptic segmentation framework in the point cloud
realm. In particular, DS-Net has three appealing properties: 1) strong backbone
design. DS-Net adopts the cylinder convolution that is specifically designed
for LiDAR point clouds. The extracted features are shared by the semantic
branch and the instance branch which operates in a bottom-up clustering style.
2) Dynamic Shifting for complex point distributions. We observe that
commonly-used clustering algorithms like BFS or DBSCAN are incapable of
handling complex autonomous driving scenes with non-uniform point cloud
distributions and varying instance sizes. Thus, we present an efficient
learnable clustering module, dynamic shifting, which adapts kernel functions
on-the-fly for different instances. 3) Consensus-driven Fusion. Finally,
consensus-driven fusion is used to deal with the disagreement between semantic
and instance predictions. To comprehensively evaluate the performance of
LiDAR-based panoptic segmentation, we construct and curate benchmarks from two
large-scale autonomous driving LiDAR datasets, SemanticKITTI and nuScenes.
Extensive experiments demonstrate that our proposed DS-Net achieves superior
accuracies over current state-of-the-art methods. Notably, we achieve 1st place
on the public leaderboard of SemanticKITTI, outperforming 2nd place by 2.6% in
terms of the PQ metric.
Related papers
- Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration [107.61458720202984]
This paper introduces a novel self-supervised learning framework for enhancing 3D perception in autonomous driving scenes.
We propose the learnable transformation alignment to bridge the domain gap between image and point cloud data.
We establish dense 2D-3D correspondences to estimate the rigid pose.
arXiv Detail & Related papers (2024-01-23T02:41:06Z) - LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and
Semantic-Aware Alignment [63.83894701779067]
We propose LCPS, the first LiDAR-Camera Panoptic network.
In our approach, we conduct LiDAR-Camera fusion in three stages.
Our fusion strategy improves about 6.9% PQ performance over the LiDAR-only baseline on NuScenes dataset.
arXiv Detail & Related papers (2023-08-03T10:57:58Z) - LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for
Autonomous Driving [34.119642131912485]
We present a more artful framework, LiDAR-guided Weakly Supervised Instance (LWSIS)
LWSIS uses the off-the-shelf 3D data, i.e., Point Cloud, together with the 3D boxes, as natural weak supervisions for training the 2D image instance segmentation models.
Our LWSIS not only exploits the complementary information in multimodal data during training, but also significantly reduces the cost of the dense 2D masks.
arXiv Detail & Related papers (2022-12-07T08:08:01Z) - Panoptic-PHNet: Towards Real-Time and High-Precision LiDAR Panoptic
Segmentation via Clustering Pseudo Heatmap [9.770808277353128]
We propose a fast and high-performance LiDAR-based framework, referred to as Panoptic-PHNet.
We introduce a clustering pseudo heatmap as a new paradigm, which, followed by a center grouping module, yields instance centers for efficient clustering.
For backbone design, we fuse the fine-grained voxel features and the 2D Bird's Eye View (BEV) features with different receptive fields to utilize both detailed and global information.
arXiv Detail & Related papers (2022-05-14T08:16:13Z) - LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks.
We extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
arXiv Detail & Related papers (2022-03-14T15:25:42Z) - The Devil is in the Task: Exploiting Reciprocal Appearance-Localization
Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving.
We introduce a Dynamic Feature Reflecting Network, named DFR-Net.
We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z) - CPSeg: Cluster-free Panoptic Segmentation of 3D LiDAR Point Clouds [2.891413712995641]
We propose a novel real-time end-to-end panoptic segmentation network for LiDAR point clouds, called CPSeg.
CPSeg comprises a shared encoder, a dual decoder, a task-aware attention module (TAM) and a cluster-free instance segmentation head.
arXiv Detail & Related papers (2021-11-02T16:44:06Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - (AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection
for Sparse Semantic Segmentation Network [3.6967381030744515]
We propose AF2-S3Net, an end-to-end encoder-decoder CNN network for 3D LiDAR semantic segmentation.
We present a novel multi-branch attentive feature fusion module in the encoder and a unique adaptive feature selection module with feature map re-weighting in the decoder.
Our experimental results show that the proposed method outperforms the state-of-the-art approaches on the large-scale SemanticKITTI benchmark.
arXiv Detail & Related papers (2021-02-08T21:04:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.