LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and
Semantic-Aware Alignment
- URL: http://arxiv.org/abs/2308.01686v2
- Date: Fri, 11 Aug 2023 18:32:54 GMT
- Title: LiDAR-Camera Panoptic Segmentation via Geometry-Consistent and
Semantic-Aware Alignment
- Authors: Zhiwei Zhang, Zhizhong Zhang, Qian Yu, Ran Yi, Yuan Xie and Lizhuang
Ma
- Abstract summary: We propose LCPS, the first LiDAR-Camera Panoptic network.
In our approach, we conduct LiDAR-Camera fusion in three stages.
Our fusion strategy improves about 6.9% PQ performance over the LiDAR-only baseline on NuScenes dataset.
- Score: 63.83894701779067
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D panoptic segmentation is a challenging perception task that requires both
semantic segmentation and instance segmentation. In this task, we notice that
images could provide rich texture, color, and discriminative information, which
can complement LiDAR data for evident performance improvement, but their fusion
remains a challenging problem. To this end, we propose LCPS, the first
LiDAR-Camera Panoptic Segmentation network. In our approach, we conduct
LiDAR-Camera fusion in three stages: 1) an Asynchronous Compensation Pixel
Alignment (ACPA) module that calibrates the coordinate misalignment caused by
asynchronous problems between sensors; 2) a Semantic-Aware Region Alignment
(SARA) module that extends the one-to-one point-pixel mapping to one-to-many
semantic relations; 3) a Point-to-Voxel feature Propagation (PVP) module that
integrates both geometric and semantic fusion information for the entire point
cloud. Our fusion strategy improves about 6.9% PQ performance over the
LiDAR-only baseline on NuScenes dataset. Extensive quantitative and qualitative
experiments further demonstrate the effectiveness of our novel framework. The
code will be released at https://github.com/zhangzw12319/lcps.git.
Related papers
- LiOn-XA: Unsupervised Domain Adaptation via LiDAR-Only Cross-Modal Adversarial Training [61.26381389532653]
LiOn-XA is an unsupervised domain adaptation (UDA) approach that combines LiDAR-Only Cross-Modal (X) learning with Adversarial training for 3D LiDAR point cloud semantic segmentation.
Our experiments on 3 real-to-real adaptation scenarios demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-21T09:50:17Z) - Self-supervised Learning of LiDAR 3D Point Clouds via 2D-3D Neural Calibration [107.61458720202984]
This paper introduces a novel self-supervised learning framework for enhancing 3D perception in autonomous driving scenes.
We propose the learnable transformation alignment to bridge the domain gap between image and point cloud data.
We establish dense 2D-3D correspondences to estimate the rigid pose.
arXiv Detail & Related papers (2024-01-23T02:41:06Z) - UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the
OpenPCSeg Codebase [43.95911443801265]
We present a unified multi-modal LiDAR segmentation network, termed UniSeg.
It accomplishes semantic segmentation and panoptic segmentation simultaneously.
We also construct the OpenPCSeg, which is the largest and most comprehensive outdoor LiDAR segmentation.
arXiv Detail & Related papers (2023-09-11T16:00:22Z) - From One to Many: Dynamic Cross Attention Networks for LiDAR and Camera
Fusion [12.792769704561024]
Existing fusion methods tend to align each 3D point to only one projected image pixel based on calibration.
We propose a Dynamic Cross Attention (DCA) module with a novel one-to-many cross-modality mapping.
The whole fusion architecture named Dynamic Cross Attention Network (DCAN) exploits multi-level image features and adapts to multiple representations of point clouds.
arXiv Detail & Related papers (2022-09-25T16:10:14Z) - LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks.
We extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
arXiv Detail & Related papers (2022-03-14T15:25:42Z) - AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object
Detection [46.03951171790736]
We propose textitAutoAlign, an automatic feature fusion strategy for 3D object detection.
We show that our approach can lead to 2.3 mAP and 7.0 mAP improvements on the KITTI and nuScenes datasets.
arXiv Detail & Related papers (2022-01-17T16:08:57Z) - Similarity-Aware Fusion Network for 3D Semantic Segmentation [87.51314162700315]
We propose a similarity-aware fusion network (SAFNet) to adaptively fuse 2D images and 3D point clouds for 3D semantic segmentation.
We employ a late fusion strategy where we first learn the geometric and contextual similarities between the input and back-projected (from 2D pixels) point clouds.
We show that SAFNet significantly outperforms existing state-of-the-art fusion-based approaches across various data integrity.
arXiv Detail & Related papers (2021-07-04T09:28:18Z) - LiDAR-based Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner.
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
arXiv Detail & Related papers (2020-11-24T08:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.