4D Panoptic Segmentation as Invariant and Equivariant Field Prediction
- URL: http://arxiv.org/abs/2303.15651v2
- Date: Tue, 12 Sep 2023 22:33:53 GMT
- Title: 4D Panoptic Segmentation as Invariant and Equivariant Field Prediction
- Authors: Minghan Zhu, Shizhong Han, Hong Cai, Shubhankar Borse, Maani Ghaffari,
Fatih Porikli
- Abstract summary: We develop rotation-equivariant neural networks for 4D panoptic segmentation.
We show that our models achieve higher accuracy with lower computational costs compared to their non-equivariant counterparts.
Our method sets the new state-of-the-art performance and achieves 1st place on the Semantic KITTITI 4D Panoptic leaderboard.
- Score: 48.57732508537554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we develop rotation-equivariant neural networks for 4D
panoptic segmentation. 4D panoptic segmentation is a benchmark task for
autonomous driving that requires recognizing semantic classes and object
instances on the road based on LiDAR scans, as well as assigning temporally
consistent IDs to instances across time. We observe that the driving scenario
is symmetric to rotations on the ground plane. Therefore, rotation-equivariance
could provide better generalization and more robust feature learning.
Specifically, we review the object instance clustering strategies and restate
the centerness-based approach and the offset-based approach as the prediction
of invariant scalar fields and equivariant vector fields. Other sub-tasks are
also unified from this perspective, and different invariant and equivariant
layers are designed to facilitate their predictions. Through evaluation on the
standard 4D panoptic segmentation benchmark of SemanticKITTI, we show that our
equivariant models achieve higher accuracy with lower computational costs
compared to their non-equivariant counterparts. Moreover, our method sets the
new state-of-the-art performance and achieves 1st place on the SemanticKITTI 4D
Panoptic Segmentation leaderboard.
Related papers
- Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection [37.142470149311904]
We propose atemporal equivariant learning framework by considering both spatial and temporal augmentations jointly.
We show our pre-training method for 3D object detection which outperforms existing equivariant and invariant approaches in many settings.
arXiv Detail & Related papers (2024-04-17T20:41:49Z) - Mask4Former: Mask Transformer for 4D Panoptic Segmentation [13.99703660936949]
Mask4Former is the first transformer-based approach unifying semantic instance segmentation and tracking.
Our model directly predicts semantic instances their temporal associations without relying on hand-crafted non-learned association strategies.
Mask4Former achieves a new state-of-the-art on the SemanticTITI test set with a score of 68.4 LSTQ.
arXiv Detail & Related papers (2023-09-28T03:30:50Z) - Multi-body SE(3) Equivariance for Unsupervised Rigid Segmentation and
Motion Estimation [49.56131393810713]
We present an SE(3) equivariant architecture and a training strategy to tackle this task in an unsupervised manner.
Our method excels in both model performance and computational efficiency, with only 0.25M parameters and 0.92G FLOPs.
arXiv Detail & Related papers (2023-06-08T22:55:32Z) - Optimization Dynamics of Equivariant and Augmented Neural Networks [2.7918308693131135]
We investigate the optimization of neural networks on symmetric data.
We compare the strategy of constraining the architecture to be equivariant to that of using data augmentation.
Our analysis reveals that even in the latter situation, stationary points may be unstable for augmented training although they are stable for the manifestly equivariant models.
arXiv Detail & Related papers (2023-03-23T17:26:12Z) - Self-Supervised Learning for Group Equivariant Neural Networks [75.62232699377877]
Group equivariant neural networks are the models whose structure is restricted to commute with the transformations on the input.
We propose two concepts for self-supervised tasks: equivariant pretext labels and invariant contrastive loss.
Experiments on standard image recognition benchmarks demonstrate that the equivariant neural networks exploit the proposed self-supervised tasks.
arXiv Detail & Related papers (2023-03-08T08:11:26Z) - PointInst3D: Segmenting 3D Instances by Points [136.7261709896713]
We propose a fully-convolutional 3D point cloud instance segmentation method that works in a per-point prediction fashion.
We find the key to its success is assigning a suitable target to each sampled point.
Our approach achieves promising results on both ScanNet and S3DIS benchmarks.
arXiv Detail & Related papers (2022-04-25T02:41:46Z) - Equivariant Point Network for 3D Point Cloud Analysis [17.689949017410836]
We propose an effective and practical SE(3) (3D translation and rotation) equivariant network for point cloud analysis.
First, we present SE(3) separable point convolution, a novel framework that breaks down the 6D convolution into two separable convolutional operators.
Second, we introduce an attention layer to effectively harness the expressiveness of the equivariant features.
arXiv Detail & Related papers (2021-03-25T21:57:10Z) - Invariant Deep Compressible Covariance Pooling for Aerial Scene
Categorization [80.55951673479237]
We propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.
We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:13:07Z) - Quaternion Equivariant Capsule Networks for 3D Point Clouds [58.566467950463306]
We present a 3D capsule module for processing point clouds that is equivariant to 3D rotations and translations.
We connect dynamic routing between capsules to the well-known Weiszfeld algorithm.
Based on our operator, we build a capsule network that disentangles geometry from pose.
arXiv Detail & Related papers (2019-12-27T13:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.