4D Panoptic LiDAR Segmentation
- URL: http://arxiv.org/abs/2102.12472v1
- Date: Wed, 24 Feb 2021 18:56:16 GMT
- Title: 4D Panoptic LiDAR Segmentation
- Authors: Mehmet Ayg\"un, Aljo\v{s}a O\v{s}ep, Mark Weber, Maxim Maximov, Cyrill
Stachniss, Jens Behley, Laura Leal-Taix\'e
- Abstract summary: We propose 4D panoptic LiDAR segmentation to assign a semantic class and a temporally-consistent instance ID to a sequence of 3D points.
Inspired by recent advances in benchmarking of multi-object tracking, we propose to adopt a new evaluation metric that separates the semantic and point-to-instance association of the task.
- Score: 27.677435778317054
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Temporal semantic scene understanding is critical for self-driving cars or
robots operating in dynamic environments. In this paper, we propose 4D panoptic
LiDAR segmentation to assign a semantic class and a temporally-consistent
instance ID to a sequence of 3D points. To this end, we present an approach and
a point-centric evaluation metric. Our approach determines a semantic class for
every point while modeling object instances as probability distributions in the
4D spatio-temporal domain. We process multiple point clouds in parallel and
resolve point-to-instance associations, effectively alleviating the need for
explicit temporal data association. Inspired by recent advances in benchmarking
of multi-object tracking, we propose to adopt a new evaluation metric that
separates the semantic and point-to-instance association aspects of the task.
With this work, we aim at paving the road for future developments of temporal
LiDAR panoptic perception.
Related papers
- Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion.
Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z) - SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues.
Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects.
We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z) - Mask4Former: Mask Transformer for 4D Panoptic Segmentation [13.99703660936949]
Mask4Former is the first transformer-based approach unifying semantic instance segmentation and tracking.
Our model directly predicts semantic instances their temporal associations without relying on hand-crafted non-learned association strategies.
Mask4Former achieves a new state-of-the-art on the SemanticTITI test set with a score of 68.4 LSTQ.
arXiv Detail & Related papers (2023-09-28T03:30:50Z) - A Spatiotemporal Correspondence Approach to Unsupervised LiDAR
Segmentation with Traffic Applications [16.260518238832887]
Key idea is to leverage the nature of a dynamic point cloud sequence and introduce drastically stronger scenarios.
We alternate between optimizing semantic into groups and clustering using point-wisetemporal labels.
Our method can learn discriminative features in an unsupervised learning fashion.
arXiv Detail & Related papers (2023-08-23T21:32:46Z) - Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos [63.94040814459116]
Self-supervised methods have shown remarkable progress in learning high-level semantics and low-level temporal correspondence.
We propose a novel semantic-aware masked slot attention on top of the fused semantic features and correspondence maps.
We adopt semantic- and instance-level temporal consistency as self-supervision to encourage temporally coherent object-centric representations.
arXiv Detail & Related papers (2023-08-19T09:12:13Z) - Learning Monocular Depth in Dynamic Environment via Context-aware
Temporal Attention [9.837958401514141]
We present CTA-Depth, a Context-aware Temporal Attention guided network for multi-frame monocular Depth estimation.
Our approach achieves significant improvements over state-of-the-art approaches on three benchmark datasets.
arXiv Detail & Related papers (2023-05-12T11:48:32Z) - Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream.
At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank.
To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - Instance Segmentation with Cross-Modal Consistency [13.524441194366544]
We introduce a novel approach to instance segmentation that jointly leverages measurements from multiple sensor modalities.
Our technique applies contrastive learning to points in the scene both across sensor modalities and the temporal domain.
We demonstrate that this formulation encourages the models to learn embeddings that are invariant to viewpoint variations.
arXiv Detail & Related papers (2022-10-14T21:17:19Z) - Learning Spatial and Temporal Variations for 4D Point Cloud Segmentation [0.39373541926236766]
We argue that the temporal information across the frames provides crucial knowledge for 3D scene perceptions.
We design a temporal variation-aware module and a temporal voxel-point refiner to capture the temporal variation in the 4D point cloud.
arXiv Detail & Related papers (2022-07-11T07:36:26Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.