Unsupervised 4D LiDAR Moving Object Segmentation in Stationary Settings
with Multivariate Occupancy Time Series
- URL: http://arxiv.org/abs/2212.14750v1
- Date: Fri, 30 Dec 2022 14:48:14 GMT
- Title: Unsupervised 4D LiDAR Moving Object Segmentation in Stationary Settings
with Multivariate Occupancy Time Series
- Authors: Thomas Kreutz, Max M\"uhlh\"auser, and Alejandro Sanchez Guinea
- Abstract summary: We address the problem of unsupervised moving object segmentation (MOS) in 4D LiDAR data recorded from a stationary sensor, where no ground truth annotations are involved.
We propose a novel 4D LiDAR representation based on a time series that relaxes the problem of unsupervised MOS.
Experiments on stationary scenes from the Raw KITTI dataset show that our fully unsupervised approach achieves performance that is comparable to that of supervised state-of-the-art approaches.
- Score: 62.997667081978825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we address the problem of unsupervised moving object
segmentation (MOS) in 4D LiDAR data recorded from a stationary sensor, where no
ground truth annotations are involved. Deep learning-based state-of-the-art
methods for LiDAR MOS strongly depend on annotated ground truth data, which is
expensive to obtain and scarce in existence. To close this gap in the
stationary setting, we propose a novel 4D LiDAR representation based on
multivariate time series that relaxes the problem of unsupervised MOS to a time
series clustering problem. More specifically, we propose modeling the change in
occupancy of a voxel by a multivariate occupancy time series (MOTS), which
captures spatio-temporal occupancy changes on the voxel level and its
surrounding neighborhood. To perform unsupervised MOS, we train a neural
network in a self-supervised manner to encode MOTS into voxel-level feature
representations, which can be partitioned by a clustering algorithm into moving
or stationary. Experiments on stationary scenes from the Raw KITTI dataset show
that our fully unsupervised approach achieves performance that is comparable to
that of supervised state-of-the-art approaches.
Related papers
- Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent [53.637837706712794]
We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs.
Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction.
We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
arXiv Detail & Related papers (2024-05-27T22:15:23Z) - Semi-supervised Open-World Object Detection [74.95267079505145]
We introduce a more realistic formulation, named semi-supervised open-world detection (SS-OWOD)
We demonstrate that the performance of the state-of-the-art OWOD detector dramatically deteriorates in the proposed SS-OWOD setting.
Our experiments on 4 datasets including MS COCO, PASCAL, Objects365 and DOTA demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-02-25T07:12:51Z) - Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking [52.393359791978035]
Motion2VecSets is a 4D diffusion model for dynamic surface reconstruction from point cloud sequences.
We parameterize 4D dynamics with latent sets instead of using global latent codes.
For more temporally-coherent object tracking, we synchronously denoise deformation latent sets and exchange information across multiple frames.
arXiv Detail & Related papers (2024-01-12T15:05:08Z) - InsMOS: Instance-Aware Moving Object Segmentation in LiDAR Data [13.196031553445117]
We propose a novel network that addresses the challenge of segmenting moving objects in 3D LiDAR scans.
Our method exploits a sequence of point clouds as input and quantifies them into 4D voxels.
We use 4D sparse convolutions to extract motion features from the 4D voxels and inject them into the current scan.
arXiv Detail & Related papers (2023-03-07T14:12:52Z) - Optimal Transport for Change Detection on LiDAR Point Clouds [16.552050876277242]
Unsupervised change detection between airborne LiDAR data points is difficult due to unmatching spatial support and noise from the acquisition system.
We propose an unsupervised approach based on the computation of the transport of 3D LiDAR points over two temporal supports.
Our method allows for unsupervised multi-class classification and outperforms the previous state-of-the-art unsupervised approaches by a significant margin.
arXiv Detail & Related papers (2023-02-14T13:08:07Z) - Efficient Spatial-Temporal Information Fusion for LiDAR-Based 3D Moving
Object Segmentation [23.666607237164186]
We propose a novel deep neural network exploiting both spatial-temporal information and different representation modalities of LiDAR scans to improve LiDAR-MOS performance.
Specifically, we first use a range image-based dual-branch structure to separately deal with spatial and temporal information.
We also use a point refinement module via 3D sparse convolution to fuse the information from both LiDAR range image and point cloud representations.
arXiv Detail & Related papers (2022-07-05T17:59:17Z) - LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods in both tasks.
We extend DS-Net to 4D panoptic LiDAR segmentation by the temporally unified instance clustering on aligned LiDAR frames.
arXiv Detail & Related papers (2022-03-14T15:25:42Z) - LiMoSeg: Real-time Bird's Eye View based LiDAR Motion Segmentation [8.184561295177623]
This paper proposes a novel real-time architecture for motion segmentation of Light Detection and Ranging (LiDAR) data.
We use two successive scans of LiDAR data in 2D Bird's Eye View representation to perform pixel-wise classification as static or moving.
We demonstrate a low latency of 8 ms on a commonly used automotive embedded platform, namely Nvidia Jetson Xavier.
arXiv Detail & Related papers (2021-11-08T23:40:55Z) - Unsupervised Representation Learning for Time Series with Temporal
Neighborhood Coding [8.45908939323268]
We propose a self-supervised framework for learning generalizable representations for non-stationary time series.
Our motivation stems from the medical field, where the ability to model the dynamic nature of time series data is especially valuable.
arXiv Detail & Related papers (2021-06-01T19:53:24Z) - SCRDet++: Detecting Small, Cluttered and Rotated Objects via
Instance-Level Feature Denoising and Rotation Loss Smoothing [131.04304632759033]
Small and cluttered objects are common in real-world which are challenging for detection.
In this paper, we first innovatively introduce the idea of denoising to object detection.
Instance-level denoising on the feature map is performed to enhance the detection to small and cluttered objects.
arXiv Detail & Related papers (2020-04-28T06:03:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.