ContrastMotion: Self-supervised Scene Motion Learning for Large-Scale
LiDAR Point Clouds
- URL: http://arxiv.org/abs/2304.12589v1
- Date: Tue, 25 Apr 2023 05:46:24 GMT
- Title: ContrastMotion: Self-supervised Scene Motion Learning for Large-Scale
LiDAR Point Clouds
- Authors: Xiangze Jia, Hui Zhou, Xinge Zhu, Yandong Guo, Ji Zhang, Yuexin Ma
- Abstract summary: We propose a novel self-supervised motion estimator for LiDAR-based autonomous driving via BEV representation.
We predict scene motion via feature-level consistency between pillars in consecutive frames, which can eliminate the effect caused by noise points and view-changing point clouds in dynamic scenes.
- Score: 21.6511040107249
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a novel self-supervised motion estimator for
LiDAR-based autonomous driving via BEV representation. Different from usually
adopted self-supervised strategies for data-level structure consistency, we
predict scene motion via feature-level consistency between pillars in
consecutive frames, which can eliminate the effect caused by noise points and
view-changing point clouds in dynamic scenes. Specifically, we propose
\textit{Soft Discriminative Loss} that provides the network with more
pseudo-supervised signals to learn discriminative and robust features in a
contrastive learning manner. We also propose \textit{Gated Multi-frame Fusion}
block that learns valid compensation between point cloud frames automatically
to enhance feature extraction. Finally, \textit{pillar association} is proposed
to predict pillar correspondence probabilities based on feature distance, and
whereby further predicts scene motion. Extensive experiments show the
effectiveness and superiority of our \textbf{ContrastMotion} on both scene flow
and motion prediction tasks. The code is available soon.
Related papers
- Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories [28.701879490459675]
We aim to learn an implicit motion field parameterized by a neural network to predict the movement of novel points within same domain.
We exploit intrinsic regularization provided by SIREN, and modify the input layer to produce atemporally smooth motion field.
Our experiments assess the model's performance in predicting unseen point trajectories and its application in temporal mesh alignment with deformation.
arXiv Detail & Related papers (2024-06-05T21:02:10Z) - Regularizing Self-supervised 3D Scene Flows with Surface Awareness and Cyclic Consistency [3.124750429062221]
We introduce two new consistency losses that enlarge clusters while preventing them from spreading over distinct objects.
The proposed losses are model-independent and can thus be used in a plug-and-play fashion to significantly improve the performance of existing models.
We also showcase the effectiveness and generalization capability of our framework on four standard sensor-unique driving datasets.
arXiv Detail & Related papers (2023-12-12T11:00:39Z) - Self-Supervised 3D Scene Flow Estimation and Motion Prediction using
Local Rigidity Prior [100.98123802027847]
We investigate self-supervised 3D scene flow estimation and class-agnostic motion prediction on point clouds.
We generate pseudo scene flow labels for self-supervised learning through piecewise rigid motion estimation.
Our method achieves new state-of-the-art performance in self-supervised scene flow learning.
arXiv Detail & Related papers (2023-10-17T14:06:55Z) - A Spatiotemporal Correspondence Approach to Unsupervised LiDAR
Segmentation with Traffic Applications [16.260518238832887]
Key idea is to leverage the nature of a dynamic point cloud sequence and introduce drastically stronger scenarios.
We alternate between optimizing semantic into groups and clustering using point-wisetemporal labels.
Our method can learn discriminative features in an unsupervised learning fashion.
arXiv Detail & Related papers (2023-08-23T21:32:46Z) - Point Contrastive Prediction with Semantic Clustering for
Self-Supervised Learning on Point Cloud Videos [71.20376514273367]
We propose a unified point cloud video self-supervised learning framework for object-centric and scene-centric data.
Our method outperforms supervised counterparts on a wide range of downstream tasks.
arXiv Detail & Related papers (2023-08-18T02:17:47Z) - Data Augmentation-free Unsupervised Learning for 3D Point Cloud
Understanding [61.30276576646909]
We propose an augmentation-free unsupervised approach for point clouds to learn transferable point-level features via soft clustering, named SoftClu.
We exploit the affiliation of points to their clusters as a proxy to enable self-training through a pseudo-label prediction task.
arXiv Detail & Related papers (2022-10-06T10:18:16Z) - Domain Knowledge Driven Pseudo Labels for Interpretable Goal-Conditioned
Interactive Trajectory Prediction [29.701029725302586]
We study the joint trajectory prediction problem with the goal-conditioned framework.
We introduce a conditional-variational-autoencoder-based (CVAE) model to explicitly encode different interaction modes into the latent space.
We propose a novel approach to avoid KL vanishing and induce an interpretable interactive latent space with pseudo labels.
arXiv Detail & Related papers (2022-03-28T21:41:21Z) - SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation [71.2856098776959]
Estimating 3D motions for point clouds is challenging, since a point cloud is unordered and its density is significantly non-uniform.
We propose a novel architecture named Sparse Convolution-Transformer Network (SCTN) that equips the sparse convolution with the transformer.
We show that the learned relation-based contextual information is rich and helpful for matching corresponding points, benefiting scene flow estimation.
arXiv Detail & Related papers (2021-05-10T15:16:14Z) - Self-Supervised Pillar Motion Learning for Autonomous Driving [10.921208239968827]
We propose a learning framework that leverages free supervisory signals from point clouds and paired camera images to estimate motion purely via self-supervision.
Our model involves a point cloud based structural consistency augmented with probabilistic motion masking as well as a cross-sensor motion regularization to realize the desired self-supervision.
arXiv Detail & Related papers (2021-04-18T02:32:08Z) - Self-Supervision by Prediction for Object Discovery in Videos [62.87145010885044]
In this paper, we use the prediction task as self-supervision and build a novel object-centric model for image sequence representation.
Our framework can be trained without the help of any manual annotation or pretrained network.
Initial experiments confirm that the proposed pipeline is a promising step towards object-centric video prediction.
arXiv Detail & Related papers (2021-03-09T19:14:33Z) - LiDAR-based Panoptic Segmentation via Dynamic Shifting Network [56.71765153629892]
LiDAR-based panoptic segmentation aims to parse both objects and scenes in a unified manner.
We propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm.
Our proposed DS-Net achieves superior accuracies over current state-of-the-art methods.
arXiv Detail & Related papers (2020-11-24T08:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.