Related papers: Learning Scene Dynamics from Point Cloud Sequences

Learning Scene Dynamics from Point Cloud Sequences

URL: http://arxiv.org/abs/2111.08755v1
Date: Tue, 16 Nov 2021 19:52:46 GMT
Title: Learning Scene Dynamics from Point Cloud Sequences
Authors: Pan He, Patrick Emami, Sanjay Ranka, Anand Rangarajan
Abstract summary: We propose a novel problem --temporal scene flow estimation (SSFE) -- that aims to predict 3D scene flow for all pairs of point clouds in a sequence. We introduce the SPCM-Net architecture, which solves this problem by computing multi-scale correlations between neighboring point clouds. We demonstrate that this approach can be effectively modified for sequential point cloud forecasting.
Score: 8.163697683448811
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding 3D scenes is a critical prerequisite for autonomous agents. Recently, LiDAR and other sensors have made large amounts of data available in the form of temporal sequences of point cloud frames. In this work, we propose a novel problem -- sequential scene flow estimation (SSFE) -- that aims to predict 3D scene flow for all pairs of point clouds in a given sequence. This is unlike the previously studied problem of scene flow estimation which focuses on two frames. We introduce the SPCM-Net architecture, which solves this problem by computing multi-scale spatiotemporal correlations between neighboring point clouds and then aggregating the correlation across time with an order-invariant recurrent unit. Our experimental evaluation confirms that recurrent processing of point cloud sequences results in significantly better SSFE compared to using only two frames. Additionally, we demonstrate that this approach can be effectively modified for sequential point cloud forecasting (SPF), a related problem that demands forecasting future point cloud frames. Our experimental results are evaluated using a new benchmark for both SSFE and SPF consisting of synthetic and real datasets. Previously, datasets for scene flow estimation have been limited to two frames. We provide non-trivial extensions to these datasets for multi-frame estimation and prediction. Due to the difficulty of obtaining ground truth motion for real-world datasets, we use self-supervised training and evaluation metrics. We believe that this benchmark will be pivotal to future research in this area. All code for benchmark and models will be made accessible.

Related papers

Fully-Geometric Cross-Attention for Point Cloud Registration [51.865371511201765]
Point cloud registration approaches often fail when the overlap between point clouds is low due to noisy point correspondences. This work introduces a novel cross-attention mechanism tailored for Transformer-based architectures that tackles this problem. We integrate the Gromov-Wasserstein distance into the cross-attention formulation to jointly compute distances between points across different point clouds. At the point level, we also devise a self-attention mechanism that aggregates the local geometric structure information into point features for fine matching.
arXiv Detail & Related papers (2025-02-12T10:44:36Z)
OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries. OPUS incorporates a suite of non-trivial strategies to enhance model performance. Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z)
Point Cloud Pre-training with Diffusion Models [62.12279263217138]
We propose a novel pre-training method called Point cloud Diffusion pre-training (PointDif) PointDif achieves substantial improvement across various real-world datasets for diverse downstream tasks such as classification, segmentation and detection.
arXiv Detail & Related papers (2023-11-25T08:10:05Z)
Pushing the Limits of Pre-training for Time Series Forecasting in the CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain. We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size. Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z)
SCOOP: Self-Supervised Correspondence and Optimization-Based Scene Flow [25.577386156273256]
Scene flow estimation is a long-standing problem in computer vision, where the goal is to find the 3D motion of a scene from its consecutive observations. We introduce SCOOP, a new method for scene flow estimation that can be learned on a small amount of data without employing ground-truth flow supervision.
arXiv Detail & Related papers (2022-11-25T10:52:02Z)
RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds [44.034836961967144]
3D motion estimation including scene flow and point cloud registration has drawn increasing interest. Recent methods employ deep neural networks to construct the cost volume for estimating accurate 3D flow. We decompose the problem into two interlaced stages, where the 3D flows are optimized point-wisely at the first stage and then globally regularized in a recurrent network at the second stage.
arXiv Detail & Related papers (2022-05-23T04:04:30Z)
IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding Alignment [58.8330387551499]
We formulate the problem as estimation of point-wise trajectories (i.e., smooth curves) We propose IDEA-Net, an end-to-end deep learning framework, which disentangles the problem under the assistance of the explicitly learned temporal consistency. We demonstrate the effectiveness of our method on various point cloud sequences and observe large improvement over state-of-the-art methods both quantitatively and visually.
arXiv Detail & Related papers (2022-03-22T10:14:08Z)
Residual 3D Scene Flow Learning with Context-Aware Feature Extraction [11.394559627312743]
We propose a novel context-aware set conv layer to exploit contextual structure information of Euclidean space. We also propose an explicit residual flow learning structure in the residual flow refinement layer to cope with long-distance movement. Our method achieves state-of-the-art performance, surpassing all other previous works to the best of our knowledge by at least 25%.
arXiv Detail & Related papers (2021-09-10T06:15:18Z)
SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation [71.2856098776959]
Estimating 3D motions for point clouds is challenging, since a point cloud is unordered and its density is significantly non-uniform. We propose a novel architecture named Sparse Convolution-Transformer Network (SCTN) that equips the sparse convolution with the transformer. We show that the learned relation-based contextual information is rich and helpful for matching corresponding points, benefiting scene flow estimation.
arXiv Detail & Related papers (2021-05-10T15:16:14Z)
Scalable Scene Flow from Point Clouds in the Real World [30.437100097997245]
We introduce a new large scale benchmark for scene flow based on the Open dataset. We show how previous works were bounded based on the amount of real LiDAR data available. We introduce the model architecture FastFlow3D that provides real time inference on the full point cloud.
arXiv Detail & Related papers (2021-03-01T20:56:05Z)
Occlusion Guided Scene Flow Estimation on 3D Point Clouds [4.518012967046983]
3D scene flow estimation is a vital tool in perceiving our environment given depth or range sensors. Here we propose a new scene flow architecture called OGSF-Net which tightly couples the learning for both flow and occlusions between frames. Their coupled symbiosis results in a more accurate prediction of flow in space.
arXiv Detail & Related papers (2020-11-30T15:22:03Z)
Learning multiview 3D point cloud registration [74.39499501822682]
We present a novel, end-to-end learnable, multiview 3D point cloud registration algorithm. Our approach outperforms the state-of-the-art by a significant margin, while being end-to-end trainable and computationally less costly.
arXiv Detail & Related papers (2020-01-15T03:42:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.