Regularizing Self-supervised 3D Scene Flows with Surface Awareness and Cyclic Consistency
- URL: http://arxiv.org/abs/2312.08879v3
- Date: Tue, 13 Aug 2024 14:32:37 GMT
- Title: Regularizing Self-supervised 3D Scene Flows with Surface Awareness and Cyclic Consistency
- Authors: Patrik Vacek, David Hurych, Karel Zimmermann, Patrick Perez, Tomas Svoboda,
- Abstract summary: We introduce two new consistency losses that enlarge clusters while preventing them from spreading over distinct objects.
The proposed losses are model-independent and can thus be used in a plug-and-play fashion to significantly improve the performance of existing models.
We also showcase the effectiveness and generalization capability of our framework on four standard sensor-unique driving datasets.
- Score: 3.124750429062221
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning without supervision how to predict 3D scene flows from point clouds is essential to many perception systems. We propose a novel learning framework for this task which improves the necessary regularization. Relying on the assumption that scene elements are mostly rigid, current smoothness losses are built on the definition of "rigid clusters" in the input point clouds. The definition of these clusters is challenging and has a significant impact on the quality of predicted flows. We introduce two new consistency losses that enlarge clusters while preventing them from spreading over distinct objects. In particular, we enforce \emph{temporal} consistency with a forward-backward cyclic loss and \emph{spatial} consistency by considering surface orientation similarity in addition to spatial proximity. The proposed losses are model-independent and can thus be used in a plug-and-play fashion to significantly improve the performance of existing models, as demonstrated on two most widely used architectures. We also showcase the effectiveness and generalization capability of our framework on four standard sensor-unique driving datasets, achieving state-of-the-art performance in 3D scene flow estimation. Our codes are available on https://github.com/ctu-vras/sac-flow.
Related papers
- ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Prediction [89.89610257714006]
Existing methods prioritize higher accuracy to cater to the demands of these tasks.
We introduce a series of targeted improvements for 3D semantic occupancy prediction and flow estimation.
Our purelytemporalal architecture framework, named ALOcc, achieves an optimal tradeoff between speed and accuracy.
arXiv Detail & Related papers (2024-11-12T11:32:56Z) - OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion.
Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z) - Let-It-Flow: Simultaneous Optimization of 3D Flow and Object Clustering [2.763111962660262]
We study the problem of self-supervised 3D scene flow estimation from real large-scale raw point cloud sequences.
We propose a novel clustering approach that allows for combination of overlapping soft clusters as well as non-overlapping rigid clusters.
Our method especially excels in resolving flow in complicated dynamic scenes with multiple independently moving objects close to each other.
arXiv Detail & Related papers (2024-04-12T10:04:03Z) - STARFlow: Spatial Temporal Feature Re-embedding with Attentive Learning for Real-world Scene Flow [5.476991379461233]
We propose global attentive flow embedding to match all-to-all point pairs in both Euclidean space.
We leverage novel domain adaptive losses to bridge the gap of motion inference from synthetic to real-world.
Our approach achieves state-of-the-art performance across various datasets, with particularly outstanding results on real-world LiDAR-scanned datasets.
arXiv Detail & Related papers (2024-03-11T04:56:10Z) - On Robust Cross-View Consistency in Self-Supervised Monocular Depth Estimation [56.97699793236174]
We study two kinds of robust cross-view consistency in this paper.
We exploit the temporal coherence in both depth feature space and 3D voxel space for self-supervised monocular depth estimation.
Experimental results on several outdoor benchmarks show that our method outperforms current state-of-the-art techniques.
arXiv Detail & Related papers (2022-09-19T03:46:13Z) - IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding
Alignment [58.8330387551499]
We formulate the problem as estimation of point-wise trajectories (i.e., smooth curves)
We propose IDEA-Net, an end-to-end deep learning framework, which disentangles the problem under the assistance of the explicitly learned temporal consistency.
We demonstrate the effectiveness of our method on various point cloud sequences and observe large improvement over state-of-the-art methods both quantitatively and visually.
arXiv Detail & Related papers (2022-03-22T10:14:08Z) - Residual 3D Scene Flow Learning with Context-Aware Feature Extraction [11.394559627312743]
We propose a novel context-aware set conv layer to exploit contextual structure information of Euclidean space.
We also propose an explicit residual flow learning structure in the residual flow refinement layer to cope with long-distance movement.
Our method achieves state-of-the-art performance, surpassing all other previous works to the best of our knowledge by at least 25%.
arXiv Detail & Related papers (2021-09-10T06:15:18Z) - SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation [71.2856098776959]
Estimating 3D motions for point clouds is challenging, since a point cloud is unordered and its density is significantly non-uniform.
We propose a novel architecture named Sparse Convolution-Transformer Network (SCTN) that equips the sparse convolution with the transformer.
We show that the learned relation-based contextual information is rich and helpful for matching corresponding points, benefiting scene flow estimation.
arXiv Detail & Related papers (2021-05-10T15:16:14Z) - Occlusion Guided Self-supervised Scene Flow Estimation on 3D Point
Clouds [4.518012967046983]
Understanding the flow in 3D space of sparsely sampled points between two consecutive time frames is the core stone of modern geometric-driven systems.
This work presents a new self-supervised training method and an architecture for the 3D scene flow estimation under occlusions.
arXiv Detail & Related papers (2021-04-10T09:55:19Z) - Occlusion Guided Scene Flow Estimation on 3D Point Clouds [4.518012967046983]
3D scene flow estimation is a vital tool in perceiving our environment given depth or range sensors.
Here we propose a new scene flow architecture called OGSF-Net which tightly couples the learning for both flow and occlusions between frames.
Their coupled symbiosis results in a more accurate prediction of flow in space.
arXiv Detail & Related papers (2020-11-30T15:22:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.