STARFlow: Spatial Temporal Feature Re-embedding with Attentive Learning for Real-world Scene Flow
- URL: http://arxiv.org/abs/2403.07032v2
- Date: Thu, 14 Nov 2024 06:36:57 GMT
- Title: STARFlow: Spatial Temporal Feature Re-embedding with Attentive Learning for Real-world Scene Flow
- Authors: Zhiyang Lu, Qinghan Chen, Ming Cheng,
- Abstract summary: We propose global attentive flow embedding to match all-to-all point pairs in both Euclidean space.
We leverage novel domain adaptive losses to bridge the gap of motion inference from synthetic to real-world.
Our approach achieves state-of-the-art performance across various datasets, with particularly outstanding results on real-world LiDAR-scanned datasets.
- Score: 5.476991379461233
- License:
- Abstract: Scene flow prediction is a crucial underlying task in understanding dynamic scenes as it offers fundamental motion information. However, contemporary scene flow methods encounter three major challenges. Firstly, flow estimation solely based on local receptive fields lacks long-dependency matching of point pairs. To address this issue, we propose global attentive flow embedding to match all-to-all point pairs in both feature space and Euclidean space, providing global initialization before local refinement. Secondly, there are deformations existing in non-rigid objects after warping, which leads to variations in the spatiotemporal relation between the consecutive frames. For a more precise estimation of residual flow, a spatial temporal feature re-embedding module is devised to acquire the sequence features after deformation. Furthermore, previous methods perform poor generalization due to the significant domain gap between the synthesized and LiDAR-scanned datasets. We leverage novel domain adaptive losses to effectively bridge the gap of motion inference from synthetic to real-world. Experiments demonstrate that our approach achieves state-of-the-art performance across various datasets, with particularly outstanding results on real-world LiDAR-scanned datasets. Our code is available at https://github.com/O-VIGIA/StarFlow.
Related papers
- OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - SSRFlow: Semantic-aware Fusion with Spatial Temporal Re-embedding for Real-world Scene Flow [6.995663556921384]
Scene flow provides the 3D motion field of the first frame from two consecutive point clouds.
We propose a novel approach called Dual Cross Attentive (DCA) for the latent fusion and alignment between two frames based semantic contexts.
We leverage novel domain adaptive losses to effectively bridge the gap of motion inference from synthetic to real-world.
arXiv Detail & Related papers (2024-07-31T02:28:40Z) - Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion.
Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z) - Let-It-Flow: Simultaneous Optimization of 3D Flow and Object Clustering [2.763111962660262]
We study the problem of self-supervised 3D scene flow estimation from real large-scale raw point cloud sequences.
We propose a novel clustering approach that allows for combination of overlapping soft clusters as well as non-overlapping rigid clusters.
Our method especially excels in resolving flow in complicated dynamic scenes with multiple independently moving objects close to each other.
arXiv Detail & Related papers (2024-04-12T10:04:03Z) - Regularizing Self-supervised 3D Scene Flows with Surface Awareness and Cyclic Consistency [3.124750429062221]
We introduce two new consistency losses that enlarge clusters while preventing them from spreading over distinct objects.
The proposed losses are model-independent and can thus be used in a plug-and-play fashion to significantly improve the performance of existing models.
We also showcase the effectiveness and generalization capability of our framework on four standard sensor-unique driving datasets.
arXiv Detail & Related papers (2023-12-12T11:00:39Z) - Multi-Body Neural Scene Flow [37.31530794244607]
We show that multi-body rigidity can be achieved without the cumbersome and brittle strategy of constraining the $SE(3)$ parameters of each rigid body.
This is achieved by regularizing the scene flow optimization to encourage isometry in flow predictions for rigid bodies.
We conduct extensive experiments on real-world datasets and demonstrate that our approach outperforms the state-of-the-art in 3D scene flow and long-term point-wise 4D trajectory prediction.
arXiv Detail & Related papers (2023-10-16T11:37:53Z) - Adaptive Cross Batch Normalization for Metric Learning [75.91093210956116]
Metric learning is a fundamental problem in computer vision.
We show that it is equally important to ensure that the accumulated embeddings are up to date.
In particular, it is necessary to circumvent the representational drift between the accumulated embeddings and the feature embeddings at the current training iteration.
arXiv Detail & Related papers (2023-03-30T03:22:52Z) - SCOOP: Self-Supervised Correspondence and Optimization-Based Scene Flow [25.577386156273256]
Scene flow estimation is a long-standing problem in computer vision, where the goal is to find the 3D motion of a scene from its consecutive observations.
We introduce SCOOP, a new method for scene flow estimation that can be learned on a small amount of data without employing ground-truth flow supervision.
arXiv Detail & Related papers (2022-11-25T10:52:02Z) - Domain-incremental Cardiac Image Segmentation with Style-oriented Replay
and Domain-sensitive Feature Whitening [67.6394526631557]
M&Ms should incrementally learn from each incoming dataset and progressively update with improved functionality as time goes by.
In medical scenarios, this is particularly challenging as accessing or storing past data is commonly not allowed due to data privacy.
We propose a novel domain-incremental learning framework to recover past domain inputs first and then regularly replay them during model optimization.
arXiv Detail & Related papers (2022-11-09T13:07:36Z) - Learning to Estimate Hidden Motions with Global Motion Aggregation [71.12650817490318]
Occlusions pose a significant challenge to optical flow algorithms that rely on local evidences.
We introduce a global motion aggregation module to find long-range dependencies between pixels in the first image.
We demonstrate that the optical flow estimates in the occluded regions can be significantly improved without damaging the performance in non-occluded regions.
arXiv Detail & Related papers (2021-04-06T10:32:03Z) - ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework
for LiDAR Point Cloud Segmentation [111.56730703473411]
Training deep neural networks (DNNs) on LiDAR data requires large-scale point-wise annotations.
Simulation-to-real domain adaptation (SRDA) trains a DNN using unlimited synthetic data with automatically generated labels.
ePointDA consists of three modules: self-supervised dropout noise rendering, statistics-invariant and spatially-adaptive feature alignment, and transferable segmentation learning.
arXiv Detail & Related papers (2020-09-07T23:46:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.