ZeroFlow: Scalable Scene Flow via Distillation
- URL: http://arxiv.org/abs/2305.10424v8
- Date: Thu, 14 Mar 2024 16:38:36 GMT
- Title: ZeroFlow: Scalable Scene Flow via Distillation
- Authors: Kyle Vedder, Neehar Peri, Nathaniel Chodosh, Ishan Khatri, Eric Eaton, Dinesh Jayaraman, Yang Liu, Deva Ramanan, James Hays,
- Abstract summary: Scene flow estimation is the task of describing the 3D motion field between temporally successive point clouds.
State-of-the-art methods use strong priors and test-time optimization techniques, but require on the order of tens of seconds to process full-size point clouds.
We propose Scene Flow via Distillation, a simple, scalable distillation framework that uses a label-free optimization method to produce pseudo-labels to supervise a feedforward model.
- Score: 66.70820145266029
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scene flow estimation is the task of describing the 3D motion field between temporally successive point clouds. State-of-the-art methods use strong priors and test-time optimization techniques, but require on the order of tens of seconds to process full-size point clouds, making them unusable as computer vision primitives for real-time applications such as open world object detection. Feedforward methods are considerably faster, running on the order of tens to hundreds of milliseconds for full-size point clouds, but require expensive human supervision. To address both limitations, we propose Scene Flow via Distillation, a simple, scalable distillation framework that uses a label-free optimization method to produce pseudo-labels to supervise a feedforward model. Our instantiation of this framework, ZeroFlow, achieves state-of-the-art performance on the Argoverse 2 Self-Supervised Scene Flow Challenge while using zero human labels by simply training on large-scale, diverse unlabeled data. At test-time, ZeroFlow is over 1000x faster than label-free state-of-the-art optimization-based methods on full-size point clouds (34 FPS vs 0.028 FPS) and over 1000x cheaper to train on unlabeled data compared to the cost of human annotation (\$394 vs ~\$750,000). To facilitate further research, we release our code, trained model weights, and high quality pseudo-labels for the Argoverse 2 and Waymo Open datasets at https://vedder.io/zeroflow.html
Related papers
- Neural Eulerian Scene Flow Fields [59.57980592109722]
EulerFlow works out-of-the-box without tuning across multiple domains.
It exhibits emergent 3D point tracking behavior by solving its estimated ODE over long-time horizons.
It outperforms all prior art on the Argoverse 2 2024 Scene Flow Challenge.
arXiv Detail & Related papers (2024-10-02T20:56:45Z) - 3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo
Auto-labelling [21.726386822643995]
We present a novel approach to generate a large number of 3D scene flow pseudo labels for real-world LiDAR point clouds.
Specifically, we employ the assumption of rigid body motion to simulate potential object-level rigid movements in autonomous driving scenarios.
By perfectly synthesizing target point clouds based on augmented motion parameters, we easily obtain lots of 3D scene flow labels in point clouds highly consistent with real scenarios.
arXiv Detail & Related papers (2024-02-28T08:12:31Z) - Dense Optical Tracking: Connecting the Dots [82.79642869586587]
DOT is a novel, simple and efficient method for solving the problem of point tracking in a video.
We show that DOT is significantly more accurate than current optical flow techniques, outperforms sophisticated "universal trackers" like OmniMotion, and is on par with, or better than, the best point tracking algorithms like CoTracker.
arXiv Detail & Related papers (2023-12-01T18:59:59Z) - InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation [33.70116170511312]
We propose a novel text-conditioned pipeline to turn Stable Diffusion (SD) into an ultra-fast one-step model.
We create the first one-step diffusion-based text-to-image generator with SD-level image quality, achieving an FID of $23.3$ on MS COCO 2017-5k.
arXiv Detail & Related papers (2023-09-12T16:42:09Z) - PointFlowHop: Green and Interpretable Scene Flow Estimation from
Consecutive Point Clouds [49.7285297470392]
An efficient 3D scene flow estimation method called PointFlowHop is proposed in this work.
PointFlowHop takes two consecutive point clouds and determines the 3D flow vectors for every point in the first point cloud.
It decomposes the scene flow estimation task into a set of subtasks, including ego-motion compensation, object association and object-wise motion estimation.
arXiv Detail & Related papers (2023-02-27T23:06:01Z) - RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point
Clouds [44.034836961967144]
3D motion estimation including scene flow and point cloud registration has drawn increasing interest.
Recent methods employ deep neural networks to construct the cost volume for estimating accurate 3D flow.
We decompose the problem into two interlaced stages, where the 3D flows are optimized point-wisely at the first stage and then globally regularized in a recurrent network at the second stage.
arXiv Detail & Related papers (2022-05-23T04:04:30Z) - Learning Scene Flow in 3D Point Clouds with Noisy Pseudo Labels [71.11151016581806]
We propose a novel scene flow method that captures 3D motions from point clouds without relying on ground-truth scene flow annotations.
Our method not only outperforms state-of-the-art self-supervised approaches, but also outperforms some supervised approaches that use accurate ground-truth flows.
arXiv Detail & Related papers (2022-03-23T18:20:03Z) - AutoFlow: Learning a Better Training Set for Optical Flow [62.40293188964933]
AutoFlow is a method to render training data for optical flow.
AutoFlow achieves state-of-the-art accuracy in pre-training both PWC-Net and RAFT.
arXiv Detail & Related papers (2021-04-29T17:55:23Z) - Scene Flow from Point Clouds with or without Learning [47.03163552693887]
Scene flow is the three-dimensional (3D) motion field of a scene.
Current learning-based approaches seek to estimate the scene flow directly from point clouds.
We present a simple and interpretable objective function to recover the scene flow from point clouds.
arXiv Detail & Related papers (2020-10-31T17:24:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.