Unsupervised Learning of 3D Scene Flow from Monocular Camera
- URL: http://arxiv.org/abs/2206.03673v1
- Date: Wed, 8 Jun 2022 04:57:27 GMT
- Title: Unsupervised Learning of 3D Scene Flow from Monocular Camera
- Authors: Guangming Wang, Xiaoyu Tian, Ruiqi Ding, and Hesheng Wang
- Abstract summary: It is difficult to obtain the ground truth of scene flow in the real scenes, and recent studies are based on synthetic data for training.
A novel unsupervised learning method for scene flow is proposed in this paper, which utilizes the images of two consecutive frames taken by monocular camera.
Our method realizes the goal that training scene flow network with real-world data, which bridges the gap between training data and test data.
- Score: 21.34395959441377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scene flow represents the motion of points in the 3D space, which is the
counterpart of the optical flow that represents the motion of pixels in the 2D
image. However, it is difficult to obtain the ground truth of scene flow in the
real scenes, and recent studies are based on synthetic data for training.
Therefore, how to train a scene flow network with unsupervised methods based on
real-world data shows crucial significance. A novel unsupervised learning
method for scene flow is proposed in this paper, which utilizes the images of
two consecutive frames taken by monocular camera without the ground truth of
scene flow for training. Our method realizes the goal that training scene flow
network with real-world data, which bridges the gap between training data and
test data and broadens the scope of available data for training. Unsupervised
learning of scene flow in this paper mainly consists of two parts: (i) depth
estimation and camera pose estimation, and (ii) scene flow estimation based on
four different loss functions. Depth estimation and camera pose estimation
obtain the depth maps and camera pose between two consecutive frames, which
provide further information for the next scene flow estimation. After that, we
used depth consistency loss, dynamic-static consistency loss, Chamfer loss, and
Laplacian regularization loss to carry out unsupervised training of the scene
flow network. To our knowledge, this is the first paper that realizes the
unsupervised learning of 3D scene flow from monocular camera. The experiment
results on KITTI show that our method for unsupervised learning of scene flow
meets great performance compared to traditional methods Iterative Closest Point
(ICP) and Fast Global Registration (FGR). The source code is available at:
https://github.com/IRMVLab/3DUnMonoFlow.
Related papers
- FlowCam: Training Generalizable 3D Radiance Fields without Camera Poses
via Pixel-Aligned Scene Flow [26.528667940013598]
Reconstruction of 3D neural fields from posed images has emerged as a promising method for self-supervised representation learning.
Key challenge preventing the deployment of these 3D scene learners on large-scale video data is their dependence on precise camera poses from structure-from-motion.
We propose a method that jointly reconstructs camera poses and 3D neural scene representations online and in a single forward pass.
arXiv Detail & Related papers (2023-05-31T20:58:46Z) - Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z) - 3D Scene Flow Estimation on Pseudo-LiDAR: Bridging the Gap on Estimating
Point Motion [19.419030878019974]
3D scene flow characterizes how the points at the current time flow to the next time in the 3D Euclidean space.
The stability of the predicted scene flow is improved by introducing the dense nature of 2D pixels into the 3D space.
Disparity consistency loss is proposed to achieve more effective unsupervised learning of 3D scene flow.
arXiv Detail & Related papers (2022-09-27T03:27:09Z) - CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous
Driving [57.16921612272783]
We propose CO3, namely Cooperative Contrastive Learning and Contextual Shape Prediction, to learn 3D representation for outdoor-scene point clouds in an unsupervised manner.
We believe CO3 will facilitate understanding LiDAR point clouds in outdoor scene.
arXiv Detail & Related papers (2022-06-08T17:37:58Z) - Learning Scene Flow in 3D Point Clouds with Noisy Pseudo Labels [71.11151016581806]
We propose a novel scene flow method that captures 3D motions from point clouds without relying on ground-truth scene flow annotations.
Our method not only outperforms state-of-the-art self-supervised approaches, but also outperforms some supervised approaches that use accurate ground-truth flows.
arXiv Detail & Related papers (2022-03-23T18:20:03Z) - Self-Supervised Multi-Frame Monocular Scene Flow [61.588808225321735]
We introduce a multi-frame monocular scene flow network based on self-supervised learning.
We observe state-of-the-art accuracy among monocular scene flow methods based on self-supervised learning.
arXiv Detail & Related papers (2021-05-05T17:49:55Z) - Weakly Supervised Learning of Rigid 3D Scene Flow [81.37165332656612]
We propose a data-driven scene flow estimation algorithm exploiting the observation that many 3D scenes can be explained by a collection of agents moving as rigid bodies.
We showcase the effectiveness and generalization capacity of our method on four different autonomous driving datasets.
arXiv Detail & Related papers (2021-02-17T18:58:02Z) - Do not trust the neighbors! Adversarial Metric Learning for
Self-Supervised Scene Flow Estimation [0.0]
Scene flow is the task of estimating 3D motion vectors to individual points of a dynamic 3D scene.
We propose a 3D scene flow benchmark and a novel self-supervised setup for training flow models.
We find that our setup is able to keep motion coherence and preserve local geometries, which many self-supervised baselines fail to grasp.
arXiv Detail & Related papers (2020-11-01T17:41:32Z) - Adversarial Self-Supervised Scene Flow Estimation [15.278302535191866]
This work proposes a metric learning approach for self-supervised scene flow estimation.
We outline a benchmark for self-supervised scene flow estimation: the Scene Flow Sandbox.
arXiv Detail & Related papers (2020-11-01T16:37:37Z) - Self-Supervised Human Depth Estimation from Monocular Videos [99.39414134919117]
Previous methods on estimating detailed human depth often require supervised training with ground truth' depth data.
This paper presents a self-supervised method that can be trained on YouTube videos without known depth.
Experiments demonstrate that our method enjoys better generalization and performs much better on data in the wild.
arXiv Detail & Related papers (2020-05-07T09:45:11Z) - Self-Supervised Monocular Scene Flow Estimation [27.477810324117016]
We propose a novel monocular scene flow method that yields competitive accuracy and real-time performance.
By taking an inverse problem view, we design a single convolutional neural network (CNN) that successfully estimates depth and 3D motion simultaneously.
arXiv Detail & Related papers (2020-04-08T17:55:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.