Consistency Guided Scene Flow Estimation
- URL: http://arxiv.org/abs/2006.11242v2
- Date: Mon, 17 Aug 2020 09:58:47 GMT
- Title: Consistency Guided Scene Flow Estimation
- Authors: Yuhua Chen, Luc Van Gool, Cordelia Schmid, Cristian Sminchisescu
- Abstract summary: CGSF is a self-supervised framework for the joint reconstruction of 3D scene structure and motion from stereo video.
We show that the proposed model can reliably predict disparity and scene flow in challenging imagery.
It achieves better generalization than the state-of-the-art, and adapts quickly and robustly to unseen domains.
- Score: 159.24395181068218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Consistency Guided Scene Flow Estimation (CGSF) is a self-supervised
framework for the joint reconstruction of 3D scene structure and motion from
stereo video. The model takes two temporal stereo pairs as input, and predicts
disparity and scene flow. The model self-adapts at test time by iteratively
refining its predictions. The refinement process is guided by a consistency
loss, which combines stereo and temporal photo-consistency with a geometric
term that couples disparity and 3D motion. To handle inherent modeling error in
the consistency loss (e.g. Lambertian assumptions) and for better
generalization, we further introduce a learned, output refinement network,
which takes the initial predictions, the loss, and the gradient as input, and
efficiently predicts a correlated output update. In multiple experiments,
including ablation studies, we show that the proposed model can reliably
predict disparity and scene flow in challenging imagery, achieves better
generalization than the state-of-the-art, and adapts quickly and robustly to
unseen domains.
Related papers
- Multi-Contextual Predictions with Vision Transformer for Video Anomaly
Detection [22.098399083491937]
understanding of thetemporal context of a video plays a vital role in anomaly detection.
We design a transformer model with three different contextual prediction streams: masked, whole and partial.
By learning to predict the missing frames of consecutive normal frames, our model can effectively learn various normality patterns in the video.
arXiv Detail & Related papers (2022-06-17T05:54:31Z) - Uncertainty-Aware Adaptation for Self-Supervised 3D Human Pose
Estimation [70.32536356351706]
We introduce MRP-Net that constitutes a common deep network backbone with two output heads subscribing to two diverse configurations.
We derive suitable measures to quantify prediction uncertainty at both pose and joint level.
We present a comprehensive evaluation of the proposed approach and demonstrate state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2022-03-29T07:14:58Z) - Towards Robust and Adaptive Motion Forecasting: A Causal Representation
Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables.
We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph.
Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z) - PDC-Net+: Enhanced Probabilistic Dense Correspondence Network [161.76275845530964]
Enhanced Probabilistic Dense Correspondence Network, PDC-Net+, capable of estimating accurate dense correspondences.
We develop an architecture and an enhanced training strategy tailored for robust and generalizable uncertainty prediction.
Our approach obtains state-of-the-art results on multiple challenging geometric matching and optical flow datasets.
arXiv Detail & Related papers (2021-09-28T17:56:41Z) - Self-Supervised Multi-Frame Monocular Scene Flow [61.588808225321735]
We introduce a multi-frame monocular scene flow network based on self-supervised learning.
We observe state-of-the-art accuracy among monocular scene flow methods based on self-supervised learning.
arXiv Detail & Related papers (2021-05-05T17:49:55Z) - FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation [87.74617110803189]
Estimating the 3D motion of points in a scene, known as scene flow, is a core problem in computer vision.
We present a recurrent architecture that learns a single step of an unrolled iterative alignment procedure for refining scene flow predictions.
arXiv Detail & Related papers (2020-11-19T23:23:48Z) - Self-Supervised Monocular Scene Flow Estimation [27.477810324117016]
We propose a novel monocular scene flow method that yields competitive accuracy and real-time performance.
By taking an inverse problem view, we design a single convolutional neural network (CNN) that successfully estimates depth and 3D motion simultaneously.
arXiv Detail & Related papers (2020-04-08T17:55:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.