Dimensions of Motion: Learning to Predict a Subspace of Optical Flow
from a Single Image
- URL: http://arxiv.org/abs/2112.01502v1
- Date: Thu, 2 Dec 2021 18:52:54 GMT
- Title: Dimensions of Motion: Learning to Predict a Subspace of Optical Flow
from a Single Image
- Authors: Richard Strong Bowen, Richard Tucker, Ramin Zabih, Noah Snavely
- Abstract summary: We introduce the problem of predicting, from a single video frame, a low-dimensional subspace of optical flow which includes the actual instantaneous optical flow.
We show how several natural scene assumptions allow us to identify an appropriate flow subspace via a set of basis flow fields parameterized by disparity.
This provides a new approach to learning these tasks in an unsupervised fashion using monocular input video without requiring camera intrinsics or poses.
- Score: 50.9686256513627
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce the problem of predicting, from a single video frame, a
low-dimensional subspace of optical flow which includes the actual
instantaneous optical flow. We show how several natural scene assumptions allow
us to identify an appropriate flow subspace via a set of basis flow fields
parameterized by disparity and a representation of object instances. The flow
subspace, together with a novel loss function, can be used for the tasks of
predicting monocular depth or predicting depth plus an object instance
embedding. This provides a new approach to learning these tasks in an
unsupervised fashion using monocular input video without requiring camera
intrinsics or poses.
Related papers
- Skin the sheep not only once: Reusing Various Depth Datasets to Drive
the Learning of Optical Flow [25.23550076996421]
We propose to leverage the geometric connection between optical flow estimation and stereo matching.
We turn the monocular depth datasets into stereo ones via virtual disparity.
We also introduce virtual camera motion into stereo data to produce additional flows along the vertical direction.
arXiv Detail & Related papers (2023-10-03T06:56:07Z) - Multi-Object Discovery by Low-Dimensional Object Motion [0.0]
We propose to model pixel-wise geometry and object motion to remove ambiguity in reconstructing flow from a single image.
We achieve state-of-the-art results in unsupervised multi-object segmentation on synthetic and real-world datasets by modeling the scene structure and object motion.
arXiv Detail & Related papers (2023-07-16T12:35:46Z) - Unsupervised Learning Optical Flow in Multi-frame Dynamic Environment
Using Temporal Dynamic Modeling [7.111443975103329]
In this paper, we explore the optical flow estimation from multiple-frame sequences of dynamic scenes.
We use motion priors of the adjacent frames to provide more reliable supervision of the occluded regions.
Experiments on KITTI 2012, KITTI 2015, Sintel Clean, and Sintel Final datasets demonstrate the effectiveness of our methods.
arXiv Detail & Related papers (2023-04-14T14:32:02Z) - Adaptive Multi-source Predictor for Zero-shot Video Object Segmentation [68.56443382421878]
We propose a novel adaptive multi-source predictor for zero-shot video object segmentation (ZVOS)
In the static object predictor, the RGB source is converted to depth and static saliency sources, simultaneously.
Experiments show that the proposed model outperforms the state-of-the-art methods on three challenging ZVOS benchmarks.
arXiv Detail & Related papers (2023-03-18T10:19:29Z) - CbwLoss: Constrained Bidirectional Weighted Loss for Self-supervised
Learning of Depth and Pose [13.581694284209885]
Photometric differences are used to train neural networks for estimating depth and camera pose from unlabeled monocular videos.
In this paper, we deal with moving objects and occlusions utilizing the difference of the flow fields and depth structure generated by affine transformation and view synthesis.
We mitigate the effect of textureless regions on model optimization by measuring differences between features with more semantic and contextual information without adding networks.
arXiv Detail & Related papers (2022-12-12T12:18:24Z) - Motion-inductive Self-supervised Object Discovery in Videos [99.35664705038728]
We propose a model for processing consecutive RGB frames, and infer the optical flow between any pair of frames using a layered representation.
We demonstrate superior performance over previous state-of-the-art methods on three public video segmentation datasets.
arXiv Detail & Related papers (2022-10-01T08:38:28Z) - Optical Flow Estimation from a Single Motion-blurred Image [66.2061278123057]
Motion blur in an image may have practical interests in fundamental computer vision problems.
We propose a novel framework to estimate optical flow from a single motion-blurred image in an end-to-end manner.
arXiv Detail & Related papers (2021-03-04T12:45:18Z) - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection
Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.
Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z) - Joint Unsupervised Learning of Optical Flow and Egomotion with Bi-Level
Optimization [59.9673626329892]
We exploit the global relationship between optical flow and camera motion using epipolar geometry.
We use implicit differentiation to enable back-propagation through the lower-level geometric optimization layer independent of its implementation.
arXiv Detail & Related papers (2020-02-26T22:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.