Unsupervised Learning Optical Flow in Multi-frame Dynamic Environment
Using Temporal Dynamic Modeling
- URL: http://arxiv.org/abs/2304.07159v1
- Date: Fri, 14 Apr 2023 14:32:02 GMT
- Title: Unsupervised Learning Optical Flow in Multi-frame Dynamic Environment
Using Temporal Dynamic Modeling
- Authors: Zitang Sun, Shin'ya Nishida, and Zhengbo Luo
- Abstract summary: In this paper, we explore the optical flow estimation from multiple-frame sequences of dynamic scenes.
We use motion priors of the adjacent frames to provide more reliable supervision of the occluded regions.
Experiments on KITTI 2012, KITTI 2015, Sintel Clean, and Sintel Final datasets demonstrate the effectiveness of our methods.
- Score: 7.111443975103329
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For visual estimation of optical flow, a crucial function for many vision
tasks, unsupervised learning, using the supervision of view synthesis has
emerged as a promising alternative to supervised methods, since ground-truth
flow is not readily available in many cases. However, unsupervised learning is
likely to be unstable when pixel tracking is lost due to occlusion and motion
blur, or the pixel matching is impaired due to variation in image content and
spatial structure over time. In natural environments, dynamic occlusion or
object variation is a relatively slow temporal process spanning several frames.
We, therefore, explore the optical flow estimation from multiple-frame
sequences of dynamic scenes, whereas most of the existing unsupervised
approaches are based on temporal static models. We handle the unsupervised
optical flow estimation with a temporal dynamic model by introducing a
spatial-temporal dual recurrent block based on the predictive coding structure,
which feeds the previous high-level motion prior to the current optical flow
estimator. Assuming temporal smoothness of optical flow, we use motion priors
of the adjacent frames to provide more reliable supervision of the occluded
regions. To grasp the essence of challenging scenes, we simulate various
scenarios across long sequences, including dynamic occlusion, content
variation, and spatial variation, and adopt self-supervised distillation to
make the model understand the object's motion patterns in a prolonged dynamic
environment. Experiments on KITTI 2012, KITTI 2015, Sintel Clean, and Sintel
Final datasets demonstrate the effectiveness of our methods on unsupervised
optical flow estimation. The proposal achieves state-of-the-art performance
with advantages in memory overhead.
Related papers
- Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation [34.529280562470746]
We introduce a novel self-supervised loss combining the Contrast Maximization framework with a non-linear motion prior in the form of pixel-level trajectories.
Their effectiveness is demonstrated in two scenarios: In dense continuous-time motion estimation, our method improves the zero-shot performance of a synthetically trained model by 29%.
arXiv Detail & Related papers (2024-07-15T15:18:28Z) - LEAP-VO: Long-term Effective Any Point Tracking for Visual Odometry [52.131996528655094]
We present the Long-term Effective Any Point Tracking (LEAP) module.
LEAP innovatively combines visual, inter-track, and temporal cues with mindfully selected anchors for dynamic track estimation.
Based on these traits, we develop LEAP-VO, a robust visual odometry system adept at handling occlusions and dynamic scenes.
arXiv Detail & Related papers (2024-01-03T18:57:27Z) - Motion-aware Memory Network for Fast Video Salient Object Detection [15.967509480432266]
We design a space-time memory (STM)-based network, which extracts useful temporal information of the current frame from adjacent frames as the temporal branch of VSOD.
In the encoding stage, we generate high-level temporal features by using high-level features from the current and its adjacent frames.
In the decoding stage, we propose an effective fusion strategy for spatial and temporal branches.
The proposed model does not require optical flow or other preprocessing, and can reach a speed of nearly 100 FPS during inference.
arXiv Detail & Related papers (2022-08-01T15:56:19Z) - Dimensions of Motion: Learning to Predict a Subspace of Optical Flow
from a Single Image [50.9686256513627]
We introduce the problem of predicting, from a single video frame, a low-dimensional subspace of optical flow which includes the actual instantaneous optical flow.
We show how several natural scene assumptions allow us to identify an appropriate flow subspace via a set of basis flow fields parameterized by disparity.
This provides a new approach to learning these tasks in an unsupervised fashion using monocular input video without requiring camera intrinsics or poses.
arXiv Detail & Related papers (2021-12-02T18:52:54Z) - MoCo-Flow: Neural Motion Consensus Flow for Dynamic Humans in Stationary
Monocular Cameras [98.40768911788854]
We introduce MoCo-Flow, a representation that models the dynamic scene using a 4D continuous time-variant function.
At the heart of our work lies a novel optimization formulation, which is constrained by a motion consensus regularization on the motion flow.
We extensively evaluate MoCo-Flow on several datasets that contain human motions of varying complexity.
arXiv Detail & Related papers (2021-06-08T16:03:50Z) - Unsupervised Motion Representation Enhanced Network for Action
Recognition [4.42249337449125]
Motion representation between consecutive frames has proven to have great promotion to video understanding.
TV-L1 method, an effective optical flow solver, is time-consuming and expensive in storage for caching the extracted optical flow.
We propose UF-TSN, a novel end-to-end action recognition approach enhanced with an embedded lightweight unsupervised optical flow estimator.
arXiv Detail & Related papers (2021-03-05T04:14:32Z) - Optical Flow Estimation from a Single Motion-blurred Image [66.2061278123057]
Motion blur in an image may have practical interests in fundamental computer vision problems.
We propose a novel framework to estimate optical flow from a single motion-blurred image in an end-to-end manner.
arXiv Detail & Related papers (2021-03-04T12:45:18Z) - A Deep Temporal Fusion Framework for Scene Flow Using a Learnable Motion
Model and Occlusions [17.66624674542256]
We propose a novel data-driven approach for temporal fusion of scene flow estimates in a multi-frame setup.
In a second step, a neural network combines bi-directional scene flow estimates from a common reference frame, yielding a refined estimate.
This way, our approach provides a fast multi-frame extension for a variety of scene flow estimators, which outperforms the underlying dual-frame approaches.
arXiv Detail & Related papers (2020-11-03T10:14:11Z) - What Matters in Unsupervised Optical Flow [51.45112526506455]
We compare and analyze a set of key components in unsupervised optical flow.
We construct a number of novel improvements to unsupervised flow models.
We present a new unsupervised flow technique that significantly outperforms the previous state-of-the-art.
arXiv Detail & Related papers (2020-06-08T19:36:26Z) - Joint Unsupervised Learning of Optical Flow and Egomotion with Bi-Level
Optimization [59.9673626329892]
We exploit the global relationship between optical flow and camera motion using epipolar geometry.
We use implicit differentiation to enable back-propagation through the lower-level geometric optimization layer independent of its implementation.
arXiv Detail & Related papers (2020-02-26T22:28:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.