StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video
Sequences
- URL: http://arxiv.org/abs/2311.17099v1
- Date: Tue, 28 Nov 2023 07:53:51 GMT
- Title: StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video
Sequences
- Authors: Shangkun Sun, Jiaming Liu, Thomas H. Li, Huaxia Li, Guoqing Liu, Wei
Gao
- Abstract summary: Occlusions between consecutive frames have long posed a significant challenge in optical flow estimation.
We present a Streamlined In-batch Multi-frame (SIM) pipeline tailored to video input, attaining a similar level of time efficiency to two-frame networks.
StreamFlow not only excels in terms of performance on challenging KITTI and Sintel datasets, with particular improvement in occluded areas.
- Score: 31.210626775505407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Occlusions between consecutive frames have long posed a significant challenge
in optical flow estimation. The inherent ambiguity introduced by occlusions
directly violates the brightness constancy constraint and considerably hinders
pixel-to-pixel matching. To address this issue, multi-frame optical flow
methods leverage adjacent frames to mitigate the local ambiguity. Nevertheless,
prior multi-frame methods predominantly adopt recursive flow estimation,
resulting in a considerable computational overlap. In contrast, we propose a
streamlined in-batch framework that eliminates the need for extensive redundant
recursive computations while concurrently developing effective spatio-temporal
modeling approaches under in-batch estimation constraints. Specifically, we
present a Streamlined In-batch Multi-frame (SIM) pipeline tailored to video
input, attaining a similar level of time efficiency to two-frame networks.
Furthermore, we introduce an efficient Integrative Spatio-temporal Coherence
(ISC) modeling method for effective spatio-temporal modeling during the
encoding phase, which introduces no additional parameter overhead.
Additionally, we devise a Global Temporal Regressor (GTR) that effectively
explores temporal relations during decoding. Benefiting from the efficient SIM
pipeline and effective modules, StreamFlow not only excels in terms of
performance on the challenging KITTI and Sintel datasets, with particular
improvement in occluded areas but also attains a remarkable $63.82\%$
enhancement in speed compared with previous multi-frame methods. The code will
be available soon at https://github.com/littlespray/StreamFlow.
Related papers
- Motion-Aware Video Frame Interpolation [49.49668436390514]
We introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames.
It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, but also effectively reduces the required computational cost and complexity.
arXiv Detail & Related papers (2024-02-05T11:00:14Z) - Dynamic Frame Interpolation in Wavelet Domain [57.25341639095404]
Video frame is an important low-level computation vision task, which can increase frame rate for more fluent visual experience.
Existing methods have achieved great success by employing advanced motion models and synthesis networks.
WaveletVFI can reduce computation up to 40% while maintaining similar accuracy, making it perform more efficiently against other state-of-the-arts.
arXiv Detail & Related papers (2023-09-07T06:41:15Z) - AccFlow: Backward Accumulation for Long-Range Optical Flow [70.4251045372285]
This paper proposes a novel recurrent framework called AccFlow for long-range optical flow estimation.
We demonstrate the superiority of backward accumulation over conventional forward accumulation.
Experiments validate the effectiveness of AccFlow in handling long-range optical flow estimation.
arXiv Detail & Related papers (2023-08-25T01:51:26Z) - Progressive Motion Context Refine Network for Efficient Video Frame
Interpolation [10.369068266836154]
Flow-based frame methods have achieved great success by first modeling optical flow between target and input frames, and then building synthesis network for target frame generation.
We propose a novel Progressive Motion Context Refine Network (PMCRNet) to predict motion fields and image context jointly for higher efficiency.
Experiments on multiple benchmarks show that proposed approaches not only achieve favorable and quantitative results but also reduces model size and running time significantly.
arXiv Detail & Related papers (2022-11-11T06:29:03Z) - IFRNet: Intermediate Feature Refine Network for Efficient Frame
Interpolation [44.04110765492441]
We devise an efficient encoder-decoder based network, termed IFRNet, for fast intermediate frame synthesizing.
Experiments on various benchmarks demonstrate the excellent performance and fast inference speed of proposed approaches.
arXiv Detail & Related papers (2022-05-29T10:18:18Z) - Optical-Flow-Reuse-Based Bidirectional Recurrent Network for Space-Time
Video Super-Resolution [52.899234731501075]
Space-time video super-resolution (ST-VSR) simultaneously increases the spatial resolution and frame rate for a given video.
Existing methods typically suffer from difficulties in how to efficiently leverage information from a large range of neighboring frames.
We propose a coarse-to-fine bidirectional recurrent neural network instead of using ConvLSTM to leverage knowledge between adjacent frames.
arXiv Detail & Related papers (2021-10-13T15:21:30Z) - FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation [97.99012124785177]
FLAVR is a flexible and efficient architecture that uses 3D space-time convolutions to enable end-to-end learning and inference for video framesupervised.
We demonstrate that FLAVR can serve as a useful self- pretext task for action recognition, optical flow estimation, and motion magnification.
arXiv Detail & Related papers (2020-12-15T18:59:30Z) - All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced
Motion Modeling [52.425236515695914]
State-of-the-art methods are iterative solutions interpolating one frame at the time.
This work introduces a true multi-frame interpolator.
It utilizes a pyramidal style network in the temporal domain to complete the multi-frame task in one-shot.
arXiv Detail & Related papers (2020-07-23T02:34:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.