HMAFlow: Learning More Accurate Optical Flow via Hierarchical Motion Field Alignment
- URL: http://arxiv.org/abs/2409.05531v2
- Date: Sun, 15 Sep 2024 06:37:55 GMT
- Title: HMAFlow: Learning More Accurate Optical Flow via Hierarchical Motion Field Alignment
- Authors: Dianbo Ma, Kousuke Imamura, Ziyan Gao, Xiangjie Wang, Satoshi Yamane,
- Abstract summary: We present a novel method, dubbed HMAFlow, to improve optical flow estimation in challenging scenes.
The proposed model mainly consists of two core components: a Hierarchical Motion Field Alignment (HMA) module and a Correlation Self-Attention (CSA) module.
Experimental results demonstrate that our model achieves the best generalization performance compared to other state-of-the-art methods.
- Score: 0.5825410941577593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optical flow estimation is a fundamental and long-standing visual task. In this work, we present a novel method, dubbed HMAFlow, to improve optical flow estimation in challenging scenes, particularly those involving small objects. The proposed model mainly consists of two core components: a Hierarchical Motion Field Alignment (HMA) module and a Correlation Self-Attention (CSA) module. In addition, we rebuild 4D cost volumes by employing a Multi-Scale Correlation Search (MCS) layer and replacing average pooling in common cost volumes with a search strategy utilizing multiple search ranges. Experimental results demonstrate that our model achieves the best generalization performance compared to other state-of-the-art methods. Specifically, compared with RAFT, our method achieves relative error reductions of 14.2% and 3.4% on the clean pass and final pass of the Sintel online benchmark, respectively. On the KITTI test benchmark, HMAFlow surpasses RAFT and GMA in the Fl-all metric by relative margins of 6.8% and 7.7%, respectively. To facilitate future research, our code will be made available at https://github.com/BooTurbo/HMAFlow.
Related papers
- ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video [26.01796507893086]
This paper proposes a 3D motion perception method called ScaleFlow++ that is easy to generalize.
With just a pair of RGB images, ScaleFlow++ can robustly estimate optical flow and motion-in-depth (MID)
On KITTI, ScaleFlow++ achieved the best monocular scene flow estimation performance, reducing SF-all from 6.21 to 5.79.
arXiv Detail & Related papers (2024-09-16T11:59:27Z) - ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video [15.629496237910999]
This paper proposes a 3D motion perception method called ScaleFlow++ that is easy to generalize.
With just a pair of RGB images, ScaleFlow++ can robustly estimate optical flow and motion-in-depth (MID)
On KITTI, ScaleFlow++ achieved the best monocular scene flow estimation performance, reducing SF-all from 6.21 to 5.79.
arXiv Detail & Related papers (2024-07-13T07:58:48Z) - Re-Evaluating LiDAR Scene Flow for Autonomous Driving [80.37947791534985]
Popular benchmarks for self-supervised LiDAR scene flow have unrealistic rates of dynamic motion, unrealistic correspondences, and unrealistic sampling patterns.
We evaluate a suite of top methods on a suite of real-world datasets.
We show that despite the emphasis placed on learning, most performance gains are caused by pre- and post-processing steps.
arXiv Detail & Related papers (2023-04-04T22:45:50Z) - Rethinking Optical Flow from Geometric Matching Consistent Perspective [38.014569953980754]
We propose a rethinking to previous optical flow estimation.
We use GIM as a pre-training task for the optical flow estimation (MatchFlow) with better feature representations.
Our method achieves 11.5% and 10.1% error reduction from GMA on Sintel clean pass and KITTI test set.
arXiv Detail & Related papers (2023-03-15T06:00:38Z) - Comparative Study of Coupling and Autoregressive Flows through Robust
Statistical Tests [0.0]
We propose an in-depth comparison of coupling and autoregressive flows, both of the affine and rational quadratic type.
We focus on a set of multimodal target distributions increasing dimensionality ranging from 4 to 400.
Our results indicate that the A-RQS algorithm stands out both in terms of accuracy and training speed.
arXiv Detail & Related papers (2023-02-23T13:34:01Z) - Bi-PointFlowNet: Bidirectional Learning for Point Cloud Based Scene Flow
Estimation [3.1869033681682124]
This paper presents a novel scene flow estimation architecture using bidirectional flow embedding layers.
The proposed bidirectional layer learns features along both forward and backward directions, enhancing the estimation performance.
In addition, hierarchical feature extraction and warping improve the performance and reduce computational overhead.
arXiv Detail & Related papers (2022-07-15T15:14:53Z) - FlowNAS: Neural Architecture Search for Optical Flow Estimation [65.44079917247369]
We propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task.
Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67% F1-all error on KITTI.
arXiv Detail & Related papers (2022-07-04T09:05:25Z) - Joint Feature Learning and Relation Modeling for Tracking: A One-Stream
Framework [76.70603443624012]
We propose a novel one-stream tracking (OSTrack) framework that unifies feature learning and relation modeling.
In this way, discriminative target-oriented features can be dynamically extracted by mutual guidance.
OSTrack achieves state-of-the-art performance on multiple benchmarks, in particular, it shows impressive results on the one-shot tracking benchmark GOT-10k.
arXiv Detail & Related papers (2022-03-22T18:37:11Z) - GMFlow: Learning Optical Flow via Global Matching [124.57850500778277]
We propose a GMFlow framework for learning optical flow estimation.
It consists of three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for global feature matching, and a self-attention layer for flow propagation.
Our new framework outperforms 32-iteration RAFT's performance on the challenging Sintel benchmark.
arXiv Detail & Related papers (2021-11-26T18:59:56Z) - Learning to Generate Content-Aware Dynamic Detectors [62.74209921174237]
We introduce a newpective of designing efficient detectors, which is automatically generating sample-adaptive model architecture.
We introduce a course-to-fine strat-egy tailored for object detection to guide the learning of dynamic routing.
Experiments on MS-COCO dataset demonstrate that CADDet achieves 1.8 higher mAP with 10% fewer FLOPs compared with vanilla routing.
arXiv Detail & Related papers (2020-12-08T08:05:20Z) - FPCR-Net: Feature Pyramidal Correlation and Residual Reconstruction for
Optical Flow Estimation [72.41370576242116]
We propose a semi-supervised Feature Pyramidal Correlation and Residual Reconstruction Network (FPCR-Net) for optical flow estimation from frame pairs.
It consists of two main modules: pyramid correlation mapping and residual reconstruction.
Experiment results show that the proposed scheme achieves the state-of-the-art performance, with improvement by 0.80, 1.15 and 0.10 in terms of average end-point error (AEE) against competing baseline methods.
arXiv Detail & Related papers (2020-01-17T07:13:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.