DCVNet: Dilated Cost Volume Networks for Fast Optical Flow
- URL: http://arxiv.org/abs/2103.17271v2
- Date: Mon, 18 Mar 2024 17:59:33 GMT
- Title: DCVNet: Dilated Cost Volume Networks for Fast Optical Flow
- Authors: Huaizu Jiang, Erik Learned-Miller,
- Abstract summary: The cost volume, capturing the similarity of possible correspondences across two input images, is a key ingredient in state-of-the-art optical flow approaches.
We propose an alternative by constructing cost volumes with different dilation factors to capture small and large displacements simultaneously.
A U-Net with skip connections is employed to convert the dilated cost volumes into weights between all possible captured displacements to get the optical flow.
- Score: 5.526631378837701
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The cost volume, capturing the similarity of possible correspondences across two input images, is a key ingredient in state-of-the-art optical flow approaches. When sampling correspondences to build the cost volume, a large neighborhood radius is required to deal with large displacements, introducing a significant computational burden. To address this, coarse-to-fine or recurrent processing of the cost volume is usually adopted, where correspondence sampling in a local neighborhood with a small radius suffices. In this paper, we propose an alternative by constructing cost volumes with different dilation factors to capture small and large displacements simultaneously. A U-Net with skip connections is employed to convert the dilated cost volumes into interpolation weights between all possible captured displacements to get the optical flow. Our proposed model DCVNet only needs to process the cost volume once in a simple feedforward manner and does not rely on the sequential processing strategy. DCVNet obtains comparable accuracy to existing approaches and achieves real-time inference (30 fps on a mid-end 1080ti GPU). The code and model weights are available at https://github.com/neu-vi/ezflow.
Related papers
- DCVSMNet: Double Cost Volume Stereo Matching Network [0.0]
DCVSMNet is a fast stereo matching network with a 67 ms inference time and strong generalization ability.
Results on several bench mark datasets show that DCVSMNet achieves better accuracy than methods such as CGI-Stereo and BGNet at the cost of greater inference time.
arXiv Detail & Related papers (2024-02-26T10:42:25Z) - Instant Complexity Reduction in CNNs using Locality-Sensitive Hashing [50.79602839359522]
We propose HASTE (Hashing for Tractable Efficiency), a parameter-free and data-free module that acts as a plug-and-play replacement for any regular convolution module.
We are able to drastically compress latent feature maps without sacrificing much accuracy by using locality-sensitive hashing (LSH)
In particular, we are able to instantly drop 46.72% of FLOPs while only losing 1.25% accuracy by just swapping the convolution modules in a ResNet34 on CIFAR-10 for our HASTE module.
arXiv Detail & Related papers (2023-09-29T13:09:40Z) - Dynamic Frame Interpolation in Wavelet Domain [57.25341639095404]
Video frame is an important low-level computation vision task, which can increase frame rate for more fluent visual experience.
Existing methods have achieved great success by employing advanced motion models and synthesis networks.
WaveletVFI can reduce computation up to 40% while maintaining similar accuracy, making it perform more efficiently against other state-of-the-arts.
arXiv Detail & Related papers (2023-09-07T06:41:15Z) - DIFT: Dynamic Iterative Field Transforms for Memory Efficient Optical
Flow [44.57023882737517]
We introduce a lightweight low-latency and memory-efficient model for optical flow estimation.
DIFT is feasible for edge applications such as mobile, XR, micro UAVs, robotics and cameras.
We demonstrate first real-time cost-volume-based optical flow DL architecture on Snapdragon 8 Gen 1 HTP efficient mobile AI accelerator.
arXiv Detail & Related papers (2023-06-09T06:10:59Z) - LLA-FLOW: A Lightweight Local Aggregation on Cost Volume for Optical
Flow Estimation [35.922073542578055]
Some methods insert stacked transformer modules that allow the network to use global information of cost volume for estimation.
But the global information aggregation often incurs serious memory and time costs during training and inference, which hinders model deployment.
We draw inspiration from the traditional local region constraint and design the local similarity aggregation (LSA) and the shifted local similarity aggregation (SLSA)
Experiments on the final pass of Sintel show the lower cost required for our approach while maintaining competitive performance.
arXiv Detail & Related papers (2023-04-17T09:22:05Z) - Correlate-and-Excite: Real-Time Stereo Matching via Guided Cost Volume
Excitation [65.83008812026635]
We construct Guided Cost volume Excitation (GCE) and show that simple channel excitation of cost volume guided by image can improve performance considerably.
We present an end-to-end network that we call Correlate-and-Excite (CoEx)
arXiv Detail & Related papers (2021-08-12T14:32:26Z) - Learning Optical Flow from a Few Matches [67.83633948984954]
We show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it.
Experiments show that our method can reduce computational cost and memory use significantly, while maintaining high accuracy.
arXiv Detail & Related papers (2021-04-05T21:44:00Z) - Displacement-Invariant Matching Cost Learning for Accurate Optical Flow
Estimation [109.64756528516631]
Learning matching costs have been shown to be critical to the success of the state-of-the-art deep stereo matching methods.
This paper proposes a novel solution that is able to bypass the requirement of building a 5D feature volume.
Our approach achieves state-of-the-art accuracy on various datasets, and outperforms all published optical flow methods on the Sintel benchmark.
arXiv Detail & Related papers (2020-10-28T09:57:00Z) - LiteFlowNet3: Resolving Correspondence Ambiguity for More Accurate
Optical Flow Estimation [99.19322851246972]
We introduce LiteFlowNet3, a deep network consisting of two specialized modules to address the problem of optical flow estimation.
LiteFlowNet3 not only achieves promising results on public benchmarks but also has a small model size and a fast runtime.
arXiv Detail & Related papers (2020-07-18T03:30:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.