Efficient Correlation Volume Sampling for Ultra-High-Resolution Optical Flow Estimation
- URL: http://arxiv.org/abs/2505.16942v1
- Date: Thu, 22 May 2025 17:30:38 GMT
- Title: Efficient Correlation Volume Sampling for Ultra-High-Resolution Optical Flow Estimation
- Authors: Karlis Martins Briedis, Markus Gross, Christopher Schroers,
- Abstract summary: We propose a more efficient implementation of the all-pairs correlation volume sampling, still matching the exact mathematical operator as defined by RAFT.<n>Our approach outperforms on-demand sampling by up to 90% while maintaining low memory usage, and performs on par with the default implementation with up to 95% lower memory usage.
- Score: 10.244450933742089
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent optical flow estimation methods often employ local cost sampling from a dense all-pairs correlation volume. This results in quadratic computational and memory complexity in the number of pixels. Although an alternative memory-efficient implementation with on-demand cost computation exists, this is slower in practice and therefore prior methods typically process images at reduced resolutions, missing fine-grained details. To address this, we propose a more efficient implementation of the all-pairs correlation volume sampling, still matching the exact mathematical operator as defined by RAFT. Our approach outperforms on-demand sampling by up to 90% while maintaining low memory usage, and performs on par with the default implementation with up to 95% lower memory usage. As cost sampling makes up a significant portion of the overall runtime, this can translate to up to 50% savings for the total end-to-end model inference in memory-constrained environments. Our evaluation of existing methods includes an 8K ultra-high-resolution dataset and an additional inference-time modification of the recent SEA-RAFT method. With this, we achieve state-of-the-art results at high resolutions both in accuracy and efficiency.
Related papers
- MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation [41.94295877935867]
MEMFOF is a memory-efficient multi-frame optical flow method.<n>Method requires only 2.09 GB of GPU memory at runtime for 1080p inputs, and 28.5 GB during training.<n>Method ranks first on the Spring benchmark with a 1-pixel (1px) outlier rate of 3.289.
arXiv Detail & Related papers (2025-06-29T09:01:42Z) - Accelerating Diffusion Sampling with Optimized Time Steps [69.21208434350567]
Diffusion probabilistic models (DPMs) have shown remarkable performance in high-resolution image synthesis.
Their sampling efficiency is still to be desired due to the typically large number of sampling steps.
Recent advancements in high-order numerical ODE solvers for DPMs have enabled the generation of high-quality images with much fewer sampling steps.
arXiv Detail & Related papers (2024-02-27T10:13:30Z) - A Finite-Horizon Approach to Active Level Set Estimation [0.7366405857677227]
We consider the problem of active learning in the context of spatial sampling for level set estimation (LSE)
We present a finite-horizon search procedure to perform LSE in one dimension while optimally balancing both the final estimation error and the distance traveled for a fixed number of samples.
We show that the resulting optimization problem can be solved in closed form and that the resulting policy generalizes existing approaches to this problem.
arXiv Detail & Related papers (2023-10-18T14:11:41Z) - Efficient distributed representations with linear-time attention scores normalization [3.8673630752805437]
We propose a linear-time approximation of the attention score normalization constants for embedding vectors with bounded norms.
The accuracy of our estimation formula surpasses competing kernel methods by even orders of magnitude.
The proposed algorithm is highly interpretable and easily adapted to an arbitrary embedding problem.
arXiv Detail & Related papers (2023-03-30T15:48:26Z) - DIP: Deep Inverse Patchmatch for High-Resolution Optical Flow [7.73554718719193]
We propose a novel Patchmatch-based framework to work on high-resolution optical flow estimation.
It can get high-precision results with lower memory benefiting from propagation and local search of Patchmatch.
Our method ranks first on all the metrics on the popular KITTI2015 benchmark, and ranks second on EPE on the Sintel clean benchmark among published optical flow methods.
arXiv Detail & Related papers (2022-04-01T10:13:59Z) - Dense Optical Flow from Event Cameras [55.79329250951028]
We propose to incorporate feature correlation and sequential processing into dense optical flow estimation from event cameras.
Our proposed approach computes dense optical flow and reduces the end-point error by 23% on MVSEC.
arXiv Detail & Related papers (2021-08-24T07:39:08Z) - SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory.
We propose StreaMRAK - a streaming version of KRR.
We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z) - Learning Optical Flow from a Few Matches [67.83633948984954]
We show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it.
Experiments show that our method can reduce computational cost and memory use significantly, while maintaining high accuracy.
arXiv Detail & Related papers (2021-04-05T21:44:00Z) - Normalized Convolution Upsampling for Refined Optical Flow Estimation [23.652615797842085]
Normalized Convolution UPsampler (NCUP) is an efficient joint upsampling approach to produce the full-resolution flow during the training of optical flow CNNs.
Our proposed approach formulates the upsampling task as a sparse problem and employs the normalized convolutional neural networks to solve it.
We achieve state-of-the-art results on Sintel benchmark with 6% error reduction, and on-par on the KITTI dataset, while having 7.5% fewer parameters.
arXiv Detail & Related papers (2021-02-13T18:34:03Z) - Fully Quantized Image Super-Resolution Networks [81.75002888152159]
We propose a Fully Quantized image Super-Resolution framework (FQSR) to jointly optimize efficiency and accuracy.
We apply our quantization scheme on multiple mainstream super-resolution architectures, including SRResNet, SRGAN and EDSR.
Our FQSR using low bits quantization can achieve on par performance compared with the full-precision counterparts on five benchmark datasets.
arXiv Detail & Related papers (2020-11-29T03:53:49Z) - Highly Efficient Salient Object Detection with 100K Parameters [137.74898755102387]
We propose a flexible convolutional module, namely generalized OctConv (gOctConv), to efficiently utilize both in-stage and cross-stages multi-scale features.
We build an extremely light-weighted model, namely CSNet, which achieves comparable performance with about 0.2% (100k) of large models on popular object detection benchmarks.
arXiv Detail & Related papers (2020-03-12T07:00:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.