BAT: Learning Event-based Optical Flow with Bidirectional Adaptive Temporal Correlation
- URL: http://arxiv.org/abs/2503.03256v1
- Date: Wed, 05 Mar 2025 08:20:16 GMT
- Title: BAT: Learning Event-based Optical Flow with Bidirectional Adaptive Temporal Correlation
- Authors: Gangwei Xu, Haotong Lin, Zhaoxing Zhang, Hongcheng Luo, Haiyang Sun, Xin Yang,
- Abstract summary: Event cameras deliver visual information characterized by a high dynamic range and high temporal resolution.<n>Current advanced optical flow methods for event cameras largely adopt established image-based frameworks.<n>We present BAT, an innovative framework that estimates event-based optical flow using bidirectional adaptive temporal correlation.
- Score: 9.216990457540941
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Event cameras deliver visual information characterized by a high dynamic range and high temporal resolution, offering significant advantages in estimating optical flow for complex lighting conditions and fast-moving objects. Current advanced optical flow methods for event cameras largely adopt established image-based frameworks. However, the spatial sparsity of event data limits their performance. In this paper, we present BAT, an innovative framework that estimates event-based optical flow using bidirectional adaptive temporal correlation. BAT includes three novel designs: 1) a bidirectional temporal correlation that transforms bidirectional temporally dense motion cues into spatially dense ones, enabling accurate and spatially dense optical flow estimation; 2) an adaptive temporal sampling strategy for maintaining temporal consistency in correlation; 3) spatially adaptive temporal motion aggregation to efficiently and adaptively aggregate consistent target motion features into adjacent motion features while suppressing inconsistent ones. Our results rank $1^{st}$ on the DSEC-Flow benchmark, outperforming existing state-of-the-art methods by a large margin while also exhibiting sharp edges and high-quality details. Notably, our BAT can accurately predict future optical flow using only past events, significantly outperforming E-RAFT's warm-start approach. Code: \textcolor{magenta}{https://github.com/gangweiX/BAT}.
Related papers
- SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.
We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.
With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z) - EMoTive: Event-guided Trajectory Modeling for 3D Motion Estimation [59.33052312107478]
Event cameras offer possibilities for 3D motion estimation through continuous adaptive pixel-level responses to scene changes.
This paper presents EMove, a novel event-based framework that models-uniform trajectories via event-guided parametric curves.
For motion representation, we introduce a density-aware adaptation mechanism to fuse spatial and temporal features under event guidance.
The final 3D motion estimation is achieved through multi-temporal sampling of parametric trajectories, flows and depth motion fields.
arXiv Detail & Related papers (2025-03-14T13:15:54Z) - Spatially-guided Temporal Aggregation for Robust Event-RGB Optical Flow Estimation [47.75348821902489]
Current optical flow methods exploit the stable appearance of frame (or RGB) data to establish robust correspondences across time.<n>Event cameras, on the other hand, provide high-temporal-resolution motion cues and excel in challenging scenarios.<n>This study introduces a novel approach that uses a spatially dense modality to guide the aggregation of the temporally dense event modality.
arXiv Detail & Related papers (2025-01-01T13:40:09Z) - Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation [34.529280562470746]
We introduce a novel self-supervised loss combining the Contrast Maximization framework with a non-linear motion prior in the form of pixel-level trajectories.
Their effectiveness is demonstrated in two scenarios: In dense continuous-time motion estimation, our method improves the zero-shot performance of a synthetically trained model by 29%.
arXiv Detail & Related papers (2024-07-15T15:18:28Z) - PASTA: Towards Flexible and Efficient HDR Imaging Via Progressively Aggregated Spatio-Temporal Alignment [91.38256332633544]
PASTA is a Progressively Aggregated Spatio-Temporal Alignment framework for HDR deghosting.
Our approach achieves effectiveness and efficiency by harnessing hierarchical representation during feature distanglement.
Experimental results showcase PASTA's superiority over current SOTA methods in both visual quality and performance metrics.
arXiv Detail & Related papers (2024-03-15T15:05:29Z) - TMA: Temporal Motion Aggregation for Event-based Optical Flow [27.49029251605363]
Event cameras have the ability to record continuous and detailed trajectories of objects with high temporal resolution.
Most existing learning-based approaches for event optical flow estimation ignore the inherent temporal continuity of event data.
We propose a novel Temporal Motion Aggregation (TMA) approach to unlock its potential.
arXiv Detail & Related papers (2023-03-21T06:51:31Z) - Learning Dense and Continuous Optical Flow from an Event Camera [28.77846425802558]
Event cameras such as DAVIS can simultaneously output high temporal resolution events and low frame-rate intensity images.
Most of the existing optical flow estimation methods are based on two consecutive image frames and can only estimate discrete flow at a fixed time interval.
We propose a novel deep learning-based dense and continuous optical flow estimation framework from a single image with event streams.
arXiv Detail & Related papers (2022-11-16T17:53:18Z) - Correlating sparse sensing for large-scale traffic speed estimation: A
Laplacian-enhanced low-rank tensor kriging approach [76.45949280328838]
We propose a Laplacian enhanced low-rank tensor (LETC) framework featuring both lowrankness and multi-temporal correlations for large-scale traffic speed kriging.
We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging.
arXiv Detail & Related papers (2022-10-21T07:25:57Z) - Dense Continuous-Time Optical Flow from Events and Frames [27.1850072968441]
We show that it is possible to compute per-pixel, continuous-time optical flow using events from an event camera.
We leverage these benefits to predict pixel trajectories densely in continuous time via parameterized B'ezier curves.
Our model is the first method that can regress dense pixel trajectories from event data.
arXiv Detail & Related papers (2022-03-25T14:29:41Z) - RGB Stream Is Enough for Temporal Action Detection [3.2689702143620147]
State-of-the-art temporal action detectors to date are based on two-stream input including RGB frames and optical flow.
optical flow is a hand-designed representation which not only requires heavy computation, but also makes it methodologically unsatisfactory that two-stream methods are often not learned end-to-end jointly with the flow.
We argue that optical flow is dispensable in high-accuracy temporal action detection and image level data augmentation is the key solution to avoid performance degradation when optical flow is removed.
arXiv Detail & Related papers (2021-07-09T11:10:11Z) - TimeLens: Event-based Video Frame Interpolation [54.28139783383213]
We introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both synthesis-based and flow-based approaches.
We show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods.
arXiv Detail & Related papers (2021-06-14T10:33:47Z) - Unsupervised Motion Representation Enhanced Network for Action
Recognition [4.42249337449125]
Motion representation between consecutive frames has proven to have great promotion to video understanding.
TV-L1 method, an effective optical flow solver, is time-consuming and expensive in storage for caching the extracted optical flow.
We propose UF-TSN, a novel end-to-end action recognition approach enhanced with an embedded lightweight unsupervised optical flow estimator.
arXiv Detail & Related papers (2021-03-05T04:14:32Z) - PAN: Towards Fast Action Recognition via Learning Persistence of
Appearance [60.75488333935592]
Most state-of-the-art methods heavily rely on dense optical flow as motion representation.
In this paper, we shed light on fast action recognition by lifting the reliance on optical flow.
We design a novel motion cue called Persistence of Appearance (PA)
In contrast to optical flow, our PA focuses more on distilling the motion information at boundaries.
arXiv Detail & Related papers (2020-08-08T07:09:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.