Related papers: Injecting Frame-Event Complementary Fusion into Diffusion for Optical Flow in Challenging Scenes

Injecting Frame-Event Complementary Fusion into Diffusion for Optical Flow in Challenging Scenes

URL: http://arxiv.org/abs/2510.10577v1
Date: Sun, 12 Oct 2025 12:52:31 GMT
Title: Injecting Frame-Event Complementary Fusion into Diffusion for Optical Flow in Challenging Scenes
Authors: Haonan Wang, Hanyu Zhou, Haoyue Liu, Luxin Yan,
Abstract summary: In degraded scenes, the frame camera provides dense appearance saturation but sparse boundary completeness due to its long imaging time and low dynamic range.<n>In contrast, the event camera offers sparse appearance saturation, while its short imaging time and high dynamic range gives rise to dense boundary completeness.<n>We propose a novel optical flow estimation framework Diff-ABFlow based on diffusion models with frame-event appearance-boundary fusion.
Score: 41.822043262920296
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Optical flow estimation has achieved promising results in conventional scenes but faces challenges in high-speed and low-light scenes, which suffer from motion blur and insufficient illumination. These conditions lead to weakened texture and amplified noise and deteriorate the appearance saturation and boundary completeness of frame cameras, which are necessary for motion feature matching. In degraded scenes, the frame camera provides dense appearance saturation but sparse boundary completeness due to its long imaging time and low dynamic range. In contrast, the event camera offers sparse appearance saturation, while its short imaging time and high dynamic range gives rise to dense boundary completeness. Traditionally, existing methods utilize feature fusion or domain adaptation to introduce event to improve boundary completeness. However, the appearance features are still deteriorated, which severely affects the mostly adopted discriminative models that learn the mapping from visual features to motion fields and generative models that generate motion fields based on given visual features. So we introduce diffusion models that learn the mapping from noising flow to clear flow, which is not affected by the deteriorated visual features. Therefore, we propose a novel optical flow estimation framework Diff-ABFlow based on diffusion models with frame-event appearance-boundary fusion.

Related papers

Fine-grained Defocus Blur Control for Generative Image Models [66.30016220484394]
Current text-to-image diffusion models excel at generating diverse, high-quality images.<n>We introduce a novel text-to-image diffusion framework that leverages camera metadata.<n>Our model enables superior fine-grained control without altering the depicted scene.
arXiv Detail & Related papers (2025-10-07T17:59:15Z)
BokehDiff: Neural Lens Blur with One-Step Diffusion [53.11429878683807]
We introduce BokehDiff, a lens blur rendering method that achieves physically accurate and visually appealing outcomes.<n>Our method employs a physics-inspired self-attention module that aligns with the image formation process.<n>We adapt the diffusion model to the one-step inference scheme without introducing additional noise, and achieve results of high quality and fidelity.
arXiv Detail & Related papers (2025-07-24T03:23:19Z)
FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation [51.110607281391154]
FlowMo is a training-free guidance method for enhancing motion coherence in text-to-video models.<n>It estimates motion coherence by measuring the patch-wise variance across the temporal dimension and guides the model to reduce this variance dynamically during sampling.
arXiv Detail & Related papers (2025-06-01T19:55:33Z)
Zero-TIG: Temporal Consistency-Aware Zero-Shot Illumination-Guided Low-light Video Enhancement [2.9695823613761316]
Low-light and underwater videos suffer from poor visibility, low contrast, and high noise.<n>Existing approaches typically rely on paired ground truth, which limits their practicality and often fails to maintain temporal consistency.<n>This paper introduces a novel zero-shot learning approach named Zero-TIG, leveraging the Retinex theory and optical flow techniques.
arXiv Detail & Related papers (2025-03-14T08:22:26Z)
Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion Models [26.79219274697864]
Bokeh Diffusion is a scene-consistent bokeh control framework.<n>We introduce a hybrid training pipeline that aligns in-the-wild images with synthetic blur augmentations.<n>Our approach enables flexible, lens-like blur control, supports downstream applications such as real image editing via inversion.
arXiv Detail & Related papers (2025-03-11T13:49:12Z)
DiffuEraser: A Diffusion Model for Video Inpainting [13.292164408616257]
We introduce DiffuEraser, a video inpainting model based on stable diffusion, to fill masked regions with greater details and more coherent structures.<n>We also expand the temporal receptive fields of both the prior model and DiffuEraser, and further enhance consistency by leveraging the temporal smoothing property of Video Diffusion Models.
arXiv Detail & Related papers (2025-01-17T08:03:02Z)
SMURF: Continuous Dynamics for Motion-Deblurring Radiance Fields [13.684805723485157]
The presence of motion blur, resulting from slight camera movements during extended shutter exposures, poses a significant challenge.<n>We propose sequential motion understanding radiance fields (SMURF), a novel approach that models continuous camera motion.<n>Our model is evaluated against benchmark datasets and demonstrates state-of-the-art performance both quantitatively and qualitatively.
arXiv Detail & Related papers (2024-03-12T11:32:57Z)
MoBluRF: Motion Deblurring Neural Radiance Fields for Blurry Monocular Video [25.964642223641057]
MoBluRF is a framework for synthesis of sharp-temporal views in blurry monocular video.<n>In the BRI stage, we reconstruct dynamic 3D scenes and jointly initialize the base rays which are used to predict latent sharp rays.<n>In the MDD stage, we introduce a novel Incremental Latent Sharp-rays Prediction (ILSP) approach for the blurry monocular video frames.
arXiv Detail & Related papers (2023-12-21T02:01:19Z)
TimeLens: Event-based Video Frame Interpolation [54.28139783383213]
We introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both synthesis-based and flow-based approaches. We show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods.
arXiv Detail & Related papers (2021-06-14T10:33:47Z)
Motion-blurred Video Interpolation and Extrapolation [72.3254384191509]
We present a novel framework for deblurring, interpolating and extrapolating sharp frames from a motion-blurred video in an end-to-end manner. To ensure temporal coherence across predicted frames and address potential temporal ambiguity, we propose a simple, yet effective flow-based rule.
arXiv Detail & Related papers (2021-03-04T12:18:25Z)
Exposure Trajectory Recovery from Motion Blur [90.75092808213371]
Motion blur in dynamic scenes is an important yet challenging research topic. In this paper, we define exposure trajectories, which represent the motion information contained in a blurry image. A novel motion offset estimation framework is proposed to model pixel-wise displacements of the latent sharp image.
arXiv Detail & Related papers (2020-10-06T05:23:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.