RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical
Flow and Scene Flow Estimation
- URL: http://arxiv.org/abs/2309.15082v1
- Date: Tue, 26 Sep 2023 17:23:55 GMT
- Title: RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical
Flow and Scene Flow Estimation
- Authors: Zhexiong Wan, Yuxin Mao, Jing Zhang, Yuchao Dai
- Abstract summary: In this paper, we incorporate RGB images, Point clouds and Events for joint optical flow and scene flow estimation with our proposed multi-stage multimodal fusion model, RPEFlow.
Experiments on both synthetic and real datasets show that our model outperforms the existing state-of-the-art by a wide margin.
- Score: 43.358140897849616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the RGB images and point clouds fusion methods have been proposed
to jointly estimate 2D optical flow and 3D scene flow. However, as both
conventional RGB cameras and LiDAR sensors adopt a frame-based data acquisition
mechanism, their performance is limited by the fixed low sampling rates,
especially in highly-dynamic scenes. By contrast, the event camera can
asynchronously capture the intensity changes with a very high temporal
resolution, providing complementary dynamic information of the observed scenes.
In this paper, we incorporate RGB images, Point clouds and Events for joint
optical flow and scene flow estimation with our proposed multi-stage multimodal
fusion model, RPEFlow. First, we present an attention fusion module with a
cross-attention mechanism to implicitly explore the internal cross-modal
correlation for 2D and 3D branches, respectively. Second, we introduce a mutual
information regularization term to explicitly model the complementary
information of three modalities for effective multimodal feature learning. We
also contribute a new synthetic dataset to advocate further research.
Experiments on both synthetic and real datasets show that our model outperforms
the existing state-of-the-art by a wide margin. Code and dataset is available
at https://npucvr.github.io/RPEFlow.
Related papers
- Spatially-guided Temporal Aggregation for Robust Event-RGB Optical Flow Estimation [47.75348821902489]
Current optical flow methods exploit the stable appearance of frame (or RGB) data to establish robust correspondences across time.
Event cameras, on the other hand, provide high-temporal-resolution motion cues and excel in challenging scenarios.
This study introduces a novel approach that uses a spatially dense modality to guide the aggregation of the temporally dense event modality.
arXiv Detail & Related papers (2025-01-01T13:40:09Z) - Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection [70.84835546732738]
RGB-Thermal Salient Object Detection aims to pinpoint prominent objects within aligned pairs of visible and thermal infrared images.
Traditional encoder-decoder architectures may not have adequately considered the robustness against noise originating from defective modalities.
We propose the ConTriNet, a robust Confluent Triple-Flow Network employing a Divide-and-Conquer strategy.
arXiv Detail & Related papers (2024-12-02T14:44:39Z) - Camera Motion Estimation from RGB-D-Inertial Scene Flow [9.192660643226372]
We introduce a novel formulation for camera motion estimation that integrates RGB-D images and inertial data through scene flow.
Our goal is to accurately estimate the camera motion in a rigid 3D environment, along with the state of the inertial measurement unit (IMU)
arXiv Detail & Related papers (2024-04-26T08:42:59Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - Attentive Multimodal Fusion for Optical and Scene Flow [24.08052492109655]
Existing methods typically rely solely on RGB images or fuse the modalities at later stages.
We propose a novel deep neural network approach named FusionRAFT, which enables early-stage information fusion between sensor modalities.
Our approach exhibits improved robustness in the presence of noise and low-lighting conditions that affect the RGB images.
arXiv Detail & Related papers (2023-07-28T04:36:07Z) - Revisiting Event-based Video Frame Interpolation [49.27404719898305]
Dynamic vision sensors or event cameras provide rich complementary information for video frame.
estimating optical flow from events is arguably more difficult than from RGB information.
We propose a divide-and-conquer strategy in which event-based intermediate frame synthesis happens incrementally in multiple simplified stages.
arXiv Detail & Related papers (2023-07-24T06:51:07Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - Middle-level Fusion for Lightweight RGB-D Salient Object Detection [81.43951906434175]
A novel lightweight RGB-D SOD model is presented in this paper.
With IMFF and L modules incorporated in the middle-level fusion structure, our proposed model has only 3.9M parameters and runs at 33 FPS.
The experimental results on several benchmark datasets verify the effectiveness and superiority of the proposed method over some state-of-the-art methods.
arXiv Detail & Related papers (2021-04-23T11:37:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.