ProgressiveMotionSeg: Mutually Reinforced Framework for Event-Based
Motion Segmentation
- URL: http://arxiv.org/abs/2203.11732v1
- Date: Tue, 22 Mar 2022 13:40:26 GMT
- Title: ProgressiveMotionSeg: Mutually Reinforced Framework for Event-Based
Motion Segmentation
- Authors: Jinze Chen, Yang Wang, Yang Cao, Feng Wu, Zheng-Jun Zha
- Abstract summary: This paper presents a Motion Estimation (ME) module and an Event Denoising (ED) module jointly optimized in a mutually reinforced manner.
Taking temporal correlation as guidance, ED module calculates the confidence that each event belongs to real activity events, and transmits it to ME module to update energy function of motion segmentation for noise suppression.
- Score: 101.19290845597918
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dynamic Vision Sensor (DVS) can asynchronously output the events reflecting
apparent motion of objects with microsecond resolution, and shows great
application potential in monitoring and other fields. However, the output event
stream of existing DVS inevitably contains background activity noise (BA noise)
due to dark current and junction leakage current, which will affect the
temporal correlation of objects, resulting in deteriorated motion estimation
performance. Particularly, the existing filter-based denoising methods cannot
be directly applied to suppress the noise in event stream, since there is no
spatial correlation. To address this issue, this paper presents a novel
progressive framework, in which a Motion Estimation (ME) module and an Event
Denoising (ED) module are jointly optimized in a mutually reinforced manner.
Specifically, based on the maximum sharpness criterion, ME module divides the
input event into several segments by adaptive clustering in a motion
compensating warp field, and captures the temporal correlation of event stream
according to the clustered motion parameters. Taking temporal correlation as
guidance, ED module calculates the confidence that each event belongs to real
activity events, and transmits it to ME module to update energy function of
motion segmentation for noise suppression. The two steps are iteratively
updated until stable motion segmentation results are obtained. Extensive
experimental results on both synthetic and real datasets demonstrate the
superiority of our proposed approaches against the State-Of-The-Art (SOTA)
methods.
Related papers
- Deformable Feature Alignment and Refinement for Moving Infrared Dim-small Target Detection [17.765101100010224]
We propose a Deformable Feature Alignment and Refinement (DFAR) method based on deformable convolution to explicitly use motion context in both the training and inference stages.
The proposed DFAR method achieves the state-of-the-art performance on two benchmark datasets including DAUB and IRDST.
arXiv Detail & Related papers (2024-07-10T00:42:25Z) - Event-based Video Frame Interpolation with Edge Guided Motion Refinement [28.331148083668857]
We introduce an end-to-end E-VFI learning method to efficiently utilize edge features from event signals for motion flow and warping enhancement.
Our method incorporates an Edge Guided Attentive (EGA) module, which rectifies estimated video motion through attentive aggregation.
Experiments on both synthetic and real datasets show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-04-28T12:13:34Z) - Motion-aware Latent Diffusion Models for Video Frame Interpolation [51.78737270917301]
Motion estimation between neighboring frames plays a crucial role in avoiding motion ambiguity.
We propose a novel diffusion framework, motion-aware latent diffusion models (MADiff)
Our method achieves state-of-the-art performance significantly outperforming existing approaches.
arXiv Detail & Related papers (2024-04-21T05:09:56Z) - Explicit Motion Handling and Interactive Prompting for Video Camouflaged
Object Detection [23.059829327898818]
Existing video camouflaged object detection approaches take noisy motion estimation as input or model motion implicitly.
We propose a novel Explicit Motion handling and Interactive Prompting framework for VCOD, dubbed EMIP, which handles motion cues explicitly.
EMIP is characterized by a two-stream architecture for simultaneously conducting camouflaged segmentation and optical flow estimation.
arXiv Detail & Related papers (2024-03-04T12:11:07Z) - Implicit Event-RGBD Neural SLAM [54.74363487009845]
Implicit neural SLAM has achieved remarkable progress recently.
Existing methods face significant challenges in non-ideal scenarios.
We propose EN-SLAM, the first event-RGBD implicit neural SLAM framework.
arXiv Detail & Related papers (2023-11-18T08:48:58Z) - DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective.
Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process.
During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z) - Representation Learning for Compressed Video Action Recognition via
Attentive Cross-modal Interaction with Motion Enhancement [28.570085937225976]
This paper proposes a novel framework, namely Attentive Cross-modal Interaction Network with Motion Enhancement.
It follows the two-stream architecture, i.e. one for the RGB modality and the other for the motion modality.
Experiments on the UCF-101, HMDB-51 and Kinetics-400 benchmarks demonstrate the effectiveness and efficiency of MEACI-Net.
arXiv Detail & Related papers (2022-05-07T06:26:49Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - EAN: Event Adaptive Network for Enhanced Action Recognition [66.81780707955852]
We propose a unified action recognition framework to investigate the dynamic nature of video content.
First, when extracting local cues, we generate the spatial-temporal kernels of dynamic-scale to adaptively fit the diverse events.
Second, to accurately aggregate these cues into a global video representation, we propose to mine the interactions only among a few selected foreground objects by a Transformer.
arXiv Detail & Related papers (2021-07-22T15:57:18Z) - TSI: Temporal Saliency Integration for Video Action Recognition [32.18535820790586]
We propose a Temporal Saliency Integration (TSI) block, which mainly contains a Salient Motion Excitation (SME) module and a Cross-scale Temporal Integration (CTI) module.
SME aims to highlight the motion-sensitive area through local-global motion modeling.
CTI is designed to perform multi-scale temporal modeling through a group of separate 1D convolutions respectively.
arXiv Detail & Related papers (2021-06-02T11:43:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.