Bilateral Event Mining and Complementary for Event Stream Super-Resolution
- URL: http://arxiv.org/abs/2405.10037v1
- Date: Thu, 16 May 2024 12:16:25 GMT
- Title: Bilateral Event Mining and Complementary for Event Stream Super-Resolution
- Authors: Zhilin Huang, Quanmin Liang, Yijie Yu, Chujun Qin, Xiawu Zheng, Kai Huang, Zikun Zhou, Wenming Yang,
- Abstract summary: Event Stream Super-Resolution (ESR) aims to address the challenge of insufficient spatial resolution in event streams.
We propose a bilateral event mining and complementary network (BMCNet) to fully leverage the potential of each event.
Our method significantly enhances the performance of event-based downstream tasks such as object recognition and video reconstruction.
- Score: 28.254644673666903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Event Stream Super-Resolution (ESR) aims to address the challenge of insufficient spatial resolution in event streams, which holds great significance for the application of event cameras in complex scenarios. Previous works for ESR often process positive and negative events in a mixed paradigm. This paradigm limits their ability to effectively model the unique characteristics of each event and mutually refine each other by considering their correlations. In this paper, we propose a bilateral event mining and complementary network (BMCNet) to fully leverage the potential of each event and capture the shared information to complement each other simultaneously. Specifically, we resort to a two-stream network to accomplish comprehensive mining of each type of events individually. To facilitate the exchange of information between two streams, we propose a bilateral information exchange (BIE) module. This module is layer-wisely embedded between two streams, enabling the effective propagation of hierarchical global information while alleviating the impact of invalid information brought by inherent characteristics of events. The experimental results demonstrate that our approach outperforms the previous state-of-the-art methods in ESR, achieving performance improvements of over 11\% on both real and synthetic datasets. Moreover, our method significantly enhances the performance of event-based downstream tasks such as object recognition and video reconstruction. Our code is available at https://github.com/Lqm26/BMCNet-ESR.
Related papers
- EIFNet: Leveraging Event-Image Fusion for Robust Semantic Segmentation [0.18416014644193066]
Event cameras offer high dynamic range and fine temporal resolution, to achieve robust scene understanding in challenging environments.<n>We propose EIFNet, a multi-modal fusion network that combines the strengths of both event and frame-based inputs.<n>EIFNet achieves state-of-the-art performance, demonstrating its effectiveness in event-based semantic segmentation.
arXiv Detail & Related papers (2025-07-29T16:19:55Z) - SMamba: Sparse Mamba for Event-based Object Detection [17.141967728323714]
Transformer-based methods have achieved remarkable performance in event-based object detection, owing to the global modeling ability.
To mitigate cost, some researchers propose window attention based sparsification strategies to discard unimportant regions.
We propose Sparse Mamba, which performs adaptive sparsification to reduce computational effort while maintaining global modeling ability.
arXiv Detail & Related papers (2025-01-21T08:33:32Z) - InterFormer: Towards Effective Heterogeneous Interaction Learning for Click-Through Rate Prediction [72.50606292994341]
We propose a novel module named InterFormer to learn heterogeneous information interaction in an interleaving style.
Our proposed InterFormer achieves state-of-the-art performance on three public datasets and a large-scale industrial dataset.
arXiv Detail & Related papers (2024-11-15T00:20:36Z) - Generating Event-oriented Attribution for Movies via Two-Stage Prefix-Enhanced Multimodal LLM [47.786978666537436]
We propose a Two-Stage Prefix-Enhanced MLLM (TSPE) approach for event attribution in movie videos.
In the local stage, we introduce an interaction-aware prefix that guides the model to focus on the relevant multimodal information within a single clip.
In the global stage, we strengthen the connections between associated events using an inferential knowledge graph.
arXiv Detail & Related papers (2024-09-14T08:30:59Z) - Efficient Event Stream Super-Resolution with Recursive Multi-Branch Fusion [30.746523517295007]
We propose an efficient Recursive Multi-Branch Information Fusion Network (RMFNet) to separate positive and negative events.
FEM efficiently promotes the fusion and exchange of information between positive and negative branches.
Our approach achieves over 17% and 31% improvement on synthetic and real datasets, accompanied by a 2.3X acceleration.
arXiv Detail & Related papers (2024-06-28T04:10:21Z) - Retain, Blend, and Exchange: A Quality-aware Spatial-Stereo Fusion Approach for Event Stream Recognition [57.74076383449153]
We propose a novel dual-stream framework for event stream-based pattern recognition via differentiated fusion, termed EFV++.
It models two common event representations simultaneously, i.e., event images and event voxels.
We achieve new state-of-the-art performance on the Bullying10k dataset, i.e., $90.51%$, which exceeds the second place by $+2.21%$.
arXiv Detail & Related papers (2024-06-27T02:32:46Z) - CrossZoom: Simultaneously Motion Deblurring and Event Super-Resolving [38.96663258582471]
CrossZoom is a novel unified neural Network (CZ-Net) to jointly recover sharp latent sequences within the exposure period of a blurry input and the corresponding High-Resolution (HR) events.
We present a multi-scale blur-event fusion architecture that leverages the scale-variant properties and effectively fuses cross-modality information to achieve cross-enhancement.
We propose a new dataset containing HR sharp-blurry images and the corresponding HR-LR event streams to facilitate future research.
arXiv Detail & Related papers (2023-09-29T03:27:53Z) - A Dual-Stream Recurrence-Attention Network With Global-Local Awareness
for Emotion Recognition in Textual Dialog [41.72374101704424]
We propose a simple and effective Dual-stream Recurrence-Attention Network (DualRAN)
DualRAN eschews the complex components of current methods and focuses on combining recurrence-based methods with attention-based ones.
We show that DualRAN boosts the weighted F1 scores by 1.43% and 0.64% on the IEMOCAP and MELD datasets, respectively.
arXiv Detail & Related papers (2023-07-02T01:25:47Z) - Abnormal Event Detection via Hypergraph Contrastive Learning [54.80429341415227]
Abnormal event detection plays an important role in many real applications.
In this paper, we study the unsupervised abnormal event detection problem in Attributed Heterogeneous Information Network.
A novel hypergraph contrastive learning method, named AEHCL, is proposed to fully capture abnormal event patterns.
arXiv Detail & Related papers (2023-04-02T08:23:20Z) - Dual Memory Aggregation Network for Event-Based Object Detection with
Learnable Representation [79.02808071245634]
Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner.
Event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation.
Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars.
arXiv Detail & Related papers (2023-03-17T12:12:41Z) - Event Voxel Set Transformer for Spatiotemporal Representation Learning on Event Streams [19.957857885844838]
Event cameras are neuromorphic vision sensors that record a scene as sparse and asynchronous event streams.
We propose an attentionaware model named Event Voxel Set Transformer (EVSTr) for efficient representation learning on event streams.
Experiments show that EVSTr achieves state-of-the-art performance while maintaining low model complexity.
arXiv Detail & Related papers (2023-03-07T12:48:02Z) - Learning Constraints and Descriptive Segmentation for Subevent Detection [74.48201657623218]
We propose an approach to learning and enforcing constraints that capture dependencies between subevent detection and EventSeg prediction.
We adopt Rectifier Networks for constraint learning and then convert the learned constraints to a regularization term in the loss function of the neural model.
arXiv Detail & Related papers (2021-09-13T20:50:37Z) - Full-Duplex Strategy for Video Object Segmentation [141.43983376262815]
Full- Strategy Network (FSNet) is a novel framework for video object segmentation (VOS)
Our FSNet performs the crossmodal feature-passing (i.e., transmission and receiving) simultaneously before fusion decoding stage.
We show that our FSNet outperforms other state-of-the-arts for both the VOS and video salient object detection tasks.
arXiv Detail & Related papers (2021-08-06T14:50:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.