Efficient Event Stream Super-Resolution with Recursive Multi-Branch Fusion
- URL: http://arxiv.org/abs/2406.19640v1
- Date: Fri, 28 Jun 2024 04:10:21 GMT
- Title: Efficient Event Stream Super-Resolution with Recursive Multi-Branch Fusion
- Authors: Quanmin Liang, Zhilin Huang, Xiawu Zheng, Feidiao Yang, Jun Peng, Kai Huang, Yonghong Tian,
- Abstract summary: We propose an efficient Recursive Multi-Branch Information Fusion Network (RMFNet) to separate positive and negative events.
FEM efficiently promotes the fusion and exchange of information between positive and negative branches.
Our approach achieves over 17% and 31% improvement on synthetic and real datasets, accompanied by a 2.3X acceleration.
- Score: 30.746523517295007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current Event Stream Super-Resolution (ESR) methods overlook the redundant and complementary information present in positive and negative events within the event stream, employing a direct mixing approach for super-resolution, which may lead to detail loss and inefficiency. To address these issues, we propose an efficient Recursive Multi-Branch Information Fusion Network (RMFNet) that separates positive and negative events for complementary information extraction, followed by mutual supplementation and refinement. Particularly, we introduce Feature Fusion Modules (FFM) and Feature Exchange Modules (FEM). FFM is designed for the fusion of contextual information within neighboring event streams, leveraging the coupling relationship between positive and negative events to alleviate the misleading of noises in the respective branches. FEM efficiently promotes the fusion and exchange of information between positive and negative branches, enabling superior local information enhancement and global information complementation. Experimental results demonstrate that our approach achieves over 17% and 31% improvement on synthetic and real datasets, accompanied by a 2.3X acceleration. Furthermore, we evaluate our method on two downstream event-driven applications, \emph{i.e.}, object recognition and video reconstruction, achieving remarkable results that outperform existing methods. Our code and Supplementary Material are available at https://github.com/Lqm26/RMFNet.
Related papers
- Bilateral Event Mining and Complementary for Event Stream Super-Resolution [28.254644673666903]
Event Stream Super-Resolution (ESR) aims to address the challenge of insufficient spatial resolution in event streams.
We propose a bilateral event mining and complementary network (BMCNet) to fully leverage the potential of each event.
Our method significantly enhances the performance of event-based downstream tasks such as object recognition and video reconstruction.
arXiv Detail & Related papers (2024-05-16T12:16:25Z) - Feature Decoupling-Recycling Network for Fast Interactive Segmentation [79.22497777645806]
Recent interactive segmentation methods iteratively take source image, user guidance and previously predicted mask as the input.
We propose the Feature Decoupling-Recycling Network (FDRN), which decouples the modeling components based on their intrinsic discrepancies.
arXiv Detail & Related papers (2023-08-07T12:26:34Z) - Efficient Image Super-Resolution with Feature Interaction Weighted
Hybrid Network [100.0415874554937]
We propose a Feature Interaction Weighted Hybrid Network (FIWHN) to alleviate the above problem.
Specifically, FIWHN consists of a series of novel Wide-residual Distillation Interaction Blocks (WDIB) as the backbone.
To complement the global features lacking in the CNN model, we introduced the Transformer into our model and explored a new way of combining the CNN and Transformer.
arXiv Detail & Related papers (2022-12-29T05:57:29Z) - FF2: A Feature Fusion Two-Stream Framework for Punctuation Restoration [27.14686854704104]
We propose a Feature Fusion two-stream framework (FF2) for punctuation restoration.
Specifically, one stream leverages a pre-trained language model to capture the semantic feature, while another auxiliary module captures the feature at hand.
Without additional data, the experimental results on the popular benchmark IWSLT demonstrate that FF2 achieves new SOTA performance.
arXiv Detail & Related papers (2022-11-09T06:18:17Z) - Magic ELF: Image Deraining Meets Association Learning and Transformer [63.761812092934576]
This paper aims to unify CNN and Transformer to take advantage of their learning merits for image deraining.
A novel multi-input attention module (MAM) is proposed to associate rain removal and background recovery.
Our proposed method (dubbed as ELF) outperforms the state-of-the-art approach (MPRNet) by 0.25 dB on average.
arXiv Detail & Related papers (2022-07-21T12:50:54Z) - Transformer-based Context Condensation for Boosting Feature Pyramids in
Object Detection [77.50110439560152]
Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF)
We propose a novel and efficient context modeling mechanism that can help existing FPs deliver better MFF results.
In particular, we introduce a novel insight that comprehensive contexts can be decomposed and condensed into two types of representations for higher efficiency.
arXiv Detail & Related papers (2022-07-14T01:45:03Z) - Decoupled Side Information Fusion for Sequential Recommendation [6.515279047538104]
We propose Decoupled Side Information Fusion for Sequential Recommendation (DIF-SR)
It moves the side information from the input to the attention layer and decouples the attention calculation of various side information and item representation.
Our proposed solution stably outperforms state-of-the-art SR models.
arXiv Detail & Related papers (2022-04-23T10:53:36Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z) - Deep feature selection-and-fusion for RGB-D semantic segmentation [8.831857715361624]
This work proposes a unified and efficient feature selectionand-fusion network (FSFNet)
FSFNet contains a symmetric cross-modality residual fusion module used for explicit fusion of multi-modality information.
Compared with the state-of-the-art methods, experimental evaluations demonstrate that the proposed model achieves competitive performance on two public datasets.
arXiv Detail & Related papers (2021-05-10T04:02:32Z) - Deep Multimodal Fusion by Channel Exchanging [87.40768169300898]
This paper proposes a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities.
The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network.
arXiv Detail & Related papers (2020-11-10T09:53:20Z) - Dual Semantic Fusion Network for Video Object Detection [35.175552056938635]
We propose a dual semantic fusion network (DSFNet) to fully exploit both frame-level and instance-level semantics in a unified fusion framework without external guidance.
The proposed DSFNet can generate more robust features through the multi-granularity fusion and avoid being affected by the instability of external guidance.
arXiv Detail & Related papers (2020-09-16T06:49:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.