Related papers: EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction

EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction

URL: http://arxiv.org/abs/2503.19721v3
Date: Tue, 01 Apr 2025 02:49:17 GMT
Title: EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction
Authors: Chengjie Ge, Xueyang Fu, Peng He, Kunyu Wang, Chengzhi Cao, Zheng-Jun Zha,
Abstract summary: EventMamba is a specialized model designed for event-based video reconstruction tasks.<n>We show that EventMamba markedly improves speed while delivering superior visual quality compared to Transformer-based methods.
Score: 66.84997711357101
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Leveraging its robust linear global modeling capability, Mamba has notably excelled in computer vision. Despite its success, existing Mamba-based vision models have overlooked the nuances of event-driven tasks, especially in video reconstruction. Event-based video reconstruction (EBVR) demands spatial translation invariance and close attention to local event relationships in the spatio-temporal domain. Unfortunately, conventional Mamba algorithms apply static window partitions and standard reshape scanning methods, leading to significant losses in local connectivity. To overcome these limitations, we introduce EventMamba--a specialized model designed for EBVR tasks. EventMamba innovates by incorporating random window offset (RWO) in the spatial domain, moving away from the restrictive fixed partitioning. Additionally, it features a new consistent traversal serialization approach in the spatio-temporal domain, which maintains the proximity of adjacent events both spatially and temporally. These enhancements enable EventMamba to retain Mamba's robust modeling capabilities while significantly preserving the spatio-temporal locality of event data. Comprehensive testing on multiple datasets shows that EventMamba markedly enhances video reconstruction, drastically improving computation speed while delivering superior visual quality compared to Transformer-based methods.

Related papers

TSkel-Mamba: Temporal Dynamic Modeling via State Space Model for Human Skeleton-based Action Recognition [59.99922360648663]
TSkel-Mamba is a hybrid Transformer-Mamba framework that effectively captures both spatial and temporal dynamics.<n>The MTI module employs multi-scale Cycle operators to capture cross-channel temporal interactions, a critical factor in action recognition.
arXiv Detail & Related papers (2025-12-12T11:55:16Z)
Gather-Scatter Mamba: Accelerating Propagation with Efficient State Space Model [15.551773379039675]
State Space Models (SSMs) have historically played a central role in sequential modeling.<n>Recent advances in selective SSMs like Mamba offer a compelling alternative.<n>We propose a hybrid architecture that combines shifted window self-attention for spatial context aggregation with Mamba-based selective scanning for efficient temporal propagation.
arXiv Detail & Related papers (2025-10-01T13:11:13Z)
CETUS: Causal Event-Driven Temporal Modeling With Unified Variable-Rate Scheduling [18.82030002020162]
Event cameras capture asynchronous pixel-level brightness changes with microsecond temporal resolution.<n>Existing methods often convert event streams into intermediate representations such as frames, voxel grids, or point clouds.<n>We propose the Variable-Rate Spatial Event Mamba, a novel architecture that directly processes raw event streams without intermediate representations.
arXiv Detail & Related papers (2025-09-17T07:55:37Z)
Trajectory-aware Shifted State Space Models for Online Video Super-Resolution [57.87099307245989]
This paper presents a novel online VSR method based on Trajectory-aware Shifted SSMs (TS-Mamba)<n>TS-Mamba first constructs the trajectories within a video to select the most similar tokens from the previous frames.<n>Our TS-Mamba achieves state-of-the-art performance in most cases and over 22.7% reduction complexity (in MACs)
arXiv Detail & Related papers (2025-08-14T08:42:15Z)
Damba-ST: Domain-Adaptive Mamba for Efficient Urban Spatio-Temporal Prediction [27.924276998605816]
We propose Damba-ST, a domain-adaptive Mamba-based model for efficient urban-temporal prediction.<n>Damba-ST retains Mamba's linear complexity while significantly enhancing complexity to heterogeneous domains.<n>It achieves state-of-the-art performance on prediction tasks and demonstrates strong zero-shot generalization.
arXiv Detail & Related papers (2025-06-22T15:40:01Z)
MambaFlow: A Novel and Flow-guided State Space Model for Scene Flow Estimation [5.369567679302849]
We propose Mamba, a novel scene flow estimation network with a mamba-based decoder.<n>MambaFlow achieves state-of-the-art performance with real-time inference speed among existing works.<n>Experiments on the Argoverse 2 benchmark demonstrate that MambaFlow achieves state-of-the-art performance with real-time inference speed.
arXiv Detail & Related papers (2025-02-24T07:05:49Z)
STNMamba: Mamba-based Spatial-Temporal Normality Learning for Video Anomaly Detection [48.997518615379995]
Video anomaly detection (VAD) has been extensively researched due to its potential for intelligent video systems.<n>Most existing methods based on CNNs and transformers still suffer from substantial computational burdens.<n>We propose a lightweight and effective Mamba-based network named STNMamba to enhance the learning of spatial-temporal normality.
arXiv Detail & Related papers (2024-12-28T08:49:23Z)
Mamba-CL: Optimizing Selective State Space Model in Null Space for Continual Learning [54.19222454702032]
Continual Learning aims to equip AI models with the ability to learn a sequence of tasks over time, without forgetting previously learned knowledge. State Space Models (SSMs) have achieved notable success in computer vision. We introduce Mamba-CL, a framework that continuously fine-tunes the core SSMs of the large-scale Mamba foundation model.
arXiv Detail & Related papers (2024-11-23T06:36:16Z)
MambaVT: Spatio-Temporal Contextual Modeling for robust RGB-T Tracking [51.28485682954006]
We propose a pure Mamba-based framework (MambaVT) to fully exploit intrinsic-temporal contextual modeling for robust visible-thermal tracking. Specifically, we devise the long-range cross-frame integration component to globally adapt to target appearance variations. Experiments show the significant potential of vision Mamba for RGB-T tracking, with MambaVT achieving state-of-the-art performance on four mainstream benchmarks.
arXiv Detail & Related papers (2024-08-15T02:29:00Z)
Cross-Scan Mamba with Masked Training for Robust Spectral Imaging [51.557804095896174]
We propose the Cross-Scanning Mamba, named CS-Mamba, that employs a Spatial-Spectral SSM for global-local balanced context encoding. Experiment results show that our CS-Mamba achieves state-of-the-art performance and the masked training method can better reconstruct smooth features to improve the visual quality.
arXiv Detail & Related papers (2024-08-01T15:14:10Z)
DiM-Gesture: Co-Speech Gesture Generation with Adaptive Layer Normalization Mamba-2 framework [2.187990941788468]
generative model crafted to create highly personalized 3D full-body gestures solely from raw speech audio. Model integrates a Mamba-based fuzzy feature extractor with a non-autoregressive Adaptive Layer Normalization (AdaLN) Mamba-2 diffusion architecture.
arXiv Detail & Related papers (2024-08-01T08:22:47Z)
Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba [11.400397931501338]
Event cameras efficiently detect changes in ambient light with low latency and high dynamic range while consuming minimal power. Most current approach to processing event data often involves converting it into frame-based representations. Point Cloud is a popular representation for 3D processing and is better suited to match the sparse and asynchronous nature of the event camera. We propose EventMamba, an efficient and effective Point Cloud framework that achieves competitive results even compared to the state-of-the-art (SOTA) frame-based method.
arXiv Detail & Related papers (2024-05-09T21:47:46Z)
MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking [50.26836546224782]
Event-based eye tracking has shown great promise with the high temporal resolution and low redundancy. The diversity and abruptness of eye movement patterns, including blinking, fixating, saccades, and smooth pursuit, pose significant challenges for eye localization. This paper proposes a bidirectional long-term sequence modeling and time-varying state selection mechanism to fully utilize contextual temporal information.
arXiv Detail & Related papers (2024-04-18T11:09:25Z)
MambaIR: A Simple Baseline for Image Restoration with State-Space Model [46.827053426281715]
We introduce MambaIR, which introduces both local enhancement and channel attention to improve the vanilla Mamba. Our method outperforms SwinIR by up to 0.45dB on image SR, using similar computational cost but with a global receptive field.
arXiv Detail & Related papers (2024-02-23T23:15:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.