Temporal-Guided Spiking Neural Networks for Event-Based Human Action Recognition
- URL: http://arxiv.org/abs/2503.17132v2
- Date: Thu, 27 Mar 2025 11:35:37 GMT
- Title: Temporal-Guided Spiking Neural Networks for Event-Based Human Action Recognition
- Authors: Siyuan Yang, Shilin Lu, Shizheng Wang, Meng Hwa Er, Zengwei Zheng, Alex C. Kot,
- Abstract summary: This paper explores the promising interplay between neural networks (SNNs) and event-based cameras for privacy-preserving human action recognition (HAR)<n>We introduce two novel frameworks to address this: temporal segment-based SNN (textitTS-SNN) and 3D convolutional SNN (textit3D-SNN)<n>To promote further research in event-based HAR, we create a dataset, textitFallingDetection-CeleX, collected using the high-resolution CeleX-V event camera.
- Score: 31.528007074074043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper explores the promising interplay between spiking neural networks (SNNs) and event-based cameras for privacy-preserving human action recognition (HAR). The unique feature of event cameras in capturing only the outlines of motion, combined with SNNs' proficiency in processing spatiotemporal data through spikes, establishes a highly synergistic compatibility for event-based HAR. Previous studies, however, have been limited by SNNs' ability to process long-term temporal information, essential for precise HAR. In this paper, we introduce two novel frameworks to address this: temporal segment-based SNN (\textit{TS-SNN}) and 3D convolutional SNN (\textit{3D-SNN}). The \textit{TS-SNN} extracts long-term temporal information by dividing actions into shorter segments, while the \textit{3D-SNN} replaces 2D spatial elements with 3D components to facilitate the transmission of temporal information. To promote further research in event-based HAR, we create a dataset, \textit{FallingDetection-CeleX}, collected using the high-resolution CeleX-V event camera $(1280 \times 800)$, comprising 7 distinct actions. Extensive experimental results show that our proposed frameworks surpass state-of-the-art SNN methods on our newly collected dataset and three other neuromorphic datasets, showcasing their effectiveness in handling long-range temporal information for event-based HAR.
Related papers
- Enhanced Temporal Processing in Spiking Neural Networks for Static Object Detection Using 3D Convolutions [0.0]
Spiking Neural Networks (SNNs) are a class of network models capable of processingtemporal information.<n>This paper focuses on enhancing the SNNs unique ability to processtemporal information.<n>To improve the SNN handling of temporal information, this paper proposes replacing traditional 2D convolutions with 3D convolutions.
arXiv Detail & Related papers (2024-12-23T15:32:26Z) - Enhancing SNN-based Spatio-Temporal Learning: A Benchmark Dataset and Cross-Modality Attention Model [30.66645039322337]
High-quality benchmark datasets are great importance to the advances of Artificial Neural Networks (SNNs)
Yet, the SNN-based cross-modal fusion remains underexplored.
In this work, we present a neuromorphic dataset that can better exploit the inherent-temporal betemporal of SNNs.
arXiv Detail & Related papers (2024-10-21T06:59:04Z) - Towards Low-latency Event-based Visual Recognition with Hybrid Step-wise Distillation Spiking Neural Networks [50.32980443749865]
Spiking neural networks (SNNs) have garnered significant attention for their low power consumption and high biologicalability.
Current SNNs struggle to balance accuracy and latency in neuromorphic datasets.
We propose Step-wise Distillation (HSD) method, tailored for neuromorphic datasets.
arXiv Detail & Related papers (2024-09-19T06:52:34Z) - SFOD: Spiking Fusion Object Detector [10.888008544975662]
Spiking Fusion Object Detector (SFOD) is a simple and efficient approach to SNN-based object detection.
We design a Spiking Fusion Module, achieving the first-time fusion of feature maps from different scales in SNNs applied to event cameras.
We establish state-of-the-art classification results based on SNNs, achieving 93.7% accuracy on the NCAR dataset.
arXiv Detail & Related papers (2024-03-22T13:24:50Z) - Efficient and Effective Time-Series Forecasting with Spiking Neural Networks [47.371024581669516]
Spiking neural networks (SNNs) provide a unique pathway for capturing the intricacies of temporal data.
Applying SNNs to time-series forecasting is challenging due to difficulties in effective temporal alignment, complexities in encoding processes, and the absence of standardized guidelines for model selection.
We propose a framework for SNNs in time-series forecasting tasks, leveraging the efficiency of spiking neurons in processing temporal information.
arXiv Detail & Related papers (2024-02-02T16:23:50Z) - Event-based Human Pose Tracking by Spiking Spatiotemporal Transformer [20.188995900488717]
We present a dedicated end-to-end sparse deep approach for event-based pose tracking.
This is the first time that 3D human pose tracking is obtained from events only.
Our approach also achieves a significant reduction of 80% in FLOPS.
arXiv Detail & Related papers (2023-03-16T22:56:12Z) - Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
Temporal Sentence Grounding [61.57847727651068]
Temporal sentence grounding aims to localize a target segment in an untrimmed video semantically according to a given sentence query.
Most previous works focus on learning frame-level features of each whole frame in the entire video, and directly match them with the textual information.
We propose a novel Motion- and Appearance-guided 3D Semantic Reasoning Network (MA3SRN), which incorporates optical-flow-guided motion-aware, detection-based appearance-aware, and 3D-aware object-level features.
arXiv Detail & Related papers (2022-03-06T13:57:09Z) - Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for
Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames.
Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks.
We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z) - SpikeMS: Deep Spiking Neural Network for Motion Segmentation [7.491944503744111]
textitSpikeMS is the first deep encoder-decoder SNN architecture for the real-world large-scale problem of motion segmentation.
We show that textitSpikeMS is capable of textitincremental predictions, or predictions from smaller amounts of test data than it is trained on.
arXiv Detail & Related papers (2021-05-13T21:34:55Z) - Real-Time High-Performance Semantic Image Segmentation of Urban Street
Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes.
The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z) - Event-Based Angular Velocity Regression with Spiking Networks [51.145071093099396]
Spiking Neural Networks (SNNs) process information conveyed as temporal spikes rather than numeric values.
We propose, for the first time, a temporal regression problem of numerical values given events from an event camera.
We show that we can successfully train an SNN to perform angular velocity regression.
arXiv Detail & Related papers (2020-03-05T17:37:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.