Efficient Gesture Recognition on Spiking Convolutional Networks Through
Sensor Fusion of Event-Based and Depth Data
- URL: http://arxiv.org/abs/2401.17064v1
- Date: Tue, 30 Jan 2024 14:42:35 GMT
- Title: Efficient Gesture Recognition on Spiking Convolutional Networks Through
Sensor Fusion of Event-Based and Depth Data
- Authors: Lea Steffen, Thomas Trapp, Arne Roennau, R\"udiger Dillmann
- Abstract summary: This work proposes a Spiking Convolutional Neural Network, processing event- and depth data for gesture recognition.
The network is simulated using the open-source neuromorphic computing framework LAVA for offline training and evaluation on an embedded system.
- Score: 1.474723404975345
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As intelligent systems become increasingly important in our daily lives, new
ways of interaction are needed. Classical user interfaces pose issues for the
physically impaired and are partially not practical or convenient. Gesture
recognition is an alternative, but often not reactive enough when conventional
cameras are used. This work proposes a Spiking Convolutional Neural Network,
processing event- and depth data for gesture recognition. The network is
simulated using the open-source neuromorphic computing framework LAVA for
offline training and evaluation on an embedded system. For the evaluation three
open source data sets are used. Since these do not represent the applied
bi-modality, a new data set with synchronized event- and depth data was
recorded. The results show the viability of temporal encoding on depth
information and modality fusion, even on differently encoded data, to be
beneficial to network performance and generalization capabilities.
Related papers
- NIDS Neural Networks Using Sliding Time Window Data Processing with Trainable Activations and its Generalization Capability [0.0]
This paper presents neural networks for network intrusion detection systems (NIDS) that operate on flow data preprocessed with a time window.
It requires only eleven features which do not rely on deep packet inspection and can be found in most NIDS datasets and easily obtained from conventional flow collectors.
The reported training accuracy exceeds 99% for the proposed method with as little as twenty neural network input features.
arXiv Detail & Related papers (2024-10-24T11:36:19Z) - Assessing Neural Network Representations During Training Using
Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process.
We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures.
We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - Neuro-Symbolic Artificial Intelligence (AI) for Intent based Semantic
Communication [85.06664206117088]
6G networks must consider semantics and effectiveness (at end-user) of the data transmission.
NeSy AI is proposed as a pillar for learning causal structure behind the observed data.
GFlowNet is leveraged for the first time in a wireless system to learn the probabilistic structure which generates the data.
arXiv Detail & Related papers (2022-05-22T07:11:57Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded
Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor.
In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z) - AttDLNet: Attention-based DL Network for 3D LiDAR Place Recognition [0.6352264764099531]
This paper proposes a novel 3D LiDAR-based deep learning network named AttDLNet.
It exploits an attention mechanism to selectively focus on long-range context and interfeature relationships.
Results show that the encoder network features are already very descriptive, but adding attention to the network further improves performance.
arXiv Detail & Related papers (2021-06-17T16:34:37Z) - Learning from Event Cameras with Sparse Spiking Convolutional Neural
Networks [0.0]
Convolutional neural networks (CNNs) are now the de facto solution for computer vision problems.
We propose an end-to-end biologically inspired approach using event cameras and spiking neural networks (SNNs)
Our method enables the training of sparse spiking neural networks directly on event data, using the popular deep learning framework PyTorch.
arXiv Detail & Related papers (2021-04-26T13:52:01Z) - Towards Improved Human Action Recognition Using Convolutional Neural
Networks and Multimodal Fusion of Depth and Inertial Sensor Data [1.52292571922932]
This paper attempts at improving the accuracy of Human Action Recognition (HAR) by fusion of depth and inertial sensor data.
We transform the depth data into Sequential Front view Images(SFI) and fine-tune the pre-trained AlexNet on these images.
Inertial data is converted into Signal Images (SI) and another convolutional neural network (CNN) is trained on these images.
arXiv Detail & Related papers (2020-08-22T03:41:34Z) - Event-based Asynchronous Sparse Convolutional Networks [54.094244806123235]
Event cameras are bio-inspired sensors that respond to per-pixel brightness changes in the form of asynchronous and sparse "events"
We present a general framework for converting models trained on synchronous image-like event representations into asynchronous models with identical output.
We show both theoretically and experimentally that this drastically reduces the computational complexity and latency of high-capacity, synchronous neural networks.
arXiv Detail & Related papers (2020-03-20T08:39:49Z) - Modality Compensation Network: Cross-Modal Adaptation for Action
Recognition [77.24983234113957]
We propose a Modality Compensation Network (MCN) to explore the relationships of different modalities.
Our model bridges data from source and auxiliary modalities by a modality adaptation block to achieve adaptive representation learning.
Experimental results reveal that MCN outperforms state-of-the-art approaches on four widely-used action recognition benchmarks.
arXiv Detail & Related papers (2020-01-31T04:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.