Related papers: Corticomorphic Hybrid CNN-SNN Architecture for EEG-based Low-footprint Low-latency Auditory Attention Detection

Corticomorphic Hybrid CNN-SNN Architecture for EEG-based Low-footprint Low-latency Auditory Attention Detection

URL: http://arxiv.org/abs/2307.08501v1
Date: Thu, 13 Jul 2023 20:33:39 GMT
Title: Corticomorphic Hybrid CNN-SNN Architecture for EEG-based Low-footprint Low-latency Auditory Attention Detection
Authors: Richard Gall, Deniz Kocanaogullari, Murat Akcakaya, Deniz Erdogmus, Rajkumar Kubendran
Abstract summary: In a multi-speaker "cocktail party" scenario, a listener can selectively attend to a speaker of interest. Current trends in EEG-based auditory attention detection using artificial neural networks (ANN) are not practical for edge-computing platforms. We propose a hybrid convolutional neural network-spiking neural network (CNN-SNN) architecture, inspired by the auditory cortex.
Score: 8.549433398954738
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In a multi-speaker "cocktail party" scenario, a listener can selectively attend to a speaker of interest. Studies into the human auditory attention network demonstrate cortical entrainment to speech envelopes resulting in highly correlated Electroencephalography (EEG) measurements. Current trends in EEG-based auditory attention detection (AAD) using artificial neural networks (ANN) are not practical for edge-computing platforms due to longer decision windows using several EEG channels, with higher power consumption and larger memory footprint requirements. Nor are ANNs capable of accurately modeling the brain's top-down attention network since the cortical organization is complex and layer. In this paper, we propose a hybrid convolutional neural network-spiking neural network (CNN-SNN) corticomorphic architecture, inspired by the auditory cortex, which uses EEG data along with multi-speaker speech envelopes to successfully decode auditory attention with low latency down to 1 second, using only 8 EEG electrodes strategically placed close to the auditory cortex, at a significantly higher accuracy of 91.03%, compared to the state-of-the-art. Simultaneously, when compared to a traditional CNN reference model, our model uses ~15% fewer parameters at a lower bit precision resulting in ~57% memory footprint reduction. The results show great promise for edge-computing in brain-embedded devices, like smart hearing aids.

Related papers

Neuromorphic Wireless Split Computing with Resonate-and-Fire Neurons [69.73249913506042]
This paper investigates a wireless split computing architecture that employs resonate-and-fire (RF) neurons to process time-domain signals directly.<n>By resonating at tunable frequencies, RF neurons extract time-localized spectral features while maintaining low spiking activity.<n> Experimental results show that the proposed RF-SNN architecture achieves comparable accuracy to conventional LIF-SNNs and ANNs.
arXiv Detail & Related papers (2025-06-24T21:14:59Z)
Adaptively Pruned Spiking Neural Networks for Energy-Efficient Intracortical Neural Decoding [0.06181089784338582]
Spiking Neural Networks (SNNs) on neuromorphic hardware have demonstrated remarkable efficiency in neural decoding. We introduce a novel adaptive pruning algorithm specifically designed for SNNs with high activation sparsity, targeting intracortical neural decoding.
arXiv Detail & Related papers (2025-04-15T19:16:34Z)
CEReBrO: Compact Encoder for Representations of Brain Oscillations Using Efficient Alternating Attention [53.539020807256904]
We introduce a Compact for Representations of Brain Oscillations using alternating attention (CEReBrO) Our tokenization scheme represents EEG signals at a per-channel patch. We propose an alternating attention mechanism that jointly models intra-channel temporal dynamics and inter-channel spatial correlations, achieving 2x speed improvement with 6x less memory required compared to standard self-attention.
arXiv Detail & Related papers (2025-01-18T21:44:38Z)
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks [51.516451451719654]
Spiking Neural Networks (SNNs) are known to be biologically plausible and power-efficient. This paper introduces a novel SNN-based Voice Activity Detection model, referred to as sVAD. It provides effective auditory feature representation through SincNet and 1D convolution, and improves noise robustness with attention mechanisms.
arXiv Detail & Related papers (2024-03-09T02:55:44Z)
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks [53.31894108974566]
Spiking-LEAF is a learnable auditory front-end meticulously designed for SNN-based speech processing. On keyword spotting and speaker identification tasks, the proposed Spiking-LEAF outperforms both SOTA spiking auditory front-ends.
arXiv Detail & Related papers (2023-09-18T04:03:05Z)
ElectrodeNet -- A Deep Learning Based Sound Coding Strategy for Cochlear Implants [9.468136300919062]
ElectrodeNet is a deep learning based sound coding strategy for the cochlear implant (CI) The extended ElectrodeNet-CS strategy further incorporates the channel selection (CS) Network models of deep neural network (DNN), convolutional neural network (CNN) and long short-term memory (LSTM) were trained using the Fast Fourier Transformed bins and channel envelopes obtained from the processing of clean speech by the ACE strategy.
arXiv Detail & Related papers (2023-05-26T09:06:04Z)
Mental arithmetic task classification with convolutional neural network based on spectral-temporal features from EEG [0.47248250311484113]
Deep neural networks (DNN) show significant advantages in computer vision applications. We present here a shallow neural network that uses mainly two convolutional neural network layers, with relatively few parameters and fast to learn spectral-temporal features from EEG. Experimental results showed that the shallow CNN model outperformed all the other models and achieved the highest classification accuracy of 90.68%.
arXiv Detail & Related papers (2022-09-26T02:15:22Z)
EEG-BBNet: a Hybrid Framework for Brain Biometric using Graph Connectivity [1.1498015270151059]
We present EEG-BBNet, a hybrid network which integrates convolutional neural networks (CNN) with graph convolutional neural networks (GCNN) Our models outperform all baselines in the event-related potential (ERP) task with an average correct recognition rates up to 99.26% using intra-session data.
arXiv Detail & Related papers (2022-08-17T10:18:22Z)
Convolutional Spiking Neural Networks for Detecting Anticipatory Brain Potentials Using Electroencephalogram [0.21847754147782888]
Spiking neural networks (SNNs) are receiving increased attention because they mimic synaptic connections in biological systems and produce spike trains. Recently, the addition of convolutional layers to combine the feature extraction power of convolutional networks with the computational efficiency of SNNs has been introduced. This paper studies the feasibility of using a convolutional spiking neural network (CSNN) to detect anticipatory slow cortical potentials.
arXiv Detail & Related papers (2022-08-14T19:04:15Z)
Braille Letter Reading: A Benchmark for Spatio-Temporal Pattern Recognition on Neuromorphic Hardware [50.380319968947035]
Recent deep learning approaches have reached accuracy in such tasks, but their implementation on conventional embedded solutions is still computationally very and energy expensive. We propose a new benchmark for computing tactile pattern recognition at the edge through letters reading. We trained and compared feed-forward and recurrent spiking neural networks (SNNs) offline using back-propagation through time with surrogate gradients, then we deployed them on the Intel Loihimorphic chip for efficient inference. Our results show that the LSTM outperforms the recurrent SNN in terms of accuracy by 14%. However, the recurrent SNN on Loihi is 237 times more energy
arXiv Detail & Related papers (2022-05-30T14:30:45Z)
Extracting the Locus of Attention at a Cocktail Party from Single-Trial EEG using a Joint CNN-LSTM Model [0.1529342790344802]
Human brain performs remarkably well in segregating a particular speaker from interfering speakers in a multi-speaker scenario. We present a joint convolutional neural network (CNN) - long short-term memory (LSTM) model to infer the auditory attention.
arXiv Detail & Related papers (2021-02-08T01:06:48Z)
Neural Architecture Search For LF-MMI Trained Time Delay Neural Networks [61.76338096980383]
A range of neural architecture search (NAS) techniques are used to automatically learn two types of hyper- parameters of state-of-the-art factored time delay neural networks (TDNNs) These include the DARTS method integrating architecture selection with lattice-free MMI (LF-MMI) TDNN training. Experiments conducted on a 300-hour Switchboard corpus suggest the auto-configured systems consistently outperform the baseline LF-MMI TDNN systems.
arXiv Detail & Related papers (2020-07-17T08:32:11Z)
AutoSpeech: Neural Architecture Search for Speaker Recognition [108.69505815793028]
We propose the first neural architecture search approach approach for the speaker recognition tasks, named as AutoSpeech. Our algorithm first identifies the optimal operation combination in a neural cell and then derives a CNN model by stacking the neural cell for multiple times. Results demonstrate that the derived CNN architectures significantly outperform current speaker recognition systems based on VGG-M, ResNet-18, and ResNet-34 back-bones, while enjoying lower model complexity.
arXiv Detail & Related papers (2020-05-07T02:53:47Z)
Deep Speaker Embeddings for Far-Field Speaker Recognition on Short Utterances [53.063441357826484]
Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions. Speaker verification on short utterances in uncontrolled noisy environment conditions is one of the most challenging and highly demanded tasks. This paper presents approaches aimed to achieve two goals: a) improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and b) reduce the system qualitydegradation for short utterances.
arXiv Detail & Related papers (2020-02-14T13:34:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.