TAnet: A New Temporal Attention Network for EEG-based Auditory Spatial Attention Decoding with a Short Decision Window
- URL: http://arxiv.org/abs/2401.05819v2
- Date: Tue, 14 May 2024 13:06:26 GMT
- Title: TAnet: A New Temporal Attention Network for EEG-based Auditory Spatial Attention Decoding with a Short Decision Window
- Authors: Yuting Ding, Fei Chen,
- Abstract summary: Auditory spatial attention detection (ASAD) is used to determine the direction of a listener's attention to a speaker.
An end-to-end temporal attention network (i.e., TAnet) was introduced in this work.
- Score: 2.9033818582958393
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Auditory spatial attention detection (ASAD) is used to determine the direction of a listener's attention to a speaker by analyzing her/his electroencephalographic (EEG) signals. This study aimed to further improve the performance of ASAD with a short decision window (i.e., <1 s) rather than with long decision windows ranging from 1 to 5 seconds in previous studies. An end-to-end temporal attention network (i.e., TAnet) was introduced in this work. TAnet employs a multi-head attention (MHA) mechanism, which can more effectively capture the interactions among time steps in collected EEG signals and efficiently assign corresponding weights to those EEG time steps. Experiments demonstrated that, compared with the CNN-based method and recent ASAD methods, TAnet provided improved decoding performance in the KUL dataset, with decoding accuracies of 92.4% (decision window 0.1 s), 94.9% (0.25 s), 95.1% (0.3 s), 95.4% (0.4 s), and 95.5% (0.5 s) with short decision windows (i.e., <1 s). As a new ASAD model with a short decision window, TAnet can potentially facilitate the design of EEG-controlled intelligent hearing aids and sound recognition systems.
Related papers
- AI-in-the-Loop Sensing and Communication Joint Design for Edge Intelligence [65.29835430845893]
We propose a framework that enhances edge intelligence through AI-in-the-loop joint sensing and communication.
A key contribution of our work is establishing an explicit relationship between validation loss and the system's tunable parameters.
We show that our framework reduces communication energy consumption by up to 77 percent and sensing costs measured by the number of samples by up to 52 percent.
arXiv Detail & Related papers (2025-02-14T14:56:58Z) - CEReBrO: Compact Encoder for Representations of Brain Oscillations Using Efficient Alternating Attention [53.539020807256904]
We introduce a Compact for Representations of Brain Oscillations using alternating attention (CEReBrO)
Our tokenization scheme represents EEG signals at a per-channel patch.
We propose an alternating attention mechanism that jointly models intra-channel temporal dynamics and inter-channel spatial correlations, achieving 2x speed improvement with 6x less memory required compared to standard self-attention.
arXiv Detail & Related papers (2025-01-18T21:44:38Z) - AADNet: Exploring EEG Spatiotemporal Information for Fast and Accurate Orientation and Timbre Detection of Auditory Attention Based on A Cue-Masked Paradigm [4.479495549911642]
Auditory attention decoding from electroencephalogram (EEG) could infer to which source the user is attending in noisy environments.
This study proposed a cue-masked auditory attention paradigm to avoid information leakage before the experiment.
An end-to-end deep learning model, AADNet, was proposed to exploit thetemporal information from the short time window EEG signals.
arXiv Detail & Related papers (2025-01-07T06:51:17Z) - DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial
Attention Detection [49.196182908826565]
Auditory Attention Detection (AAD) aims to detect target speaker from brain signals in a multi-speaker environment.
Current approaches primarily rely on traditional convolutional neural network designed for processing Euclidean data like images.
This paper proposes a dynamical graph self-distillation (DGSD) approach for AAD, which does not require speech stimuli as input.
arXiv Detail & Related papers (2023-09-07T13:43:46Z) - Corticomorphic Hybrid CNN-SNN Architecture for EEG-based Low-footprint
Low-latency Auditory Attention Detection [8.549433398954738]
In a multi-speaker "cocktail party" scenario, a listener can selectively attend to a speaker of interest.
Current trends in EEG-based auditory attention detection using artificial neural networks (ANN) are not practical for edge-computing platforms.
We propose a hybrid convolutional neural network-spiking neural network (CNN-SNN) architecture, inspired by the auditory cortex.
arXiv Detail & Related papers (2023-07-13T20:33:39Z) - Exploring linguistic feature and model combination for speech
recognition based automatic AD detection [61.91708957996086]
Speech based automatic AD screening systems provide a non-intrusive and more scalable alternative to other clinical screening techniques.
Scarcity of specialist data leads to uncertainty in both model selection and feature learning when developing such systems.
This paper investigates the use of feature and model combination approaches to improve the robustness of domain fine-tuning of BERT and Roberta pre-trained text encoders.
arXiv Detail & Related papers (2022-06-28T05:09:01Z) - SOUL: An Energy-Efficient Unsupervised Online Learning Seizure Detection
Classifier [68.8204255655161]
Implantable devices that record neural activity and detect seizures have been adopted to issue warnings or trigger neurostimulation to suppress seizures.
For an implantable seizure detection system, a low power, at-the-edge, online learning algorithm can be employed to dynamically adapt to neural signal drifts.
SOUL was fabricated in TSMC's 28 nm process occupying 0.1 mm2 and achieves 1.5 nJ/classification energy efficiency, which is at least 24x more efficient than state-of-the-art.
arXiv Detail & Related papers (2021-10-01T23:01:20Z) - WNARS: WFST based Non-autoregressive Streaming End-to-End Speech
Recognition [59.975078145303605]
We propose a novel framework, namely WNARS, using hybrid CTC-attention AED models and weighted finite-state transducers.
On the AISHELL-1 task, our WNARS achieves a character error rate of 5.22% with 640ms latency, to the best of our knowledge, which is the state-of-the-art performance for online ASR.
arXiv Detail & Related papers (2021-04-08T07:56:03Z) - DENS-ECG: A Deep Learning Approach for ECG Signal Delineation [15.648061765081264]
This paper proposes a deep learning model for real-time segmentation of heartbeats.
The proposed algorithm, named as the DENS-ECG algorithm, combines convolutional neural network (CNN) and long short-term memory (LSTM) model.
arXiv Detail & Related papers (2020-05-18T13:13:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.