AADNet: Exploring EEG Spatiotemporal Information for Fast and Accurate Orientation and Timbre Detection of Auditory Attention Based on A Cue-Masked Paradigm
- URL: http://arxiv.org/abs/2501.03571v1
- Date: Tue, 07 Jan 2025 06:51:17 GMT
- Title: AADNet: Exploring EEG Spatiotemporal Information for Fast and Accurate Orientation and Timbre Detection of Auditory Attention Based on A Cue-Masked Paradigm
- Authors: Keren Shi, Xu Liu, Xue Yuan, Haijie Shang, Ruiting Dai, Hanbin Wang, Yunfa Fu, Ning Jiang, Jiayuan He,
- Abstract summary: Auditory attention decoding from electroencephalogram (EEG) could infer to which source the user is attending in noisy environments.
This study proposed a cue-masked auditory attention paradigm to avoid information leakage before the experiment.
An end-to-end deep learning model, AADNet, was proposed to exploit thetemporal information from the short time window EEG signals.
- Score: 4.479495549911642
- License:
- Abstract: Auditory attention decoding from electroencephalogram (EEG) could infer to which source the user is attending in noisy environments. Decoding algorithms and experimental paradigm designs are crucial for the development of technology in practical applications. To simulate real-world scenarios, this study proposed a cue-masked auditory attention paradigm to avoid information leakage before the experiment. To obtain high decoding accuracy with low latency, an end-to-end deep learning model, AADNet, was proposed to exploit the spatiotemporal information from the short time window of EEG signals. The results showed that with a 0.5-second EEG window, AADNet achieved an average accuracy of 93.46% and 91.09% in decoding auditory orientation attention (OA) and timbre attention (TA), respectively. It significantly outperformed five previous methods and did not need the knowledge of the original audio source. This work demonstrated that it was possible to detect the orientation and timbre of auditory attention from EEG signals fast and accurately. The results are promising for the real-time multi-property auditory attention decoding, facilitating the application of the neuro-steered hearing aids and other assistive listening devices.
Related papers
- Multi-class Decoding of Attended Speaker Direction Using Electroencephalogram and Audio Spatial Spectrum [13.036563238499026]
Decoding the directional focus of an attended speaker from listeners' electroencephalogram (EEG) signals is essential for developing brain-computer interfaces.
By integrating audio spatial spectra with EEG features, the decoding accuracy can be effectively improved.
The proposed Sp-EEG-Deformer model achieves notable 14-class decoding accuracies of 55.35% and 57.19% in leave-one-subject-out and leave-one-trial-out scenarios.
arXiv Detail & Related papers (2024-11-11T12:32:26Z) - Single-word Auditory Attention Decoding Using Deep Learning Model [9.698931956476692]
Identifying auditory attention by comparing auditory stimuli and corresponding brain responses, is known as auditory attention decoding (AAD)
This paper presents a deep learning approach, based on EEGNet, to address this challenge.
arXiv Detail & Related papers (2024-10-15T21:57:19Z) - Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models [52.04189118767758]
Generalization is a main issue for current audio deepfake detectors.
In this paper we study the potential of large-scale pre-trained models for audio deepfake detection.
arXiv Detail & Related papers (2024-05-03T15:27:11Z) - TAnet: A New Temporal Attention Network for EEG-based Auditory Spatial Attention Decoding with a Short Decision Window [2.9033818582958393]
Auditory spatial attention detection (ASAD) is used to determine the direction of a listener's attention to a speaker.
An end-to-end temporal attention network (i.e., TAnet) was introduced in this work.
arXiv Detail & Related papers (2024-01-11T10:36:27Z) - DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial
Attention Detection [49.196182908826565]
Auditory Attention Detection (AAD) aims to detect target speaker from brain signals in a multi-speaker environment.
Current approaches primarily rely on traditional convolutional neural network designed for processing Euclidean data like images.
This paper proposes a dynamical graph self-distillation (DGSD) approach for AAD, which does not require speech stimuli as input.
arXiv Detail & Related papers (2023-09-07T13:43:46Z) - EchoVest: Real-Time Sound Classification and Depth Perception Expressed
through Transcutaneous Electrical Nerve Stimulation [0.0]
We have developed a new assistive device, EchoVest, for blind/deaf people to intuitively become more aware of their environment.
EchoVest transmits vibrations to the user's body by utilizing transcutaneous electric nerve stimulation (TENS) based on the source of the sounds.
We aimed to outperform CNN-based machine-learning models, the most commonly used machine learning model for classification tasks, in accuracy and computational costs.
arXiv Detail & Related papers (2023-07-10T14:43:32Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Little Help Makes a Big Difference: Leveraging Active Learning to
Improve Unsupervised Time Series Anomaly Detection [2.1684857243537334]
A large set of anomaly detection algorithms have been deployed for detecting unexpected network incidents.
Unsupervised anomaly detection algorithms often suffer from excessive false alarms.
We propose to use active learning to introduce and benefit from the feedback of operators.
arXiv Detail & Related papers (2022-01-25T13:54:19Z) - Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV)
We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples.
Our codes will be made open-source for future works to do comparison.
arXiv Detail & Related papers (2021-07-01T08:58:16Z) - Speech Enhancement for Wake-Up-Word detection in Voice Assistants [60.103753056973815]
Keywords spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants.
This paper proposes a Speech Enhancement model adapted to the task of WUW detection.
It aims at increasing the recognition rate and reducing the false alarms in the presence of these types of noises.
arXiv Detail & Related papers (2021-01-29T18:44:05Z) - Deep Speaker Embeddings for Far-Field Speaker Recognition on Short
Utterances [53.063441357826484]
Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions.
Speaker verification on short utterances in uncontrolled noisy environment conditions is one of the most challenging and highly demanded tasks.
This paper presents approaches aimed to achieve two goals: a) improve the quality of far-field speaker verification systems in the presence of environmental noise, reverberation and b) reduce the system qualitydegradation for short utterances.
arXiv Detail & Related papers (2020-02-14T13:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.