PhyAAt: Physiology of Auditory Attention to Speech Dataset
- URL: http://arxiv.org/abs/2005.11577v1
- Date: Sat, 23 May 2020 17:55:18 GMT
- Title: PhyAAt: Physiology of Auditory Attention to Speech Dataset
- Authors: Nikesh Bajaj, Jes\'us Requena Carri\'on, Francesco Bellotti
- Abstract summary: Auditory attention to natural speech is a complex brain process.
We present a dataset of physiological signals collected from an experiment on auditory attention to natural speech.
- Score: 0.5976833843615385
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Auditory attention to natural speech is a complex brain process. Its
quantification from physiological signals can be valuable to improving and
widening the range of applications of current brain-computer-interface systems,
however it remains a challenging task. In this article, we present a dataset of
physiological signals collected from an experiment on auditory attention to
natural speech. In this experiment, auditory stimuli consisting of
reproductions of English sentences in different auditory conditions were
presented to 25 non-native participants, who were asked to transcribe the
sentences. During the experiment, 14 channel electroencephalogram, galvanic
skin response, and photoplethysmogram signals were collected from each
participant. Based on the number of correctly transcribed words, an attention
score was obtained for each auditory stimulus presented to subjects. A strong
correlation ($p<<0.0001$) between the attention score and the auditory
conditions was found. We also formulate four different predictive tasks
involving the collected dataset and develop a feature extraction framework. The
results for each predictive task are obtained using a Support Vector Machine
with spectral features, and are better than chance level. The dataset has been
made publicly available for further research, along with a python library -
phyaat to facilitate the preprocessing, modeling, and reproduction of the
results presented in this paper. The dataset and other resources are shared on
webpage - https://phyaat.github.io.
Related papers
- Decoding speech perception from non-invasive brain recordings [48.46819575538446]
We introduce a model trained with contrastive-learning to decode self-supervised representations of perceived speech from non-invasive recordings.
Our model can identify, from 3 seconds of MEG signals, the corresponding speech segment with up to 41% accuracy out of more than 1,000 distinct possibilities.
arXiv Detail & Related papers (2022-08-25T10:01:43Z) - Co-Located Human-Human Interaction Analysis using Nonverbal Cues: A
Survey [71.43956423427397]
We aim to identify the nonverbal cues and computational methodologies resulting in effective performance.
This survey differs from its counterparts by involving the widest spectrum of social phenomena and interaction settings.
Some major observations are: the most often used nonverbal cue, computational method, interaction environment, and sensing approach are speaking activity, support vector machines, and meetings composed of 3-4 persons equipped with microphones and cameras, respectively.
arXiv Detail & Related papers (2022-07-20T13:37:57Z) - End-to-End Binaural Speech Synthesis [71.1869877389535]
We present an end-to-end speech synthesis system that combines a low-bitrate audio system with a powerful decoder.
We demonstrate the capability of the adversarial loss in capturing environment effects needed to create an authentic auditory scene.
arXiv Detail & Related papers (2022-07-08T05:18:36Z) - Toward a realistic model of speech processing in the brain with
self-supervised learning [67.7130239674153]
Self-supervised algorithms trained on the raw waveform constitute a promising candidate.
We show that Wav2Vec 2.0 learns brain-like representations with as little as 600 hours of unlabelled speech.
arXiv Detail & Related papers (2022-06-03T17:01:46Z) - Neural Language Taskonomy: Which NLP Tasks are the most Predictive of
fMRI Brain Activity? [3.186888145772382]
Several popular Transformer based language models have been found to be successful for text-driven brain encoding.
In this work, we explore transfer learning from representations learned for ten popular natural language processing tasks.
Experiments across all 10 task representations provide the following cognitive insights.
arXiv Detail & Related papers (2022-05-03T10:23:08Z) - Deep Neural Convolutive Matrix Factorization for Articulatory
Representation Decomposition [48.56414496900755]
This work uses a neural implementation of convolutive sparse matrix factorization to decompose the articulatory data into interpretable gestures and gestural scores.
Phoneme recognition experiments were additionally performed to show that gestural scores indeed code phonological information successfully.
arXiv Detail & Related papers (2022-04-01T14:25:19Z) - Model-based analysis of brain activity reveals the hierarchy of language
in 305 subjects [82.81964713263483]
A popular approach to decompose the neural bases of language consists in correlating, across individuals, the brain responses to different stimuli.
Here, we show that a model-based approach can reach equivalent results within subjects exposed to natural stimuli.
arXiv Detail & Related papers (2021-10-12T15:30:21Z) - Learning spectro-temporal representations of complex sounds with
parameterized neural networks [16.270691619752288]
We propose a parametrized neural network layer, that computes specific spectro-temporal modulations based on Gabor kernels (Learnable STRFs)
We evaluated predictive capabilities of this layer on Speech Activity Detection, Speaker Verification, Urban Sound Classification and Zebra Finch Call Type Classification.
As this layer is fully interpretable, we used quantitative measures to describe the distribution of the learned spectro-temporal modulations.
arXiv Detail & Related papers (2021-03-12T07:53:47Z) - Comparison of Speaker Role Recognition and Speaker Enrollment Protocol
for conversational Clinical Interviews [9.728371067160941]
We train end-to-end neural network architectures to adapt to each task and evaluate each approach under the same metric.
Results do not depend on the demographics of the Interviewee, highlighting the clinical relevance of our methods.
arXiv Detail & Related papers (2020-10-30T09:07:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.